Re: PATCH [0/4] perf: clean-up of power events API

2010-10-19 Thread Thomas Renninger
On Sunday 10 October 2010 14:19:28 Ingo Molnar wrote:
 
 * Arjan van de Ven ar...@linux.intel.com wrote:
 
...
  also I have to say that some events are more likely to change than others
  
  function foo in the kernel called is more likely to change than the 
  processor went to THIS frequency. The concept of CPU frequencies has 
  been with us fora long time and is going to be there for a long time 
  as well ..
Right, it's a frequency and a CPU that should get passed along with the
event. The X86/ACPI specific X-state data (even there unused and never will 
get used) should vanish before ARM starts to make use of it.
The idle (power_start/power_end) state definition is worse...

 Most definitely. It's no accident that it took such a long time for this 
 issue to be raised in the first place.
 It's a rare occurance -
Do you agree that this occurance happened now and these events
should get cleaned up before ARM and other archs make use of the broken
interface?
If not, discussing this further, is a big waste of time... and Jean
would have to try to adapt his ARM code on the broken ABI...

 and then 
 we can deal with it intelligently, without breaking stuff unnecessarily.
Can we get this defined a bit clearer so that a patch can be created?

Compatibility can only be achieved by still firing the old events for
some kernel rounds.

I'll send some patches in a new thread with these people in CC.
It would be great to see a decision (in a way that a patch can be created)
how an event change can/should look like if there is urgent need.

Thanks,

 Thomas
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-19 Thread Ingo Molnar

* Thomas Renninger tr...@suse.de wrote:

  Most definitely. It's no accident that it took such a long time for this 
  issue 
  to be raised in the first place. It's a rare occurance -

 Do you agree that this occurance happened now and these events should get 
 cleaned 
 up before ARM and other archs make use of the broken interface?

 If not, discussing this further, is a big waste of time... and Jean would 
 have to 
 try to adapt his ARM code on the broken ABI...

The discussion seems to have died down somewhat. Please re-send to lkml the 
latest 
patches you have to remind everyone of the latest state of things - the merge 
window 
is getting near.

My only compatibility/ABI point is basically that it shouldnt break _existing_ 
tracepoints (and users thereof). If your latest bits meet that then it ought to 
be a 
good first step. You are free to (and encouraged to) introduce more complete 
sets of 
events.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-19 Thread Ingo Molnar

* Peter Zijlstra pet...@infradead.org wrote:

 On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote:
  
  * Thomas Renninger tr...@suse.de wrote:
  
Most definitely. It's no accident that it took such a long time for 
this issue 
to be raised in the first place. It's a rare occurance -
  
   Do you agree that this occurance happened now and these events should get 
   cleaned 
   up before ARM and other archs make use of the broken interface?
  
   If not, discussing this further, is a big waste of time... and Jean would 
   have to 
   try to adapt his ARM code on the broken ABI...
  
  The discussion seems to have died down somewhat. Please re-send to lkml the 
  latest 
  patches you have to remind everyone of the latest state of things - the 
  merge window 
  is getting near.
  
  My only compatibility/ABI point is basically that it shouldnt break 
  _existing_ 
  tracepoints (and users thereof). If your latest bits meet that then it 
  ought to be a 
  good first step. You are free to (and encouraged to) introduce more 
  complete sets of 
  events.
 
 Can we deprecate and eventually remove the old ones, or will we be forever 
 obliged 
 to carry the old ones too?

We most definitely want to deprecate and remove the old ones - but we want to 
give 
instrumentation software some migration time for that.

Jean, Arjan, what would be a feasible and practical deprecation period for 
that? One 
kernel cycle?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-19 Thread Peter Zijlstra
On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote:
 
 * Thomas Renninger tr...@suse.de wrote:
 
   Most definitely. It's no accident that it took such a long time for this 
   issue 
   to be raised in the first place. It's a rare occurance -
 
  Do you agree that this occurance happened now and these events should get 
  cleaned 
  up before ARM and other archs make use of the broken interface?
 
  If not, discussing this further, is a big waste of time... and Jean would 
  have to 
  try to adapt his ARM code on the broken ABI...
 
 The discussion seems to have died down somewhat. Please re-send to lkml the 
 latest 
 patches you have to remind everyone of the latest state of things - the merge 
 window 
 is getting near.
 
 My only compatibility/ABI point is basically that it shouldnt break 
 _existing_ 
 tracepoints (and users thereof). If your latest bits meet that then it ought 
 to be a 
 good first step. You are free to (and encouraged to) introduce more complete 
 sets of 
 events.

Can we deprecate and eventually remove the old ones, or will we be
forever obliged to carry the old ones too?
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-19 Thread Arjan van de Ven

 On 10/19/2010 4:52 AM, Ingo Molnar wrote:

* Peter Zijlstrapet...@infradead.org  wrote:


On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote:

* Thomas Renningertr...@suse.de  wrote:


Most definitely. It's no accident that it took such a long time for this issue
to be raised in the first place. It's a rare occurance -

Do you agree that this occurance happened now and these events should get 
cleaned
up before ARM and other archs make use of the broken interface?

If not, discussing this further, is a big waste of time... and Jean would have 
to
try to adapt his ARM code on the broken ABI...

The discussion seems to have died down somewhat. Please re-send to lkml the 
latest
patches you have to remind everyone of the latest state of things - the merge 
window
is getting near.

My only compatibility/ABI point is basically that it shouldnt break _existing_
tracepoints (and users thereof). If your latest bits meet that then it ought to 
be a
good first step. You are free to (and encouraged to) introduce more complete 
sets of
events.

Can we deprecate and eventually remove the old ones, or will we be forever 
obliged
to carry the old ones too?

We most definitely want to deprecate and remove the old ones - but we want to 
give
instrumentation software some migration time for that.

Jean, Arjan, what would be a feasible and practical deprecation period for 
that? One
kernel cycle?


more like a year

for some time software needs to support both, especially if popular 
distros stick to an older kernel like *cough* RHEL6


--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-19 Thread Ingo Molnar

* Arjan van de Ven ar...@linux.intel.com wrote:

  On 10/19/2010 4:52 AM, Ingo Molnar wrote:
 * Peter Zijlstrapet...@infradead.org  wrote:
 
 On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote:
 * Thomas Renningertr...@suse.de  wrote:
 
 Most definitely. It's no accident that it took such a long time for this 
 issue
 to be raised in the first place. It's a rare occurance -
 Do you agree that this occurance happened now and these events should get 
 cleaned
 up before ARM and other archs make use of the broken interface?
 
 If not, discussing this further, is a big waste of time... and Jean would 
 have to
 try to adapt his ARM code on the broken ABI...
 The discussion seems to have died down somewhat. Please re-send to lkml 
 the latest
 patches you have to remind everyone of the latest state of things - the 
 merge window
 is getting near.
 
 My only compatibility/ABI point is basically that it shouldnt break 
 _existing_
 tracepoints (and users thereof). If your latest bits meet that then it 
 ought to be a
 good first step. You are free to (and encouraged to) introduce more 
 complete sets of
 events.
 Can we deprecate and eventually remove the old ones, or will we be forever 
 obliged
 to carry the old ones too?
 We most definitely want to deprecate and remove the old ones - but we want 
 to give
 instrumentation software some migration time for that.
 
 Jean, Arjan, what would be a feasible and practical deprecation period for 
 that? One
 kernel cycle?
 
 more like a year
 
 for some time software needs to support both, especially if popular distros 
 stick 
 to an older kernel like *cough* RHEL6

Sure, you can support both. But as long as support for the _new_ events is 
included 
in PowerTop there's no need to keep the duality upstream. Running ancient 
PowerTop 
on fresh kernels is not common.

An old RHEL kernel will still keep on working as you can keep support for old 
events 
in PowerTop as long as you wish to.

The new kernel also wont 'overwrite' old events with new definitions in the 
future, 
so PowerTop will keep working for as long as you want to support older kernels.

Does that sound good?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-19 Thread Arjan van de Ven

 On 10/19/2010 6:50 AM, Ingo Molnar wrote:

* Arjan van de Venar...@linux.intel.com  wrote:


  On 10/19/2010 4:52 AM, Ingo Molnar wrote:

* Peter Zijlstrapet...@infradead.org   wrote:


On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote:

* Thomas Renningertr...@suse.de   wrote:


Most definitely. It's no accident that it took such a long time for this issue
to be raised in the first place. It's a rare occurance -

Do you agree that this occurance happened now and these events should get 
cleaned
up before ARM and other archs make use of the broken interface?

If not, discussing this further, is a big waste of time... and Jean would have 
to
try to adapt his ARM code on the broken ABI...

The discussion seems to have died down somewhat. Please re-send to lkml the 
latest
patches you have to remind everyone of the latest state of things - the merge 
window
is getting near.

My only compatibility/ABI point is basically that it shouldnt break _existing_
tracepoints (and users thereof). If your latest bits meet that then it ought to 
be a
good first step. You are free to (and encouraged to) introduce more complete 
sets of
events.

Can we deprecate and eventually remove the old ones, or will we be forever 
obliged
to carry the old ones too?

We most definitely want to deprecate and remove the old ones - but we want to 
give
instrumentation software some migration time for that.

Jean, Arjan, what would be a feasible and practical deprecation period for 
that? One
kernel cycle?

more like a year

for some time software needs to support both, especially if popular distros 
stick
to an older kernel like *cough* RHEL6

Sure, you can support both. But as long as support for the _new_ events is 
included
in PowerTop there's no need to keep the duality upstream. Running ancient 
PowerTop
on fresh kernels is not common.

An old RHEL kernel will still keep on working as you can keep support for old 
events
in PowerTop as long as you wish to.

The new kernel also wont 'overwrite' old events with new definitions in the 
future,
so PowerTop will keep working for as long as you want to support older kernels.

Does that sound good?


this does not scale much long term, eg this only works if this is only 
done once, and these points are stable afterwards.
otherwise we get 25 of those different  workarounds for kernel ABI 
breakage into all different projects, and it becomes

untestable for all the poor software writers...

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-18 Thread Jean Pihet
On Sat, Oct 9, 2010 at 8:36 PM, Linus Torvalds
torva...@linux-foundation.org wrote:
 On Sat, Oct 9, 2010 at 1:14 AM, Pierre Tardy tar...@gmail.com wrote:
 On Sat, Oct 9, 2010 at 8:28 AM, Ingo Molnar mi...@elte.hu wrote:


 The thing is, Arjan is 100% right that a library for this is not a
 'solution', it's an unnecessary complication.
 Yes. sounds like overengineering.

 I also want to remind people that backwards compatibility should
 always absolutely be the #1 priority. Using libraries to hide
 differences is a totally moronic thing to do, because if you can do a
 compatibility library with good interfaces, then damn it, the kernel
 interface should already _be_ that good interface.
Agree on that. The idea is to have the kernel interfaces cleaned up
and so have better user space apps in the end.

 And no, even if you interact purely with open source programs, the
 backwards compatibility requirement doesn't go away. It's a damn pain
 in the ass to have to recompile, and it means that you have a much
 harder time mixing and matching, and just updating the kernel on top
 of a standard distribution.

 So changing kernel interfaces that get exported to user space is
 always a disaster. Anybody who _designs_ for that kind of disaster
 shouldn't be participating in kernel development, because they've
 shown themselves to be unable to understand the pain and suffering.

 Yes, we do it. Sometimes we change interfaces because not changing
 them is too damn painful. But it should absolutely not be the design
 model.

So what is the best way to have the power tracing events elegantly cleaned up?
The proposed patch 4/4 [1] introduces a new Kconfig option
CONFIG_DEPRECATED_POWER_EVENT_TRACING which allows to select to map
the trace points to the old _OR_ to the new events API, only for the
already existing events. This gives some time for the adaptation of
the user space apps.

I understand this could require a kernel re-compilation in order to
use the old events API but I really want to avoid to duplicate the
trace points in the code to instrument, e.g.:

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 199dcb9..013c274 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -354,5 +354,7 @@ void cpufreq_notify_transition(struct
cpufreq_freqs *freqs, unsigned int state)
adjust_jiffies(CPUFREQ_POSTCHANGE, freqs);
dprintk(FREQ: %lu - CPU: %lu, (unsigned long)freqs-new,
(unsigned long)freqs-cpu);
trace_power_frequency(POWER_PSTATE, freqs-new, freqs-cpu);
+   trace_power_switch_state(POWER_PSTATE, freqs-new, freqs-cpu,
+smp_processor_id());

The proposed patch only changes the power tracing events API
definition files, not the code to instrument.

I am OK to re-spin the patches, only if a compromise is agreed on.
As said before the kernel Documentation and pytimechart user space
tools patches will be provided as well.


Linus


Thanks,
Jean

[1] http://marc.info/?l=linux-omapm=128620575900689w=2
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-10 Thread Peter Zijlstra
On Sat, 2010-10-09 at 21:39 -0400, Steven Rostedt wrote:
 I've been hesitant in the pass from doing the TRACE_EVENT_ABI()
 before, because Peter Zijlstra (who is currently MIA) has been strongly
 against it. 

I see no point in the TRACE_EVENT_ABI() because if I need to change such
a tracepoint to reflect changes in the kernel then I will freely do so.

Even seemingly stable points like sched_switch(), which we all agree
will stay around forever (gotta have context switches on a multi-tasking
OS) will not stay stable when we add/change scheduling policies.

Sure, the prev and next task thing will stay the same, but the meaning
and interpretation of things like the prio field will not, esp when we
go add something like a deadline scheduler that isn't priority based.

So one possibility is to simply remove all that information from the
tracepoints, remove the prio and state fields, but how useful is that?

I guess what I'm saying is that even if we were to provide _ABI I see us
getting into this very same argument over and over again, making me want
to remove all this trace event muck right now before it gets worse.

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-10 Thread Ingo Molnar

* Arjan van de Ven ar...@linux.intel.com wrote:

  On 10/8/2010 11:28 PM, Ingo Molnar wrote:
 * Mathieu Desnoyersmathieu.desnoy...@efficios.com  wrote:
 
 * Arjan van de Ven (ar...@linux.intel.com) wrote:
   On 10/8/2010 1:38 AM, Ingo Molnar wrote:
 The fundamental thing about tracing/instrumentation is that there
 are no deep ABI needs: it's all about analyzing development kernels
 (and a few select versions that get the enterprise treatment) but
 otherwise the half-life of this kind of information is very short.
 
 So we dont want to tie ourselves down with excessive ABIs.
 ok I'll start working on a second mechanism then to export
 information that applications need ;-( it'll look a lot like tracing
 I suppose ;-(
 What's wrong with doing the compatibility layer in a LGPL library
 shipped with the kernel tree under tools/ ? Why does everything *have*
 to be done in kernel-space ? Why are you so focused on making your
 application interact directly with kernel ABIs ?
 The thing is, Arjan is 100% right that a library for this is not a
 'solution', it's an unnecessary complication.
 
 What i suggested in my mail was to _keep existing events_. I.e. do not
 break powertop. We are 100% happy that we _have_ such apps, and we
 should do reasonable things to not break them.
 
 If we need to change events, we can add a new event. The old events will
 lose their relevance without us having to do much - and without us
 having to break powertop, pytimechart, etc. We can even have periods of
 overlap when both events are available - to give instrumentation apps
 time to learn the new events.
 
 I.e. it's not an ABI in the classic sense - we do not (because we
 cannot) guarantee the infinite availability of these events. But we can
 guarantee that the fields do not change in some stupid, avoidable way.
 
 also I have to say that some events are more likely to change than others
 
 function foo in the kernel called is more likely to change than the 
 processor went to THIS frequency. The concept of CPU frequencies has 
 been with us fora long time and is going to be there for a long time 
 as well ..

Most definitely. It's no accident that it took such a long time for this 
issue to be raised in the first place. It's a rare occurance - and then 
we can deal with it intelligently, without breaking stuff unnecessarily.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-10 Thread Steven Rostedt
On Sun, 2010-10-10 at 08:41 +0200, Peter Zijlstra wrote:
 On Sat, 2010-10-09 at 21:39 -0400, Steven Rostedt wrote:
  I've been hesitant in the pass from doing the TRACE_EVENT_ABI()
  before, because Peter Zijlstra (who is currently MIA) has been strongly
  against it. 
 
 I see no point in the TRACE_EVENT_ABI() because if I need to change such
 a tracepoint to reflect changes in the kernel then I will freely do so.
 
 Even seemingly stable points like sched_switch(), which we all agree
 will stay around forever (gotta have context switches on a multi-tasking
 OS) will not stay stable when we add/change scheduling policies.
 
 Sure, the prev and next task thing will stay the same, but the meaning
 and interpretation of things like the prio field will not, esp when we
 go add something like a deadline scheduler that isn't priority based.
 
 So one possibility is to simply remove all that information from the
 tracepoints, remove the prio and state fields, but how useful is that?
 
 I guess what I'm saying is that even if we were to provide _ABI I see us
 getting into this very same argument over and over again, making me want
 to remove all this trace event muck right now before it gets worse.

Then how's this as a compromise. We do not add a TRACE_EVENT_ABI(), but
instead manually add the ABI interface to existing tracepoints. Let's
use the sched example you shown above.

We can connect to the sched_switch() tracepoint manually in something
perhaps called kernel/abi_trace.c or trace_abi.c (whatever).

Here we create the directories manually there:

/sys/kernel/event/sched/sched_switch/

But this sched_switch will only include the prev and next pids, comms,
and perhaps even run state. But not the prio (since we see that
changing).

It would then need the code to enable the trace point with:

register_trace_sched_switch(sched_switch_abi_probe, NULL);

Where we have

static void
sched_switch_abi_probe(void *ignore,
struct task_switch *prev,
struct task_struct *next)
{
/* code to grab just the ABI stuff */
}

And this code can then record to what ever hooked to it.

Making this a manual effort will make it easier to control what becomes
an ABI. We can have long discussions and flames over what goes here. But
that's good since debates before an ABI is created is much better than
debates after one is created.

I'm afraid that a easy macro called TRACE_EVENT_ABI() would have the
same issue. ABIs may be created too quickly before they are thought
through.

Thoughts?

-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-09 Thread Ingo Molnar

* Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote:

 * Arjan van de Ven (ar...@linux.intel.com) wrote:
   On 10/8/2010 1:38 AM, Ingo Molnar wrote:
 
  The fundamental thing about tracing/instrumentation is that there 
  are no deep ABI needs: it's all about analyzing development kernels 
  (and a few select versions that get the enterprise treatment) but 
  otherwise the half-life of this kind of information is very short.
 
  So we dont want to tie ourselves down with excessive ABIs.
 
  ok I'll start working on a second mechanism then to export 
  information that applications need ;-( it'll look a lot like tracing 
  I suppose ;-(
 
 What's wrong with doing the compatibility layer in a LGPL library 
 shipped with the kernel tree under tools/ ? Why does everything *have* 
 to be done in kernel-space ? Why are you so focused on making your 
 application interact directly with kernel ABIs ?

The thing is, Arjan is 100% right that a library for this is not a 
'solution', it's an unnecessary complication.

What i suggested in my mail was to _keep existing events_. I.e. do not 
break powertop. We are 100% happy that we _have_ such apps, and we 
should do reasonable things to not break them.

If we need to change events, we can add a new event. The old events will 
lose their relevance without us having to do much - and without us 
having to break powertop, pytimechart, etc. We can even have periods of 
overlap when both events are available - to give instrumentation apps 
time to learn the new events.

I.e. it's not an ABI in the classic sense - we do not (because we 
cannot) guarantee the infinite availability of these events. But we can 
guarantee that the fields do not change in some stupid, avoidable way.

Changing an existing event in some non-append way is just sloppy and we 
can do better.

Arjan, Pierre, does that sound OK to you?

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-09 Thread Pierre Tardy
On Sat, Oct 9, 2010 at 8:28 AM, Ingo Molnar mi...@elte.hu wrote:


 The thing is, Arjan is 100% right that a library for this is not a
 'solution', it's an unnecessary complication.
Yes. sounds like overengineering.

 If we need to change events, we can add a new event. The old events will
 lose their relevance without us having to do much - and without us
 having to break powertop, pytimechart, etc. We can even have periods of
 overlap when both events are available - to give instrumentation apps
 time to learn the new events.

 I.e. it's not an ABI in the classic sense - we do not (because we
 cannot) guarantee the infinite availability of these events. But we can
 guarantee that the fields do not change in some stupid, avoidable way.

 Changing an existing event in some non-append way is just sloppy and we
 can do better.

 Arjan, Pierre, does that sound OK to you?
Yes. Its reasonable.
Note that pytimechart patch for new power API is already ready and
just waiting for this issue to be decided.

Regards,
Pierre
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-09 Thread Arjan van de Ven

 On 10/8/2010 11:28 PM, Ingo Molnar wrote:

* Mathieu Desnoyersmathieu.desnoy...@efficios.com  wrote:


* Arjan van de Ven (ar...@linux.intel.com) wrote:

  On 10/8/2010 1:38 AM, Ingo Molnar wrote:

The fundamental thing about tracing/instrumentation is that there
are no deep ABI needs: it's all about analyzing development kernels
(and a few select versions that get the enterprise treatment) but
otherwise the half-life of this kind of information is very short.

So we dont want to tie ourselves down with excessive ABIs.

ok I'll start working on a second mechanism then to export
information that applications need ;-( it'll look a lot like tracing
I suppose ;-(

What's wrong with doing the compatibility layer in a LGPL library
shipped with the kernel tree under tools/ ? Why does everything *have*
to be done in kernel-space ? Why are you so focused on making your
application interact directly with kernel ABIs ?

The thing is, Arjan is 100% right that a library for this is not a
'solution', it's an unnecessary complication.

What i suggested in my mail was to _keep existing events_. I.e. do not
break powertop. We are 100% happy that we _have_ such apps, and we
should do reasonable things to not break them.

If we need to change events, we can add a new event. The old events will
lose their relevance without us having to do much - and without us
having to break powertop, pytimechart, etc. We can even have periods of
overlap when both events are available - to give instrumentation apps
time to learn the new events.

I.e. it's not an ABI in the classic sense - we do not (because we
cannot) guarantee the infinite availability of these events. But we can
guarantee that the fields do not change in some stupid, avoidable way.


also I have to say that some events are more likely to change than others

function foo in the kernel called is more likely to change than the 
processor went to THIS frequency.
The concept of CPU frequencies has been with us fora long time and is 
going to be there for a long time as well ..


--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-09 Thread Linus Torvalds
On Sat, Oct 9, 2010 at 1:14 AM, Pierre Tardy tar...@gmail.com wrote:
 On Sat, Oct 9, 2010 at 8:28 AM, Ingo Molnar mi...@elte.hu wrote:


 The thing is, Arjan is 100% right that a library for this is not a
 'solution', it's an unnecessary complication.
 Yes. sounds like overengineering.

I also want to remind people that backwards compatibility should
always absolutely be the #1 priority. Using libraries to hide
differences is a totally moronic thing to do, because if you can do a
compatibility library with good interfaces, then damn it, the kernel
interface should already _be_ that good interface.

And no, even if you interact purely with open source programs, the
backwards compatibility requirement doesn't go away. It's a damn pain
in the ass to have to recompile, and it means that you have a much
harder time mixing and matching, and just updating the kernel on top
of a standard distribution.

So changing kernel interfaces that get exported to user space is
always a disaster. Anybody who _designs_ for that kind of disaster
shouldn't be participating in kernel development, because they've
shown themselves to be unable to understand the pain and suffering.

Yes, we do it. Sometimes we change interfaces because not changing
them is too damn painful. But it should absolutely not be the design
model.

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-09 Thread Steven Rostedt
On Sat, 2010-10-09 at 11:36 -0700, Linus Torvalds wrote:
 On Sat, Oct 9, 2010 at 1:14 AM, Pierre Tardy tar...@gmail.com wrote:
  On Sat, Oct 9, 2010 at 8:28 AM, Ingo Molnar mi...@elte.hu wrote:
 
 
  The thing is, Arjan is 100% right that a library for this is not a
  'solution', it's an unnecessary complication.
  Yes. sounds like overengineering.
 
 I also want to remind people that backwards compatibility should
 always absolutely be the #1 priority. Using libraries to hide
 differences is a totally moronic thing to do, because if you can do a
 compatibility library with good interfaces, then damn it, the kernel
 interface should already _be_ that good interface.
 
 And no, even if you interact purely with open source programs, the
 backwards compatibility requirement doesn't go away. It's a damn pain
 in the ass to have to recompile, and it means that you have a much
 harder time mixing and matching, and just updating the kernel on top
 of a standard distribution.
 
 So changing kernel interfaces that get exported to user space is
 always a disaster. Anybody who _designs_ for that kind of disaster
 shouldn't be participating in kernel development, because they've
 shown themselves to be unable to understand the pain and suffering.
 
 Yes, we do it. Sometimes we change interfaces because not changing
 them is too damn painful. But it should absolutely not be the design
 model.

The difference here compared to all other user interfaces, is that this
interface has the sole purpose of showing what is happening inside the
kernel. By saying that we expose this to userspace, it must too be
stable is saying that all kernel internals that use trace events must
never change.

The big push against tracepoints/trace-markers/trace-events in the
beginning was the fear that they will hinder kernel development because
they become interfaces for users to see what is happening inside the
kernel. When I wrote the interface, I put it in the debugfs system so
people will know that this is a debug interface and can change without
notice.

Trace-events, unlike syscalls, may change depending on how you compiled
the kernel. There's no guarantee that they will even exist on a system.

If all trace-events are now stable ABI, then I suggest we stop adding
any more events, and only add new ones to places that we do not expect
to develop the kernel on anymore.

Not sure what other solution there is. Trace points have been added way
too freely, because any maintainer could add them to their system any
way they felt like it. Now if they are frozen in stone, then the code
that they expose must also be frozen.

-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-09 Thread Steven Rostedt
On Sat, 2010-10-09 at 09:19 -0700, Arjan van de Ven wrote:

  I.e. it's not an ABI in the classic sense - we do not (because we
  cannot) guarantee the infinite availability of these events. But we can
  guarantee that the fields do not change in some stupid, avoidable way.
 
 also I have to say that some events are more likely to change than others
 
 function foo in the kernel called is more likely to change than the 
 processor went to THIS frequency.
 The concept of CPU frequencies has been with us fora long time and is 
 going to be there for a long time as well ..

Perhaps for basic concepts, we need a standard trace-event. Are people
willing to have a TRACE_EVENT_ABI() (it's trivial to write), and we can
mark those events with that macro that we know tools depend on.

These events can be exposed in a /sys/kernel/events/... directory, to
let tools know what what events they can rely on.

We've talked about doing this before, I've just been waiting to hear a
consensus on if we should. I know Peter Zijlstra was against the idea,
and too bad he's off gallivanting to share his input now.

-- Steve
 

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-09 Thread Linus Torvalds
On Sat, Oct 9, 2010 at 2:15 PM, Steven Rostedt rost...@goodmis.org wrote:

 The difference here compared to all other user interfaces, is that this
 interface has the sole purpose of showing what is happening inside the
 kernel.

Bogus and dishonest argument.

Listen to yourself, and read this thread again.

The thread was about doing some kind of open-source library to allow
non-open-source access to these events, and keeping backwards
compatibility in user space. In fact, that is what you yourself said.

So you claimed it could be backwards-compatible. If that's the case,
then there is no excuse for not being so in the kernel.

You can't have it both ways. Stop the f*cking waffling.

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-09 Thread Steven Rostedt
On Sat, 2010-10-09 at 16:20 -0700, Linus Torvalds wrote:
 On Sat, Oct 9, 2010 at 2:15 PM, Steven Rostedt rost...@goodmis.org wrote:
 
  The difference here compared to all other user interfaces, is that this
  interface has the sole purpose of showing what is happening inside the
  kernel.
 
 Bogus and dishonest argument.
 
 Listen to yourself, and read this thread again.
 
 The thread was about doing some kind of open-source library to allow
 non-open-source access to these events, and keeping backwards
 compatibility in user space. In fact, that is what you yourself said.
 
 So you claimed it could be backwards-compatible. If that's the case,
 then there is no excuse for not being so in the kernel.
 
 You can't have it both ways. Stop the f*cking waffling.

Let me rephrase it then, and lets forget about the library. I was just
brain storming ideas.

I'm all for labeling specific trace points as ABI, such that, these
trace points have had sufficient thought and are not expected to change
in the near future. But I'm against the idea that any tracepoint that
has been shown to userspace can be considered stable.

With or without libraries, I'm for two kinds of interfaces: One that is
stable and has been thoroughly thought through, and one that is free for
the maintainers to have an interface to let them see what is happening
in the kernel, even on a production system, but be able to change them
whenever they feel the need.

That's the basis of my idea. A stable backward-compatible interface, and
an interface that is unstable for developers. Whether we put the stable
interface into a library (to keep the ugliness from developers, which
you obviously do not like), or two, distinctly label the tracepoint as
ABI, to let the developers and everyone else know what tracepoints an
application can count on and what ones they should not.

Thus my waffling is really wanting both, a stable ABI and an unstable
one. I've been hesitant in the pass from doing the TRACE_EVENT_ABI()
before, because Peter Zijlstra (who is currently MIA) has been strongly
against it.

-- Steve
 



--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-08 Thread Tejun Heo
Hello,

On 10/07/2010 05:58 PM, Frederic Weisbecker wrote:
 I really feel uncomfortable with this tracepoint/ABI problem
 Mathieu suggested we start a user library that could handle these
 changes when they are really necessary.
 
 Thoughts?
 
 (Adding Tejun in Cc).

Given that tracepoints are supposed to make internal operation
visible.  I don't think it's a good idea to make it part of fixed ABI.
Maybe some core part can be put in stone but I think things like
internal workqueue implementation should be changeable without
worrying about ABI issues.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-08 Thread Ingo Molnar

* Tejun Heo t...@kernel.org wrote:

 Hello,
 
 On 10/07/2010 05:58 PM, Frederic Weisbecker wrote:
  I really feel uncomfortable with this tracepoint/ABI problem
  Mathieu suggested we start a user library that could handle these
  changes when they are really necessary.
  
  Thoughts?
  
  (Adding Tejun in Cc).
 
 Given that tracepoints are supposed to make internal operation 
 visible.  I don't think it's a good idea to make it part of fixed ABI.

Yep, exactly.

OTOH since it exports information we can do disciplined versioning and 
extensions only - i.e. leave the old power events around, add the new 
ones with new distinct names, and phase out the old ones in a kernel 
cycle or two. It's not hard to do.

That way apps can support old kernels too (if they want to), but new 
events as well - and all in a controlled, non-disruptive manner.

More importantly, the kernel wont have cruft and will have no ABI 
restrictions - the only 'restriction' is to treat information in an 
append-only manner (i.e. change the event name if you change it 
materially) - and that's not a big deal here.

The fundamental thing about tracing/instrumentation is that there are no 
deep ABI needs: it's all about analyzing development kernels (and a few 
select versions that get the enterprise treatment) but otherwise the 
half-life of this kind of information is very short.

So we dont want to tie ourselves down with excessive ABIs.

 Maybe some core part can be put in stone but I think things like 
 internal workqueue implementation should be changeable without 
 worrying about ABI issues.

That's most definitely so! There is and will be zero back-coupling from 
workqueue tracepoints to workqueue internals. Dont worry about this.

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-08 Thread Arjan van de Ven

 On 10/8/2010 1:38 AM, Ingo Molnar wrote:


The fundamental thing about tracing/instrumentation is that there are no
deep ABI needs: it's all about analyzing development kernels (and a few
select versions that get the enterprise treatment) but otherwise the
half-life of this kind of information is very short.

So we dont want to tie ourselves down with excessive ABIs.



ok I'll start working on a second mechanism then to export information 
that applications need ;-(

it'll look a lot like tracing I suppose ;-(

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-08 Thread Mathieu Desnoyers
* Arjan van de Ven (ar...@linux.intel.com) wrote:
  On 10/8/2010 1:38 AM, Ingo Molnar wrote:

 The fundamental thing about tracing/instrumentation is that there are no
 deep ABI needs: it's all about analyzing development kernels (and a few
 select versions that get the enterprise treatment) but otherwise the
 half-life of this kind of information is very short.

 So we dont want to tie ourselves down with excessive ABIs.


 ok I'll start working on a second mechanism then to export information  
 that applications need ;-(
 it'll look a lot like tracing I suppose ;-(

What's wrong with doing the compatibility layer in a LGPL library shipped with
the kernel tree under tools/ ? Why does everything *have* to be done in
kernel-space ? Why are you so focused on making your application interact
directly with kernel ABIs ?

I'm being direct because there are trivial solutions to your problem that you
are rejecting without due consideration. (and also I just had one coffee too
many) ;-)

Regards,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency RD Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-08 Thread Arjan van de Ven

 On 10/8/2010 6:41 AM, Mathieu Desnoyers wrote:

* Arjan van de Ven (ar...@linux.intel.com) wrote:

  On 10/8/2010 1:38 AM, Ingo Molnar wrote:

The fundamental thing about tracing/instrumentation is that there are no
deep ABI needs: it's all about analyzing development kernels (and a few
select versions that get the enterprise treatment) but otherwise the
half-life of this kind of information is very short.

So we dont want to tie ourselves down with excessive ABIs.


ok I'll start working on a second mechanism then to export information
that applications need ;-(
it'll look a lot like tracing I suppose ;-(

What's wrong with doing the compatibility layer in a LGPL library shipped with
the kernel tree under tools/ ?


because that is not workable... at least nobody has shown to be able to 
make this work.
libraries (after compilation) live in /lib or /usr/lib (or lib64 I 
suppose).
what mechanism ensures that a user who compiles his kernel gets a 
library compatible with that kernel in /usr/lib?

and can said library deal with older kernels too? And distro kernels?


Why does everything *have* to be done in
kernel-space

it doesn't. but the alternative must be workable.

  Why are you so focused on making your application interact
directly with kernel ABIs ?

I'm being direct because there are trivial solutions to your problem that you
are rejecting without due consideration. (and also I just had one coffee too
many) ;-)


since you seem to think that dealing with such a library is trivial... 
how about you do it for one function even, to

show that the deployment/use-in-an-app is workable.
I'd be more than happy to use it if it's workable and the API is at 
least halfway sane.


--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-08 Thread Steven Rostedt
On Fri, 2010-10-08 at 09:22 -0700, Arjan van de Ven wrote:
 On 10/8/2010 6:41 AM, Mathieu Desnoyers wrote:

 because that is not workable... at least nobody has shown to be able to 
 make this work.
 libraries (after compilation) live in /lib or /usr/lib (or lib64 I 
 suppose).
 what mechanism ensures that a user who compiles his kernel gets a 
 library compatible with that kernel in /usr/lib?
 and can said library deal with older kernels too? And distro kernels?

Perhaps we should have make install of a kernel also install this
library?

Have two libraries? One that is linked to the app, the other that can
search for another library to link on load too (like a kernel.ld.so)

Then we could see the kernel version, and search for a library that is
compatible, and load that one.

The app only needs to worry about loading the generic library. The
generic library can test for compatible libraries for the kernel.

Could just be..

  libkernel.ld.so  which then loads..

  /lib/modules/2.6.36/libkernel.so


Just a little brain storming.

-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-08 Thread Mathieu Desnoyers
* Arjan van de Ven (ar...@linux.intel.com) wrote:
  On 10/8/2010 6:41 AM, Mathieu Desnoyers wrote:
 * Arjan van de Ven (ar...@linux.intel.com) wrote:
   On 10/8/2010 1:38 AM, Ingo Molnar wrote:
 The fundamental thing about tracing/instrumentation is that there are no
 deep ABI needs: it's all about analyzing development kernels (and a few
 select versions that get the enterprise treatment) but otherwise the
 half-life of this kind of information is very short.

 So we dont want to tie ourselves down with excessive ABIs.

 ok I'll start working on a second mechanism then to export information
 that applications need ;-(
 it'll look a lot like tracing I suppose ;-(
 What's wrong with doing the compatibility layer in a LGPL library shipped 
 with
 the kernel tree under tools/ ?

 because that is not workable... at least nobody has shown to be able to  
 make this work.
 libraries (after compilation) live in /lib or /usr/lib (or lib64 I  
 suppose).
 what mechanism ensures that a user who compiles his kernel gets a  
 library compatible with that kernel in /usr/lib?

I don't think the perf tools/ do it right at the moment, but here is my
proposal:

Currently, we need to do make install from the tools/ directory. Since kernel
developers are lazy, I would propose a CONFIG_INSTALL_TOOLS default N config
option that would let make install from the root of the kernel tree install the
tools too. (looking at my inbox..) As Steven just beat me to it, see his lib
versioning proposal. ;)

The library would present an API to the application that would let apps consume
specific events of interest. Translation of fixed event names/fields into the
current kernel version tracepoint names/fields would be performed by the lib,
and the library would also deal with reading the perf events through the perf
ABI and would act as a middle-man to make sure they are always perceived by the
application in the same way.

 and can said library deal with older kernels too? And distro kernels?

Steven's proposal should work.


 Why does everything *have* to be done in
 kernel-space
 it doesn't. but the alternative must be workable.
   Why are you so focused on making your application interact
 directly with kernel ABIs ?

 I'm being direct because there are trivial solutions to your problem that you
 are rejecting without due consideration. (and also I just had one coffee too
 many) ;-)

 since you seem to think that dealing with such a library is trivial...  
 how about you do it for one function even, to
 show that the deployment/use-in-an-app is workable.
 I'd be more than happy to use it if it's workable and the API is at  
 least halfway sane.

I currently have a lot on my plate with trace format and ring buffer, but if
anyone is interested in trying to implement this, I can look at it and provide
feedback/hints.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency RD Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-08 Thread Frank Ch. Eigler
Hi -

On Fri, Oct 08, 2010 at 01:21:35PM -0400, Steven Rostedt wrote:
 [...]
 Perhaps we should have make install of a kernel also install this
 library?
 [...]
 The app only needs to worry about loading the generic library. The
 generic library can test for compatible libraries for the kernel.
 [...]

If this library were to be distributed with the kernel, what would
make the generic side of the interface any less permanent than a
kernel ABI?  That is, if there is a libkernel-internals.so built from
kernel sources, wouldn't its ABI become necessarily as fixed as any
old syscall or procfs file?

One can have some backward compatibility with symbol versioning et
al., but would that be sufficiently powerful to avoid handcuffing
kernel developers' inclinations to make random future changes?

- FChE
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-08 Thread Steven Rostedt
On Fri, 2010-10-08 at 13:49 -0400, Frank Ch. Eigler wrote:
 Hi -
 
 On Fri, Oct 08, 2010 at 01:21:35PM -0400, Steven Rostedt wrote:
  [...]
  Perhaps we should have make install of a kernel also install this
  library?
  [...]
  The app only needs to worry about loading the generic library. The
  generic library can test for compatible libraries for the kernel.
  [...]
 
 If this library were to be distributed with the kernel, what would
 make the generic side of the interface any less permanent than a
 kernel ABI?  That is, if there is a libkernel-internals.so built from
 kernel sources, wouldn't its ABI become necessarily as fixed as any
 old syscall or procfs file?

One thing, the backwards compatibility would reside in user space. The
big advantage to that than for this to be in kernel space is that it is
only there when used. When we have backward compatibility in the kernel,
it's there in memory for everyone, whether you want it or not.

 
 One can have some backward compatibility with symbol versioning et
 al., but would that be sufficiently powerful to avoid handcuffing
 kernel developers' inclinations to make random future changes?
 

Sure, also note, that this is a two lib design. We still have
a /usr/lib/libkernel.so that the apps will interface with. This will
need to load in the other kernel versions. When we change interfaces, we
can make the /usr/lib/libkernel.so.1 .2 etc.

Also doing this dynamically from a library, we can check if the kernel
versions work. It can test if the used function is compatible or not,
use an older version, or just tell the user sorry, please update your
libkernel.so for this kernel. Doing this in userspace will allow a lot
more flexibility. We just need to think hard how these transactions will
work, and make it flexible for future enhancements.

But the kernel is free to do whatever it wants. The libraries will need
to worry about keeping the applications happy ;-)

-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-07 Thread Mathieu Desnoyers
[ Adding a few more CCs, since this discussion is about a tracepoint
  userspace ABI policy, which is a topic of general interest. ]

* Thomas Renninger (tr...@suse.de) wrote:
 Hi,
 
 On Monday 04 October 2010 17:20:57 Jean Pihet wrote:
  Here is a re-spin of the patches after discussion.
 
 what is going to happen here now?
 
 Is this supposed to go through Ingo's tree?
 
 Ingo: do you mind commenting on this.

Meanwhile, here are some ideas...


 I see 3 possibilities:
   1) Power (or all) perf events are never going to change.

Persisting with bad interfaces (which were never meant to be stable, and were
actually explicitely said to be non-stable) for the sake of poorly written
proprietary userspace apps does not seem like viable to me. Since when did we
start designing kernel code for broken proprierary apps ? (see below for
solutions on how to fix the apps)

The only reason we have these tracepoints in there is because they can follow
kernel code changes, thanks to their flexible nature. Being stucked with
badly named tracepoints because of some monolithic analyzer app is just insane.

   If they are going to change, then now is the right time and
   2) Backward compatibility is provided in some way for some time.

I've looked at the resulting code, and, honestly, it's ugly and it complexifies
the test matrix. I would really prefer to move this compatibility crap out of
the kernel out into userspace libraries, where it belongs. It should have got
there in the first place when the developers of these propritary
tracepoint-consumers got the hint that those were going to change. Then you have
a sane design:

1) The kernel, providing a tracepoint ABI that *can change over time*, because
   tracepoints are too tied to kernel code to afford not being changed.
2) Adapdation libraries, some which could be provided with the perf userspace
   libraries, some which could be provided along with the tracepoint consumer
   application, so the proprietary application can link on an open-source
   library that can be upgraded when needed.
3) The trace analyzer. So if the analyzer is open source, then it's fine, it
   could follow the rare ABI breakups that are needed by a simple upgrade.
   Ideally we might want to keep backward compatibility code in there too, but
   it's OK to require users to upgrade their tools if the kernel is upgraded.
   If the analyzer is closed-source, then it should interact with an open source
   library rather than with the kernel tracepoints ABI.

So, given that I don't want to uglify kernel code based on some badly written
proprietary userspace tools, and given we've given all possible warnings telling
that the tracepoint ABI might change, I really don't see why we should bother
bloating the kernel with this. The analyzers should be changed to use adaptation
libraries instead.

   3) The power events get cleaned up without compatibility to
  former kernels versions.
 
 There are patches for 2. and 3., for 1. there obviously are no
 needed.
 For 2., the patches (mine or Jeans), need some polishing. IMO
 these double events inside of general code aren't that bad.
 I trust Jean, that it's not that easy with all the include magic
 and macros, partly realized that myself already and it's not worth
 it to dig further for a temporary solution.
 
 Votes so far:
 1. Arjan
 2. Myself, Jean
 3. Peter Zijlstra and Mathieu Desnoyers
 
 Jean's work got successfully blocked for weeks now.
 If there would be a final decision by a maintainer who is going to
 merge Jean's work, that would be great and it would finally be worth
 to send updated patches again which hopefully some day find their way into
 a linux-next kernel...

Yes, sadly this debate running in circles hurts contributors.

Thanks for the summary!

Mathieu

 
 Thanks,
 
   Thomas

-- 
Mathieu Desnoyers
Operating System Efficiency RD Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-07 Thread Pierre Tardy
On Thu, Oct 7, 2010 at 5:08 PM, Mathieu Desnoyers
mathieu.desnoy...@efficios.com wrote:
 [ Adding a few more CCs, since this discussion is about a tracepoint
  userspace ABI policy, which is a topic of general interest. ]

To add a little more comment, this is not the first time that
tracepoints ABI changes. You can look at pytimechart sourcecode:
http://gitorious.org/pytimechart/pytimechart/blobs/master/timechart/ftrace.py

from 2.6.31 which is the first kernel I support,

sched_switch:  'task %s:%d [%d] == %s:%d [%d]',
changed to:
sched_switch:  'prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s
== next_comm=%s next_pid=%d next_prio=%d',

workqueue_execution: 'thread=%s
func=%s\\+%s/%s','thread','func','func_offset','func_size'),
changed to:
workqueue_execution: 'thread=%s func=%s','thread','func'),


actually, over all the events pytimechart supports, only power traces
are stable...

Regards,
-- 
Pierre
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-07 Thread Steven Rostedt
On Thu, 2010-10-07 at 17:23 +0200, Pierre Tardy wrote:

 
 actually, over all the events pytimechart supports, only power traces
 are stable...

Let me rephrase that for you...

actually, over all the events pytimechart supports, only power traces
are inflexible...


-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-07 Thread Jean Pihet
Hi,

On Thu, Oct 7, 2010 at 5:08 PM, Mathieu Desnoyers
mathieu.desnoy...@efficios.com wrote:
 [ Adding a few more CCs, since this discussion is about a tracepoint
  userspace ABI policy, which is a topic of general interest. ]

 * Thomas Renninger (tr...@suse.de) wrote:
 Hi,

 On Monday 04 October 2010 17:20:57 Jean Pihet wrote:
  Here is a re-spin of the patches after discussion.

 what is going to happen here now?

 Is this supposed to go through Ingo's tree?

 Ingo: do you mind commenting on this.

 Meanwhile, here are some ideas...


 I see 3 possibilities:
   1) Power (or all) perf events are never going to change.

 Persisting with bad interfaces (which were never meant to be stable, and were
 actually explicitely said to be non-stable) for the sake of poorly written
 proprietary userspace apps does not seem like viable to me. Since when did we
 start designing kernel code for broken proprierary apps ? (see below for
 solutions on how to fix the apps)

 The only reason we have these tracepoints in there is because they can follow
 kernel code changes, thanks to their flexible nature. Being stucked with
 badly named tracepoints because of some monolithic analyzer app is just 
 insane.

   If they are going to change, then now is the right time and
   2) Backward compatibility is provided in some way for some time.

 I've looked at the resulting code, and, honestly, it's ugly and it 
 complexifies
 the test matrix. I would really prefer to move this compatibility crap out of
 the kernel out into userspace libraries, where it belongs. It should have got
 there in the first place when the developers of these propritary
 tracepoint-consumers got the hint that those were going to change. Then you 
 have
 a sane design:

 1) The kernel, providing a tracepoint ABI that *can change over time*, because
   tracepoints are too tied to kernel code to afford not being changed.
 2) Adapdation libraries, some which could be provided with the perf userspace
   libraries, some which could be provided along with the tracepoint consumer
   application, so the proprietary application can link on an open-source
   library that can be upgraded when needed.
 3) The trace analyzer. So if the analyzer is open source, then it's fine, it
   could follow the rare ABI breakups that are needed by a simple upgrade.
   Ideally we might want to keep backward compatibility code in there too, but
   it's OK to require users to upgrade their tools if the kernel is upgraded.
   If the analyzer is closed-source, then it should interact with an open 
 source
   library rather than with the kernel tracepoints ABI.
Totally agree here! The real solution is to provide such a library.
Anyone interested?

 So, given that I don't want to uglify kernel code based on some badly written
 proprietary userspace tools, and given we've given all possible warnings 
 telling
 that the tracepoint ABI might change, I really don't see why we should bother
 bloating the kernel with this. The analyzers should be changed to use 
 adaptation
 libraries instead.

   3) The power events get cleaned up without compatibility to
      former kernels versions.

 There are patches for 2. and 3., for 1. there obviously are no
 needed.
 For 2., the patches (mine or Jeans), need some polishing. IMO
 these double events inside of general code aren't that bad.
 I trust Jean, that it's not that easy with all the include magic
 and macros, partly realized that myself already and it's not worth
 it to dig further for a temporary solution.

 Votes so far:
 1. Arjan
 2. Myself, Jean
 3. Peter Zijlstra and Mathieu Desnoyers
I am for 3 but I do not mind to provide the code for 2.


 Jean's work got successfully blocked for weeks now.
 If there would be a final decision by a maintainer who is going to
 merge Jean's work, that would be great and it would finally be worth
 to send updated patches again which hopefully some day find their way into
 a linux-next kernel...

 Yes, sadly this debate running in circles hurts contributors.

Indeed.

Honestly I do not know what to do next. Here are some facts:
- Thomas and myself did completely rework the patches a couple of
times now and most of them got acked
- A transition API has been provided along with a Kconfig option.
Special care has been taken to provide an easy to maintain solution
(just remove the option from Kconfig and a few lines of code in
kernel/trace)
- I am willing to provide the according patches to pytimechart, for
the new API only. Some pytimechart fixes from myself are already in
the tree thanks to the maintainer (Pierre).

Please let us move this forward. Some more on-going work is depending
on those changes.

Thanks,
Jean


 Thanks for the summary!

 Mathieu


 Thanks,

       Thomas

 --
 Mathieu Desnoyers
 Operating System Efficiency RD Consultant
 EfficiOS Inc.
 http://www.efficios.com

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org

Re: PATCH [0/4] perf: clean-up of power events API

2010-10-07 Thread Thomas Renninger
On Thursday 07 October 2010 17:08:25 Mathieu Desnoyers wrote:
 [ Adding a few more CCs, since this discussion is about a tracepoint
   userspace ABI policy, which is a topic of general interest. ]
 
...
 Yes, sadly this debate running in circles hurts contributors.
 
 Thanks for the summary!
Thanks for yours!

So you (and Peter Zijlstra and some others) prefer the solution I
posted with these two patches:
[PATCH 1/2] PERF(kernel): Cleanup power events
[PATCH 2/2] PERF(userspace): Adjust perf timechart to the new power events

Those should be fine.
Ingo, can you merge them, so that Jean can finally put his ARM
specific implementation on top.

Thanks,

Thomas
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-07 Thread Jean Pihet
Thomas,

On Thu, Oct 7, 2010 at 5:49 PM, Thomas Renninger tr...@suse.de wrote:
 On Thursday 07 October 2010 17:08:25 Mathieu Desnoyers wrote:
 [ Adding a few more CCs, since this discussion is about a tracepoint
   userspace ABI policy, which is a topic of general interest. ]

 ...
 Yes, sadly this debate running in circles hurts contributors.

 Thanks for the summary!
 Thanks for yours!

 So you (and Peter Zijlstra and some others) prefer the solution I
 posted with these two patches:
 [PATCH 1/2] PERF(kernel): Cleanup power events
My latest patch [1/4] is a rework of your patch. It adds:
- adaptation to linux-2.6-tip
- change of a wrong permission intel_idle.c
- light changes in the API (correction of trace printks ...)

I would prefer to use that version if you are ok.

 [PATCH 2/2] PERF(userspace): Adjust perf timechart to the new power events

 Those should be fine.
 Ingo, can you merge them, so that Jean can finally put his ARM
 specific implementation on top.

 Thanks,

    Thomas


Jean
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-07 Thread Frederic Weisbecker
On Thu, Oct 07, 2010 at 05:23:43PM +0200, Pierre Tardy wrote:
 On Thu, Oct 7, 2010 at 5:08 PM, Mathieu Desnoyers
 mathieu.desnoy...@efficios.com wrote:
  [ Adding a few more CCs, since this discussion is about a tracepoint
   userspace ABI policy, which is a topic of general interest. ]
 
 To add a little more comment, this is not the first time that
 tracepoints ABI changes. You can look at pytimechart sourcecode:
 http://gitorious.org/pytimechart/pytimechart/blobs/master/timechart/ftrace.py
 
 from 2.6.31 which is the first kernel I support,
 
 sched_switch:  'task %s:%d [%d] == %s:%d [%d]',
 changed to:
 sched_switch:  'prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s
 == next_comm=%s next_pid=%d next_prio=%d',
 
 workqueue_execution: 'thread=%s
 func=%s\\+%s/%s','thread','func','func_offset','func_size'),
 changed to:
 workqueue_execution: 'thread=%s func=%s','thread','func'),
 



Seems to be only formatting changes, but no field has been removed and
no tracepoint has been renamed, etc...

So these are no stable ABI changes because the formatting can be changed
anytime. We want that flexibility and it stands on top of the per event
format files.

Tools are not supposed to read ascii formatted traces from trace/trace_pipe
files. Instead they need to read binary traces from trace_pipe_raw files
and look at the format file to know how to format this.

This is why we have these format files: to let tools adapt with changes
like format change or fields added.

And we have a library in perf and trace-cmd that let you

- request a field value in a raw trace, by its name. So the field doesn't
  need to have a stable offset in the trace.
- request ascii format info, so that if ascii format changes, the tool
  adapt.
- record binary traces, much more leightweight for the writer (kernel)
  and for the reader (user).


I did told you that it would be better you make PyTimeChart use the perf
scripting facilities, it handles all the above things + it would
avoid you to handle a lot of things.

Now it's up to you, but don't count on us to make the ascii formatting
a stable ABI.

 
 actually, over all the events pytimechart supports, only power traces
 are stable...


Now one problem is that we have really broken the workqueue tracepoints
in this release. I thought nobody was using them so we could
refactor this tracepoint subsystem, my bad.

workqueue_execution has become workqueue_execution_start and
workqueue_execution_end. workqueue_insertion is going to
suffer a similar split.

workqueue_creation and workqueue_destruction have disappear but
I can probably restore them, but for the rest, what should we do?

I really feel uncomfortable with this tracepoint/ABI problem
Mathieu suggested we start a user library that could handle these
changes when they are really necessary.

Thoughts?

(Adding Tejun in Cc).

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-07 Thread Pierre Tardy
 I did told you that it would be better you make PyTimeChart use the perf
 scripting facilities, it handles all the above things + it would
 avoid you to handle a lot of things.

Actually, perf scripting  facility is already supported by pytimechart
but does not make it that easier to maintain.
event name changes = must update, event fields added/removed = must update


 Now it's up to you, but don't count on us to make the ascii formatting
 a stable ABI.
I'm not against adding 1 line in pytimechart each time there is some
change in ascii formatting



 actually, over all the events pytimechart supports, only power traces
 are stable...


 Now one problem is that we have really broken the workqueue tracepoints
 in this release. I thought nobody was using them so we could
 refactor this tracepoint subsystem, my bad.
No problem. I'll update pytimechart whenever someone sends me traces
that does not work (I'm okay with pre 2.6.31 traces too...)


Regards,
Pierre
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [0/4] perf: clean-up of power events API

2010-10-06 Thread Thomas Renninger
Hi,

On Monday 04 October 2010 17:20:57 Jean Pihet wrote:
 Here is a re-spin of the patches after discussion.

what is going to happen here now?

Is this supposed to go through Ingo's tree?

Ingo: do you mind commenting on this.

I see 3 possibilities:
  1) Power (or all) perf events are never going to change.
  
  If they are going to change, then now is the right time and
  2) Backward compatibility is provided in some way for some time.

  3) The power events get cleaned up without compatibility to
 former kernels versions.

There are patches for 2. and 3., for 1. there obviously are no
needed.
For 2., the patches (mine or Jeans), need some polishing. IMO
these double events inside of general code aren't that bad.
I trust Jean, that it's not that easy with all the include magic
and macros, partly realized that myself already and it's not worth
it to dig further for a temporary solution.

Votes so far:
1. Arjan
2. Myself, Jean
3. Peter Zijlstra and Mathieu Desnoyers

Jean's work got successfully blocked for weeks now.
If there would be a final decision by a maintainer who is going to
merge Jean's work, that would be great and it would finally be worth
to send updated patches again which hopefully some day find their way into
a linux-next kernel...

Thanks,

  Thomas
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html