Re: PATCH [0/4] perf: clean-up of power events API
On Sunday 10 October 2010 14:19:28 Ingo Molnar wrote: * Arjan van de Ven ar...@linux.intel.com wrote: ... also I have to say that some events are more likely to change than others function foo in the kernel called is more likely to change than the processor went to THIS frequency. The concept of CPU frequencies has been with us fora long time and is going to be there for a long time as well .. Right, it's a frequency and a CPU that should get passed along with the event. The X86/ACPI specific X-state data (even there unused and never will get used) should vanish before ARM starts to make use of it. The idle (power_start/power_end) state definition is worse... Most definitely. It's no accident that it took such a long time for this issue to be raised in the first place. It's a rare occurance - Do you agree that this occurance happened now and these events should get cleaned up before ARM and other archs make use of the broken interface? If not, discussing this further, is a big waste of time... and Jean would have to try to adapt his ARM code on the broken ABI... and then we can deal with it intelligently, without breaking stuff unnecessarily. Can we get this defined a bit clearer so that a patch can be created? Compatibility can only be achieved by still firing the old events for some kernel rounds. I'll send some patches in a new thread with these people in CC. It would be great to see a decision (in a way that a patch can be created) how an event change can/should look like if there is urgent need. Thanks, Thomas -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
* Thomas Renninger tr...@suse.de wrote: Most definitely. It's no accident that it took such a long time for this issue to be raised in the first place. It's a rare occurance - Do you agree that this occurance happened now and these events should get cleaned up before ARM and other archs make use of the broken interface? If not, discussing this further, is a big waste of time... and Jean would have to try to adapt his ARM code on the broken ABI... The discussion seems to have died down somewhat. Please re-send to lkml the latest patches you have to remind everyone of the latest state of things - the merge window is getting near. My only compatibility/ABI point is basically that it shouldnt break _existing_ tracepoints (and users thereof). If your latest bits meet that then it ought to be a good first step. You are free to (and encouraged to) introduce more complete sets of events. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
* Peter Zijlstra pet...@infradead.org wrote: On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote: * Thomas Renninger tr...@suse.de wrote: Most definitely. It's no accident that it took such a long time for this issue to be raised in the first place. It's a rare occurance - Do you agree that this occurance happened now and these events should get cleaned up before ARM and other archs make use of the broken interface? If not, discussing this further, is a big waste of time... and Jean would have to try to adapt his ARM code on the broken ABI... The discussion seems to have died down somewhat. Please re-send to lkml the latest patches you have to remind everyone of the latest state of things - the merge window is getting near. My only compatibility/ABI point is basically that it shouldnt break _existing_ tracepoints (and users thereof). If your latest bits meet that then it ought to be a good first step. You are free to (and encouraged to) introduce more complete sets of events. Can we deprecate and eventually remove the old ones, or will we be forever obliged to carry the old ones too? We most definitely want to deprecate and remove the old ones - but we want to give instrumentation software some migration time for that. Jean, Arjan, what would be a feasible and practical deprecation period for that? One kernel cycle? Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote: * Thomas Renninger tr...@suse.de wrote: Most definitely. It's no accident that it took such a long time for this issue to be raised in the first place. It's a rare occurance - Do you agree that this occurance happened now and these events should get cleaned up before ARM and other archs make use of the broken interface? If not, discussing this further, is a big waste of time... and Jean would have to try to adapt his ARM code on the broken ABI... The discussion seems to have died down somewhat. Please re-send to lkml the latest patches you have to remind everyone of the latest state of things - the merge window is getting near. My only compatibility/ABI point is basically that it shouldnt break _existing_ tracepoints (and users thereof). If your latest bits meet that then it ought to be a good first step. You are free to (and encouraged to) introduce more complete sets of events. Can we deprecate and eventually remove the old ones, or will we be forever obliged to carry the old ones too? -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On 10/19/2010 4:52 AM, Ingo Molnar wrote: * Peter Zijlstrapet...@infradead.org wrote: On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote: * Thomas Renningertr...@suse.de wrote: Most definitely. It's no accident that it took such a long time for this issue to be raised in the first place. It's a rare occurance - Do you agree that this occurance happened now and these events should get cleaned up before ARM and other archs make use of the broken interface? If not, discussing this further, is a big waste of time... and Jean would have to try to adapt his ARM code on the broken ABI... The discussion seems to have died down somewhat. Please re-send to lkml the latest patches you have to remind everyone of the latest state of things - the merge window is getting near. My only compatibility/ABI point is basically that it shouldnt break _existing_ tracepoints (and users thereof). If your latest bits meet that then it ought to be a good first step. You are free to (and encouraged to) introduce more complete sets of events. Can we deprecate and eventually remove the old ones, or will we be forever obliged to carry the old ones too? We most definitely want to deprecate and remove the old ones - but we want to give instrumentation software some migration time for that. Jean, Arjan, what would be a feasible and practical deprecation period for that? One kernel cycle? more like a year for some time software needs to support both, especially if popular distros stick to an older kernel like *cough* RHEL6 -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
* Arjan van de Ven ar...@linux.intel.com wrote: On 10/19/2010 4:52 AM, Ingo Molnar wrote: * Peter Zijlstrapet...@infradead.org wrote: On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote: * Thomas Renningertr...@suse.de wrote: Most definitely. It's no accident that it took such a long time for this issue to be raised in the first place. It's a rare occurance - Do you agree that this occurance happened now and these events should get cleaned up before ARM and other archs make use of the broken interface? If not, discussing this further, is a big waste of time... and Jean would have to try to adapt his ARM code on the broken ABI... The discussion seems to have died down somewhat. Please re-send to lkml the latest patches you have to remind everyone of the latest state of things - the merge window is getting near. My only compatibility/ABI point is basically that it shouldnt break _existing_ tracepoints (and users thereof). If your latest bits meet that then it ought to be a good first step. You are free to (and encouraged to) introduce more complete sets of events. Can we deprecate and eventually remove the old ones, or will we be forever obliged to carry the old ones too? We most definitely want to deprecate and remove the old ones - but we want to give instrumentation software some migration time for that. Jean, Arjan, what would be a feasible and practical deprecation period for that? One kernel cycle? more like a year for some time software needs to support both, especially if popular distros stick to an older kernel like *cough* RHEL6 Sure, you can support both. But as long as support for the _new_ events is included in PowerTop there's no need to keep the duality upstream. Running ancient PowerTop on fresh kernels is not common. An old RHEL kernel will still keep on working as you can keep support for old events in PowerTop as long as you wish to. The new kernel also wont 'overwrite' old events with new definitions in the future, so PowerTop will keep working for as long as you want to support older kernels. Does that sound good? Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On 10/19/2010 6:50 AM, Ingo Molnar wrote: * Arjan van de Venar...@linux.intel.com wrote: On 10/19/2010 4:52 AM, Ingo Molnar wrote: * Peter Zijlstrapet...@infradead.org wrote: On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote: * Thomas Renningertr...@suse.de wrote: Most definitely. It's no accident that it took such a long time for this issue to be raised in the first place. It's a rare occurance - Do you agree that this occurance happened now and these events should get cleaned up before ARM and other archs make use of the broken interface? If not, discussing this further, is a big waste of time... and Jean would have to try to adapt his ARM code on the broken ABI... The discussion seems to have died down somewhat. Please re-send to lkml the latest patches you have to remind everyone of the latest state of things - the merge window is getting near. My only compatibility/ABI point is basically that it shouldnt break _existing_ tracepoints (and users thereof). If your latest bits meet that then it ought to be a good first step. You are free to (and encouraged to) introduce more complete sets of events. Can we deprecate and eventually remove the old ones, or will we be forever obliged to carry the old ones too? We most definitely want to deprecate and remove the old ones - but we want to give instrumentation software some migration time for that. Jean, Arjan, what would be a feasible and practical deprecation period for that? One kernel cycle? more like a year for some time software needs to support both, especially if popular distros stick to an older kernel like *cough* RHEL6 Sure, you can support both. But as long as support for the _new_ events is included in PowerTop there's no need to keep the duality upstream. Running ancient PowerTop on fresh kernels is not common. An old RHEL kernel will still keep on working as you can keep support for old events in PowerTop as long as you wish to. The new kernel also wont 'overwrite' old events with new definitions in the future, so PowerTop will keep working for as long as you want to support older kernels. Does that sound good? this does not scale much long term, eg this only works if this is only done once, and these points are stable afterwards. otherwise we get 25 of those different workarounds for kernel ABI breakage into all different projects, and it becomes untestable for all the poor software writers... -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Sat, Oct 9, 2010 at 8:36 PM, Linus Torvalds torva...@linux-foundation.org wrote: On Sat, Oct 9, 2010 at 1:14 AM, Pierre Tardy tar...@gmail.com wrote: On Sat, Oct 9, 2010 at 8:28 AM, Ingo Molnar mi...@elte.hu wrote: The thing is, Arjan is 100% right that a library for this is not a 'solution', it's an unnecessary complication. Yes. sounds like overengineering. I also want to remind people that backwards compatibility should always absolutely be the #1 priority. Using libraries to hide differences is a totally moronic thing to do, because if you can do a compatibility library with good interfaces, then damn it, the kernel interface should already _be_ that good interface. Agree on that. The idea is to have the kernel interfaces cleaned up and so have better user space apps in the end. And no, even if you interact purely with open source programs, the backwards compatibility requirement doesn't go away. It's a damn pain in the ass to have to recompile, and it means that you have a much harder time mixing and matching, and just updating the kernel on top of a standard distribution. So changing kernel interfaces that get exported to user space is always a disaster. Anybody who _designs_ for that kind of disaster shouldn't be participating in kernel development, because they've shown themselves to be unable to understand the pain and suffering. Yes, we do it. Sometimes we change interfaces because not changing them is too damn painful. But it should absolutely not be the design model. So what is the best way to have the power tracing events elegantly cleaned up? The proposed patch 4/4 [1] introduces a new Kconfig option CONFIG_DEPRECATED_POWER_EVENT_TRACING which allows to select to map the trace points to the old _OR_ to the new events API, only for the already existing events. This gives some time for the adaptation of the user space apps. I understand this could require a kernel re-compilation in order to use the old events API but I really want to avoid to duplicate the trace points in the code to instrument, e.g.: diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 199dcb9..013c274 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -354,5 +354,7 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state) adjust_jiffies(CPUFREQ_POSTCHANGE, freqs); dprintk(FREQ: %lu - CPU: %lu, (unsigned long)freqs-new, (unsigned long)freqs-cpu); trace_power_frequency(POWER_PSTATE, freqs-new, freqs-cpu); + trace_power_switch_state(POWER_PSTATE, freqs-new, freqs-cpu, +smp_processor_id()); The proposed patch only changes the power tracing events API definition files, not the code to instrument. I am OK to re-spin the patches, only if a compromise is agreed on. As said before the kernel Documentation and pytimechart user space tools patches will be provided as well. Linus Thanks, Jean [1] http://marc.info/?l=linux-omapm=128620575900689w=2 -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Sat, 2010-10-09 at 21:39 -0400, Steven Rostedt wrote: I've been hesitant in the pass from doing the TRACE_EVENT_ABI() before, because Peter Zijlstra (who is currently MIA) has been strongly against it. I see no point in the TRACE_EVENT_ABI() because if I need to change such a tracepoint to reflect changes in the kernel then I will freely do so. Even seemingly stable points like sched_switch(), which we all agree will stay around forever (gotta have context switches on a multi-tasking OS) will not stay stable when we add/change scheduling policies. Sure, the prev and next task thing will stay the same, but the meaning and interpretation of things like the prio field will not, esp when we go add something like a deadline scheduler that isn't priority based. So one possibility is to simply remove all that information from the tracepoints, remove the prio and state fields, but how useful is that? I guess what I'm saying is that even if we were to provide _ABI I see us getting into this very same argument over and over again, making me want to remove all this trace event muck right now before it gets worse. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
* Arjan van de Ven ar...@linux.intel.com wrote: On 10/8/2010 11:28 PM, Ingo Molnar wrote: * Mathieu Desnoyersmathieu.desnoy...@efficios.com wrote: * Arjan van de Ven (ar...@linux.intel.com) wrote: On 10/8/2010 1:38 AM, Ingo Molnar wrote: The fundamental thing about tracing/instrumentation is that there are no deep ABI needs: it's all about analyzing development kernels (and a few select versions that get the enterprise treatment) but otherwise the half-life of this kind of information is very short. So we dont want to tie ourselves down with excessive ABIs. ok I'll start working on a second mechanism then to export information that applications need ;-( it'll look a lot like tracing I suppose ;-( What's wrong with doing the compatibility layer in a LGPL library shipped with the kernel tree under tools/ ? Why does everything *have* to be done in kernel-space ? Why are you so focused on making your application interact directly with kernel ABIs ? The thing is, Arjan is 100% right that a library for this is not a 'solution', it's an unnecessary complication. What i suggested in my mail was to _keep existing events_. I.e. do not break powertop. We are 100% happy that we _have_ such apps, and we should do reasonable things to not break them. If we need to change events, we can add a new event. The old events will lose their relevance without us having to do much - and without us having to break powertop, pytimechart, etc. We can even have periods of overlap when both events are available - to give instrumentation apps time to learn the new events. I.e. it's not an ABI in the classic sense - we do not (because we cannot) guarantee the infinite availability of these events. But we can guarantee that the fields do not change in some stupid, avoidable way. also I have to say that some events are more likely to change than others function foo in the kernel called is more likely to change than the processor went to THIS frequency. The concept of CPU frequencies has been with us fora long time and is going to be there for a long time as well .. Most definitely. It's no accident that it took such a long time for this issue to be raised in the first place. It's a rare occurance - and then we can deal with it intelligently, without breaking stuff unnecessarily. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Sun, 2010-10-10 at 08:41 +0200, Peter Zijlstra wrote: On Sat, 2010-10-09 at 21:39 -0400, Steven Rostedt wrote: I've been hesitant in the pass from doing the TRACE_EVENT_ABI() before, because Peter Zijlstra (who is currently MIA) has been strongly against it. I see no point in the TRACE_EVENT_ABI() because if I need to change such a tracepoint to reflect changes in the kernel then I will freely do so. Even seemingly stable points like sched_switch(), which we all agree will stay around forever (gotta have context switches on a multi-tasking OS) will not stay stable when we add/change scheduling policies. Sure, the prev and next task thing will stay the same, but the meaning and interpretation of things like the prio field will not, esp when we go add something like a deadline scheduler that isn't priority based. So one possibility is to simply remove all that information from the tracepoints, remove the prio and state fields, but how useful is that? I guess what I'm saying is that even if we were to provide _ABI I see us getting into this very same argument over and over again, making me want to remove all this trace event muck right now before it gets worse. Then how's this as a compromise. We do not add a TRACE_EVENT_ABI(), but instead manually add the ABI interface to existing tracepoints. Let's use the sched example you shown above. We can connect to the sched_switch() tracepoint manually in something perhaps called kernel/abi_trace.c or trace_abi.c (whatever). Here we create the directories manually there: /sys/kernel/event/sched/sched_switch/ But this sched_switch will only include the prev and next pids, comms, and perhaps even run state. But not the prio (since we see that changing). It would then need the code to enable the trace point with: register_trace_sched_switch(sched_switch_abi_probe, NULL); Where we have static void sched_switch_abi_probe(void *ignore, struct task_switch *prev, struct task_struct *next) { /* code to grab just the ABI stuff */ } And this code can then record to what ever hooked to it. Making this a manual effort will make it easier to control what becomes an ABI. We can have long discussions and flames over what goes here. But that's good since debates before an ABI is created is much better than debates after one is created. I'm afraid that a easy macro called TRACE_EVENT_ABI() would have the same issue. ABIs may be created too quickly before they are thought through. Thoughts? -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
* Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: * Arjan van de Ven (ar...@linux.intel.com) wrote: On 10/8/2010 1:38 AM, Ingo Molnar wrote: The fundamental thing about tracing/instrumentation is that there are no deep ABI needs: it's all about analyzing development kernels (and a few select versions that get the enterprise treatment) but otherwise the half-life of this kind of information is very short. So we dont want to tie ourselves down with excessive ABIs. ok I'll start working on a second mechanism then to export information that applications need ;-( it'll look a lot like tracing I suppose ;-( What's wrong with doing the compatibility layer in a LGPL library shipped with the kernel tree under tools/ ? Why does everything *have* to be done in kernel-space ? Why are you so focused on making your application interact directly with kernel ABIs ? The thing is, Arjan is 100% right that a library for this is not a 'solution', it's an unnecessary complication. What i suggested in my mail was to _keep existing events_. I.e. do not break powertop. We are 100% happy that we _have_ such apps, and we should do reasonable things to not break them. If we need to change events, we can add a new event. The old events will lose their relevance without us having to do much - and without us having to break powertop, pytimechart, etc. We can even have periods of overlap when both events are available - to give instrumentation apps time to learn the new events. I.e. it's not an ABI in the classic sense - we do not (because we cannot) guarantee the infinite availability of these events. But we can guarantee that the fields do not change in some stupid, avoidable way. Changing an existing event in some non-append way is just sloppy and we can do better. Arjan, Pierre, does that sound OK to you? Ingo -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Sat, Oct 9, 2010 at 8:28 AM, Ingo Molnar mi...@elte.hu wrote: The thing is, Arjan is 100% right that a library for this is not a 'solution', it's an unnecessary complication. Yes. sounds like overengineering. If we need to change events, we can add a new event. The old events will lose their relevance without us having to do much - and without us having to break powertop, pytimechart, etc. We can even have periods of overlap when both events are available - to give instrumentation apps time to learn the new events. I.e. it's not an ABI in the classic sense - we do not (because we cannot) guarantee the infinite availability of these events. But we can guarantee that the fields do not change in some stupid, avoidable way. Changing an existing event in some non-append way is just sloppy and we can do better. Arjan, Pierre, does that sound OK to you? Yes. Its reasonable. Note that pytimechart patch for new power API is already ready and just waiting for this issue to be decided. Regards, Pierre -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On 10/8/2010 11:28 PM, Ingo Molnar wrote: * Mathieu Desnoyersmathieu.desnoy...@efficios.com wrote: * Arjan van de Ven (ar...@linux.intel.com) wrote: On 10/8/2010 1:38 AM, Ingo Molnar wrote: The fundamental thing about tracing/instrumentation is that there are no deep ABI needs: it's all about analyzing development kernels (and a few select versions that get the enterprise treatment) but otherwise the half-life of this kind of information is very short. So we dont want to tie ourselves down with excessive ABIs. ok I'll start working on a second mechanism then to export information that applications need ;-( it'll look a lot like tracing I suppose ;-( What's wrong with doing the compatibility layer in a LGPL library shipped with the kernel tree under tools/ ? Why does everything *have* to be done in kernel-space ? Why are you so focused on making your application interact directly with kernel ABIs ? The thing is, Arjan is 100% right that a library for this is not a 'solution', it's an unnecessary complication. What i suggested in my mail was to _keep existing events_. I.e. do not break powertop. We are 100% happy that we _have_ such apps, and we should do reasonable things to not break them. If we need to change events, we can add a new event. The old events will lose their relevance without us having to do much - and without us having to break powertop, pytimechart, etc. We can even have periods of overlap when both events are available - to give instrumentation apps time to learn the new events. I.e. it's not an ABI in the classic sense - we do not (because we cannot) guarantee the infinite availability of these events. But we can guarantee that the fields do not change in some stupid, avoidable way. also I have to say that some events are more likely to change than others function foo in the kernel called is more likely to change than the processor went to THIS frequency. The concept of CPU frequencies has been with us fora long time and is going to be there for a long time as well .. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Sat, Oct 9, 2010 at 1:14 AM, Pierre Tardy tar...@gmail.com wrote: On Sat, Oct 9, 2010 at 8:28 AM, Ingo Molnar mi...@elte.hu wrote: The thing is, Arjan is 100% right that a library for this is not a 'solution', it's an unnecessary complication. Yes. sounds like overengineering. I also want to remind people that backwards compatibility should always absolutely be the #1 priority. Using libraries to hide differences is a totally moronic thing to do, because if you can do a compatibility library with good interfaces, then damn it, the kernel interface should already _be_ that good interface. And no, even if you interact purely with open source programs, the backwards compatibility requirement doesn't go away. It's a damn pain in the ass to have to recompile, and it means that you have a much harder time mixing and matching, and just updating the kernel on top of a standard distribution. So changing kernel interfaces that get exported to user space is always a disaster. Anybody who _designs_ for that kind of disaster shouldn't be participating in kernel development, because they've shown themselves to be unable to understand the pain and suffering. Yes, we do it. Sometimes we change interfaces because not changing them is too damn painful. But it should absolutely not be the design model. Linus -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Sat, 2010-10-09 at 11:36 -0700, Linus Torvalds wrote: On Sat, Oct 9, 2010 at 1:14 AM, Pierre Tardy tar...@gmail.com wrote: On Sat, Oct 9, 2010 at 8:28 AM, Ingo Molnar mi...@elte.hu wrote: The thing is, Arjan is 100% right that a library for this is not a 'solution', it's an unnecessary complication. Yes. sounds like overengineering. I also want to remind people that backwards compatibility should always absolutely be the #1 priority. Using libraries to hide differences is a totally moronic thing to do, because if you can do a compatibility library with good interfaces, then damn it, the kernel interface should already _be_ that good interface. And no, even if you interact purely with open source programs, the backwards compatibility requirement doesn't go away. It's a damn pain in the ass to have to recompile, and it means that you have a much harder time mixing and matching, and just updating the kernel on top of a standard distribution. So changing kernel interfaces that get exported to user space is always a disaster. Anybody who _designs_ for that kind of disaster shouldn't be participating in kernel development, because they've shown themselves to be unable to understand the pain and suffering. Yes, we do it. Sometimes we change interfaces because not changing them is too damn painful. But it should absolutely not be the design model. The difference here compared to all other user interfaces, is that this interface has the sole purpose of showing what is happening inside the kernel. By saying that we expose this to userspace, it must too be stable is saying that all kernel internals that use trace events must never change. The big push against tracepoints/trace-markers/trace-events in the beginning was the fear that they will hinder kernel development because they become interfaces for users to see what is happening inside the kernel. When I wrote the interface, I put it in the debugfs system so people will know that this is a debug interface and can change without notice. Trace-events, unlike syscalls, may change depending on how you compiled the kernel. There's no guarantee that they will even exist on a system. If all trace-events are now stable ABI, then I suggest we stop adding any more events, and only add new ones to places that we do not expect to develop the kernel on anymore. Not sure what other solution there is. Trace points have been added way too freely, because any maintainer could add them to their system any way they felt like it. Now if they are frozen in stone, then the code that they expose must also be frozen. -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Sat, 2010-10-09 at 09:19 -0700, Arjan van de Ven wrote: I.e. it's not an ABI in the classic sense - we do not (because we cannot) guarantee the infinite availability of these events. But we can guarantee that the fields do not change in some stupid, avoidable way. also I have to say that some events are more likely to change than others function foo in the kernel called is more likely to change than the processor went to THIS frequency. The concept of CPU frequencies has been with us fora long time and is going to be there for a long time as well .. Perhaps for basic concepts, we need a standard trace-event. Are people willing to have a TRACE_EVENT_ABI() (it's trivial to write), and we can mark those events with that macro that we know tools depend on. These events can be exposed in a /sys/kernel/events/... directory, to let tools know what what events they can rely on. We've talked about doing this before, I've just been waiting to hear a consensus on if we should. I know Peter Zijlstra was against the idea, and too bad he's off gallivanting to share his input now. -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Sat, Oct 9, 2010 at 2:15 PM, Steven Rostedt rost...@goodmis.org wrote: The difference here compared to all other user interfaces, is that this interface has the sole purpose of showing what is happening inside the kernel. Bogus and dishonest argument. Listen to yourself, and read this thread again. The thread was about doing some kind of open-source library to allow non-open-source access to these events, and keeping backwards compatibility in user space. In fact, that is what you yourself said. So you claimed it could be backwards-compatible. If that's the case, then there is no excuse for not being so in the kernel. You can't have it both ways. Stop the f*cking waffling. Linus -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Sat, 2010-10-09 at 16:20 -0700, Linus Torvalds wrote: On Sat, Oct 9, 2010 at 2:15 PM, Steven Rostedt rost...@goodmis.org wrote: The difference here compared to all other user interfaces, is that this interface has the sole purpose of showing what is happening inside the kernel. Bogus and dishonest argument. Listen to yourself, and read this thread again. The thread was about doing some kind of open-source library to allow non-open-source access to these events, and keeping backwards compatibility in user space. In fact, that is what you yourself said. So you claimed it could be backwards-compatible. If that's the case, then there is no excuse for not being so in the kernel. You can't have it both ways. Stop the f*cking waffling. Let me rephrase it then, and lets forget about the library. I was just brain storming ideas. I'm all for labeling specific trace points as ABI, such that, these trace points have had sufficient thought and are not expected to change in the near future. But I'm against the idea that any tracepoint that has been shown to userspace can be considered stable. With or without libraries, I'm for two kinds of interfaces: One that is stable and has been thoroughly thought through, and one that is free for the maintainers to have an interface to let them see what is happening in the kernel, even on a production system, but be able to change them whenever they feel the need. That's the basis of my idea. A stable backward-compatible interface, and an interface that is unstable for developers. Whether we put the stable interface into a library (to keep the ugliness from developers, which you obviously do not like), or two, distinctly label the tracepoint as ABI, to let the developers and everyone else know what tracepoints an application can count on and what ones they should not. Thus my waffling is really wanting both, a stable ABI and an unstable one. I've been hesitant in the pass from doing the TRACE_EVENT_ABI() before, because Peter Zijlstra (who is currently MIA) has been strongly against it. -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
Hello, On 10/07/2010 05:58 PM, Frederic Weisbecker wrote: I really feel uncomfortable with this tracepoint/ABI problem Mathieu suggested we start a user library that could handle these changes when they are really necessary. Thoughts? (Adding Tejun in Cc). Given that tracepoints are supposed to make internal operation visible. I don't think it's a good idea to make it part of fixed ABI. Maybe some core part can be put in stone but I think things like internal workqueue implementation should be changeable without worrying about ABI issues. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
* Tejun Heo t...@kernel.org wrote: Hello, On 10/07/2010 05:58 PM, Frederic Weisbecker wrote: I really feel uncomfortable with this tracepoint/ABI problem Mathieu suggested we start a user library that could handle these changes when they are really necessary. Thoughts? (Adding Tejun in Cc). Given that tracepoints are supposed to make internal operation visible. I don't think it's a good idea to make it part of fixed ABI. Yep, exactly. OTOH since it exports information we can do disciplined versioning and extensions only - i.e. leave the old power events around, add the new ones with new distinct names, and phase out the old ones in a kernel cycle or two. It's not hard to do. That way apps can support old kernels too (if they want to), but new events as well - and all in a controlled, non-disruptive manner. More importantly, the kernel wont have cruft and will have no ABI restrictions - the only 'restriction' is to treat information in an append-only manner (i.e. change the event name if you change it materially) - and that's not a big deal here. The fundamental thing about tracing/instrumentation is that there are no deep ABI needs: it's all about analyzing development kernels (and a few select versions that get the enterprise treatment) but otherwise the half-life of this kind of information is very short. So we dont want to tie ourselves down with excessive ABIs. Maybe some core part can be put in stone but I think things like internal workqueue implementation should be changeable without worrying about ABI issues. That's most definitely so! There is and will be zero back-coupling from workqueue tracepoints to workqueue internals. Dont worry about this. Ingo -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On 10/8/2010 1:38 AM, Ingo Molnar wrote: The fundamental thing about tracing/instrumentation is that there are no deep ABI needs: it's all about analyzing development kernels (and a few select versions that get the enterprise treatment) but otherwise the half-life of this kind of information is very short. So we dont want to tie ourselves down with excessive ABIs. ok I'll start working on a second mechanism then to export information that applications need ;-( it'll look a lot like tracing I suppose ;-( -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
* Arjan van de Ven (ar...@linux.intel.com) wrote: On 10/8/2010 1:38 AM, Ingo Molnar wrote: The fundamental thing about tracing/instrumentation is that there are no deep ABI needs: it's all about analyzing development kernels (and a few select versions that get the enterprise treatment) but otherwise the half-life of this kind of information is very short. So we dont want to tie ourselves down with excessive ABIs. ok I'll start working on a second mechanism then to export information that applications need ;-( it'll look a lot like tracing I suppose ;-( What's wrong with doing the compatibility layer in a LGPL library shipped with the kernel tree under tools/ ? Why does everything *have* to be done in kernel-space ? Why are you so focused on making your application interact directly with kernel ABIs ? I'm being direct because there are trivial solutions to your problem that you are rejecting without due consideration. (and also I just had one coffee too many) ;-) Regards, Mathieu -- Mathieu Desnoyers Operating System Efficiency RD Consultant EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On 10/8/2010 6:41 AM, Mathieu Desnoyers wrote: * Arjan van de Ven (ar...@linux.intel.com) wrote: On 10/8/2010 1:38 AM, Ingo Molnar wrote: The fundamental thing about tracing/instrumentation is that there are no deep ABI needs: it's all about analyzing development kernels (and a few select versions that get the enterprise treatment) but otherwise the half-life of this kind of information is very short. So we dont want to tie ourselves down with excessive ABIs. ok I'll start working on a second mechanism then to export information that applications need ;-( it'll look a lot like tracing I suppose ;-( What's wrong with doing the compatibility layer in a LGPL library shipped with the kernel tree under tools/ ? because that is not workable... at least nobody has shown to be able to make this work. libraries (after compilation) live in /lib or /usr/lib (or lib64 I suppose). what mechanism ensures that a user who compiles his kernel gets a library compatible with that kernel in /usr/lib? and can said library deal with older kernels too? And distro kernels? Why does everything *have* to be done in kernel-space it doesn't. but the alternative must be workable. Why are you so focused on making your application interact directly with kernel ABIs ? I'm being direct because there are trivial solutions to your problem that you are rejecting without due consideration. (and also I just had one coffee too many) ;-) since you seem to think that dealing with such a library is trivial... how about you do it for one function even, to show that the deployment/use-in-an-app is workable. I'd be more than happy to use it if it's workable and the API is at least halfway sane. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Fri, 2010-10-08 at 09:22 -0700, Arjan van de Ven wrote: On 10/8/2010 6:41 AM, Mathieu Desnoyers wrote: because that is not workable... at least nobody has shown to be able to make this work. libraries (after compilation) live in /lib or /usr/lib (or lib64 I suppose). what mechanism ensures that a user who compiles his kernel gets a library compatible with that kernel in /usr/lib? and can said library deal with older kernels too? And distro kernels? Perhaps we should have make install of a kernel also install this library? Have two libraries? One that is linked to the app, the other that can search for another library to link on load too (like a kernel.ld.so) Then we could see the kernel version, and search for a library that is compatible, and load that one. The app only needs to worry about loading the generic library. The generic library can test for compatible libraries for the kernel. Could just be.. libkernel.ld.so which then loads.. /lib/modules/2.6.36/libkernel.so Just a little brain storming. -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
* Arjan van de Ven (ar...@linux.intel.com) wrote: On 10/8/2010 6:41 AM, Mathieu Desnoyers wrote: * Arjan van de Ven (ar...@linux.intel.com) wrote: On 10/8/2010 1:38 AM, Ingo Molnar wrote: The fundamental thing about tracing/instrumentation is that there are no deep ABI needs: it's all about analyzing development kernels (and a few select versions that get the enterprise treatment) but otherwise the half-life of this kind of information is very short. So we dont want to tie ourselves down with excessive ABIs. ok I'll start working on a second mechanism then to export information that applications need ;-( it'll look a lot like tracing I suppose ;-( What's wrong with doing the compatibility layer in a LGPL library shipped with the kernel tree under tools/ ? because that is not workable... at least nobody has shown to be able to make this work. libraries (after compilation) live in /lib or /usr/lib (or lib64 I suppose). what mechanism ensures that a user who compiles his kernel gets a library compatible with that kernel in /usr/lib? I don't think the perf tools/ do it right at the moment, but here is my proposal: Currently, we need to do make install from the tools/ directory. Since kernel developers are lazy, I would propose a CONFIG_INSTALL_TOOLS default N config option that would let make install from the root of the kernel tree install the tools too. (looking at my inbox..) As Steven just beat me to it, see his lib versioning proposal. ;) The library would present an API to the application that would let apps consume specific events of interest. Translation of fixed event names/fields into the current kernel version tracepoint names/fields would be performed by the lib, and the library would also deal with reading the perf events through the perf ABI and would act as a middle-man to make sure they are always perceived by the application in the same way. and can said library deal with older kernels too? And distro kernels? Steven's proposal should work. Why does everything *have* to be done in kernel-space it doesn't. but the alternative must be workable. Why are you so focused on making your application interact directly with kernel ABIs ? I'm being direct because there are trivial solutions to your problem that you are rejecting without due consideration. (and also I just had one coffee too many) ;-) since you seem to think that dealing with such a library is trivial... how about you do it for one function even, to show that the deployment/use-in-an-app is workable. I'd be more than happy to use it if it's workable and the API is at least halfway sane. I currently have a lot on my plate with trace format and ring buffer, but if anyone is interested in trying to implement this, I can look at it and provide feedback/hints. Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency RD Consultant EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
Hi - On Fri, Oct 08, 2010 at 01:21:35PM -0400, Steven Rostedt wrote: [...] Perhaps we should have make install of a kernel also install this library? [...] The app only needs to worry about loading the generic library. The generic library can test for compatible libraries for the kernel. [...] If this library were to be distributed with the kernel, what would make the generic side of the interface any less permanent than a kernel ABI? That is, if there is a libkernel-internals.so built from kernel sources, wouldn't its ABI become necessarily as fixed as any old syscall or procfs file? One can have some backward compatibility with symbol versioning et al., but would that be sufficiently powerful to avoid handcuffing kernel developers' inclinations to make random future changes? - FChE -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Fri, 2010-10-08 at 13:49 -0400, Frank Ch. Eigler wrote: Hi - On Fri, Oct 08, 2010 at 01:21:35PM -0400, Steven Rostedt wrote: [...] Perhaps we should have make install of a kernel also install this library? [...] The app only needs to worry about loading the generic library. The generic library can test for compatible libraries for the kernel. [...] If this library were to be distributed with the kernel, what would make the generic side of the interface any less permanent than a kernel ABI? That is, if there is a libkernel-internals.so built from kernel sources, wouldn't its ABI become necessarily as fixed as any old syscall or procfs file? One thing, the backwards compatibility would reside in user space. The big advantage to that than for this to be in kernel space is that it is only there when used. When we have backward compatibility in the kernel, it's there in memory for everyone, whether you want it or not. One can have some backward compatibility with symbol versioning et al., but would that be sufficiently powerful to avoid handcuffing kernel developers' inclinations to make random future changes? Sure, also note, that this is a two lib design. We still have a /usr/lib/libkernel.so that the apps will interface with. This will need to load in the other kernel versions. When we change interfaces, we can make the /usr/lib/libkernel.so.1 .2 etc. Also doing this dynamically from a library, we can check if the kernel versions work. It can test if the used function is compatible or not, use an older version, or just tell the user sorry, please update your libkernel.so for this kernel. Doing this in userspace will allow a lot more flexibility. We just need to think hard how these transactions will work, and make it flexible for future enhancements. But the kernel is free to do whatever it wants. The libraries will need to worry about keeping the applications happy ;-) -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
[ Adding a few more CCs, since this discussion is about a tracepoint userspace ABI policy, which is a topic of general interest. ] * Thomas Renninger (tr...@suse.de) wrote: Hi, On Monday 04 October 2010 17:20:57 Jean Pihet wrote: Here is a re-spin of the patches after discussion. what is going to happen here now? Is this supposed to go through Ingo's tree? Ingo: do you mind commenting on this. Meanwhile, here are some ideas... I see 3 possibilities: 1) Power (or all) perf events are never going to change. Persisting with bad interfaces (which were never meant to be stable, and were actually explicitely said to be non-stable) for the sake of poorly written proprietary userspace apps does not seem like viable to me. Since when did we start designing kernel code for broken proprierary apps ? (see below for solutions on how to fix the apps) The only reason we have these tracepoints in there is because they can follow kernel code changes, thanks to their flexible nature. Being stucked with badly named tracepoints because of some monolithic analyzer app is just insane. If they are going to change, then now is the right time and 2) Backward compatibility is provided in some way for some time. I've looked at the resulting code, and, honestly, it's ugly and it complexifies the test matrix. I would really prefer to move this compatibility crap out of the kernel out into userspace libraries, where it belongs. It should have got there in the first place when the developers of these propritary tracepoint-consumers got the hint that those were going to change. Then you have a sane design: 1) The kernel, providing a tracepoint ABI that *can change over time*, because tracepoints are too tied to kernel code to afford not being changed. 2) Adapdation libraries, some which could be provided with the perf userspace libraries, some which could be provided along with the tracepoint consumer application, so the proprietary application can link on an open-source library that can be upgraded when needed. 3) The trace analyzer. So if the analyzer is open source, then it's fine, it could follow the rare ABI breakups that are needed by a simple upgrade. Ideally we might want to keep backward compatibility code in there too, but it's OK to require users to upgrade their tools if the kernel is upgraded. If the analyzer is closed-source, then it should interact with an open source library rather than with the kernel tracepoints ABI. So, given that I don't want to uglify kernel code based on some badly written proprietary userspace tools, and given we've given all possible warnings telling that the tracepoint ABI might change, I really don't see why we should bother bloating the kernel with this. The analyzers should be changed to use adaptation libraries instead. 3) The power events get cleaned up without compatibility to former kernels versions. There are patches for 2. and 3., for 1. there obviously are no needed. For 2., the patches (mine or Jeans), need some polishing. IMO these double events inside of general code aren't that bad. I trust Jean, that it's not that easy with all the include magic and macros, partly realized that myself already and it's not worth it to dig further for a temporary solution. Votes so far: 1. Arjan 2. Myself, Jean 3. Peter Zijlstra and Mathieu Desnoyers Jean's work got successfully blocked for weeks now. If there would be a final decision by a maintainer who is going to merge Jean's work, that would be great and it would finally be worth to send updated patches again which hopefully some day find their way into a linux-next kernel... Yes, sadly this debate running in circles hurts contributors. Thanks for the summary! Mathieu Thanks, Thomas -- Mathieu Desnoyers Operating System Efficiency RD Consultant EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Thu, Oct 7, 2010 at 5:08 PM, Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: [ Adding a few more CCs, since this discussion is about a tracepoint userspace ABI policy, which is a topic of general interest. ] To add a little more comment, this is not the first time that tracepoints ABI changes. You can look at pytimechart sourcecode: http://gitorious.org/pytimechart/pytimechart/blobs/master/timechart/ftrace.py from 2.6.31 which is the first kernel I support, sched_switch: 'task %s:%d [%d] == %s:%d [%d]', changed to: sched_switch: 'prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s == next_comm=%s next_pid=%d next_prio=%d', workqueue_execution: 'thread=%s func=%s\\+%s/%s','thread','func','func_offset','func_size'), changed to: workqueue_execution: 'thread=%s func=%s','thread','func'), actually, over all the events pytimechart supports, only power traces are stable... Regards, -- Pierre -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Thu, 2010-10-07 at 17:23 +0200, Pierre Tardy wrote: actually, over all the events pytimechart supports, only power traces are stable... Let me rephrase that for you... actually, over all the events pytimechart supports, only power traces are inflexible... -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
Hi, On Thu, Oct 7, 2010 at 5:08 PM, Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: [ Adding a few more CCs, since this discussion is about a tracepoint userspace ABI policy, which is a topic of general interest. ] * Thomas Renninger (tr...@suse.de) wrote: Hi, On Monday 04 October 2010 17:20:57 Jean Pihet wrote: Here is a re-spin of the patches after discussion. what is going to happen here now? Is this supposed to go through Ingo's tree? Ingo: do you mind commenting on this. Meanwhile, here are some ideas... I see 3 possibilities: 1) Power (or all) perf events are never going to change. Persisting with bad interfaces (which were never meant to be stable, and were actually explicitely said to be non-stable) for the sake of poorly written proprietary userspace apps does not seem like viable to me. Since when did we start designing kernel code for broken proprierary apps ? (see below for solutions on how to fix the apps) The only reason we have these tracepoints in there is because they can follow kernel code changes, thanks to their flexible nature. Being stucked with badly named tracepoints because of some monolithic analyzer app is just insane. If they are going to change, then now is the right time and 2) Backward compatibility is provided in some way for some time. I've looked at the resulting code, and, honestly, it's ugly and it complexifies the test matrix. I would really prefer to move this compatibility crap out of the kernel out into userspace libraries, where it belongs. It should have got there in the first place when the developers of these propritary tracepoint-consumers got the hint that those were going to change. Then you have a sane design: 1) The kernel, providing a tracepoint ABI that *can change over time*, because tracepoints are too tied to kernel code to afford not being changed. 2) Adapdation libraries, some which could be provided with the perf userspace libraries, some which could be provided along with the tracepoint consumer application, so the proprietary application can link on an open-source library that can be upgraded when needed. 3) The trace analyzer. So if the analyzer is open source, then it's fine, it could follow the rare ABI breakups that are needed by a simple upgrade. Ideally we might want to keep backward compatibility code in there too, but it's OK to require users to upgrade their tools if the kernel is upgraded. If the analyzer is closed-source, then it should interact with an open source library rather than with the kernel tracepoints ABI. Totally agree here! The real solution is to provide such a library. Anyone interested? So, given that I don't want to uglify kernel code based on some badly written proprietary userspace tools, and given we've given all possible warnings telling that the tracepoint ABI might change, I really don't see why we should bother bloating the kernel with this. The analyzers should be changed to use adaptation libraries instead. 3) The power events get cleaned up without compatibility to former kernels versions. There are patches for 2. and 3., for 1. there obviously are no needed. For 2., the patches (mine or Jeans), need some polishing. IMO these double events inside of general code aren't that bad. I trust Jean, that it's not that easy with all the include magic and macros, partly realized that myself already and it's not worth it to dig further for a temporary solution. Votes so far: 1. Arjan 2. Myself, Jean 3. Peter Zijlstra and Mathieu Desnoyers I am for 3 but I do not mind to provide the code for 2. Jean's work got successfully blocked for weeks now. If there would be a final decision by a maintainer who is going to merge Jean's work, that would be great and it would finally be worth to send updated patches again which hopefully some day find their way into a linux-next kernel... Yes, sadly this debate running in circles hurts contributors. Indeed. Honestly I do not know what to do next. Here are some facts: - Thomas and myself did completely rework the patches a couple of times now and most of them got acked - A transition API has been provided along with a Kconfig option. Special care has been taken to provide an easy to maintain solution (just remove the option from Kconfig and a few lines of code in kernel/trace) - I am willing to provide the according patches to pytimechart, for the new API only. Some pytimechart fixes from myself are already in the tree thanks to the maintainer (Pierre). Please let us move this forward. Some more on-going work is depending on those changes. Thanks, Jean Thanks for the summary! Mathieu Thanks, Thomas -- Mathieu Desnoyers Operating System Efficiency RD Consultant EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org
Re: PATCH [0/4] perf: clean-up of power events API
On Thursday 07 October 2010 17:08:25 Mathieu Desnoyers wrote: [ Adding a few more CCs, since this discussion is about a tracepoint userspace ABI policy, which is a topic of general interest. ] ... Yes, sadly this debate running in circles hurts contributors. Thanks for the summary! Thanks for yours! So you (and Peter Zijlstra and some others) prefer the solution I posted with these two patches: [PATCH 1/2] PERF(kernel): Cleanup power events [PATCH 2/2] PERF(userspace): Adjust perf timechart to the new power events Those should be fine. Ingo, can you merge them, so that Jean can finally put his ARM specific implementation on top. Thanks, Thomas -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
Thomas, On Thu, Oct 7, 2010 at 5:49 PM, Thomas Renninger tr...@suse.de wrote: On Thursday 07 October 2010 17:08:25 Mathieu Desnoyers wrote: [ Adding a few more CCs, since this discussion is about a tracepoint userspace ABI policy, which is a topic of general interest. ] ... Yes, sadly this debate running in circles hurts contributors. Thanks for the summary! Thanks for yours! So you (and Peter Zijlstra and some others) prefer the solution I posted with these two patches: [PATCH 1/2] PERF(kernel): Cleanup power events My latest patch [1/4] is a rework of your patch. It adds: - adaptation to linux-2.6-tip - change of a wrong permission intel_idle.c - light changes in the API (correction of trace printks ...) I would prefer to use that version if you are ok. [PATCH 2/2] PERF(userspace): Adjust perf timechart to the new power events Those should be fine. Ingo, can you merge them, so that Jean can finally put his ARM specific implementation on top. Thanks, Thomas Jean -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
On Thu, Oct 07, 2010 at 05:23:43PM +0200, Pierre Tardy wrote: On Thu, Oct 7, 2010 at 5:08 PM, Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: [ Adding a few more CCs, since this discussion is about a tracepoint userspace ABI policy, which is a topic of general interest. ] To add a little more comment, this is not the first time that tracepoints ABI changes. You can look at pytimechart sourcecode: http://gitorious.org/pytimechart/pytimechart/blobs/master/timechart/ftrace.py from 2.6.31 which is the first kernel I support, sched_switch: 'task %s:%d [%d] == %s:%d [%d]', changed to: sched_switch: 'prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s == next_comm=%s next_pid=%d next_prio=%d', workqueue_execution: 'thread=%s func=%s\\+%s/%s','thread','func','func_offset','func_size'), changed to: workqueue_execution: 'thread=%s func=%s','thread','func'), Seems to be only formatting changes, but no field has been removed and no tracepoint has been renamed, etc... So these are no stable ABI changes because the formatting can be changed anytime. We want that flexibility and it stands on top of the per event format files. Tools are not supposed to read ascii formatted traces from trace/trace_pipe files. Instead they need to read binary traces from trace_pipe_raw files and look at the format file to know how to format this. This is why we have these format files: to let tools adapt with changes like format change or fields added. And we have a library in perf and trace-cmd that let you - request a field value in a raw trace, by its name. So the field doesn't need to have a stable offset in the trace. - request ascii format info, so that if ascii format changes, the tool adapt. - record binary traces, much more leightweight for the writer (kernel) and for the reader (user). I did told you that it would be better you make PyTimeChart use the perf scripting facilities, it handles all the above things + it would avoid you to handle a lot of things. Now it's up to you, but don't count on us to make the ascii formatting a stable ABI. actually, over all the events pytimechart supports, only power traces are stable... Now one problem is that we have really broken the workqueue tracepoints in this release. I thought nobody was using them so we could refactor this tracepoint subsystem, my bad. workqueue_execution has become workqueue_execution_start and workqueue_execution_end. workqueue_insertion is going to suffer a similar split. workqueue_creation and workqueue_destruction have disappear but I can probably restore them, but for the rest, what should we do? I really feel uncomfortable with this tracepoint/ABI problem Mathieu suggested we start a user library that could handle these changes when they are really necessary. Thoughts? (Adding Tejun in Cc). -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
I did told you that it would be better you make PyTimeChart use the perf scripting facilities, it handles all the above things + it would avoid you to handle a lot of things. Actually, perf scripting facility is already supported by pytimechart but does not make it that easier to maintain. event name changes = must update, event fields added/removed = must update Now it's up to you, but don't count on us to make the ascii formatting a stable ABI. I'm not against adding 1 line in pytimechart each time there is some change in ascii formatting actually, over all the events pytimechart supports, only power traces are stable... Now one problem is that we have really broken the workqueue tracepoints in this release. I thought nobody was using them so we could refactor this tracepoint subsystem, my bad. No problem. I'll update pytimechart whenever someone sends me traces that does not work (I'm okay with pre 2.6.31 traces too...) Regards, Pierre -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH [0/4] perf: clean-up of power events API
Hi, On Monday 04 October 2010 17:20:57 Jean Pihet wrote: Here is a re-spin of the patches after discussion. what is going to happen here now? Is this supposed to go through Ingo's tree? Ingo: do you mind commenting on this. I see 3 possibilities: 1) Power (or all) perf events are never going to change. If they are going to change, then now is the right time and 2) Backward compatibility is provided in some way for some time. 3) The power events get cleaned up without compatibility to former kernels versions. There are patches for 2. and 3., for 1. there obviously are no needed. For 2., the patches (mine or Jeans), need some polishing. IMO these double events inside of general code aren't that bad. I trust Jean, that it's not that easy with all the include magic and macros, partly realized that myself already and it's not worth it to dig further for a temporary solution. Votes so far: 1. Arjan 2. Myself, Jean 3. Peter Zijlstra and Mathieu Desnoyers Jean's work got successfully blocked for weeks now. If there would be a final decision by a maintainer who is going to merge Jean's work, that would be great and it would finally be worth to send updated patches again which hopefully some day find their way into a linux-next kernel... Thanks, Thomas -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html