Victor, On Thu, Oct 14, 2010 at 3:05 PM, Victor Jimenez <victor.jav...@bsc.es> wrote: > Hello, > > I am interested in being able to sample the performance counters along > the time (for instance, once every second). This is exactly what "task" > example from libpfm4 does when the -p option is used. > > However, I am getting some strange results when monitoring programs that > call the sleep function. Directly using the perf tool included in the > kernel seems to provide correct results. > > Just for reference, I am using Ubuntu 10.10 (kernel version is > 2.6.35-22) on an x86 system (please, let me know if you need more > information). > > If I execute the following command: > perf_2.6.35-22 stat -e cycles:uk,instructions:uk sleep 10 > > this is what I get: > > Performance counter stats for 'sleep 10': > > 1,119,863 cycles > 564,811 instructions # 0.504 IPC > > 10.001751746 seconds time elapsed > > The number of cycles and instructions is really low, as expected, since > the process is sleeping most of the time. > That's correct.
> However, the output from "task" in libpfm4 gives me: > > task -p -e PERF_COUNT_HW_CPU_CYCLES,PERF_COUNT_HW_INSTRUCTIONS sleep 10 > > 1,121,784 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling, > ena=615,705, run=615,705) > 480,523 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling, > ena=615,705, run=615,705) > > 0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling, > ena=615,705, run=615,705) > 0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling, > ena=615,705, run=615,705) > > 0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling, > ena=615,705, run=615,705) > 0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling, > ena=615,705, run=615,705) > > 0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling, > ena=615,705, run=615,705) > 0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling, > ena=615,705, run=615,705) > > 0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling, > ena=615,705, run=615,705) > 0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling, > ena=615,705, run=615,705) > > 0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling, > ena=615,705, run=615,705) > 0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling, > ena=615,705, run=615,705) > > 0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling, > ena=615,705, run=615,705) > 0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling, > ena=615,705, run=615,705) > > 0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling, > ena=615,705, run=615,705) > 0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling, > ena=615,705, run=615,705) > > 0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling, > ena=615,705, run=615,705) > 0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling, > ena=615,705, run=615,705) > > 14,724,919,451 PERF_COUNT_HW_CPU_CYCLES (99.99% scaling, > ena=9,001,429,971, run=785,445) > 6,363,156,689 PERF_COUNT_HW_INSTRUCTIONS (99.99% scaling, > ena=9,001,429,971, run=785,445) > > The first sample approximately matches the result from "perf". However, > with libpfm4 there is a sample at the end with a significant number of With -p, you print the deltas every second. It's thus normal to get zero given the process is blocked in sleep(). > cycles and instructions that does not appear with "perf". Initially, I > was thinking I could just drop the last sample, but if the sleep is > called in the middle of the execution, I would definitely get "wrong" > results, as there will be more samples after that. > what you are seeing in the end is the result of the counter scaling. The tool is scaling the count based on time_enabled vs time_running. Time_enabled captures duration from start to stop whereas time_running captures duration of when each event was actually on the PMU HW. Now, I think there is a kernel bug with accounting of time_running because if you run task without -p it works (this is what perf does). But if in between you read() counts then the value of time_running at the end is very different. I will investigate. Thanks for pointing this out to me. > Do you have any idea on what could be the problem? > > Thank you very much, > Victor > > WARNING / LEGAL TEXT: This message is intended only for the use of the > individual or entity to which it is addressed and may contain > information which is privileged, confidential, proprietary, or exempt > from disclosure under applicable law. If you are not the intended > recipient or the person responsible for delivering the message to the > intended recipient, you are strictly prohibited from disclosing, > distributing, copying, or in any way using this message. If you have > received this communication in error, please notify the sender and > destroy and delete any copies you may have received. > > http://www.bsc.es/disclaimer.htm > > ------------------------------------------------------------------------------ > Beautiful is writing same markup. Internet Explorer 9 supports > standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. > Spend less time writing and rewriting code and more time creating great > experiences on the web. Be a part of the beta today. > http://p.sf.net/sfu/beautyoftheweb > _______________________________________________ > perfmon2-libpfm4-commits mailing list > perfmon2-libpfm4-comm...@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/perfmon2-libpfm4-commits > ------------------------------------------------------------------------------ Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today. http://p.sf.net/sfu/beautyoftheweb _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel