Victor,

On Thu, Oct 14, 2010 at 3:05 PM, Victor Jimenez <victor.jav...@bsc.es> wrote:
> Hello,
>
> I am interested in being able to sample the performance counters along
> the time (for instance, once every second). This is exactly what "task"
> example from libpfm4 does when the -p option is used.
>
> However, I am getting some strange results when monitoring programs that
> call the sleep function. Directly using the perf tool included in the
> kernel seems to provide correct results.
>
> Just for reference, I am using Ubuntu 10.10 (kernel version is
> 2.6.35-22) on an x86 system (please, let me know if you need more
> information).
>
> If I execute the following command:
> perf_2.6.35-22 stat -e cycles:uk,instructions:uk sleep 10
>
> this is what I get:
>
> Performance counter stats for 'sleep 10':
>
>             1,119,863  cycles
>              564,811  instructions             #      0.504 IPC
>
>        10.001751746  seconds time elapsed
>
> The number of cycles and instructions is really low, as expected, since
> the process is sleeping most of the time.
>
That's correct.

> However, the output from "task" in libpfm4 gives me:
>
> task -p -e PERF_COUNT_HW_CPU_CYCLES,PERF_COUNT_HW_INSTRUCTIONS sleep 10
>
>            1,121,784 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling,
> ena=615,705, run=615,705)
>              480,523 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling,
> ena=615,705, run=615,705)
>
>                    0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling,
> ena=615,705, run=615,705)
>                    0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling,
> ena=615,705, run=615,705)
>
>                    0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling,
> ena=615,705, run=615,705)
>                    0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling,
> ena=615,705, run=615,705)
>
>                    0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling,
> ena=615,705, run=615,705)
>                    0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling,
> ena=615,705, run=615,705)
>
>                    0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling,
> ena=615,705, run=615,705)
>                    0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling,
> ena=615,705, run=615,705)
>
>                    0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling,
> ena=615,705, run=615,705)
>                    0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling,
> ena=615,705, run=615,705)
>
>                    0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling,
> ena=615,705, run=615,705)
>                    0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling,
> ena=615,705, run=615,705)
>
>                    0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling,
> ena=615,705, run=615,705)
>                    0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling,
> ena=615,705, run=615,705)
>
>                    0 PERF_COUNT_HW_CPU_CYCLES (0.00% scaling,
> ena=615,705, run=615,705)
>                    0 PERF_COUNT_HW_INSTRUCTIONS (0.00% scaling,
> ena=615,705, run=615,705)
>
>       14,724,919,451 PERF_COUNT_HW_CPU_CYCLES (99.99% scaling,
> ena=9,001,429,971, run=785,445)
>        6,363,156,689 PERF_COUNT_HW_INSTRUCTIONS (99.99% scaling,
> ena=9,001,429,971, run=785,445)
>
> The first sample approximately matches the result from "perf". However,
> with libpfm4 there is a sample at the end with a significant number of

With -p, you print the deltas every second. It's thus normal to get zero
given the process is blocked in sleep().

> cycles and instructions that does not appear with "perf". Initially, I
> was thinking I could just drop the last sample, but if the sleep is
> called in the middle of the execution, I would definitely get "wrong"
> results, as there will be more samples after that.
>
what you are seeing in the end is the result of the counter scaling.
The tool is scaling the count based on time_enabled vs time_running.
Time_enabled captures duration from start to stop whereas time_running
captures duration of when each event was actually on the PMU HW.

Now, I think there is a kernel bug with accounting of time_running because
if you run task without -p it works (this is what perf does). But if in between
you read() counts then the value of time_running at the end is very different.

I will investigate.
Thanks for pointing this out to me.

> Do you have any idea on what could be the problem?
>
> Thank you very much,
> Victor
>
> WARNING / LEGAL TEXT: This message is intended only for the use of the
> individual or entity to which it is addressed and may contain
> information which is privileged, confidential, proprietary, or exempt
> from disclosure under applicable law. If you are not the intended
> recipient or the person responsible for delivering the message to the
> intended recipient, you are strictly prohibited from disclosing,
> distributing, copying, or in any way using this message. If you have
> received this communication in error, please notify the sender and
> destroy and delete any copies you may have received.
>
> http://www.bsc.es/disclaimer.htm
>
> ------------------------------------------------------------------------------
> Beautiful is writing same markup. Internet Explorer 9 supports
> standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
> Spend less time writing and  rewriting code and more time creating great
> experiences on the web. Be a part of the beta today.
> http://p.sf.net/sfu/beautyoftheweb
> _______________________________________________
> perfmon2-libpfm4-commits mailing list
> perfmon2-libpfm4-comm...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/perfmon2-libpfm4-commits
>

------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to