On Tue, 9 Sep 2014, Gary Mohr wrote:

> --- ls output removed ---
> 
>  Performance counter stats for '/bin/ls':
> 
>              5,625 uncore_cbox_0/event=0x35,umask=0xa/                        
>             [26.27%]
>    <not supported> uncore_cbox_0/event=0x35,umask=0x4a,filter_nid=0x1/
> 
>        0.002038929 seconds time elapsed
> 
> 
> So this behaved similar to PAPI/libpfm4.  The first event returned a count 
> and the second event got an error.  
> Just for fun, I used the same events in the opposite order:
> 
> 
> perf stat -a -e 
> \{"uncore_cbox_0/event=0x35,umask=0x4a,filter_nid=0x1/","uncore_cbox_0/event=0x35,umask=0xa/"\}
>  /bin/ls
> 
> --- ls output removed ---
> 
>  Performance counter stats for '/bin/ls':
> 
>      <not counted> uncore_cbox_0/event=0x35,umask=0x4a,filter_nid=0x1/
>    <not supported> uncore_cbox_0/event=0x35,umask=0xa/
> 
>        0.002003219 seconds time elapsed
> 
> 
> This caused both events to report an error.  This seems to me like a kernel 
> problem.  I also tried using each event by itself and they both returned 
> counts.  With PAPI/libpfm4 I believe that this test will return a count for 
> the first event and an error on the second.
> You implied that the { } 's may influence if or how events are grouped.  So I 
> tried the command again in the original order without the { } characters and 
> got this: 
> 
> 
> perf stat -a -e 
> "uncore_cbox_0/event=0x35,umask=0xa/","uncore_cbox_0/event=0x35,umask=0x4a,filter_nid=0x1/"
>  /bin/ls
> 
> --- ls output removed ---
> 
>  Performance counter stats for '/bin/ls':
> 
>             57,288 uncore_cbox_0/event=0x35,umask=0xa/                        
>             [18.05%]
>            158,292 uncore_cbox_0/event=0x35,umask=0x4a,filter_nid=0x1/        
>                             [ 3.07%]
> 
>        0.001963151 seconds time elapsed
> 
> 
> Both events give a count.  I have never seen this result with PAPI/libpfm4 
> but I have never tried them with grouping enabled when calling the kernel.  
> 
> In PAPI we turned grouping off so that the kernel would allow us to use 
> events from different uncore pmu's at the same time.  I can try turning it 
> back on and running these two events to see what happens.  If they work, 
> maybe a better solution is to try a hybrid form of grouping.  We could create 
> a different group for each uncore pmu and put all the events associated with 
> a given pmu into that pmu's group.  We would then call the kernel once for 
> each group rather than once for each event as we are doing now.
> 
> Any idea if the kernel will let us play the game this way ??

Interesting, I'll have to run some more tests on my Sandybridge-EP 
machine.

What kernel are you running again?  I'm testing on a machine running 3.14
so possibly there were scheduling bugs with older kernels that were fixed 
at some point.

When running both with and without {} I get something like:

Performance counter stats for 'system wide':

               606      uncore_cbox_0/event=0x35,umask=0xa/                     
               [99.61%]
               247      uncore_cbox_0/event=0x35,umask=0x4a,filter_nid=0x1/     
                              

       0.000851895 seconds time elapsed

Which makes it look like it's multiplexing the events in some sort of way 
I'm not really following, maybe to avoid a scheduling issue.

Vince

------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce.
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to