On 01/09/2011 6:45 AM, stephane eranian wrote: > On Thu, Sep 1, 2011 at 3:29 PM, stephane eranian<eran...@googlemail.com> > wrote: >> On Thu, Sep 1, 2011 at 3:06 PM, Ryan Johnson >> <ryan.john...@cs.utoronto.ca> wrote: >>> On 01/09/2011 1:55 AM, stephane eranian wrote: >>>> On Thu, Sep 1, 2011 at 1:07 AM, Corey Ashford >>>> <cjash...@linux.vnet.ibm.com> wrote: >>>>> On 08/25/2011 07:19 AM, stephane eranian wrote: >>>>>> Hi, >>>>>> >>>>>> Sorry for late reply. >>>>>> >>>>>> The current support for mmaped count is broken on perf_event x86. >>>>>> It simply does not work. I think it only works on PPC at this point. >>>>> Just as an aside, you can access the counter registers from user space >>>>> on Power (aka PPC) machines, but because the kernel is free to schedule >>>>> the events onto whatever counters that meet the resource constraints, >>>>> it's not at all clear which hardware counter to read from user space, >>>>> and in fact, with event rotation, the counter being used can change from >>>>> one system tick till the next. >>>>> >>>>> If you program a single event, you can be guaranteed that it won't move >>>>> around, but you still will have to guess or somehow determine which >>>>> hardware counter is being used by the kernel. >>>>> >>>> Yes, and that's why they have this 'lock' field in there.It's not really a >>>> lock >>>> but rather a generation counter. You need to read it before you attempt to >>>> read and you need to check it when you're done reading. If the two values >>>> don't match then the counter changed and you need to retry. And changes >>>> means it may have moved to a different counter. >>> This protocol is actually documented pretty well in >>> <linux/perf_event.h>, too. Read the lock, read the index, read hw >>> counter[index-1], read lock again to verify. >>> >>>> But the key problem here is the time scaling. In case you are multiplex >>>> you need to be able to retrieve time_enabled and time_running to scale >>>> the count. But that's not exposed, thus it does not work as soon as you >>>> have multiplexing. Well, unless you only care about deltas and not the >>>> absolute values. >>> Doesn't perf_event_mmap_page expose both those, also protected by the >>> generation counter? Or are you saying the kernel doesn't actually update >>> those fields right now? >>> >> Yes, it does. I am not sure they're updated correctly, though. >> I have not tried that in a very long time. >> > Did you manage to make libpfm4's self_count program work correctly? > Even by just looking at the raw count coming out of rdpmc? > > I think there are issues with hdr->offset, i.e., the 64-bit sw-maintained > base for the counter. I only did limited testing because things took priority the last couple of weeks, but I'll be back into it in the next couple of weeks. Meanwhile, here's what I know:
The machine is a Westmere EX (which is why I can't just use an older kernel+perfctr) running kernel 2.6.38. I've got the cvs head for papi, wired up with git version 9fc1bc1e of libpfm4. self_count seg faults by default because rdpmc is privileged, and papi's unit tests cause the machine to hard-lock (have to use the hypervisor to reboot). One definite culprit is ctests/overflow_allcounters, but I haven't done a bisection search in 2.6.38 to see if there are any others. I upgraded to kernel 2.6.39, ctests/overflow_allcounters is the only unit test failure, but it "only" hard-locks the perf events infrastructure rather than the whole machine. The unit tests's process hangs with 0% cpu util and becomes unkillable, and any later process attempting to use perf events suffers the same fate. The mmap+rdpmc support is apparently disabled in 2.6.39, in that index=0 for all time. The self_count test runs without errors and reports monotonically increasing values, but I never attempted to verify that the starting count was meaningful. For now I've rolled back to 2.6.38, since the later version is a step backwards for my needs. With the kernel module I mentioned before, user-level rdpmc seems to stay enabled indefinitely and self_count runs without errors. I've extended the test slightly to run fib with n={30,35,40}, to track which counter number it used directly (if any), and to report the deltas between measurements. Here's the output I get: > $ ./self_count > raw=0xcd73 offset=0x0, ena=36278 run=36278 idx=-1 direct=0 > 52595 PERF_COUNT_HW_CPU_CYCLES (delta= cd73) > raw=0xffff811d738b offset=0x7fffffff, ena=36278 run=36278 idx=0 direct=1 > 281474995417994 PERF_COUNT_HW_CPU_CYCLES (delta= 10000011ca617) > raw=0xffff8d588633 offset=0x7fffffff, ena=36278 run=36278 idx=0 direct=1 > 281475200615986 PERF_COUNT_HW_CPU_CYCLES (delta= c3b12a8) > raw=0xffff94aa8789 offset=0xfffffffe, ena=36278 run=36278 idx=0 direct=1 > 281477470914439 PERF_COUNT_HW_CPU_CYCLES (delta= 87520155) > raw=0xffff95c33ede offset=0xfffffffe, ena=36278 run=36278 idx=0 direct=1 > 281477489311452 PERF_COUNT_HW_CPU_CYCLES (delta= 118b755) > raw=0xffffa1ea0c92 offset=0xfffffffe, ena=36278 run=36278 idx=0 direct=1 > 281477693181072 PERF_COUNT_HW_CPU_CYCLES (delta= c26cdb4) > raw=0xffffa8e995ed offset=0x17ffffffd, ena=36278 run=36278 idx=0 direct=1 > 281479958074858 PERF_COUNT_HW_CPU_CYCLES (delta= 86ff895a) > raw=0xffffaa0262b9 offset=0x17ffffffd, ena=36278 run=36278 idx=0 direct=1 > 281479976477366 PERF_COUNT_HW_CPU_CYCLES (delta= 118cccc) > raw=0xffffb6284921 offset=0x17ffffffd, ena=36278 run=36278 idx=0 direct=1 > 281480180287774 PERF_COUNT_HW_CPU_CYCLES (delta= c25e668) Judging from the above, the offset does seem to be broken, truncated to 32 bits, perhaps? If I force to always call read() then it makes more sense: > $ ./self_count > raw=0xda66 offset=0x0, ena=39065 run=39065 idx=-1 direct=0 > 55910 PERF_COUNT_HW_CPU_CYCLES (delta= da66) > raw=0x11dc60e offset=0x0, ena=10052007 run=10052007 idx=-1 direct=0 > 18728462 PERF_COUNT_HW_CPU_CYCLES (delta= 11ceba8) > raw=0xd590016 offset=0x0, ena=120077612 run=120077612 idx=-1 direct=0 > 223936534 PERF_COUNT_HW_CPU_CYCLES (delta= c3b3a08) > raw=0x9466a0de offset=0x0, ena=1334882738 run=1334882738 idx=-1 direct=0 > 2489753822 PERF_COUNT_HW_CPU_CYCLES (delta= 870da0c8) > raw=0x957f95c4 offset=0x0, ena=1344755931 run=1344755931 idx=-1 direct=0 > 2508166596 PERF_COUNT_HW_CPU_CYCLES (delta= 118f4e6) > raw=0xa1b53a47 offset=0x0, ena=1454582523 run=1454582523 idx=-1 direct=0 > 2713008711 PERF_COUNT_HW_CPU_CYCLES (delta= c35a483) The counter itself seems to work fine, though, and I'd only be using it for deltas anyway. Ryan ------------------------------------------------------------------------------ Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free "Love Thy Logs" t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel