On Sat, May 1, 2010 at 3:37 AM, Corey Ashford
<cjash...@linux.vnet.ibm.com> wrote:
> Hi Stephane,
>
> When I posted the patch, I hadn't really tested it much.  Now that I'm 
> testing it, I'm seeing some very strange behavior.  After I worked through a 
> number of oversights I had made, I have come to this point where I find that 
> grouping and "enable_on_exec" don't appear to work together.  At first I 
> thought it was my code, but I reverted back to the stock code for "task", and 
> am seeing this on Linux kernel version 2.6.33.3:
>
> % ./task -e PM_RUN_CYC,PM_INST_CMPL sleep 3
>             438,780 PM_RUN_CYC (963,728 : 963,728)
>             139,211 PM_INST_CMPL (963,728 : 963,728)
>
> This looks correct, but if I add the -g switch:
>
> % ./task -g -e PM_RUN_CYC,PM_INST_CMPL sleep 3
> task: could not read event0 ret=32         <<<<<<<< Notice this
>             437,055 PM_RUN_CYC (930,392 : 930,392)
>                   0 PM_INST_CMPL (930,392 : 930,392)
>
> Does this look familiar to you?
>

Yes. I get the same on x86. This is a problem I reported on LKML a couple of
weeks ago.

There is a serious bug with the management of groups. The issue is
related to group
and attach/detach. In the case of the task example, you read the
counts once the monitored
task has died, i.e., events are detached. It turns out that group
reading, i.e, a single read to
extract all the values of a group, is broken. It works fine as long as
the events are attached
to the task, but it fails when they are detached. After detachment,
all you can read out via
the leader's fd is the leader's count. To read the "slave" counts, you
have to use their file
descriptors. This is a bug. If events are created as a group then they
must be managed as
a group all along, at least as long as the leader exists.

I have not heard back from LKML since I posted the problem.

> I hacked at the code so that it would do only self-monitoring, and found that 
> reading up the values works fine in that case.  Only when monitoring another 
> pid do we always read fewer bytes than what we ask for (in this case 32 
> instead of 40).  I have read up the "nr" value from the 
> "PERF_EVENT_FORMAT_GROUP" record, and the value is correct (it contains the 
> number of counters in the group), so it seems to be that we simply cannot 
> read past the first counter value.
>
> I'm starting to suspect a kernel bug here.  I've thought about trying the 
> -tip kernel (I started with 2.6.32.11 where I had the same issue), but I 
> thought I'd run this by you first.
>
> Regards,
>
> - Corey
>
>

------------------------------------------------------------------------------
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to