Re: [perfmon2] libpfm4 progress

stephane eranian Fri, 19 Jun 2009 12:21:55 -0700

On Fri, Jun 19, 2009 at 9:03 PM, Corey J Ashford<cjash...@us.ibm.com> wrote:
> Hi Stephane,
>
> For these sort of events which require multiple pmds, how does libpfm
> describe to the caller the formula for combining the values from the pmds?
>  Or is it expected that the caller knows how?


It is expected the caller knows.
With libpfm-3.x that was explicit because the caller had to pass a PMU-specific
data structure.

>
> - Corey
>
>
> stephane eranian <eran...@googlemail.com> wrote on 06/19/2009 10:24:29 AM:
>
>> Dan,
>>
>> Here is another libpfm4-related question.
>>
>> One thing that is quite bad with the current libpfm-3.x is that
>> you pass events and get back PMCs and PMDs but it is not
>> obvious which register corresponds to which event. The mapping
>> is not always 1-to-1. Take Intel Core, you can pass unhalted_core_cycles
>> and instructions_retired, that would go into a single PMCs and two
>> PMDs. In general tools, do not really care about PMCs, they get them
>> back and pass them to the kernel (perfmon does not allow reading back
>> of the PMCs). But you need to know which data register to read, so
>> you can collect your data.
>>
>> There is an implicit rule inside libpfm which says that PMD for events
>> are returned IN THE ORDER of the events. And it assumes a one-to-one
>> mapping: 1 event = 1 data register. This is pretty basic and has worked
>> okay until now.
>>
>> With libpfm4, everything is represented as an event, not just the actual
>> events but also AMD64 IBS, Intel LBR. Thus, there needs to be a more
>> robust way of mapping events -> registers.
>>
>> Ideally, you'd want libpfm to return the list of PMCs and PMDs that
> correspond
>> to each event. For instance:
>>     struct {
>>        pfmlib_reg_t pmcs[];
>>        pfmlib_reg_t pmds[]
>>     };
>>
>> In some cases, the PMCs of two events could be identical, e.g.,
>> Intel Core fixed
>> counters. But usually PMD would always be distinct. But the above
> proposal is
>> overkill and consumes quite a bit of memory, unless all of this is
> dynamically
>> allocated.
>>
>> The current alternative I am experimenting with is that for each
>> register returned,
>> the index of the event is stored. Remember that pfm_dispatch_events()
>> is replaced
>> with pfm_assign_events(char **events_argv, pfmlib_assign_in_t *in,
>> pfmlib_assign_out_t *out);
>>
>> If you pass:
>>    events_argv[0] = unhalted_core_cycles
>>    events_argv[1] = instructions_retired
>>
>> You will get back:
>>     out->pmds[0].reg_num = 17;
>>     out->pmds[0].reg_eventid = 0;
>>
>>     out->pmds[1].reg_num = 16;
>>     out->pmds[1].reg_eventid = 1
>>
>> Thus, to find out what register(s) you need to read for
>> unhalted_core_cycles, you have to scan
>>  out->pmds[] once, looking for eventid == 0.
>>
>> I know it is not ideal. But this works for multi-pmd events or
>> pseudo-events such as AMD IBS.
>> For instance, IBSOPFETCH: 1 PMC + 3PMDS
>>
>> Note that this scheme does not work too well for PMCS if they are
>> shared by multiple events, such
>> as on Intel Core for fixed counters. But my earlier point was that
>> tool, do not really care as to which
>> PMCs corresponds to which event.
>>
>> There are some alternatives such as  returning a pair of bitmasks per
>> event, one for the PMCs
>> the other for the PMDs.
>>
>> I am open to suggestions on how to solve this better.
>
>

------------------------------------------------------------------------------
Are you an open source citizen? Join us for the Open Source Bridge conference!
Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250.
Need another reason to go? 24-hour hacker lounge. Register today!
http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Re: [perfmon2] libpfm4 progress

Reply via email to