Dan,

Here is another libpfm4-related question.

One thing that is quite bad with the current libpfm-3.x is that
you pass events and get back PMCs and PMDs but it is not
obvious which register corresponds to which event. The mapping
is not always 1-to-1. Take Intel Core, you can pass unhalted_core_cycles
and instructions_retired, that would go into a single PMCs and two
PMDs. In general tools, do not really care about PMCs, they get them
back and pass them to the kernel (perfmon does not allow reading back
of the PMCs). But you need to know which data register to read, so
you can collect your data.

There is an implicit rule inside libpfm which says that PMD for events
are returned IN THE ORDER of the events. And it assumes a one-to-one
mapping: 1 event = 1 data register. This is pretty basic and has worked
okay until now.

With libpfm4, everything is represented as an event, not just the actual
events but also AMD64 IBS, Intel LBR. Thus, there needs to be a more
robust way of mapping events -> registers.

Ideally, you'd want libpfm to return the list of PMCs and PMDs that correspond
to each event. For instance:
    struct {
       pfmlib_reg_t pmcs[];
       pfmlib_reg_t pmds[]
    };

In some cases, the PMCs of two events could be identical, e.g., Intel Core fixed
counters. But usually PMD would always be distinct. But the above proposal is
overkill and consumes quite a bit of memory, unless all of this is dynamically
allocated.

The current alternative I am experimenting with is that for each
register returned,
the index of the event is stored. Remember that pfm_dispatch_events()
is replaced
with pfm_assign_events(char **events_argv, pfmlib_assign_in_t *in,
pfmlib_assign_out_t *out);

If you pass:
   events_argv[0] = unhalted_core_cycles
   events_argv[1] = instructions_retired

You will get back:
    out->pmds[0].reg_num = 17;
    out->pmds[0].reg_eventid = 0;

    out->pmds[1].reg_num = 16;
    out->pmds[1].reg_eventid = 1

Thus, to find out what register(s) you need to read for
unhalted_core_cycles, you have to scan
 out->pmds[] once, looking for eventid == 0.

I know it is not ideal. But this works for multi-pmd events or
pseudo-events such as AMD IBS.
For instance, IBSOPFETCH: 1 PMC + 3PMDS

Note that this scheme does not work too well for PMCS if they are
shared by multiple events, such
as on Intel Core for fixed counters. But my earlier point was that
tool, do not really care as to which
PMCs corresponds to which event.

There are some alternatives such as  returning a pair of bitmasks per
event, one for the PMCs
the other for the PMDs.

I am open to suggestions on how to solve this better.

------------------------------------------------------------------------------
Are you an open source citizen? Join us for the Open Source Bridge conference!
Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250.
Need another reason to go? 24-hour hacker lounge. Register today!
http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to