On Fri, Jun 19, 2009 at 9:03 PM, Corey J Ashford<cjash...@us.ibm.com> wrote: > Hi Stephane, > > For these sort of events which require multiple pmds, how does libpfm > describe to the caller the formula for combining the values from the pmds? > Or is it expected that the caller knows how?
It is expected the caller knows. With libpfm-3.x that was explicit because the caller had to pass a PMU-specific data structure. > > - Corey > > > stephane eranian <eran...@googlemail.com> wrote on 06/19/2009 10:24:29 AM: > >> Dan, >> >> Here is another libpfm4-related question. >> >> One thing that is quite bad with the current libpfm-3.x is that >> you pass events and get back PMCs and PMDs but it is not >> obvious which register corresponds to which event. The mapping >> is not always 1-to-1. Take Intel Core, you can pass unhalted_core_cycles >> and instructions_retired, that would go into a single PMCs and two >> PMDs. In general tools, do not really care about PMCs, they get them >> back and pass them to the kernel (perfmon does not allow reading back >> of the PMCs). But you need to know which data register to read, so >> you can collect your data. >> >> There is an implicit rule inside libpfm which says that PMD for events >> are returned IN THE ORDER of the events. And it assumes a one-to-one >> mapping: 1 event = 1 data register. This is pretty basic and has worked >> okay until now. >> >> With libpfm4, everything is represented as an event, not just the actual >> events but also AMD64 IBS, Intel LBR. Thus, there needs to be a more >> robust way of mapping events -> registers. >> >> Ideally, you'd want libpfm to return the list of PMCs and PMDs that > correspond >> to each event. For instance: >> struct { >> pfmlib_reg_t pmcs[]; >> pfmlib_reg_t pmds[] >> }; >> >> In some cases, the PMCs of two events could be identical, e.g., >> Intel Core fixed >> counters. But usually PMD would always be distinct. But the above > proposal is >> overkill and consumes quite a bit of memory, unless all of this is > dynamically >> allocated. >> >> The current alternative I am experimenting with is that for each >> register returned, >> the index of the event is stored. Remember that pfm_dispatch_events() >> is replaced >> with pfm_assign_events(char **events_argv, pfmlib_assign_in_t *in, >> pfmlib_assign_out_t *out); >> >> If you pass: >> events_argv[0] = unhalted_core_cycles >> events_argv[1] = instructions_retired >> >> You will get back: >> out->pmds[0].reg_num = 17; >> out->pmds[0].reg_eventid = 0; >> >> out->pmds[1].reg_num = 16; >> out->pmds[1].reg_eventid = 1 >> >> Thus, to find out what register(s) you need to read for >> unhalted_core_cycles, you have to scan >> out->pmds[] once, looking for eventid == 0. >> >> I know it is not ideal. But this works for multi-pmd events or >> pseudo-events such as AMD IBS. >> For instance, IBSOPFETCH: 1 PMC + 3PMDS >> >> Note that this scheme does not work too well for PMCS if they are >> shared by multiple events, such >> as on Intel Core for fixed counters. But my earlier point was that >> tool, do not really care as to which >> PMCs corresponds to which event. >> >> There are some alternatives such as returning a pair of bitmasks per >> event, one for the PMCs >> the other for the PMDs. >> >> I am open to suggestions on how to solve this better. > > ------------------------------------------------------------------------------ Are you an open source citizen? Join us for the Open Source Bridge conference! Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250. Need another reason to go? 24-hour hacker lounge. Register today! http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel