Hi Stephane, On Thu June 1 2006 2:21 pm, Stephane Eranian wrote: > On Thu, Jun 01, 2006 at 10:52:50AM -0500, Kevin Corry wrote: > > And while we're talking about libpfm, I have some questions about the > > existing APIs. > > > > First, I don't quite understand the purpose of the following APIs: > > pfm_find_event_by_code() > > pfm_find_event_by_code_next() > > pfm_get_event_code() > > $ pfmon -icpu_clk_unhalted > Name : CPU_CLK_UNHALTED > VCode : 0x79 > Code : 0x79 > counter: [ 0 1 ] > Desc : Number cycles during which the processor is not halted and not in > a thermal trip > > $ pfmon -i0x79 > Name : CPU_CLK_UNHALTED > VCode : 0x79 > Code : 0x79 > counter: [ ] > Desc : Number cycles during which the processor is not halted and not in > a thermal trip > > A user may want to see information about a particular event using its name. > Using the event code may be useful if the name in the document and the name > in libpfm are not quite the same.
Ok, that seems like a decent reason. > I think we can get rid of the _next() stuff. I would also like to ger rid of > the pfm_print_evnet_info*() interface as well. Sounds fine to me. > > What's the difference between get_impl_pmds() and get_impl_counters()? On > > i386, there are apparently 2 PMDs, but 4 counters? Are there counters > > that don't have corresponding PMDs? If so, how do they count anything? > > Ah, that's a good question. On some PMU models, Itanium for instance, not > all PMD registers are counters, some are buffers (to store addresses, > latencies and so on). So it is important for tools to make the distinction. > Futhermore, the number of PMC is not necessarily equal to the number of > PMD. Again Itanium being the perfect example. But P4 is also like this, a > counting PMD is controlled by two PMC (CCCR and ESCR). Ok, this makes sense. But this implies that num-counters should be less than or equal to num-PMDs. But on i386, it looks like it has 2 PMDs but 4 "counters". Actually, now that I look through more of the code, there seems to be some discrepencies in the i386 code. pfm_i386_p6_num_counters() returns 2, but pfm_i386_p6_get_impl_counters() sets bits for 4 counters. Is this intentional, or maybe just a bug? > > Why is there a get_num_counters() API, instead of simply a numerical > > field in the pfm_pmu_support_t structure like there is for pmc_count and > > pmd_count? Again, what's the distinction between PMCs, PMDs, and > > counters? > > An omission on my part. That reminds me of something nasty about P4 and > HyperThreading that has to transpire somehow to libpfm. Oh, man, don't get me started. Every time I starting thinking about HyperThreading support in libpfm I wind up banging my head against the keyboard. :) > When HT is enabled, > the kernel will divide the number of counters in half. As such each thread > only has access to 8 counters. Nine, actually. > Take a look at > /sys/kernel/perfmon/pmu_desc/mappings. To make libpfm works in this setup, > it needs to know some of the PMCs are not available. It does not detect > this by itself. In recent libpfm, there is a new field passed to > dispatch_events() that indicates unavailable PMC registers. The examples in > libpfm and also pfmon, query /sys/kernel/perfmon/pmu_desc/mappings and > update the pfp_unavailable_pmc bitmask accordingly. I thin we could > leverage this mechanism to let libpfm know that the P4 operates in HT mode. That's one possibility, or at least part of the solution. I think there's going to be more to HT support than just having half of the available counters, though. Certain bits in the ESCRs and CCCRs are only meaningful with HT enabled, and I think there are some PMU functions/events that may not be available with HT enabled. > In general for libpfm, I tend to > have the return value be the error code, additional informaiton is returned > by pointers. Ok, that's kind of what I expected. > > pointer to the event name string stored in the private array of events, > > but get_event_desc() strdup's the string and returns it by pointer. There > > are > > Yes, I think this was motivated by the fact that it is more likely that a > tool will want to parse the descriptor for formatting than it is to parse > the event name. If you return the string directly, it can be modified by > the tool and mess up the table. But again maybe that's overkill on my side. If anything, I'd suggest changing the get_event_name() routines to strdup() the name the way that get_event_desc() does. That way they're consistent, and we won't have to worry about what the caller does with the strings. We just need to note that the string needs to be free'd when they're done with it. > > some routines that check that the event index passed in is valid, and > > some routines that just assume it is valid. Is there a reason for all the > > differences? > > The validity of the index should be checked in the common section of > libpfm. Arch specific can assume it is valid, at least that was the intent. > I may have missed that in some arch. Please provide functions that have the > problem. It looks like all of the pfm_*_print_info() routines check that the event number is within range. I thought I had seen others, but I guess those must have been some of the "arch-private" APIs. And since you said that the _print_info() routines might be going away, then we probably don't need to worry about them. Thanks for the info! -- Kevin Corry [EMAIL PROTECTED] http://www.ibm.com/linux/ http://evms.sourceforge.net/ _______________________________________________ perfmon mailing list [email protected] http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/
