Kevin, On Thu, Jun 01, 2006 at 10:52:50AM -0500, Kevin Corry wrote: > On Wed May 31 2006 9:58 am, Stephane Eranian wrote: > > On Mon, May 08, 2006 at 10:11:29AM -0500, Kevin Corry wrote: > > > Right now libpfm has the pfm_find_event_* and pfm_get_event_* APIs. My > > > initial impression is that we could add APIs along the lines of > > > pfm_find_event_mask_* and pfm_get_event_mask_*, which would search within > > > one specified "parent" event for the specified mask information. And > > > pfm_dispatch_events() and its input parameters would have to be updated > > > to account for these event-masks. And obviously if a PMU doesn't have the > > > notion of event-masks (like POWER4/5), then it simply doesn't need to > > > provide any information for these new APIs. Any thoughts? > > > > I have been thinking about that new API too. > > And while we're talking about libpfm, I have some questions about the > existing > APIs. > > First, I don't quite understand the purpose of the following APIs: > pfm_find_event_by_code() > pfm_find_event_by_code_next() > pfm_get_event_code() > $ pfmon -icpu_clk_unhalted Name : CPU_CLK_UNHALTED VCode : 0x79 Code : 0x79 counter: [ 0 1 ] Desc : Number cycles during which the processor is not halted and not in a thermal trip
$ pfmon -i0x79 Name : CPU_CLK_UNHALTED VCode : 0x79 Code : 0x79 counter: [ ] Desc : Number cycles during which the processor is not halted and not in a thermal trip A user may want to see information about a particular event using its name. Using the event code may be useful if the name in the document and the name in libpfm are not quite the same. In any case, those examples use the pfm_print_event_info() routines. I don't really like them, yet they were easy to implement. I would rather see libpfm having enough interfaces to extract that information and let the tool figure out the format of the output. The intent of pfm_find_event_by_code_next() was to allow a tool to iterate over the events, i.e., you find the first match and then you continue on to the next possible match. The _next() was used to mask to application how event descriptor are organized, i.e., not necessarily contiguously. By for 3.2, I decided this was overkill, and pfmon-3.2 not exploits the fact that it knows the event descriptor is indeed the event index in the table. Thus it can simply figure out how many events total and then simply call pfm_get_event_name() or pfm_print_event_info_byindex(). I think we can get rid of the _next() stuff. I would also like to ger rid of the pfm_print_evnet_info*() interface as well. > Why would the caller care about the actual code for the event? That seems > like > it would be a fairly meaningless number unless you also knew exactly which > PMC and which bit-position within the PMC to put it in. But the whole point > of libpfm is to hide that level of detail from the caller. > This is mostly for information purposes and to allow users to match with documentation. > Also, the source for those APIs seems to treat the "code" as a unique number > that can be used to identify an event. On Pentium4, that definitely isn't the > case. Several events have the same "event-select" code, because those events > cannot be counted on the same set of PMCs/PMDs. The code is not unique. Look at this example on Mckinley: $ pfmon -iinst_retired ... Name : IA64_INST_RETIRED VCode : 0x8 Code : 0x8 PMD/PMC: [ ] Umask : 0000 EAR : No (N/A) BTB : No MaxIncr: 6 (Threshold [0-5]) Qual : [Instruction Address Range] [OpCode Match] Group : None Set : None Desc : Retired IA-64 Instructions, alias to IA64 ... Name : IA64_TAGGED_INST_RETIRED_IBRP2_PMC8 VCode : 0x20008 Code : 0x8 PMD/PMC: [ ] Umask : 0010 EAR : No (N/A) BTB : No MaxIncr: 6 (Threshold [0-5]) Qual : [Instruction Address Range] [OpCode Match] Group : None Set : None Desc : Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 2 and opcode matcher PMC8. Code executed with PSR.is=1 is not included. > > Similarly, in the arch-specific module APIs, what is the difference between > get_event_code() and get_event_vcode()? From the i386 files, it looks like > get_event_code() returns just the "event mask" (similar to the "event select" > value for Pentium4), and get_event_vcode() returns the opaque value for the > whole PMC. On Pentium4, that value is going to be incomplete until you've > chosen which pair of PMCs to use for an event and which unit-masks to use. > Also, Pentium4 uses two PMCs for each event (one ESCR and one CCCR). Which > PMC value should be returned for the "vcode" API? > The vcode (virtual code) encapsulates the actual event code (event select), the unit mask, and other attributes of the event. For instance take the IA64_INST_RETIRED event from above, you see Qual (qualifier), group, set, EAR, BTB. Those are attributes encoded in vcode. It's just a convenient way to pack additional information about an event. This is purely internal. As such, I think we can easily get rid of the pfm_find_event_by_vcode() interface. > What's the difference between get_impl_pmds() and get_impl_counters()? On > i386, there are apparently 2 PMDs, but 4 counters? Are there counters that > don't have corresponding PMDs? If so, how do they count anything? Ah, that's a good question. On some PMU models, Itanium for instance, not all PMD registers are counters, some are buffers (to store addresses, latencies and so on). So it is important for tools to make the distinction. Futhermore, the number of PMC is not necessarily equal to the number of PMD. Again Itanium being the perfect example. But P4 is also like this, a counting PMD is controlled by two PMC (CCCR and ESCR). > > Why is there a get_num_counters() API, instead of simply a numerical field in > the pfm_pmu_support_t structure like there is for pmc_count and pmd_count? > Again, what's the distinction between PMCs, PMDs, and counters? > An omission on my part. That reminds me of something nasty about P4 and HyperThreading that has to transpire somehow to libpfm. When HT is enabled, the kernel will divide the number of counters in half. As such each thread only has access to 8 counters. Take a look at /sys/kernel/perfmon/pmu_desc/mappings. To make libpfm works in this setup, it needs to know some of the PMCs are not available. It does not detect this by itself. In recent libpfm, there is a new field passed to dispatch_events() that indicates unavailable PMC registers. The examples in libpfm and also pfmon, query /sys/kernel/perfmon/pmu_desc/mappings and update the pfp_unavailable_pmc bitmask accordingly. I thin we could leverage this mechanism to let libpfm know that the P4 operates in HT mode. > Another note about the arch-specific module APIs - there seems to be a lot of > inconsistency in how parameters and return values are passed. For instance, > get_event_code() returns the code via pointer, but get_event_vcode() simply > returns the vcode directly. Also, get_event_code() now takes a PMD number but > get_event_vcode() doesn't. The get_event_name() routine simply returns a As I said I think we can forget about the vcode() API, so you should not worry about it, I am going to remove it. In general for libpfm, I tend to have the return value be the error code, additional informaiton is returned by pointers. > pointer to the event name string stored in the private array of events, but > get_event_desc() strdup's the string and returns it by pointer. There are Yes, I think this was motivated by the fact that it is more likely that a tool will want to parse the descriptor for formatting than it is to parse the event name. If you return the string directly, it can be modified by the tool and mess up the table. But again maybe that's overkill on my side. > some routines that check that the event index passed in is valid, and some > routines that just assume it is valid. Is there a reason for all the > differences? The validity of the index should be checked in the common section of libpfm. Arch specific can assume it is valid, at least that was the intent. I may have missed that in some arch. Please provide functions that have the problem. Hope this helps. -- -Stephane _______________________________________________ perfmon mailing list [email protected] http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/
