Will, On Fri, Oct 27, 2006 at 04:10:12PM -0400, William Cohen wrote: > >by libpfm using those indexes. The reason for 256 is that the IA-64 > >architecture allows up to 256 PMC and 256 PMD register to be implemented. > >That does not necessarily translate into 256 actual registers because there > >can holes. If you look at the Montecito PMU you will see there is a hole > >between > >PMC16-PMC32. > > It's okay to leave the size the same on the Itanium PFMLIB_MAX_PMCS (256) > as on the old version of libpfm.
You mean for your backport? If so, the answer is YES. > > >>working on backporting the montecito support to the earlier version of > >>libpfm and pfmon in RHEL4. I would like to avoid changing the data > >>structures used by the shared libraries if possible. > >> > > > >Interesting. > > Not sure how interesting a backport of this is. :) It is just one of the > things on my list to do. I think it is good to have this backported as this is what most people use today. > > >>Other changes noticed in data structures during the backport were: > >> > >>-pfmlib_input_param_t include pfp_unavail_pmcs field > > > > > >Yes, this is a new feature introduced with 3.2. The library can dispatch > >events when you only have a partial PMU available. It will try and use > >other registers than those with their bit set in pfp_unavail_pmcs. > >You can probably ignore this for your backport. > > I took a look at the code and I see that this may be used for things that > use the perfmon hardware for a watchdog timer. > Exactly! In fact, on AMD Opteron this iswhat happens with the latest kernel patch when nmi_watchdog!=0. The NMI takes one counter, we use the others. A PMU interrupt generates an NMI. If it is not for the watchdog, we re-emit at a lower vector or perfmon. If you try pfmon and any libpfm examples, you will see that they automatically ignore PERFEVTSEL0. > Does this allow a combination of systemwide and per-process counting on the > system for perfmon itself? For example someone sets up global sampling on > the entier system (ala OProfile) and a later program is started that use a > perfmon counter (or two) on the process. > Yes, the mechanism is a required piece of the puzzle to enable sharing of the PMU between conflicting sessions. Another piece is a PMU register allocator but that is still missing at this point. > How is something like this going to work with powerpc perfmon hardware that > has a selector? The requestor gets all or nothing for the perfmon hardware? > If this is all or nothing, then someone will get it, the other won't. Another approach will be to set some priorityies between per-thread and system-wide or between users for accessing the PMU. > >>-unit_masks and num_mask in pfmlib_event_t struct > >> > > > >Yes, that was introduced to expose unit masks to tools. This was required > >to > >cut down on complexity for the event table on PMU such as P4 and others > >where > >the number of combination of units masks is fairly large for some events. > > > >This is not used on IA-64, including Montecito. > > There seems to be some access of unit_masks and num_mask in > pfm_mont_dispatch_counters(). > Ah, yes, I added this lately to support MESI qualifications without creating too many events. I think in your case you could convert this to events, i.e., one event per combination, or drop the MESI, i.e., force all MESI bits to on. -- -Stephane _______________________________________________ perfmon mailing list [email protected] http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/
