Will,

On Fri, Oct 27, 2006 at 04:10:12PM -0400, William Cohen wrote:
> >by libpfm using those indexes. The reason for 256 is that the IA-64
> >architecture allows up to 256 PMC and 256 PMD register to be implemented.
> >That does not necessarily translate into 256 actual registers because there
> >can holes. If you look at the Montecito PMU you will see there is a hole 
> >between
> >PMC16-PMC32.
> 
> It's okay to leave the size the same on the Itanium PFMLIB_MAX_PMCS (256) 
> as on the old version of libpfm.

You mean for your backport? If so, the answer is YES.

> 
> >>working on backporting the montecito support to the earlier version of 
> >>libpfm and pfmon in RHEL4. I would like to avoid changing the data 
> >>structures used by the shared libraries if possible.
> >>
> >
> >Interesting.
> 
> Not sure how interesting a backport of this is. :) It is just one of the 
> things on my list to do.

I think it is good to have this backported as this is what most people use 
today.

> 
> >>Other changes noticed in data structures during the backport were:
> >>
> >>-pfmlib_input_param_t include pfp_unavail_pmcs field
> >
> >
> >Yes, this is a new feature introduced with 3.2. The library can dispatch
> >events when you only have a partial PMU available. It will try and use
> >other registers than those with their bit set in pfp_unavail_pmcs.
> >You can probably ignore this for your backport.
> 
> I took a look at the code and I see that this may be used for things that 
> use the perfmon hardware for a watchdog timer.
> 
Exactly! In fact, on AMD Opteron this iswhat happens with the latest kernel 
patch
when nmi_watchdog!=0. The NMI takes one counter, we use the others. A PMU 
interrupt
generates an NMI. If it is not for the watchdog, we re-emit at a lower vector or
perfmon. If you try pfmon and any libpfm examples, you will see that they 
automatically
ignore PERFEVTSEL0.


> Does this allow a combination of systemwide and per-process counting on the 
> system for perfmon itself? For example someone sets up global sampling on 
> the entier system (ala OProfile) and a later program is started that use a 
> perfmon counter (or two) on the process.
> 
Yes, the mechanism is a required piece of the puzzle to enable sharing of the
PMU between conflicting sessions. Another piece is a PMU register allocator
but that is still missing at this point.

> How is something like this going to work with powerpc perfmon hardware that 
> has a selector? The requestor gets all or nothing for the perfmon hardware?
> 
If this is all or nothing, then someone will get it, the other won't. Another
approach will be to set some priorityies between per-thread and system-wide
or between users for accessing the PMU.

> >>-unit_masks and num_mask in pfmlib_event_t struct
> >>
> >
> >Yes, that was introduced to expose unit masks to tools. This was required 
> >to
> >cut down on complexity for the event table on PMU such as P4 and others 
> >where
> >the number of combination of units masks is fairly large for some events.
> >
> >This is not used on IA-64, including Montecito.
> 
> There seems to be some access of unit_masks and num_mask in 
> pfm_mont_dispatch_counters().
> 
Ah, yes, I added this lately to support MESI qualifications without creating 
too many events.
I think in your case you could convert this to events, i.e., one event per 
combination, or drop
the MESI, i.e., force all MESI bits to on.

-- 

-Stephane
_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

Reply via email to