Stephane Eranian wrote:
Will,

On Wed, Oct 25, 2006 at 02:10:39PM -0400, William Cohen wrote:

Stephane Eranian wrote:

On Wed, Oct 25, 2006 at 11:56:34AM -0400, William Cohen wrote:


I am looking through the libpfm code and notice that PFMLIB_MAX_PMCS has increased from 256 (libpfm-3.0) to 512 (libpfm-3.2). Is there a specific case where the 256 wasn't large enough? Or is this assuming future perfmon hardware will need the additional space?



Yes, this is justified by the fact that the kernel interface currently
supports up to 320 registers. So 256 was too small to cover for it.
I decided to double it to be okay for a while. We could reduce it
to 320 as well.


So none of the processors are currently use more than 256 registers? I am

None of them yet. On Itanium 2, we use PMC256-PMC263 to host the code and data
debug registers used with range restrictions. Those are not manipulated
by libpfm using those indexes. The reason for 256 is that the IA-64
architecture allows up to 256 PMC and 256 PMD register to be implemented.
That does not necessarily translate into 256 actual registers because there
can holes. If you look at the Montecito PMU you will see there is a hole between
PMC16-PMC32.

It's okay to leave the size the same on the Itanium PFMLIB_MAX_PMCS (256) as on the old version of libpfm.

working on backporting the montecito support to the earlier version of libpfm and pfmon in RHEL4. I would like to avoid changing the data structures used by the shared libraries if possible.


Interesting.

Not sure how interesting a backport of this is. :) It is just one of the things on my list to do.

Other changes noticed in data structures during the backport were:

-pfmlib_input_param_t include pfp_unavail_pmcs field


Yes, this is a new feature introduced with 3.2. The library can dispatch
events when you only have a partial PMU available. It will try and use
other registers than those with their bit set in pfp_unavail_pmcs.
You can probably ignore this for your backport.

I took a look at the code and I see that this may be used for things that use the perfmon hardware for a watchdog timer.

Does this allow a combination of systemwide and per-process counting on the system for perfmon itself? For example someone sets up global sampling on the entier system (ala OProfile) and a later program is started that use a perfmon counter (or two) on the process.

How is something like this going to work with powerpc perfmon hardware that has a selector? The requestor gets all or nothing for the perfmon hardware?

-unit_masks and num_mask in pfmlib_event_t struct


Yes, that was introduced to expose unit masks to tools. This was required to
cut down on complexity for the event table on PMU such as P4 and others where
the number of combination of units masks is fairly large for some events.

This is not used on IA-64, including Montecito.

There seems to be some access of unit_masks and num_mask in pfm_mont_dispatch_counters().

-Will

_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

Reply via email to