Phil,
Ok lots of things in this E-mail.
On Tue, Aug 29, 2006 at 08:58:59PM +0200, Philip Mucci wrote:
> Hi Stephane,
>
> I'm not what you mean about lists or switching on the fly. PAPI
> eventsets contain everything needed to start and stop PMU state. For a
> given thread, only 1 PAPI eventset can be running at a given time. A
> user is free to create as many eventsets as he wants and then start/stop
> them whenever subject to the above limitation. Each PAPI eventset may
> map to many counters requiring more than 1 PerfMon2 eventset which means
> I need to do a create_eventset, load_context upon every start.
>
Ok, so you are saying we have to distinct layers here, the PAPI eventsets
and the perfmon eventsets. At the PAPI level, the users wants to multiplex
PAPI sets such that there is only one active at a time.
Perfmon receives a PAPI event set, each set may have way more events
than what the machine can actually measure at once. It seems the limit
is set by PAPI and not the underlying HW. In other words, you probably
set it to the same constant value for all architectures. That's nice
to your users but then you have to deal with the mapping.
Anyway, at the perfmon layer you have to take each PAPI set and figure out
if it can be mapped onto a single perfmon set. The number of counter in
a perfmon set is determined by the underlying PMU because each set
encapsulates the full PMU state. On P6, you have 2 counters per set and
on Montecito, you'll have 12. You have to solve the following issue:
- how to distributes the events inside the PAPI set into possibly
multiple perfmon sets?
The answer is : it depends on the PMU, the events and their number.
The libpfm library can help you somehow but it does not solve the whole
issue for you. But I agree that would be a nice thing to have and it does
not have to be tied to the perfmon interface. You could give a list of
events and it would return events in multiple "groups" if they could not
be measured together. I need to think about this.
But back to your problem. Supposing you have to use multiple event sets
to support a single PAPI set. Then you create your context, create
the sets, and program them. Those sets will be mutliplexed by perfmon
in a round-robin fashion. But, if I understand the problem correctly,
you are saying that this is all and well for one PAPI set. But about
the other PAPI sets?
First of all I don't think you need to create multiple contexts, i.e.,
one per PAPI set, to support what you want to do. You only need one
context.
Second, the PAPI interface lets a user multiplex PAPI sets. The mapping
from PAPI sets to perfmon sets is invisible to PAPI users. In practice,
you could build a perfmon event set list that would contain all the sets
to support the PAPI sets. I think it is best if I give a example.
PAPI set1 (paset1): ev1 ev2 ev3 ev4
PAPI set2 (paset2): ev5 ev6 ev7 ev8
Suppose you run on P6 with two counters and that a valid assignment to perfmon
sets would be:
Perfmon set0 (peset0) : ev1 ev2
Perfmon set1 (peset1) : ev3 ev4
Perfmon set2 (peset2) : ev5 ev6
Perfmon set3 (peset3) : ev7 ev8
All 4 perfmon sets would be multiplexed alltogether. Internally PAPI would
maintain
the mapping, so it would know which registers and sets to query to retrieve the
counters
for each PAPI set.
The sequence would be prepared all at once, then you load and you start. There
is no
need to stop change the setup and restart.
Would that be a satisfactory solution?
I'll wait for your answer before I respond to the rest to make the
message more manageable in size.
--
-Stephane
_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/