Re: [perfmon] Re: Multiplexing and load_context in PerfMon2 and PAPI

Philip Mucci Wed, 30 Aug 2006 04:18:50 -0700

Ok, this seems to make a bit of sense. Basically, I'll be doing a
create_eventsets for as many eventsets as I think I'll ever use. After
that I can do a load context...The only time I need to redo it is when I
create a new context, i.e. for the attach/detach case...


BTW, I just checked in on the papi-3-2-0 tree a working perfmon2
snapshot. If you download it, ./configure and change the flags to -g
-DDEBUG in the makefile, build it and then run the profile case with
'PAPI_DEBUG=SUBSTRATE' env variable, you'll see the out-of-memory error
from create_context. There's lots of output...

Phil

On Tue, 2006-08-29 at 14:53 -0700, Stephane Eranian wrote:
> Phil,
> 
> Ok lots of things in this E-mail.
> 
> On Tue, Aug 29, 2006 at 08:58:59PM +0200, Philip Mucci wrote:
> > Hi Stephane,
> > 
> > I'm not what you mean about lists or switching on the fly. PAPI
> > eventsets contain everything needed to start and stop PMU state. For a
> > given thread, only 1 PAPI eventset can be running at a given time. A
> > user is free to create as many eventsets as he wants and then start/stop
> > them whenever subject to the above limitation. Each PAPI eventset may
> > map to many counters requiring more than 1 PerfMon2 eventset which means
> > I need to do a create_eventset, load_context upon every start. 
> > 
> Ok, so you are saying we have to distinct layers here, the PAPI eventsets
> and the perfmon eventsets.  At the PAPI level, the users wants to multiplex
> PAPI sets such that there is only one active at a time.
> 
> Perfmon receives a PAPI event set, each set may have way more events
> than what the machine can actually measure at once.  It seems the limit
> is set by PAPI and not the underlying HW. In other words, you probably
> set it to the same constant value for all architectures. That's nice
> to your users but then you have to deal with the mapping.
> 
> Anyway, at the perfmon layer you have to take each PAPI set and figure out
> if it can be mapped onto a single perfmon set. The number of counter in 
> a perfmon set is determined by the underlying PMU because each set
> encapsulates the full PMU state. On P6, you have 2 counters per set and
> on Montecito, you'll have 12. You have to solve the following issue:
> 
>       - how to distributes the events inside the PAPI set into possibly
>         multiple perfmon sets?
> 
> The answer is : it depends on the PMU, the events and their number.
> 
> The libpfm library can help you somehow but it does not solve the whole
> issue for you. But I agree that would be a nice thing to have and it does
> not have to be tied to the perfmon interface. You could give a list of
> events and it would return events in multiple "groups" if they could not
> be measured together. I need to think about this.
> 
> But back to your problem. Supposing you have to use multiple event sets
> to support a single PAPI set. Then you create your context, create
> the sets, and program them. Those sets will be mutliplexed by perfmon
> in a round-robin fashion. But, if I understand the problem correctly,
> you are saying that this is all and well for one PAPI set.  But about
> the other PAPI sets?
> 
> 
> First of all I don't think you need to create multiple contexts, i.e.,
> one per PAPI set, to support what you want to do. You only need one
> context. 
> 
> Second, the PAPI interface lets a user multiplex PAPI sets. The mapping
> from PAPI sets to perfmon sets is invisible to PAPI users. In practice,
> you could build a perfmon event set list that would contain all the sets
> to support the PAPI sets. I think it is best if I give a example.
> 
> PAPI set1 (paset1): ev1 ev2 ev3 ev4
> PAPI set2 (paset2): ev5 ev6 ev7 ev8
> 
> Suppose you run on P6 with two counters and that a valid assignment to perfmon
> sets would be:
> 
> Perfmon set0 (peset0) : ev1 ev2
> Perfmon set1 (peset1) : ev3 ev4
> Perfmon set2 (peset2) : ev5 ev6
> Perfmon set3 (peset3) : ev7 ev8
> 
> All 4 perfmon sets would be multiplexed alltogether. Internally PAPI would 
> maintain
> the mapping, so it would know which registers and sets to query to retrieve 
> the counters
> for each PAPI set.
> 
> The sequence would be prepared all at once, then you load and you start. 
> There is no
> need to stop change the setup and restart.
> 
> Would that be a satisfactory solution?
> 
> 
> I'll wait for your answer before I respond to the rest to make the
> message more manageable in size.
> 
> --
> -Stephane

_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

Re: [perfmon] Re: Multiplexing and load_context in PerfMon2 and PAPI

Reply via email to