Hi Stephane, I'm not what you mean about lists or switching on the fly. PAPI eventsets contain everything needed to start and stop PMU state. For a given thread, only 1 PAPI eventset can be running at a given time. A user is free to create as many eventsets as he wants and then start/stop them whenever subject to the above limitation. Each PAPI eventset may map to many counters requiring more than 1 PerfMon2 eventset which means I need to do a create_eventset, load_context upon every start.
Today I discovered something even "worse" (as far as papi goes), I implemented sampling today for PAPI profiling...and I discovered that sampling has to be set up a create_context time. Which means that for PAPI eventsets that sample, I have to do a create_context inside start and close(fd) inside stop. Basically, for PAPI start and stop to work, I have to run through the entire Perfmon API. ;-) PAPI_start and stop does not have to be fast...we claim it's slow because it usually means a system call. User's are supposed to use read(). However in the perfmon2 case, it will be the slowest of all implementations. I'm not dogging perfmon2 here...I'm just saying that it doesn't fit the semantics of PAPI (and has been for a while, I make no claims it's good or bad.) But clearly in this case, it doesn't match the usage model of PFM. Is there any way to provide ioctl()'s on the FD to do the things that create_context and create_eventsets do so I don't always have to jump through all the hoops. I don't think this is super important, just something to be aware of. PAPI can survive as it is and be optimized later. Now for some other points about sampling. 1) It would be great if when sampling, the PFM_OVFL_MSG contained a pointer to the sample header. When self sampling multiple threads, there's no such thing as global variables (which are used all throughout the test suite) 2) It would also be great, if the sample header also contained the number of sampled pmds (also a global in the test cases). The above two would remove the need for a hash lookup function and would make the signal handler fully self contained to process sample entries. Now for the bad news (really, I'm sorry about this one...) - BAD: Sampling works great in PAPI up until the 3rd cycle of create_context() witrh 4*getpagesize() sample entries. The fourth one always returns the dreaded 'not supported' error message and ERRNO is set to ENOMEM. I have munmap()'d the buffer and close()d the context file descriptor between each incantation. This is in PAPI's profile test case which runs profiling a bunch of times (and init and shutdown PAPI for every time) Would you like to see the DEBUG log (it's big) - WORSE: I can hard lock my i386 2.7.17.10 kernel by running task_smpl_user on emacs and Cntrl-C it about 1 out of every 5 times. There is no oops, nothing...just hard lock. The good news: - PAPI for Perfmon is well on it's way to having full support. I believe it will break horribly on the PIV due to the pmc/pmd mapping issues (and trying to figure out the final offset into the pmd structure with multiplexing is even worse...) Thanks for listening. We're almost home... Phil P.S. Have you had any requests for pfm_dispatch_events to be able to dispatch events with multiplexing enabled? That would simplify things greatly. I am not confident in the ability of the code to get the resulting offsets of the final PD structure correct...especially in light of PIV like beasties. On Tue, 2006-08-29 at 07:45 -0700, Stephane Eranian wrote: > Phil, > > On Mon, Aug 28, 2006 at 10:07:11PM +0000, Philip Mucci wrote: > > > > Today I got kernel multiplexing with Perfmon2 working in PAPI. All tests > > in PAPI are passing at this juncture. However, I must say that > > implementing multiplexing was somewhat painful. Before I get on the > > soapbox about that, there is a more serious issue (I think). > > > > You can't do anything related to eventsets while the context is loaded. > > Every time I tried to do a create_evtsets after a load_context I would > > get a 'not supported' error from Perfmon. > > > Yes, this is the expected behavior. > > > PAPI has a few function at the low level, in short they can be referred > > to as: > > > > init_control_state > > update_control_state > > start > > read > > stop > > > > and of course, init/tear down/option handling routines. PAPI can have > > multiple eventsets, even though only 1 can be running at any given time > > (unless you are attached to another process, which I have also > > implemented) > > > > There is something confusing about your PAPI description of eventsets. > Before I can comment, you need to describe this a bit more. > > Are you saying that PAPI can manage multiple lists of distinct events sets? > For instance: > L1 = set1, set2, set3 > L2 = set1, set2, set3, set4 > > Where setX encapsulates the full PMU state (i.e., all accessible registers). > And you want to start with L1 and then switch to L2 on the fly? > > Am I getting this right? > > -- > -Stephane _______________________________________________ perfmon mailing list [email protected] http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/
