Hi Phil, You make some very good points about native events and masks being both hardware and software platform specific. Initially I was completely sold on this argument. But as I said before I also believe that Vince's point that PAPI should provide a consistent way to do "something" across all components where that "something" makes sense is a desirable goal.
The new cpu mask I am working on adding clearly appears to the user as a hardware event mask. So if this mask exists and is supported by some components and not others, that does not bother me very much. But if there are multiple components that all provide the ability to count events for a specific cpu or package, it should be a goal of PAPI to provide a consistent way for the user to specify that information. This is an area that could have been handled better in previous versions of PAPI (isn't hind sight wonderful). In current releases the core and uncore components require that the cpu/package be identified using the setopt API call (eventset level control). But the RAPL component encodes the package to be counted in the event names. So the RAPL component has already moved the package specification to the event level but did it in a way that does not scale very well to components with a large set of events and masks (like uncore). I think that we all agree that it is an improvement to be able to specify the cpu/package information as a mask on an event. Especially for uncore where this feature provides capabilities which the user does not currently have. I think that we also mostly agree that if we can make this work for the perf_events and perf_events_uncore components (linux core and uncore events), then we will have addresses the issue for the vast majority of existing PAPI users. We will have also created an approach which other components can follow when collecting cpu/package information for events. In my opinion this does not replace the setopt API interface for cpu attach. It just provides an additional way that users can provide cpu/package information at a finer level of control. In fact the code could easily be made to support the use of both interfaces at the same time. If a setopt cpu attach was done, then any events which did not specify a cpu mask would get the eventset's cpu number. But any event that had a cpu mask would use the cpu number provided with the mask. So I think that consistency should be the goal and whenever making changes to PAPI we should try to get closer to a consistent user interface. But I completely agree that we cannot stop improving some parts of PAPI just because we do not have the resources to go back and rework some other less used parts. BTW: I have a modified version of PAPI which uses the libpfm4 extended events recommended by Stephane. This version seems to be working correctly when adding and counting events (both with and without a cpu mask). I have also made changes to the PAPI list code to separate it from the code that adds events. The list code is now about 80% working. It seems to walk event lists properly and shows the event names and descriptions properly. Showing event masks and groups is not yet working. For both adding events and listing events, the debug traces are much easier to follow because there is far less code being executed than before. Gary > -----Original Message----- > From: Philip Mucci [mailto:mu...@icl.utk.edu] > Sent: Friday, April 25, 2014 11:23 AM > To: Gary Mohr > Cc: Vince Weaver; Stephane Eranian; Heike McCraw; <perfapi- > de...@eecs.utk.edu>; perfmon2-devel > Subject: Re: [perfmon2] [Perfapi-devel] FW: Proposed enhancement to > libpfm4. > > Hi guys, > > I disagree - let me try to make my case. > > Native events are exactly that - native. The definition of a native event is > that > they are BOUND to both a hardware + software platform. There has never > been a guarantee that a native event on two different OS's for the same > hardware works. The cpu=N umask is no different than any other native > event qualifier or umask - it has no meaning on BG, BSD or anywhere else. > There are other umasks like this already present in libpfm. So my claim is > that > cpu=N is a platform specific umask and should be exposed as such. > > Next, let's talk about the implication that there is a conflict between setopt > and umasks. First, set opt is an operation that works on the entire event set, > i.e. all events. Second, it blindly assumes that all events support that > operation - or assumes the substrate can tell the difference and return a > valid error. Most often it does not, but if you are using this function, it's > because you know something about the platform and the events you are > counting. However, these functions are compatible - consider the set opt > implementation that dispatches to a libpfm/perf substrate, which in turn just > removes all the events from the event set, appends the "platform specific" > umask to each event string, and adds the events back into the event set. This > same function also sets an event set flag, marking the umask such that any > subsequent event gets the umask appended to it (cpu, domain, etc...) > > All the above technical arguments aside, my opinion is that > Linux+perf+libpfm forms 98% of our user base - and to deny a piece of > functionality on the basis of that 2% is being pedantic at best. > > Anyways, thanks for the debate guys - whatever we end up doing, I hope it'll > be more useful than what we have now. At a minimum, let me suggest that > we develop a patch guarded by a configure flag and #ifdef that can get this > working for people that need it. We do not need to make it the default. > > Phil > > > On Apr 25, 2014, at 11:25 AM, Gary Mohr <gary.m...@bull.com> wrote: > > > Vince, Phil, > > > > I kind of agree with what Vince is saying in this discussion. One of the > principles that papi was built to support is the idea that the same approach > can be used for all platforms supported by papi. And this idea has value. > > > > The existing setopt interface will continue to exist for all supported > platforms. The setopt interface specifies the cpu to be used at the event set > level (all events in an event set). I added the support for cpu attach to > setopt > a couple of years ago and at the time it was supported only by what is now > the perf_events component. I do not know if it was ever extended to also > work with BlueGene and I am pretty sure it was not used to specify the > package used in the rapl component (rapl replicated event names for each > package). So it seems to me like papi is already in a position where the > setopt version of cpu attach only works with some components (probably > only perf_events and perf_events_uncore). To get the consistency > mentioned above some work would have to be done in the components that > currently do not use the cpu number specified with setopt. > > > > The work I am doing now to add a "cpu=x" mask is creating an additional > way to specify the cpu number. But it is also allowing users to specify the > cpu > number to use at the event level rather than the event set level. So this new > approach provides a capability that does not currently exist in the setopt > interface for cpu attach. The new approach is very useful in the uncore > component but as pointed out would also be nice to have in other > components. Just like the setopt interface, to get consistency across all > supported platforms, it will require work in all of the components where it > makes sense to support it. > > > > So I think Vince makes a good point but I think we already have the > situation he is concerned about. The new cpu mask will also have this > limitation once it is working on the perf_events and perf_events_uncore > components. But the solution for this situation is not to stop building > things > that are useful on the linux platform. The solution should be to add support > to the other components to support the same capabilities. I know this is > easier said than done but please do not limit what is provided for one > component because other components will not support it. Instead I view > these changes (both the setopt and cpu mask features) as a blue print of > what would be nice to have and an example of how to do it. To get it done > on the other components, someone will have to step make the changes in > the other components to make it happen. > > > > For my part right now, I am willing to do the work to support cpu masks in > both the perf_event and perf_event_uncore components. I could envision > that possibly in the future we may be willing to convert the rapl component > to use this model. But I have no use for BlueGene so someone else will need > to implement the changes being added for the linux platform if it is to > become consistent. It may require contributions from the papi community > users who have access to BlueGene and care about being able to use these > capabilities. > > > > Well that is my two cents. > > Gary > > > > > > > >> -----Original Message----- > >> From: Vince Weaver [mailto:vincent.wea...@maine.edu] > >> Sent: Friday, April 25, 2014 8:03 AM > >> To: Philip Mucci > >> Cc: Vince Weaver; Gary Mohr; Stephane Eranian; Heike McCraw; > <perfapi- > >> de...@eecs.utk.edu>; perfmon2-devel > >> Subject: Re: [perfmon2] [Perfapi-devel] FW: Proposed enhancement to > >> libpfm4. > >> > >> On Thu, 24 Apr 2014, Philip Mucci wrote: > >> > >>> But Vince, are you implying that we should kinda 'discourage' using > >>> these events? I think we want people (experts like Gary) to be able to > >>> measure an uncore event with all information. As is stands, > >>> understanding events at that level falls on the user. Stephane shot down > >>> the idea of libpfm changing - unfortunately, that leaves us to find a > >>> workaround in PAPI. I don't think we can throw out a valuable solution > >>> just because it covers 99% (rather than 100) of the installed user base. > >>> I want to understand your proposal for the solution, what are your > >>> thoughts? > >> > >> I'm just saying maintanence wise it's going to be a pain if we have two > >> different ways of doing the same thing. Just looking at the bigger > >> picture. > >> > >> Ideally we should be able to use the CPU= flag on any event, not just > >> uncore. But to do that properly PAPI should be handling the flag, not > >> libpfm4. > >> > >> Otherwise you end up with the problem where Linux/libpfm4 users can > just > >> do something like > >> UNHALTED_CORE_CYCLES:cpu=0 > >> > >> but the interface is completely different if you're saying running on > >> BlueGene, etc, and you have to go through the whole setopt interface. > >> > >> It means the whole cross-platform idea of PAPI falls apart because you > >> might as well be using two different libraries, PAPI, and PAPI-libpfm4 > >> which have different ways of doing things and will require different > >> #ifdefs in your code depending on what OS you're running your code on. > >> > >> So yes, we can throw together some sort of short-term hack that gets > >> uncore working a little easier right now (it's perfectly possible to get > >> uncore results even without the cpu=0 patches, you just need to use CPU > >> attach like the example code shoes). It's just going to add a long-term > >> maintanence pain to the rest of the library. > >> > >> Vince ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available. Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel