Also send this to the perfmon mailing list.

From: Gary Mohr
Sent: Wednesday, April 09, 2014 4:56 PM
To: Stephane Eranian
Cc: Vince Weaver; Philip Mucci; Heike McCraw; Michel Brown
Subject: Proposed enhancement to libpfm4.

Hi Stephane,

There has been quite a bit of discussion in the PAPI community lately regarding 
ways to make the PAPI uncore component useful to existing PAPI applications.

A short description of the problem:

The kernel requires a cpu number to be provided on the open when setting up to 
count uncore events (used by kernel to pick the package/socket to count).
PAPI currently provides a way to set a cpu number but it requires a call to 
PAPI_set_opt which existing papi applications that are currently used with core 
events almost never use.
This means that existing PAPI applications cannot use uncore events without 
coding changes.

Possible solution:

Change the uncore event string to include information to specify the core 
number that should pass to the kernel for this event.
PAPI applications normally get the event to use from a user or config file, so 
they would have access to uncore events if the user just adds a little extra 
information to the event string.

Two approaches were considered:

1 -- The event name could be extended to include a package component.  This 
would result in the event names being replicated once for each package on the 
system.
2 -- A new event mask could be added to provide the number of the core which 
should  be passed to the kernel for the event.

Since the SNBEP system already has 315 uncore events, replicating them for each 
package could lead to over 1200 different event names.  The current list output 
for uncore events on this system produces 6,000+ lines of output.  Replicating 
each event could drive that to about 24,000 lines of output.  This makes the 
first approach less than desirable.

A new mask for the uncore events could be added to identify which core number 
should be passed to the kernel.  But this information is needed by PAPI and 
does not end up in the attribute structure built by libpfm4 and passed to the 
kernel by PAPI.  This means that we would be introducing a mask that should be 
processed by PAPI and not libpfm4.  The new mask approach would have no effect 
on the number of events and little or no effect on the list output.  So it 
seemed to be the preferred approach.

In addition during these discussions, it was felt that a small number of other 
PAPI attributes could also be handled with PAPI specific event masks rather 
than through independent API calls (as is required today).  This encouraged 
looking for a general solution.

Two different approaches for adding a mask have been considered:

1 -- Modify PAPI to prescan the event strings to remove and process the new 
mask.
2 - Enhance libpfm4 to allow event strings which contain masks it does know 
about.

The first approach probably can be done but there is some concern that if PAPI 
prescreens and removes some of the event masks, it may remove masks that would 
have been meaningful to libpfm4.  This would be undesirable but could be 
avoided with careful PAPI mask names.

The idea behind the second approach is to add a feature to libpfm4 which would 
allow PAPI to pass an event string which contains some masks which libpfm4 may 
not understand.  When this is done, libpfm4 would be able to return a table to 
the caller which contains the events libpfm4 did not recognize.  When using 
this new feature, libpfm4 would not consider an unknown mask as an error.  It 
would just return unprocessed masks  to the caller and let the caller decide if 
those masks were valid.  This provides PAPI with an easy way to extend the set 
of event masks an application can use.  Of course when this new feature is not 
being used, libpfm4 would continue to behave exactly as it has in the past.

I spent some time adding this feature to libpfm4 and now have it working.  The 
end result is that I can now use papi_command_line to count uncore events 
without any changes to the application.

A high level summary of what I did to libpfm4:

I created two new libpfm4 functions which provide the same service as two 
existing functions but accept an additional calling argument.  The additional 
calling argument is a pointer to a table where libpfm4 can store any 
unprocessed masks.  The new functions are pfm_find_event_mask and 
pfm_get_os_event_encoding_mask.  The current function names also still exist 
and just call the new functions passing a NULL pointer for the unprocessed 
masks table.  Then the code in these new functions was changed to handle the 
case where it finds an unrecognized mask so it now behaves as described above.

Attached you will find a patch file that contains the libpfm4 changes that I 
made (code is always more interesting than descriptions).

I am hoping to persuade you that this code is worth putting into libpfm4 but in 
either case, I am interested in your views on the topic.

There are still a few things in these patches that I think should be changed to 
make it more robust but if you are in agreement with this approach, I will 
gladly adjust it to meet expectations.

I hope I did not bore you too much with details but I thought some of the 
background to explain why something in this area is needed was important.

Thanks
Gary


Attachment: AddPapiSpecificMaskSupport_libpfm4.patch
Description: AddPapiSpecificMaskSupport_libpfm4.patch

------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to