I agree with Stephane on this one. I was hopping back and forth and back
again about different ways to do this. Perhaps the best way to is extended
the PAPI API with new calls, such as,

PAPI_cpu_eventset

or similar to assign a given cpu to the events in the event set. PAPI could
enforce that such calls only apply to system-wide or per-numa-node or per-socket
events (the calls would return failure if applied to per CPU events). I am
still a bit out of my comform zone, here so I still might not understand
the full ramifications of what is going on exactly ;-[]

Given this type of approach, there MUST be a logical default for these event
sets if the above call or similar is NOT specified. I don't know if this
is CPU 0 or whatever, but there could also be a top-level call, made before
PAPI_library_init, that sets that default CPU number. A combination of
calls my be warranted to give full flexibility.

Steve

On Thu, 17 Apr 2014, Stephane Eranian wrote:

Date: Thu, 17 Apr 2014 09:59:53 -0500
From: Stephane Eranian <eran...@googlemail.com>
Reply-To: eran...@gmail.com
To: Gary Mohr <gary.m...@bull.com>
Cc: perfmon2-devel <perfmon2-devel@lists.sourceforge.net>
Subject: Re: [perfmon2] FW: Proposed enhancement to libpfm4.

Gary,


I am trying to understand the underpinning here. You are saying there is no
way to pass a CPU to the PAPI call
to pin an uncore event to a particular socket.

First, uncore events are system-wide only events. This is why you need to
pass a CPU number (as a substitute
for a socket number). Second, the kernel always exports a list of CPUs to
monitor for each uncore PMU. It is
located in /sys/device/uncore_xxx/cpumask.

I don't really like the libpfm4 changes you are proposing. They do not make
sense to me because you are trying
to work around a limitation of PAPI by modifying libpfm4.

My understanding is that PAPI is not designed to handle system-wide events.
System-wide events require a CPU
number. So why not extend PAPI to handle this instead so it would work with
or without libpfm4? I understand it
would break existing tools, but then those tools are not ready to cope with
CPU or socket-level measurements, maybe.


On Thu, Apr 10, 2014 at 5:40 PM, Gary Mohr <gary.m...@bull.com> wrote:

      Also send this to the perfmon mailing list.

       

      From: Gary Mohr
      Sent: Wednesday, April 09, 2014 4:56 PM
      To: Stephane Eranian
      Cc: Vince Weaver; Philip Mucci; Heike McCraw; Michel Brown
      Subject: Proposed enhancement to libpfm4.

 

Hi Stephane,

 

There has been quite a bit of discussion in the PAPI community lately
regarding ways to make the PAPI uncore component useful to existing
PAPI applications. 

 

A short description of the problem:

 

The kernel requires a cpu number to be provided on the open when
setting up to count uncore events (used by kernel to pick the
package/socket to count). 

PAPI currently provides a way to set a cpu number but it requires a
call to PAPI_set_opt which existing papi applications that are
currently used with core events almost never use.

This means that existing PAPI applications cannot use uncore events
without coding changes.

 

Possible solution:

 

Change the uncore event string to include information to specify the
core number that should pass to the kernel for this event.

PAPI applications normally get the event to use from a user or config
file, so they would have access to uncore events if the user just adds
a little extra information to the event string.

 

Two approaches were considered:

 

1 -- The event name could be extended to include a package component. 
This would result in the event names being replicated once for each
package on the system.

2 -- A new event mask could be added to provide the number of the core
which should  be passed to the kernel for the event.

 

Since the SNBEP system already has 315 uncore events, replicating them
for each package could lead to over 1200 different event names.  The
current list output for uncore events on this system produces 6,000+
lines of output.  Replicating each event could drive that to about
24,000 lines of output.  This makes the first approach less than
desirable.

 

A new mask for the uncore events could be added to identify which core
number should be passed to the kernel.  But this information is needed
by PAPI and does not end up in the attribute structure built by
libpfm4 and passed to the kernel by PAPI.  This means that we would be
introducing a mask that should be processed by PAPI and not libpfm4. 
The new mask approach would have no effect on the number of events and
little or no effect on the list output.  So it seemed to be the
preferred approach.

 

In addition during these discussions, it was felt that a small number
of other PAPI attributes could also be handled with PAPI specific
event masks rather than through independent API calls (as is required
today).  This encouraged looking for a general solution.

 

Two different approaches for adding a mask have been considered:

 

1 -- Modify PAPI to prescan the event strings to remove and process
the new mask.

2 – Enhance libpfm4 to allow event strings which contain masks it does
know about.

 

The first approach probably can be done but there is some concern that
if PAPI prescreens and removes some of the event masks, it may remove
masks that would have been meaningful to libpfm4.  This would be
undesirable but could be avoided with careful PAPI mask names.

 

The idea behind the second approach is to add a feature to libpfm4
which would allow PAPI to pass an event string which contains some
masks which libpfm4 may not understand.  When this is done, libpfm4
would be able to return a table to the caller which contains the
events libpfm4 did not recognize.  When using this new feature,
libpfm4 would not consider an unknown mask as an error.  It would just
return unprocessed masks  to the caller and let the caller decide if
those masks were valid.  This provides PAPI with an easy way to extend
the set of event masks an application can use.  Of course when this
new feature is not being used, libpfm4 would continue to behave
exactly as it has in the past.

 

I spent some time adding this feature to libpfm4 and now have it
working.  The end result is that I can now use papi_command_line to
count uncore events without any changes to the application.

 

A high level summary of what I did to libpfm4:

 

I created two new libpfm4 functions which provide the same service as
two existing functions but accept an additional calling argument.  The
additional calling argument is a pointer to a table where libpfm4 can
store any unprocessed masks.  The new functions are
pfm_find_event_mask and pfm_get_os_event_encoding_mask.  The current
function names also still exist and just call the new functions
passing a NULL pointer for the unprocessed masks table.  Then the code
in these new functions was changed to handle the case where it finds
an unrecognized mask so it now behaves as described above.

 

Attached you will find a patch file that contains the libpfm4 changes
that I made (code is always more interesting than descriptions).

 

I am hoping to persuade you that this code is worth putting into
libpfm4 but in either case, I am interested in your views on the
topic.

 

There are still a few things in these patches that I think should be
changed to make it more robust but if you are in agreement with this
approach, I will gladly adjust it to meet expectations.

 

I hope I did not bore you too much with details but I thought some of
the background to explain why something in this area is needed was
important.

 

Thanks

Gary

 

 


---------------------------------------------------------------------------
---
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel



------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to