Re: [perfmon2] libpfm redesign

Corey J Ashford Mon, 01 Jun 2009 15:54:37 -0700

Thanks for the reply, Stephane,

stephane eranian <[email protected]> wrote on 06/01/2009 09:34:39 AM:


> Corey,
> 
> On Sat, May 30, 2009 at 12:01 AM, Corey Ashford
> <[email protected]> wrote:
> >
> >
> > Philip Mucci wrote:
> >>
> >> Hi Stefane,
> >>
> >> Yeah, we talked about this on the phone. I personally think an 
 attribute
> >> versus a umask is an unnecessary semantic distinction. Both  could be
> >> handled by the umask mechanism as far as I can see.
> >> i.e. When you parse a umask, you see an equal sign, you treat it as
> >>  such... Further, umasks can simply be seen as shorthand for the 
 attribute
> >> umask=true or something to that effect. In this way, we also 
>  don't need any
> >> new API calls specific to attributes...
> >>
> >> Let's see what other folks think..
> >>
> >
> > I agree with this.  I think it makes it easier to understand the 
interface
> > too... "What's the difference between a umask and an attribute, 
exactly?", I
> > would want to know.  We could call them all attributes, or all umasks.
> > Personally, I prefer the term attribute, as it gets at the heart of 
what
> > they are for.
> >
> The main difference is: umask have no user-controllable values
> The value of a umask is hardcoded in the event table.
> 
> An attribute has a value which can be specified by the user at runtime: 
k=1
> 
> You could convert a umask, e.g., ANY_P, into an attribute by 
> defining a generic
> attribute: umask=0x345.
> 
> I admit this is somewhat confusing. We could call everything an 
attribute.
> Umask would be an attribute without the possibility of assigning a 
value.
> 
> The following would be a valid specification:
>         INST_RETIRED:ANY_P:k=1:u=1
> 
> When parsing, the = sign would help libpfm figure out what each element 
is.
> if (has('='))
>     lookup_attr(str);
> else
>     lookup_umask(str)
> 
> The reason I am splitting the two here is because of the way attributes 
would
> be encoded compared to umasks. Umask would remain as they are. 
Attributes
> would be defined per event. But they would not necessarily need to 
> be hardcoded
> in the event table. For most PMUs, all events would have the same list
> of attributes
> therefore no need to encode them in the table.

Makse sense.

> 
> Would we still need a binary representation of events via 
pfmlib_event_t?
> 
> Not for PCL, not even for perfmon because the pfm_dispatch_events() 
could
> take the event specification strings directly, via an argv-style 
argument.

True.  Just a quick anecdote here about that:  My collegue Carl Love had 
never worked on libpfm before, and when he looked at the pfm_dispatch 
call, he was a little confused not to find a list of event name strings 
passed in.  So I think this would be a natural thing to do, to change the 
dispatch function to accept the full event strings, and that would get rid 
of the need for many programmers to call the individual event and umask 
lookup functions first.

> 
> They are several more issues to address however:
>  1 - listing of events and umasks
>  2 - querying of event special features, e.g., pfm_nhm_event_is_pebs()
> 
> Today, for listing, we iterate over the opaque event identifiers. A 
simple
> for loop is enough because there is an assumption that identifier are
> contiguous starting from 0. This logic would remain. No need for
> pfmlib_event_t here.
> 
> In terms of querying, we are currently relying on the event
> and umask identifiers. Those identifiers are retrieved via specific
> calls such as pfm_find_full_event(). Identifiers are returned as
> part of pfmlib_event_t. Without that, all query calls would have
> to operate on the event specification directly. For instance, to
> query the list of unit masks for an event:
> 
>      pfm_get_event_mask_name("INST_RETIRED", 0,  name);
> 
> Would return the name of unit mask 0 for INST_RETIRED.
> 
> How would attributes with values be exposed to users for querying?
> 
> It is useful for tools to list attributes with values, maybe to
> print more customized error messages. Pfmon does some of that
> today.
> 
> Attributes would be logically added to the list of existng umasks.
> Let's take an example:
> 
>         { .pme_name = "INST_RETIRED",
>           .pme_code = 0xc0,
>           .pme_desc =  "Instructions retired",
>           .pme_umasks = {
>                 { .pme_uname = "ANY_P",
>                   .pme_udesc = "Instructions retired (precise event)",
>                   .pme_ucode = 0x0,
>                   .pme_flags = PFMLIB_CORE_PEBS
>                 },
>                 { .pme_uname = "LOADS",
>                   .pme_udesc = "Instructions retired, which contain a 
load",
>                   .pme_ucode = 0x1
>                 },
>                 { .pme_uname = "STORES",
>                   .pme_udesc = "Instructions retired, which contain a 
store",
>                   .pme_ucode = 0x2
>                 },
>                 { .pme_uname = "OTHER",
>                   .pme_udesc = "Instructions retired, with no load or
> store operation",
>                   .pme_ucode = 0x4
>                 }
>            },
>            .pme_numasks = 4
>         },
> 
> Assuming this event supports attributes: i, e, c, u ,k
> 
> That means that it exposes 4 + 5 = 9  attributes
> 4 are actually encoded in the event table. 5 are dynamically added.
> So you can actually iterate from 0-8 with pfm_get_event_mask_name().
> 
> One thing which is still missing here is the ability to query valid 
values
> for an attribute. For instance, the c attribute (counter-mask) is only 
8-bit
> wide. Libpfm will reject any value > 255. But it would be nice for tools
> to have a way of knowing this limit. Of course, the difficulty is the 
type
> of the value. In the current scheme, the value may not necessarily be an
> integer.

Yes, that's a tough problem because it's difficult to foresee every 
possible attribute value that might be needed.
I suppose you could define a few common types that could be used:
1) an enum: e.g.  LEVEL, EDGE
2) a decimal range m .. n
3) a hexadecimal range h .. i

A specific architecture implementation could use one of the above, and 
libpfm would know how to describe and check those.  Or it could implement 
its own type, by supplying callbacks for libpfm to call.  Two callbacks 
might be:

char *get_acceptable_value_string(char *attribute); /* returns a string 
describing acceptable values of the supplied attribute */
int validate_value(char *attribute, char *value); /* returns -1 if the 
string supplied doesn't pass the test of an acceptable value */

> 
> > As for the encoding of PCL events, I like the idea of unifying the
> > user/kernel/supervisor as attributes/umasks and then processing that 
into an
> > entire pcl perf_hw_counter struct.  However, libpfm has attempted to
> > disconnect itself from the kernel implementation, and adding this sort 
of
> > kernel-specific function is taking a rather long drive down that road. 
 Plus
> > it makes libpfm more closely tied to the version of PCL which is in 
the
> > kernel.  In that light, at least until PCL settles down very well, I 
think
> > the "simply provide raw encoding (no PCL encoding)" option would be
> > preferable.  (by the way, is the "uint64_t vals" param supposed to be 
a
> > pointer "uint64_t *vals" ?)
> >
> I like the idea of keeping libpfm internals disconnected from a kernel
> API. But I am wondering about the level of support provided by a libpfm
> which only returns uint64_t *vals. Don't get me wrong, though, I think
> this call is useful. But it is useful enough, don't tools want more 
help?

Well, this level of support was enough for the PAPI PCL substrate, but 
that's not everyone, I know.

> 
> Should libpfm only be concerned by actual PMU hardware events?
> In other words, should it not deal with PCL generic HW events, such
> as PERF_COUNT_CPU_CYCLES, and SW events, such as
> PERF_COUNT_PAGE_FAULTS?  If it does not, then it means each
> tool needs to have another layer which intercepts those events and
> only calls libpfm when the event is not recognized. Similarly, the type
> bitfield inside the config field should be written by the caller of 
libpfm
> and not libpfm.

Since there are not a huge number of predefined PCL events, I think it 
would make sense for libpfm to handle them also.  If you can generalize 
the support in libpfm so that it potentially could be extended to 
generalized events using other kernel implementations (say on BSD), that 
would be good (and I'm guessing not too difficult).

More along those lines, you might want to consider partitioning libpfm 
into a generic piece, and a kernel-specific piece, sort of like how PAPI 
has its substrates.

[snip]
> > I think there's little motivation to go with anything other than XML 
for
> > this, if you want to do it.  There are some decent XML parsing 
libraries out
> > there that should suit the needs of libpfm (expat and libxml2).  You'd
> > probably want to distribute an XML schema or DTD file for each 
architecture
> > too, so that the libpfm could validate that the syntax of the event 
file is
> > correct for that arch.  Functions for both are in libxml2.  I should 
be
> > clear to say that I have never used either library, but I have used 
similar
> > libraries in Java.
> >
> Yes, XML like syntax would be used. I am no expert in this. What I care
> about is the extensibility of the description. As you know, each PMU may
> have to define model-specific attributes in the event table. Yet we 
don't
> want to patch the lexical analyzer and parser for each new PMU model.

Yeah, with XML, if you are reasonably careful with designing the schema 
(for example, don't use attributes too much), you can make the description 
for each PMU very extensible, and add new features later on without 
breaking the previous ones.

Since the parser and lex analyzer is built into the libxml2 and expat 
libraries, there's really very little work to do with that part.  It does 
all of the heavy-lifting and error detection.

> 
> > Once you read in the XML file and convert it to an in-memory 
structure, you
> > wouldn't need to access the file system anymore.  So I'm not sure 
where the
> > "bursty accesses on shared storage" problem comes in.
> >
> The worry is about accessing the XML file. Imagine you launch a 
measurement
> over a 10,000-node cluster that uses shared storage for the file. All
> 10,000 nodes
> would go read the XML file pretty much at the same time.

Good point!  Thanks for the explanation.

> 
> > If you accessed the XML tables via the DOM, the potential is there to 
use
> > more memory than would be required by having loaded the hard-coded 
event
> > tables for all arches.  However, if you use SAX instead, the memory 
overhead
> > should be minimal, and both expat and libxml2 support a SAX interface.
> >
> I don't know what DOM and SAX are ;-<

Very simply, with the DOM (Document Object Model), you can randomly access 
the XML tree using paths to get to particular nodes, iterate through them, 
etc.  With SAX, you provide callbacks for the XML parser to call as it 
parses each XML element in the tree.  Once it has parsed the whole XML 
file, you no longer use the XML library, and simply deal with whatever 
data you've stored as a result of the parsing callbacks.

> 
> I think the issue about hardcoded vs. XML event table is secondary 
compared
> to the issues related to umasks and attributes. Let's solve that part 
first.

I agree.  Just thought I'd throw that XML stuff in there since I do have 
some experience with it.  Overall, XML is very nice to work with.

- Corey

------------------------------------------------------------------------------
OpenSolaris 2009.06 is a cutting edge operating system for enterprises 
looking to deploy the next generation of Solaris that includes the latest 
innovations from Sun and the OpenSource community. Download a copy and 
enjoy capabilities such as Networking, Storage and Virtualization. 
Go to: http://p.sf.net/sfu/opensolaris-get
_______________________________________________
perfmon2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Re: [perfmon2] libpfm redesign

Reply via email to