On Fri, Jun 19, 2009 at 5:09 PM, Dan Terpstra<terps...@eecs.utk.edu> wrote:
> In how many instances were these values (edge, invert, counter-mask, etc.)
> non-zero in the past? Since we typically didn't need to worry about them, I
> didn't :(

To enable those features, you had to pass a PMU-specific data structure
to pfm_dispatch_events(). Without that structure, they would be set to
zero UNLESS hardcoded in the event table, e.g., some events on Core and
Nehalem.

> Sounds like I'll need to modify any non-zero cases in the PAPI PRESET
> definition tables to support libpfm4. Is that a safe assumption?

I doubt you had any of those enabled in your PAPI event table, unless you
were already passing the model-specific input parameter to
pfm_dispatch_events().

> - d
>
>> -----Original Message-----
>> From: stephane eranian [mailto:eran...@googlemail.com]
>> Sent: Friday, June 19, 2009 11:03 AM
>> To: Dan Terpstra
>> Cc: perfmon2-devel
>> Subject: Re: [perfmon2] libpfm4 progress
>>
>> On Fri, Jun 19, 2009 at 5:01 PM, Dan Terpstra<terps...@eecs.utk.edu>
>> wrote:
>> > Stephane -
>> > This looks good. Am I correct in assuming that it's backward compatible
>> with
>> > earlier libpfm syntax? In other words, if I *don't* specify the new
>> > attributes, will they default to the values used in earlier versions?
>>
>> The event specification is backward compatible.
>> The API is not.
>> If you don't specify the new attributes, they default to 0.
>>
>> > - d
>> >
>> >> -----Original Message-----
>> >> From: stephane eranian [mailto:eran...@googlemail.com]
>> >> Sent: Thursday, June 18, 2009 6:39 PM
>> >> To: perfmon2-devel
>> >> Subject: [perfmon2] libpfm4 progress
>> >>
>> >> Hi,
>> >>
>> >> As discussed earlier on this list, I have been working on the
>> >> next generation libpfm. A version that will handle both
>> >> perfmon and PCL and which will also make it much simpler
>> >> for tool writers to enable advanced features.
>> >>
>> >> There will be a new event naming scheme. All features of
>> >> an event or counter will be controlled in the event name
>> >> specification.
>> >>
>> >> The PCL support covers both the PCL generic HW & SW
>> >> events and the usual raw PMU events. There is a dedicated
>> >> PCL call to encode the key fields on struct perf_counter_attr:
>> >>
>> >>    int pfm_get_pcl_event_encoding(const char *str, struct
>> >> perf_counter_attr *hw)
>> >>
>> >> You can also retrieve just the raw event encoding:
>> >>    int pfm_get_event_encoding(const char *str, uint64_t *codes, int
>> >> *count, int *plm);
>> >>
>> >> Here are some examples (screenshots) on AMD64 and Intel Core.
>> >>
>> >> $ showeventinfo | head -20
>> >> PMU model: AMD64 (Family 10h RevB, Barcelona)
>> >> #-----------------------------
>> >> Name     : DISPATCHED_FPU
>> >> Desc     : Dispatched FPU Operations
>> >> Code     : 0x0
>> >> Counters : [ 0 1 2 3 ]
>> >> Attr-00 : 0x01 : [OPS_ADD] : Add pipe ops excluding load ops and SSE
>> move
>> >> ops
>> >> Attr-01 : 0x02 : [OPS_MULTIPLY] : Multiply pipe ops excluding load ops
>> >> and SSE move ops
>> >> Attr-02 : 0x04 : [OPS_STORE] : Store pipe ops excluding load ops and
>> >> SSE move ops
>> >> Attr-03 : 0x08 : [OPS_ADD_PIPE_LOAD_OPS] : Add pipe load ops and SSE
>> move
>> >> ops
>> >> Attr-04 : 0x10 : [OPS_MULTIPLY_PIPE_LOAD_OPS] : Multiply pipe load ops
>> >> and SSE move ops
>> >> Attr-05 : 0x20 : [OPS_STORE_PIPE_LOAD_OPS] : Store pipe load ops and
>> >> SSE move ops
>> >> Attr-06 : 0x3f : [ALL] : All sub-events selected
>> >> Attr-07 : 0x07 : [i] : invert (0 or 1)
>> >> Attr-08 : 0x08 : [e] : edge level (0 or 1)
>> >> Attr-09 : 0x09 : [c] : counter-mask=[0-255]
>> >> Attr-10 : 0x0a : [u] : measure at priv level 1, 2, 3 (0 or 1)
>> >> Attr-11 : 0x0b : [k] : measure at priv level 0 (0 or 1)
>> >> Attr-12 : 0x0c : [g] : measure at guest level (0 or 1)
>> >> Attr-13 : 0x0d : [h] : measure at hypervisor level (0 or 1)
>> >>
>> >>
>> >> You notice the new attributes now merged with the regular unit masks.
>> >> To enable invert + edge + counter-mask on this event for OPS_ADD, you
>> >> simply need
>> >> to pass:
>> >>        DISPATCHED_FPU:OPS_ADD:i=1:e=1:c=2
>> >>
>> >> This counts every cycle in which less than 2 FPU add ops are
>> >> dispatched. Key value
>> >> add for tool is that there is no need to pass AMD64 specific
>> >> structures to enable AMD-specific
>> >> features. Proof with the libpfm self examples shown here on top of PCL:
>> >>
>> >> $ self DISPATCHED_FPU:OPS_ADD:i=1:e=1:c=2
>> >> [0x2d40100 event_sel=0x0 event_sel2=0x0 umask=0x1 os=0 usr=0 en=1
>> >> int=1 inv=1 edge=1 cnt_mask=2 guest=0 host=0]DISPATCHED_FPU
>> >> [type=4 val=0x2d40100 e_u=0 e_k=0 e_hv=0 plm=0x0]
>> >> DISPATCHED_FPU:OPS_ADD:i=1:e=1:c=2
>> >>                    0 DISPATCHED_FPU:OPS_ADD:i=1:e=1:c=2
>> >>
>> >> The 3rd line shows the PCL encoding for this event.
>> >>
>> >> As I said, PCL events are automatically added if PCL is detected on the
>> >> host:
>> >> $ showeventinfo
>> >> ...
>> >> #-----------------------------
>> >> Name     : PERF_COUNT_CPU_CYCLES
>> >> Desc     : PERF_COUNT_CPU_CYCLES
>> >> Code     : 0x0
>> >> Counters : [ ]
>> >> Attr-00 : 0x00 : [u] : measure at priv level 1, 2, 3, (0 or 1)
>> >> Attr-01 : 0x01 : [k] : measure at priv level 0 (0 or 1)
>> >> Attr-02 : 0x02 : [hv] : measure at hypervisor level (0 or 1)
>> >> #-----------------------------
>> >> Name     : PERF_COUNT_INSTRUCTIONS
>> >> Desc     : PERF_COUNT_INSTRUCTIONS
>> >> Code     : 0x1
>> >> Counters : [ ]
>> >> Attr-00 : 0x00 : [u] : measure at priv level 1, 2, 3, (0 or 1)
>> >> Attr-01 : 0x01 : [k] : measure at priv level 0 (0 or 1)
>> >> Attr-02 : 0x02 : [hv] : measure at hypervisor level (0 or 1)
>> >> ...
>> >> Name     : PERF_COUNT_CONTEXT_SWITCHES
>> >> Desc     : PERF_COUNT_CONTEXT_SWITCHES
>> >> Code     : 0x100000003
>> >> Counters : [ ]
>> >> Attr-00 : 0x00 : [u] : measure at priv level 1, 2, 3, (0 or 1)
>> >> Attr-01 : 0x01 : [k] : measure at priv level 0 (0 or 1)
>> >> Attr-02 : 0x02 : [hv] : measure at hypervisor level (0 or 1)
>> >> #-----------------------------
>> >> Name     : PERF_COUNT_CPU_MIGRATIONS
>> >> Desc     : PERF_COUNT_CPU_MIGRATIONS
>> >> Code     : 0x100000004
>> >> Counters : [ ]
>> >> Attr-00 : 0x00 : [u] : measure at priv level 1, 2, 3, (0 or 1)
>> >> Attr-01 : 0x01 : [k] : measure at priv level 0 (0 or 1)
>> >> Attr-02 : 0x02 : [hv] : measure at hypervisor level (0 or 1)
>> >>
>> >> And same thing, you can measure those with an unmodified program:
>> >> $ self perf_count_context_switches
>> >> [type=1 val=0x3 e_u=0 e_k=0 e_hv=0 plm=0x0] perf_count_context_switches
>> >>                 1002 perf_count_context_switches
>> >>
>> >> On AMD64 Family 10h, IBS will be enabled using the same mechanism.
>> >>
>> >> -----------------------------------------------------------------------
>> ---
>> >> ----
>> >> Crystal Reports - New Free Runtime and 30 Day Trial
>> >> Check out the new simplified licensing option that enables unlimited
>> >> royalty-free distribution of the report engine for externally facing
>> >> server and web deployment.
>> >> http://p.sf.net/sfu/businessobjects
>> >> _______________________________________________
>> >> perfmon2-devel mailing list
>> >> perfmon2-devel@lists.sourceforge.net
>> >> https://lists.sourceforge.net/lists/listinfo/perfmon2-devel
>> >
>> >
>
>

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to