On Fri, Jun 19, 2009 at 5:03 PM, stephane eranian<eran...@googlemail.com> wrote: > On Fri, Jun 19, 2009 at 5:01 PM, Dan Terpstra<terps...@eecs.utk.edu> wrote: >> Stephane - >> This looks good. Am I correct in assuming that it's backward compatible with >> earlier libpfm syntax? In other words, if I *don't* specify the new >> attributes, will they default to the values used in earlier versions? > > The event specification is backward compatible. > The API is not. > If you don't specify the new attributes, they default to 0. > the case of the priv level is special, though. You realize that if you were to not pass u=1 or k=1, then nothing would be measured.
For perfmon this is handled: - like with libpfm3.x, there is a default plm that applications MUST pass. it is applied if not priv level attribute is detected. for PCL: - this is handled via the exclude_* fields. Without attributes, all exclude_* are cleared, i.e, measure at all levels. >> - d >> >>> -----Original Message----- >>> From: stephane eranian [mailto:eran...@googlemail.com] >>> Sent: Thursday, June 18, 2009 6:39 PM >>> To: perfmon2-devel >>> Subject: [perfmon2] libpfm4 progress >>> >>> Hi, >>> >>> As discussed earlier on this list, I have been working on the >>> next generation libpfm. A version that will handle both >>> perfmon and PCL and which will also make it much simpler >>> for tool writers to enable advanced features. >>> >>> There will be a new event naming scheme. All features of >>> an event or counter will be controlled in the event name >>> specification. >>> >>> The PCL support covers both the PCL generic HW & SW >>> events and the usual raw PMU events. There is a dedicated >>> PCL call to encode the key fields on struct perf_counter_attr: >>> >>> int pfm_get_pcl_event_encoding(const char *str, struct >>> perf_counter_attr *hw) >>> >>> You can also retrieve just the raw event encoding: >>> int pfm_get_event_encoding(const char *str, uint64_t *codes, int >>> *count, int *plm); >>> >>> Here are some examples (screenshots) on AMD64 and Intel Core. >>> >>> $ showeventinfo | head -20 >>> PMU model: AMD64 (Family 10h RevB, Barcelona) >>> #----------------------------- >>> Name : DISPATCHED_FPU >>> Desc : Dispatched FPU Operations >>> Code : 0x0 >>> Counters : [ 0 1 2 3 ] >>> Attr-00 : 0x01 : [OPS_ADD] : Add pipe ops excluding load ops and SSE move >>> ops >>> Attr-01 : 0x02 : [OPS_MULTIPLY] : Multiply pipe ops excluding load ops >>> and SSE move ops >>> Attr-02 : 0x04 : [OPS_STORE] : Store pipe ops excluding load ops and >>> SSE move ops >>> Attr-03 : 0x08 : [OPS_ADD_PIPE_LOAD_OPS] : Add pipe load ops and SSE move >>> ops >>> Attr-04 : 0x10 : [OPS_MULTIPLY_PIPE_LOAD_OPS] : Multiply pipe load ops >>> and SSE move ops >>> Attr-05 : 0x20 : [OPS_STORE_PIPE_LOAD_OPS] : Store pipe load ops and >>> SSE move ops >>> Attr-06 : 0x3f : [ALL] : All sub-events selected >>> Attr-07 : 0x07 : [i] : invert (0 or 1) >>> Attr-08 : 0x08 : [e] : edge level (0 or 1) >>> Attr-09 : 0x09 : [c] : counter-mask=[0-255] >>> Attr-10 : 0x0a : [u] : measure at priv level 1, 2, 3 (0 or 1) >>> Attr-11 : 0x0b : [k] : measure at priv level 0 (0 or 1) >>> Attr-12 : 0x0c : [g] : measure at guest level (0 or 1) >>> Attr-13 : 0x0d : [h] : measure at hypervisor level (0 or 1) >>> >>> >>> You notice the new attributes now merged with the regular unit masks. >>> To enable invert + edge + counter-mask on this event for OPS_ADD, you >>> simply need >>> to pass: >>> DISPATCHED_FPU:OPS_ADD:i=1:e=1:c=2 >>> >>> This counts every cycle in which less than 2 FPU add ops are >>> dispatched. Key value >>> add for tool is that there is no need to pass AMD64 specific >>> structures to enable AMD-specific >>> features. Proof with the libpfm self examples shown here on top of PCL: >>> >>> $ self DISPATCHED_FPU:OPS_ADD:i=1:e=1:c=2 >>> [0x2d40100 event_sel=0x0 event_sel2=0x0 umask=0x1 os=0 usr=0 en=1 >>> int=1 inv=1 edge=1 cnt_mask=2 guest=0 host=0]DISPATCHED_FPU >>> [type=4 val=0x2d40100 e_u=0 e_k=0 e_hv=0 plm=0x0] >>> DISPATCHED_FPU:OPS_ADD:i=1:e=1:c=2 >>> 0 DISPATCHED_FPU:OPS_ADD:i=1:e=1:c=2 >>> >>> The 3rd line shows the PCL encoding for this event. >>> >>> As I said, PCL events are automatically added if PCL is detected on the >>> host: >>> $ showeventinfo >>> ... >>> #----------------------------- >>> Name : PERF_COUNT_CPU_CYCLES >>> Desc : PERF_COUNT_CPU_CYCLES >>> Code : 0x0 >>> Counters : [ ] >>> Attr-00 : 0x00 : [u] : measure at priv level 1, 2, 3, (0 or 1) >>> Attr-01 : 0x01 : [k] : measure at priv level 0 (0 or 1) >>> Attr-02 : 0x02 : [hv] : measure at hypervisor level (0 or 1) >>> #----------------------------- >>> Name : PERF_COUNT_INSTRUCTIONS >>> Desc : PERF_COUNT_INSTRUCTIONS >>> Code : 0x1 >>> Counters : [ ] >>> Attr-00 : 0x00 : [u] : measure at priv level 1, 2, 3, (0 or 1) >>> Attr-01 : 0x01 : [k] : measure at priv level 0 (0 or 1) >>> Attr-02 : 0x02 : [hv] : measure at hypervisor level (0 or 1) >>> ... >>> Name : PERF_COUNT_CONTEXT_SWITCHES >>> Desc : PERF_COUNT_CONTEXT_SWITCHES >>> Code : 0x100000003 >>> Counters : [ ] >>> Attr-00 : 0x00 : [u] : measure at priv level 1, 2, 3, (0 or 1) >>> Attr-01 : 0x01 : [k] : measure at priv level 0 (0 or 1) >>> Attr-02 : 0x02 : [hv] : measure at hypervisor level (0 or 1) >>> #----------------------------- >>> Name : PERF_COUNT_CPU_MIGRATIONS >>> Desc : PERF_COUNT_CPU_MIGRATIONS >>> Code : 0x100000004 >>> Counters : [ ] >>> Attr-00 : 0x00 : [u] : measure at priv level 1, 2, 3, (0 or 1) >>> Attr-01 : 0x01 : [k] : measure at priv level 0 (0 or 1) >>> Attr-02 : 0x02 : [hv] : measure at hypervisor level (0 or 1) >>> >>> And same thing, you can measure those with an unmodified program: >>> $ self perf_count_context_switches >>> [type=1 val=0x3 e_u=0 e_k=0 e_hv=0 plm=0x0] perf_count_context_switches >>> 1002 perf_count_context_switches >>> >>> On AMD64 Family 10h, IBS will be enabled using the same mechanism. >>> >>> -------------------------------------------------------------------------- >>> ---- >>> Crystal Reports - New Free Runtime and 30 Day Trial >>> Check out the new simplified licensing option that enables unlimited >>> royalty-free distribution of the report engine for externally facing >>> server and web deployment. >>> http://p.sf.net/sfu/businessobjects >>> _______________________________________________ >>> perfmon2-devel mailing list >>> perfmon2-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/perfmon2-devel >> >> > ------------------------------------------------------------------------------ Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel