[perfmon2] Understanding libpfm4 perf_events attribute translation
All, I want to profile using MEM_TRANS_RETIRED::LATENCY_ABOVE_THRESHOLD counter on Intel SandyBridge. Using libpfm4's examples/check_event, I can extract the perf_events config into *0x5301cd* and *0x3* : $ ./check_events snb::MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD Requested Event: snb::MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD ActualEvent: snb::MEM_TRANS_RETIRED:LATENCY _ABOVE_THRESHOLD:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3 PMU: Intel Sandy Bridge IDX: 142606390 Codes : *0x5301cd* *0x3* However, looking at the result of examples/showevtinfo program, I believe the config number of MEM_TRANS_RETIRED::LATENCY_ABOVE_THRESHOLD should be *0x1cd* (*0x01* for the umask and *0xcd* for the code): IDX : 142606390 PMU name : snb (Intel Sandy Bridge) Name : MEM_TRANS_RETIRED Equiv: None Flags: [precise] Desc : Memory transactions retired Code : *0xcd* Umask-00 : *0x01* : PMU : [LATENCY_ABOVE_THRESHOLD] : [precise] : Memory load instructions retired above programmed clocks, minimum threshold value is 3 (Precise Event and ldlat required) Umask-01 : 0x02 : PMU : [PRECISE_STORE] : [precise] : Capture where stores occur, must use with PEBS (Precise Event required) Modif-00 : 0x00 : PMU : [k] : monitor at priv level 0 (boolean) Modif-01 : 0x01 : PMU : [u] : monitor at priv level 1, 2, 3 (boolean) Modif-02 : 0x02 : PMU : [e] : edge level (may require counter-mask >= 1) (boolean) Modif-03 : 0x03 : PMU : [i] : invert (boolean) Modif-04 : 0x04 : PMU : [c] : counter-mask in range [0-255] (integer) Modif-05 : 0x05 : PMU : [t] : measure any thread (boolean) Modif-06 : 0x06 : PMU : [ldlat] : load latency threshold (cycles, [3-65535]) (integer) The question is how pfm_get_os_event_encoding() translates MEM_TRANS_RETIRED::LATENCY_ABOVE_THRESHOLD into *0x5301cd* and *0x3* instead of *0x1cd* ? Thanks Laksono Adhianto -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel
Re: [perfmon2] Understanding libpfm4 perf_events attribute translation
Stephane, Thanks for the info ! It's clear for me now. Laksono Adhianto On Wed, May 16, 2018 at 12:31 PM, Stephane Eranian wrote: > Hi, > > On Tue, May 15, 2018 at 11:48 AM, laksono wrote: > >> All, >> >> >> I want to profile using MEM_TRANS_RETIRED::LATENCY_ABOVE_THRESHOLD >> counter on Intel SandyBridge. Using libpfm4's examples/check_event, I can >> extract the perf_events config into *0x5301cd* and *0x3* : >> >> >> $ ./check_events snb::MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD >> >> >> Requested Event: snb::MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD >> >> ActualEvent: snb::MEM_TRANS_RETIRED:LATENCY >> _ABOVE_THRESHOLD:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3 >> >> PMU: Intel Sandy Bridge >> >> IDX: 142606390 >> >> Codes : *0x5301cd* *0x3* >> >> However, looking at the result of examples/showevtinfo program, I believe >> the config number of MEM_TRANS_RETIRED::LATENCY_ABOVE_THRESHOLD should >> be *0x1cd* (*0x01* for the umask and *0xcd* for the code): >> >> >> IDX : 142606390 >> >> PMU name : snb (Intel Sandy Bridge) >> >> Name : MEM_TRANS_RETIRED >> >> Equiv: None >> >> Flags: [precise] >> >> Desc : Memory transactions retired >> >> Code : *0xcd* >> >> Umask-00 : *0x01* : PMU : [LATENCY_ABOVE_THRESHOLD] : [precise] : Memory >> load instructions retired above programmed clocks, minimum threshold value >> is 3 (Precise Event and ldlat required) >> >> Umask-01 : 0x02 : PMU : [PRECISE_STORE] : [precise] : Capture where >> stores occur, must use with PEBS (Precise Event required) >> >> Modif-00 : 0x00 : PMU : [k] : monitor at priv level 0 (boolean) >> >> Modif-01 : 0x01 : PMU : [u] : monitor at priv level 1, 2, 3 (boolean) >> >> Modif-02 : 0x02 : PMU : [e] : edge level (may require counter-mask >= 1) >> (boolean) >> >> Modif-03 : 0x03 : PMU : [i] : invert (boolean) >> >> Modif-04 : 0x04 : PMU : [c] : counter-mask in range [0-255] (integer) >> >> Modif-05 : 0x05 : PMU : [t] : measure any thread (boolean) >> >> Modif-06 : 0x06 : PMU : [ldlat] : load latency threshold (cycles, >> [3-65535]) (integer) >> >> The question is how pfm_get_os_event_encoding() translates >> MEM_TRANS_RETIRED::LATENCY_ABOVE_THRESHOLD into *0x5301cd* and *0x3* >> instead of *0x1cd* ? >> >> You can ignore the 0x5 in 0x5301cd, it is the enable bit which gets > overwriten by the kernel. > The case of 0x3 is more interesting. This event is special, it requires a > latency filter. As you see from the description, the event increments when > the load execution latency is above a certain threshold. Well, you need to > specify the threshold. This is done using the ldlat= modifier. If you don't > specify one, the library will assume you want to the smallest latency > possible which is 3. > Hope this clarifies how to use this event. > > >> Thanks >> >> Laksono Adhianto >> > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel
[perfmon2] spr::TOPDOWN:SLOTS and spr::TOPDOWN:BAD_SPEC_SLOTS events have the same codes?
Hi I found that pfm_get_os_event_encoding() returns the same code for both spr::TOPDOWN:SLOTS and spr::TOPDOWN:BAD_SPEC_SLOTS counters. If I run the check_events example: $ ./check_events TOPDOWN:SLOTS ... Requested Event: TOPDOWN:SLOTS ActualEvent: spr::TOPDOWN:SLOTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0 PMU: Intel SapphireRapid IDX: 1073741888 Codes : *0x530400* $ ./check_events TOPDOWN:BAD_SPEC_SLOTS Requested Event: TOPDOWN:BAD_SPEC_SLOTS ActualEvent: spr::TOPDOWN:BAD_SPEC_SLOTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0 PMU: Intel SapphireRapid IDX: 1073741888 Codes : *0x530400* Other sub-events like TOPDOWN:BACKEND_BOUND_SLOTS, TOPDOWN:BR_MISPREDICT_SLOTS and TOPDOWN:MEMORY_BOUND_SLOTS have correctly unique codes. Interestingly, the ./showevtinfo program shows that both TOPDOWN:SLOTS and TOPDOWN:BAD_SPEC_SLOTS have the sane Umask. IDX : 1073741888 PMU name : spr (Intel SapphireRapid) Name : TOPDOWN Equiv: None Flags: [hw_smpl] [speculative] Desc : Topdown events. Code : 0x0 Umask-00 : 0x02 : PMU : [BACKEND_BOUND_SLOTS] : [hw_smpl] [speculative] : TMA slots where no uops were being issued due to lack of back-end resources. Umask-01 : *0x04* : PMU : [BAD_SPEC_SLOTS] : [hw_smpl] [speculative] : TMA slots wasted due to incorrect speculations. Umask-02 : 0x08 : PMU : [BR_MISPREDICT_SLOTS] : [hw_smpl] [speculative] : TMA slots wasted due to incorrect speculation by branch mispredictions Umask-03 : 0x10 : PMU : [MEMORY_BOUND_SLOTS] : [hw_smpl] [speculative] : TBD Umask-04 : *0x04* : PMU : [SLOTS] : [hw_smpl] [speculative] : TMA slots available for an unhalted logical processor. Fixed counter - architectural event Umask-05 : 0x1a4 : PMU : [SLOTS_P] : [hw_smpl] [speculative] : TMA slots available for an unhalted logical processor. General counter - architectural event Modif-00 : 0x00 : PMU : [k] : monitor at priv level 0 (boolean) Modif-01 : 0x01 : PMU : [u] : monitor at priv level 1, 2, 3 (boolean) Modif-02 : 0x02 : PMU : [e] : edge level (may require counter-mask >= 1) (boolean) Modif-03 : 0x03 : PMU : [i] : invert (boolean) Modif-04 : 0x04 : PMU : [c] : counter-mask in range [0-255] (integer) Modif-05 : 0x07 : PMU : [intx] : monitor only inside transactional memory region (boolean) Modif-06 : 0x08 : PMU : [intxcp] : do not count occurrences inside aborted transactional memory region (boolean) This is tested with Intel Sapphire Rapid CPU on Linux 4.18 using the latest libpfm4 from the git repository. Laksono Adhianto ___ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel
[perfmon2] Issues with some Intel Sapphire Rapids TOPDOWN.* codes
Hi, I think pfm_get_os_event_encoding returns incorrect codes for some Intel TOPDOWN sub-events. If I run ./check_events example program with some TOPDOWN events: ./check_events TOPDOWN:BACKEND_BOUND_SLOTS TOPDOWN:BR_MISPREDICT_SLOTS TOPDOWN:MEMORY_BOUND_SLOTS ... Requested Event: TOPDOWN:BACKEND_BOUND_SLOTS ActualEvent: spr::TOPDOWN:BACKEND_BOUND_SLOTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0 PMU: Intel SapphireRapid IDX: 1073741890 Codes : *0x530200* Requested Event: TOPDOWN:BR_MISPREDICT_SLOTS ActualEvent: spr::TOPDOWN:BR_MISPREDICT_SLOTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0 PMU: Intel SapphireRapid IDX: 1073741890 Codes : *0x530800* Requested Event: TOPDOWN:MEMORY_BOUND_SLOTS ActualEvent: spr::TOPDOWN:MEMORY_BOUND_SLOTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0 PMU: Intel SapphireRapid IDX: 1073741890 Codes : *0x531000* >From Intel perfmon json file at https://github.com/intel/perfmon/blob/main/SPR/events/sapphirerapids_core.json, if I understand correctly, the above codes should be *0x5302a4*, *0x5308a4*, and *0x5310a4* respectively: "EventCode": "*0xa4*", "UMask": "*0x02*", "EventName": "TOPDOWN.BACKEND_BOUND_SLOTS", "EventCode": "*0xa4*", "UMask": "*0x08*", "EventName": "TOPDOWN.BR_MISPREDICT_SLOTS", "EventCode": "*0xa4*", "UMask": "*0x10*", "EventName": "TOPDOWN.MEMORY_BOUND_SLOTS", Can someone confirm if this is correct? Laksono Adhianto ___ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel