Hi,

On Tue, May 15, 2018 at 11:48 AM, laksono <laks...@gmail.com> wrote:

> All,
>
>
> I want to profile using MEM_TRANS_RETIRED::LATENCY_ABOVE_THRESHOLD
> counter on Intel SandyBridge. Using libpfm4's examples/check_event, I can
> extract the perf_events config into *0x5301cd* and *0x3* :
>
>
>  $ ./check_events snb::MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD
>
>
> Requested Event: snb::MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD
>
> Actual    Event: snb::MEM_TRANS_RETIRED:LATENCY
> _ABOVE_THRESHOLD:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3
>
> PMU            : Intel Sandy Bridge
>
> IDX            : 142606390
>
> Codes          : *0x5301cd* *0x3*
>
> However, looking at the result of examples/showevtinfo program, I believe
> the config number of MEM_TRANS_RETIRED::LATENCY_ABOVE_THRESHOLD should be
> *0x1cd* (*0x01* for the umask and *0xcd* for the code):
>
>
> IDX      : 142606390
>
> PMU name : snb (Intel Sandy Bridge)
>
> Name     : MEM_TRANS_RETIRED
>
> Equiv    : None
>
> Flags    : [precise]
>
> Desc     : Memory transactions retired
>
> Code     : *0xcd*
>
> Umask-00 : *0x01* : PMU : [LATENCY_ABOVE_THRESHOLD] : [precise] : Memory
> load instructions retired above programmed clocks, minimum threshold value
> is 3 (Precise Event and ldlat required)
>
> Umask-01 : 0x02 : PMU : [PRECISE_STORE] : [precise] : Capture where stores
> occur, must use with PEBS (Precise Event required)
>
> Modif-00 : 0x00 : PMU : [k] : monitor at priv level 0 (boolean)
>
> Modif-01 : 0x01 : PMU : [u] : monitor at priv level 1, 2, 3 (boolean)
>
> Modif-02 : 0x02 : PMU : [e] : edge level (may require counter-mask >= 1)
> (boolean)
>
> Modif-03 : 0x03 : PMU : [i] : invert (boolean)
>
> Modif-04 : 0x04 : PMU : [c] : counter-mask in range [0-255] (integer)
>
> Modif-05 : 0x05 : PMU : [t] : measure any thread (boolean)
>
> Modif-06 : 0x06 : PMU : [ldlat] : load latency threshold (cycles,
> [3-65535]) (integer)
>
> The question is how pfm_get_os_event_encoding() translates
> MEM_TRANS_RETIRED::LATENCY_ABOVE_THRESHOLD into *0x5301cd* and *0x3*
> instead of *0x1cd* ?
>
> You can ignore the 0x5 in 0x5301cd, it is the enable bit which gets
overwriten by the kernel.
The case of 0x3 is more interesting. This event is special, it requires a
latency filter. As you see from the description, the event increments when
the load execution latency is above a certain threshold. Well, you need to
specify the threshold. This is done using the ldlat= modifier. If you don't
specify one, the library will assume you want to the smallest latency
possible which is 3.
Hope this clarifies how to use this event.


> Thanks
>
> Laksono Adhianto
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to