On Sun, Nov 1, 2009 at 4:37 AM, Carole Wu <cwu...@gmail.com> wrote:
> Hello,
>
> pfmon --smpl-module=pebs-ll -e MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD \
>          --ld-lat-threshold=4 --long-smpl-periods=2000 --smpl-compact
> --with-header ...
>
> This should generate a trace of sample, where each sample represents the
> 2000th long latency (>4 cycles) memory operations. 2000 x (the number of
> samples in the trace) should equal to the number of LAST_LEVEL_CACHE_MISSES
> collected with perfmon2's "counting" capability, e.g. pfmon -e
> LAST_LEVEL_CACHE_MISSES. Is this right?

No.

First of all this is not tracing but statistical sampling. You are
going to loose some misses.
There are shadowing effects. In order to complete a sample, you need
the latency. Until
it is collected, no other loads can be sampled even if they exceed the
threshold, i.e., PEBS
can only track one load at a time. When the PEBS buffer fills up,
monitoring stop, but execution
continues, you are missing some more loads there.

Note that this discrepancy is not specific to PEBS, you get the same
behavior with AMD IBS or
Itanium D-EAR. But the idea is that if you run for long enough, you
will eventually get enough
representative samples to approximate a trace.


> I am seeing mismatching numbers for counts collected with PEBS and the
> counting feature in perfmon2 for my Nehalem machine.
> Your help is appreciated. Thanks,
> Carole
> On Wed, Oct 28, 2009 at 4:36 AM, stephane eranian <eran...@googlemail.com>
> wrote:
>>
>> Hi,
>>
>> I am happy to report that I have now uploaded all the code necessary to
>> use
>> PEBS on Intel Core, Atom, and Nehalem. That includes PEBS-LL on Nehalem
>> which is used to sample where cache misses occur.
>>
>> What you need:
>>    - latest libpfm sources from CVS
>>
>>    - latest pfmon sources from CVS
>>
>>    - perfmon2 2.6.30 from GIT
>>
>>      git clone
>> git://git.kernel.org/pub/scm/linux/kernel/git/eranian/linux-2.6.git
>>      Make sure you enabled 'Unified PEBS'
>>
>>
>> This kernel includes a unified PEBS sampling format which supports
>> Netburst,
>> Core, Atom, and Nehalem. You must insert the module perfmon_pebs_smpl
>> (or compile in the code).
>>
>> Next, to use PEBS, you can simply do:
>>
>>   pfmon --smpl-module=pebs --smpl-compact --with-header
>> -einst_retired:any_p \
>>             --long-smpl-period=2400000 ...
>>
>>   Not all events support PEBS. In --smpl-compact mode, each line
>> contains a PEBS
>>   sample.
>>
>> To collect cache misses on Nehalem, you can do:
>>
>> pfmon --smpl-module=pebs-ll -e MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD \
>>          --ld-lat-threshold=4 --long-smpl-periods=2000 --smpl-compact
>> --with-header ...
>>
>>  You must use the MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD to activate
>> this
>>  HW feature.
>>
>> Each line contains a PEBS record, including the cache miss
>> information. The ld-lat parameter
>> is the minimal threshold for the miss latency. Only misses >=
>> threshold are captured. It must
>> be at least 4. 4 cycles is the L1D hit latency. For each captured
>> miss, you get an instruction addr,
>> data addr, miss latency, source of the data (where did it come from,
>> refer to Intel documentation).
>> It is important to understand that the instruction addr does NOT point
>> to the load instruction but
>> ALWAYS to the next dynamic instruction, i.e., the whole state is
>> recorded at retirement of the load.
>>
>>
>> On Mon, Oct 5, 2009 at 1:55 AM, Carole Wu <cwu...@gmail.com> wrote:
>> > Hello,
>> >
>> > I'd like to collect information about my workload, running on Nehalem,
>> > using
>> > PEBS, so I use the following command.
>> >
>> >>> pfmon -e MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD --ld-lat-threshold=1
>> >>> --long-smpl-periods=2000 --short-smpl-periods=200 ./mcf_base inp.in
>> >>> load latency threshold not yet supported
>> > However, the response seems to suggest that my machine does not
>> > currently
>> > support PEBS? Is it true, or am I not setting parameters correctly?
>> >
>> > Any help is greatly appreciated.
>> >
>> > Carole
>> >
>> > ------------------------------------------------------------------------------
>> > Come build with us! The BlackBerry&reg; Developer Conference in SF, CA
>> > is the only developer event you need to attend this year. Jumpstart your
>> > developing skills, take BlackBerry mobile applications to market and
>> > stay
>> > ahead of the curve. Join us from November 9&#45;12, 2009. Register
>> > now&#33;
>> > http://p.sf.net/sfu/devconf
>> > _______________________________________________
>> > perfmon2-devel mailing list
>> > perfmon2-devel@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/perfmon2-devel
>> >
>> >
>
>

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to