On Wed, Feb 3, 2010 at 3:40 PM, Peter Zijlstra <pet...@infradead.org> wrote:
> On Wed, 2010-02-03 at 15:30 +0100, Stephane Eranian wrote:
>> On Wed, Feb 3, 2010 at 3:19 PM, Peter Zijlstra <pet...@infradead.org> wrote:
>> > On Wed, 2010-02-03 at 15:07 +0100, Stephane Eranian wrote:
>> >> >> The only improvement that PEBS provides is that you get an IP and the
>> >> >> machine state at retirement of an instruction that caused the event to
>> >> >> increment. Thus, the IP points to the next dynamic instruction. The 
>> >> >> instruction
>> >> >> is not the one that cause the P-th occurence of the event, if you set 
>> >> >> the
>> >> >> period to P. It is at P+N, where N cannot be predicted and varies 
>> >> >> depending
>> >> >> on the event and executed code. This introduces some bias in the 
>> >> >> samples..
>> >> >
>> >> > I'm not sure I follow, it records the next event after overflow, doesn't
>> >> > that make it P+1?
>> >> >
>> >> That is not what I wrote. I did not say if records at P+1. I said it 
>> >> records
>> >> at P+N, where N varies from sample to sample and cannot be predicted.
>> >> N is expressed in the unit of the sampling event.
>> >
>> > OK, so I'm confused.
>> >
>> > The manual says it arms the PEBS assist on overflow, and the PEBS thing
>> > will then record the next event. Which to me reads like P+1.
>> >
>> you are assuming arming is instantaneous.
>
> Yes I was, ok that stinks.
>
PEBS is still very useful because it guarantees the state you capture
is at retirement of an instruction which caused the event.

PEBS also gets way more interesting on Nehalem because of the
ability to capture where cache misses occur. That's the load latency
feature. You need to support that.

I believe you would need to abstract this in a generic fashion so it
could be used on other architectures, such as AMD with IBS.

On Nehalem, it requires the following:

- only works if you sample on MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD.

- the threshold must be programmed into a dedicated MSR. The extra
  difficulty is that this MSR is shared between CPU when HT is on.


> If only they would reset the counter on overflow instead of on record,
> that would solve quite a few issues I imagine.
>
> Then add IP to the actual instruction and you've got yourself a useful
> tool :-)
>
>

------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to