On Jul 15, 2009, at 03:26 , stephane eranian wrote:

Ken,

On Fri, Jul 3, 2009 at 10:01 PM, Kenneth Hoste<kenneth.ho...@ugent.be> wrote:

On Jul 1, 2009, at 07:03 , stephane eranian wrote:

Kenneth,

Let me check on this with Intel.

Thanks! Any news yet?


I can confirm that the following events do indeed overcount on Intel Core i7:
  - L1D_CACHE_LOCK : overcounts by 3x
  - L1D_CACHE_LD      : overcounts because "This event counts load
uops at dispatch. Consequently loads "
                                     which are blocked and then
re-dispatched will be counted multiple times."
  - L1D_CACHE_ST     : overcounts for the same reason as L1D_CACHE_LD

OK, that could explain what we are seeing. Glad to hear we weren't just missing
something obvious.

For D-cache misses, it is recommended you use MEM_LOAD_RETIRED which
is what you did.

Fortunately there's this option to dance around the buggy event above for counting
L1-D load misses. Thanks for confirming this.

As for your approximation using this event, you may also want to add HIT_LFB.

Yes, you're right. We tried that, and the counts we obtain now are a lot closer to
the ones we were getting on the Core 2 (see graph in attachment).

Attachment: L1-D_misses_core2_vs_corei7.pdf
Description: Adobe PDF document



I have not yet received confirmation about your initial posting about
L1I:MISSES but it is likely
it also overcounts.

OK. Are you still expecting "official" word on this?

Any idea if there are alternatives for dancing around this one too?
We couldn't figure out an alternative, but maybe we missed it...

Those issues are known but the SDM Vol3b has not yet been updated to
reflect them.

Alternatively, I think there is also an errata that seem to describe the problem
with L1D_CACHE_*.

Look in http://download.intel.com/design/processor/specupdt/320836.pdf
And erratum AAJ105.

I don't think that erratum matches with the bug mentioned above.
It just states that non-retired loads may also be counted, not that a
single (retired or not) load might be counted several times due to
re-dispatching. Right?

We always assumed the 40h event counted both retired and non-retired
loads, but even then the counts didn't make sense (overcount is too high).

Thanks for your patience.

Thank *you* for helping us out with this, we were pretty much in the dark here.

Kenneth

--

Kenneth Hoste
Paris research group - ELIS - Ghent University, Belgium
email: kenneth.ho...@elis.ugent.be
website: http://www.elis.ugent.be/~kehoste
blog: http://boegel.kejo.be

------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to