Thanks, that helped!
Currently I am trying to understand how I can count L1 cache misses.
Can I get it simply with MEM_LOAD_RETIRED:L1D_HIT:i:c=1 ?
When do I know if I can use the counter mask or not? In which cases does it
make sense?
2010/11/25 stephane eranian
> Hi,
>
> I think the section
Hi,
I think the section of the manual you're referring to talks
about drill-down on an Intel Core (Core 2) processors. There
you do have RS_UOPS_DISPATCHED.
If you want to do something similar on Nehalem, I recommend
you look at the presentations here:
https://openlab-mu-internal.web.cern.ch/op
Hi,
What machine are you trying to do this on?
On Thu, Nov 25, 2010 at 4:39 PM, Eduard Diner
wrote:
> Hi guys,
> I have instrumented my code on Nehalem with libpfm4 and now I am a little
> bit confused.
> I am reading Intel's "IA64-Optimization-Reference". In Appendix B3.4 they
> describe a dril
Hi guys,
I have instrumented my code on Nehalem with libpfm4 and now I am a little
bit confused.
I am reading Intel's "IA64-Optimization-Reference". In Appendix B3.4 they
describe a drill-down technique for performance analysis. I think this is a
good starting point for analysis.
Basically I want t