Hello, Just now, Stijn (in CC), a colleague of mine, and I have been seeing some weird counts on a Core i7 machine for the SPEC CPU2000 and CPU2006 workloads, more specifically for the L1 instruction cache misses.
Comparing the counts on Core i7 with those obtained on a Core 2, Stijn noticed unexpected differences, i.e. large overcounts for the Core i7. This is strange, because the L1 instruction caches on both types of processors are equally big (32k), and the more recent Core i7 has additional features such as a victim cache and a stream buffer cache. So, the counts should be (slightly?) lower instead of higher... I'm using the perfex tool that comes with the perfctr kernel patch on both systems, and also the pfmon tool on the Core i7 system to validate the counts. On the Core2, I'm using the L1I_MISSES event (event code 81h), on the Core i7 I'm using the L1I.MISSES event (event code 80h with mask 02h). More specifically: *) Core 2: perfex -e 0x410081 ./gcc 200.i -o 200.s *) Core i7: perfex -e 0x410280 ./gcc 200.i -o 200.s and pfmon -e L1I:MISSES ./gcc 200.i -o 200.s One example is CPU2000's gcc with the 200.s reference input set. On the Core 2 we counted ~76M (million) L1-I misses. Also counting the cycles during which the instruction decoder is stalled due to the misses leads to an estimation of roughly 19 cycles penalty for each L1-I miss, which makes perfect sense, because the latency of the L2 cache is about 19 cycles. On the Core i7 system we counted ~292M L1-I misses, thus a lot more than on the Core 2 system with the same L1-I cache size. Also counting cycles during which the decoder is stalled yields of penalty of ~2.1 cycles/miss, a surprisingly low number because the L2 cache latency is significantly higher. So, our conclusion is that the L1-I misses event on the Core i7 isn't counting what is claimed. The documentation says that the L1I.MISSES event also includes streaming buffer and victim cache misses, but to our knowledge those are only looked at if the request already misses the L1-I cache. And it says explicitly that every L1-I miss is only counted once... Does anyone have suggestions on what we might be seeing here? Is it a problem with the event, or are we misinterpreting what the event is actually counting? Any comments/suggestions are highly appreciated... greetings, Kenneth -- Kenneth Hoste Paris research group - ELIS - Ghent University, Belgium email: kenneth.ho...@elis.ugent.be website: http://www.elis.ugent.be/~kehoste blog: http://boegel.kejo.be ------------------------------------------------------------------------------ _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel