On Thu, 6 Mar 2014, Alen Stojanov wrote: > However, whenever I try to run matrices of bigger size, the reported flops are > not even close to the flops that I am supposed to obtain (anticipated results: > 600 * 600 * 600 * 2 = 432'000'000): > > perf stat -e r538010 ./mmmtest 600 > > Performance counter stats for './mmmtest 600': > > 2,348,148,851 r538010 > > 0.955511968 seconds time elapsed > ... > CPU: Intel(R) Xeon(R) CPU E5-2643 0 @ 3.30GHz, 8 cores > Linux Kernel: 3.11.0-12-generic > GCC Version: gcc version 4.8.1 (Ubuntu/Linaro 4.8.1-10ubuntu8) > Monitored events: FP_COMP_OPS_EXE:SSE_SCALAR_DOUBLE - Raw event: 0x538010 > (converted using libpfm4) ... > Do you know why does this happens ? How can I instruct perf to obtain accurate > results ?
one thing you might want to do is put :u on your event name so you are only measuring user space accesses not kernel too. floating point events are notoriously unreliable on modern intel processors. The event might also be counting speculative events or uops and it gets more complicated with AVX in the mix. What does the intel documentation say for the event for your architecture? Vince -- To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html