This patch fixes a bug in 2.6.33 X86 event scheduling whereby all counts are bogus as soon as events need to be multiplexed because the PMU is overcommitted.
The code in hw_perf_enable() was causing multiplexed events to accumulate collected counts twice causing bogus results. This is demonstrated on AMD Barcelona with the example below. First run, no conflict, you obtain the actual counts. Second run, PMU overcommitted, multiplexing, all events are over-counted. Third run, patch applied, you obtain the correct count through scaling. Intel processors would be affected in the same way. # perf stat -e instructions,cycles ./noploop 10 noploop for 10 seconds Performance counter stats for './noploop 10': 10884992991 instructions # 0.495 IPC 21976457932 cycles 10.000906311 seconds time elapsed # perf stat -e instructions,instructions,instructions,instructions,cycles ./noploop 10 noploop for 10 seconds Performance counter stats for './noploop 10': 16342703033 instructions # 1.000 IPC (scaled from 80.00%) 16337667144 instructions # 0.999 IPC (scaled from 80.00%) 16342494809 instructions # 1.000 IPC (scaled from 80.00%) 16344432632 instructions # 1.000 IPC (scaled from 80.00%) 16346620711 cycles (scaled from 80.00%) 10.015941304 seconds time elapsed # perf stat -e instructions,instructions,instructions,instructions,cycles ./noploop 10 noploop for 10 seconds Performance counter stats for './noploop 10': 10865832804 instructions # 0.495 IPC (scaled from 80.00%) 10866436957 instructions # 0.495 IPC (scaled from 80.00%) 10866172153 instructions # 0.495 IPC (scaled from 80.00%) 10866276672 instructions # 0.495 IPC (scaled from 80.00%) 21944300714 cycles (scaled from 80.00%) 10.000686860 seconds time elapsed Signed-off-by: Stephane Eranian <eran...@google.com> -- perf_event.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c index 97cddbf..ef5d63f 100644 --- a/arch/x86/kernel/cpu/perf_event.c +++ b/arch/x86/kernel/cpu/perf_event.c @@ -818,8 +818,6 @@ void hw_perf_enable(void) match_prev_assignment(hwc, cpuc, i)) continue; - x86_pmu_stop(event); - hwc->idx = -1; } ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel