This patch fixes a bug in 2.6.33 X86 event scheduling whereby
all counts are bogus as soon as events need to be multiplexed
because the PMU is overcommitted.

The code in hw_perf_enable() was causing multiplexed events
to accumulate collected counts twice causing bogus results.

This is demonstrated on AMD Barcelona with the example
below. First run, no conflict, you obtain the actual counts.
Second run, PMU overcommitted, multiplexing, all events are
over-counted. Third run, patch applied, you obtain the correct
count through scaling.

Intel processors would be affected in the same way.

        # perf stat -e instructions,cycles ./noploop 10
        noploop for 10 seconds
        
         Performance counter stats for './noploop 10':
        
                10884992991  instructions             #      0.495 IPC  
                21976457932  cycles                  
        
               10.000906311  seconds time elapsed
        
        # perf stat -e 
instructions,instructions,instructions,instructions,cycles ./noploop 10
        noploop for 10 seconds
        
         Performance counter stats for './noploop 10':
        
                16342703033  instructions             #      1.000 IPC    
(scaled from 80.00%)
                16337667144  instructions             #      0.999 IPC    
(scaled from 80.00%)
                16342494809  instructions             #      1.000 IPC    
(scaled from 80.00%)
                16344432632  instructions             #      1.000 IPC    
(scaled from 80.00%)
                16346620711  cycles                    (scaled from 80.00%)
        
               10.015941304  seconds time elapsed
        
        # perf stat -e 
instructions,instructions,instructions,instructions,cycles ./noploop 10
        noploop for 10 seconds
        
         Performance counter stats for './noploop 10':
        
                10865832804  instructions             #      0.495 IPC    
(scaled from 80.00%)
                10866436957  instructions             #      0.495 IPC    
(scaled from 80.00%)
                10866172153  instructions             #      0.495 IPC    
(scaled from 80.00%)
                10866276672  instructions             #      0.495 IPC    
(scaled from 80.00%)
                21944300714  cycles                    (scaled from 80.00%)
        
               10.000686860  seconds time elapsed
        
        Signed-off-by: Stephane Eranian <eran...@google.com>
--
 perf_event.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 97cddbf..ef5d63f 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -818,8 +818,6 @@ void hw_perf_enable(void)
                            match_prev_assignment(hwc, cpuc, i))
                                continue;
 
-                       x86_pmu_stop(event);
-
                        hwc->idx = -1;
                }
 

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to