* Aubrey Li <aubrey.in...@gmail.com> wrote:

> > But what we are really interested in are throughput numbers under 
> > these three kernel variants, right?
> 
> These are sysbench events per second number, higher is better.
> 
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 1/1       508.5( 0.2%)    504.7( 1.1%) -0.8%     509.0( 0.2%)  0.1%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 2/2      1000.2( 1.4%)   1004.1( 1.6%)  0.4%     997.6( 1.2%) -0.3%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 4/4      1912.1( 1.0%)   1904.2( 1.1%) -0.4%    1914.9( 1.3%)  0.1%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 8/8      3753.5( 0.3%)   3748.2( 0.3%) -0.1%    3751.3( 0.4%) -0.1%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 16/16    7139.3( 2.4%)   7137.9( 1.8%) -0.0%    7049.2( 2.4%) -1.3%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 32/32   10899.0( 4.2%)  10780.3( 4.4%) -1.1%    10339.2( 9.6%) -5.1%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 64/64   15086.1(11.5%)  14262.0( 8.2%) -5.5%    11168.7(22.2%) -26.0%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 128/128 15371.9(22.0%)  14675.8(14.4%) -4.5%    10963.9(18.5%) -28.7%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 256/256 15990.8(22.0%)  12227.9(10.3%) -23.5%   10469.9(19.6%) -34.5%

So because I'm a big fan of presenting data in a readable fashion, here 
are your results, tabulated:

 #
 # Sysbench throughput comparison of 3 different kernels at different 
 # load levels, higher numbers are better:
 #

 
.--------------------------------------|----------------------------------------------------------------.
 |  NA/AVX     vanilla-SMT    [stddev%] |coresched-SMT   [stddev%]   +/-  |   
no-SMT    [stddev%]   +/-  |
 
|--------------------------------------|----------------------------------------------------------------|
 |   1/1             508.5    [  0.2% ] |        504.7   [  1.1% ]   0.8% |    
509.0    [  0.2% ]   0.1% |
 |   2/2            1000.2    [  1.4% ] |       1004.1   [  1.6% ]   0.4% |    
997.6    [  1.2% ]   0.3% |
 |   4/4            1912.1    [  1.0% ] |       1904.2   [  1.1% ]   0.4% |   
1914.9    [  1.3% ]   0.1% |
 |   8/8            3753.5    [  0.3% ] |       3748.2   [  0.3% ]   0.1% |   
3751.3    [  0.4% ]   0.1% |
 |  16/16           7139.3    [  2.4% ] |       7137.9   [  1.8% ]   0.0% |   
7049.2    [  2.4% ]   1.3% |
 |  32/32          10899.0    [  4.2% ] |      10780.3   [  4.4% ]  -1.1% |  
10339.2    [  9.6% ]  -5.1% |
 |  64/64          15086.1    [ 11.5% ] |      14262.0   [  8.2% ]  -5.5% |  
11168.7    [ 22.2% ] -26.0% |
 | 128/128         15371.9    [ 22.0% ] |      14675.8   [ 14.4% ]  -4.5% |  
10963.9    [ 18.5% ] -28.7% |
 | 256/256         15990.8    [ 22.0% ] |      12227.9   [ 10.3% ] -23.5% |  
10469.9    [ 19.6% ] -34.5% |
 
'--------------------------------------|----------------------------------------------------------------'

One major thing that sticks out is that if we compare the stddev numbers 
to the +/- comparisons then it's pretty clear that the benchmarks are 
very noisy: in all but the last row stddev is actually higher than the 
measured effect.

So what does 'stddev' mean here, exactly? The stddev of multipe runs, 
i.e. measured run-to-run variance? Or is it some internal metric of the 
benchmark?

Thanks,

        Ingo

Reply via email to