Re: performance changes on d4b4c2cd: 37.6% fsmark.files_per_sec, -15.9% fsmark.files_per_sec, and few more

2015-03-24 Thread NeilBrown
On Wed, 18 Mar 2015 13:03:19 +0800 Yuanahn Liu 
wrote:

> Hi,
> 
> FYI, we noticed performance changes on `fsmark.files_per_sec' by 
> d4b4c2cdffab86f5c7594c44635286a6d277d5c6:
> 
> > commit d4b4c2cdffab86f5c7594c44635286a6d277d5c6
> > Author: s...@kernel.org 
> > AuthorDate: Mon Dec 15 12:57:03 2014 +1100
> > Commit: NeilBrown 
> > CommitDate: Wed Mar 4 13:40:17 2015 +1100
> > 
> > RAID5: batch adjacent full stripe write

Thanks a lot for this one too!
Generally positive, with the only regressions on NoSync tests.  Maybe the
same cause?

Again, 
>  7 ±  5% +37.6% 10 ±  6%  
> fsmark.time.percent_of_cpu_this_job_got
and
>  9 ±  0% -14.8%  7 ±  6%  
> fsmark.time.percent_of_cpu_this_job_got

are a bit confusing - really less than 10% of a CPU ??

Thanks,
NeilBrown


> 
> c1dfe87e41d9c2926fe92f803f02c733ddbccf0b 
> d4b4c2cdffab86f5c7594c44635286a6d277d5c6
>  
> 
> run time(m) metric_value ±stddev run time(m) metric_value 
> ±stddev change   testbox/benchmark/sub-testcase
> --- --   --- --  
>   --
> 4   15.3  33.525 ±3.0%   6   11.1  46.133 
> ±5.0%  37.6% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
> 3   0.5  262.800 ±1.5%   3   0.4  307.367 
> ±1.2%  17.0% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync
> 3   0.5  289.900 ±0.3%   3   0.4  323.367 
> ±2.4%  11.5% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync
> 3   0.5  325.667 ±2.2%   3   0.5  358.800 
> ±1.8%  10.2% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-NoSync
> 3   0.6  216.100 ±0.4%   3   0.6  230.100 
> ±0.4%   6.5% 
> ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose
> 3   0.5  309.900 ±0.3%   3   0.5  328.500 
> ±1.1%   6.0% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-NoSync
> 
> 3   13.8  37.000 ±0.2%   3   16.5  31.100 
> ±0.3% -15.9% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync
> 
> NOTE: here are some more info about those test parameters for you to
>   understand the testcase better:
> 
>   1x : where 'x' means iterations or loop, corresponding to the 'L' 
> option of fsmark
>   64t: where 't' means thread
>   4M : means the single file size, corresponding to the '-s' option of 
> fsmark
>   120G, 30G: means the total test size
> 
>   4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where 
> '12G' means
> the size of one ramdisk. So, it would be 48G in total. And we 
> made a
> raid on those ramdisk.
> 
> 
> And FYI, here I listed more detailed changes for the maximal postive and 
> negtive changes.
> 
> 
> more detailed changes about ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
> -
> 
> c1dfe87e41d9c292  d4b4c2cdffab86f5c7594c4463  
>   --  
>  %stddev %change %stddev
>  \  |\  
>  33.53 ±  3% +37.6%  46.13 ±  4%  fsmark.files_per_sec
>916 ±  3% -27.2%667 ±  5%  fsmark.time.elapsed_time.max
>916 ±  3% -27.2%667 ±  5%  fsmark.time.elapsed_time
>  7 ±  5% +37.6% 10 ±  6%  
> fsmark.time.percent_of_cpu_this_job_got
>  92097 ±  2% -23.1%  70865 ±  4%  
> fsmark.time.voluntary_context_switches
>   0.04 ± 42%+681.0%   0.27 ± 22%  turbostat.Pkg%pc3
> 716062 ±  3% -82.7% 124210 ± 21%  cpuidle.C1-IVT.usage
>  6.883e+08 ±  2% -86.8%   91146705 ± 34%  cpuidle.C1-IVT.time
>   0.04 ± 30%+145.8%   0.10 ± 25%  turbostat.CPU%c3
>404 ± 16% -58.4%168 ± 14%  cpuidle.POLL.usage
>159 ± 47%+179.5%444 ± 23%  
> proc-vmstat.kswapd_low_wmark_hit_quickly
>  11133 ± 23%+100.3%  22298 ± 30%  cpuidle.C3-IVT.usage
>   10286681 ± 27% +95.6%   20116924 ± 27%  cpuidle.C3-IVT.time
>   7.92 ± 16% +77.4%  14.05 ±  6%  turbostat.Pkg%pc6
>   4.93 ±  3% -38.6%   3.03 ±  2%  turbostat.CPU%c1
>916 ±  3% -27.2%667 ±  5%  time.elapsed_time.max
>916 ±  3% -27.2%667 ±  5%  time.elapsed_time
>2137390 ±  3% -26.7%1566752 ±  5%  proc-vmstat.pgfault
>  7 ±  5% +37.6% 10 ±  6%  time.percent_of_cpu_this_job_got
>  4.309e+10 ±  3% -26.3%  3.176e+10 ±  5%  cpuidle.C6-IVT.time
>  49038 ±  2% -23.9%  37334 ±  4%  uptime.idle
>   1047 ±  2% -23.8%797 ±  4%  uptime.boot
>  92097 

Re: performance changes on d4b4c2cd: 37.6% fsmark.files_per_sec, -15.9% fsmark.files_per_sec, and few more

2015-03-24 Thread NeilBrown
On Wed, 18 Mar 2015 13:03:19 +0800 Yuanahn Liu yuanhan@linux.intel.com
wrote:

 Hi,
 
 FYI, we noticed performance changes on `fsmark.files_per_sec' by 
 d4b4c2cdffab86f5c7594c44635286a6d277d5c6:
 
  commit d4b4c2cdffab86f5c7594c44635286a6d277d5c6
  Author: s...@kernel.org s...@kernel.org
  AuthorDate: Mon Dec 15 12:57:03 2014 +1100
  Commit: NeilBrown ne...@suse.de
  CommitDate: Wed Mar 4 13:40:17 2015 +1100
  
  RAID5: batch adjacent full stripe write

Thanks a lot for this one too!
Generally positive, with the only regressions on NoSync tests.  Maybe the
same cause?

Again, 
  7 ±  5% +37.6% 10 ±  6%  
 fsmark.time.percent_of_cpu_this_job_got
and
  9 ±  0% -14.8%  7 ±  6%  
 fsmark.time.percent_of_cpu_this_job_got

are a bit confusing - really less than 10% of a CPU ??

Thanks,
NeilBrown


 
 c1dfe87e41d9c2926fe92f803f02c733ddbccf0b 
 d4b4c2cdffab86f5c7594c44635286a6d277d5c6
  
 
 run time(m) metric_value ±stddev run time(m) metric_value 
 ±stddev change   testbox/benchmark/sub-testcase
 --- --   --- --  
   --
 4   15.3  33.525 ±3.0%   6   11.1  46.133 
 ±5.0%  37.6% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
 3   0.5  262.800 ±1.5%   3   0.4  307.367 
 ±1.2%  17.0% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync
 3   0.5  289.900 ±0.3%   3   0.4  323.367 
 ±2.4%  11.5% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync
 3   0.5  325.667 ±2.2%   3   0.5  358.800 
 ±1.8%  10.2% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-NoSync
 3   0.6  216.100 ±0.4%   3   0.6  230.100 
 ±0.4%   6.5% 
 ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose
 3   0.5  309.900 ±0.3%   3   0.5  328.500 
 ±1.1%   6.0% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-NoSync
 
 3   13.8  37.000 ±0.2%   3   16.5  31.100 
 ±0.3% -15.9% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync
 
 NOTE: here are some more info about those test parameters for you to
   understand the testcase better:
 
   1x : where 'x' means iterations or loop, corresponding to the 'L' 
 option of fsmark
   64t: where 't' means thread
   4M : means the single file size, corresponding to the '-s' option of 
 fsmark
   120G, 30G: means the total test size
 
   4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where 
 '12G' means
 the size of one ramdisk. So, it would be 48G in total. And we 
 made a
 raid on those ramdisk.
 
 
 And FYI, here I listed more detailed changes for the maximal postive and 
 negtive changes.
 
 
 more detailed changes about ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
 -
 
 c1dfe87e41d9c292  d4b4c2cdffab86f5c7594c4463  
   --  
  %stddev %change %stddev
  \  |\  
  33.53 ±  3% +37.6%  46.13 ±  4%  fsmark.files_per_sec
916 ±  3% -27.2%667 ±  5%  fsmark.time.elapsed_time.max
916 ±  3% -27.2%667 ±  5%  fsmark.time.elapsed_time
  7 ±  5% +37.6% 10 ±  6%  
 fsmark.time.percent_of_cpu_this_job_got
  92097 ±  2% -23.1%  70865 ±  4%  
 fsmark.time.voluntary_context_switches
   0.04 ± 42%+681.0%   0.27 ± 22%  turbostat.Pkg%pc3
 716062 ±  3% -82.7% 124210 ± 21%  cpuidle.C1-IVT.usage
  6.883e+08 ±  2% -86.8%   91146705 ± 34%  cpuidle.C1-IVT.time
   0.04 ± 30%+145.8%   0.10 ± 25%  turbostat.CPU%c3
404 ± 16% -58.4%168 ± 14%  cpuidle.POLL.usage
159 ± 47%+179.5%444 ± 23%  
 proc-vmstat.kswapd_low_wmark_hit_quickly
  11133 ± 23%+100.3%  22298 ± 30%  cpuidle.C3-IVT.usage
   10286681 ± 27% +95.6%   20116924 ± 27%  cpuidle.C3-IVT.time
   7.92 ± 16% +77.4%  14.05 ±  6%  turbostat.Pkg%pc6
   4.93 ±  3% -38.6%   3.03 ±  2%  turbostat.CPU%c1
916 ±  3% -27.2%667 ±  5%  time.elapsed_time.max
916 ±  3% -27.2%667 ±  5%  time.elapsed_time
2137390 ±  3% -26.7%1566752 ±  5%  proc-vmstat.pgfault
  7 ±  5% +37.6% 10 ±  6%  time.percent_of_cpu_this_job_got
  4.309e+10 ±  3% -26.3%  3.176e+10 ±  5%  cpuidle.C6-IVT.time
  49038 ±  2% -23.9%  37334 ±  4%  uptime.idle
   1047 ±  2% -23.8%797 ±  4%  uptime.boot
  92097 ±  2% -23.1%  70865 ±  4%  

performance changes on d4b4c2cd: 37.6% fsmark.files_per_sec, -15.9% fsmark.files_per_sec, and few more

2015-03-17 Thread Yuanahn Liu
Hi,

FYI, we noticed performance changes on `fsmark.files_per_sec' by 
d4b4c2cdffab86f5c7594c44635286a6d277d5c6:

> commit d4b4c2cdffab86f5c7594c44635286a6d277d5c6
> Author: s...@kernel.org 
> AuthorDate: Mon Dec 15 12:57:03 2014 +1100
> Commit: NeilBrown 
> CommitDate: Wed Mar 4 13:40:17 2015 +1100
> 
> RAID5: batch adjacent full stripe write

c1dfe87e41d9c2926fe92f803f02c733ddbccf0b 
d4b4c2cdffab86f5c7594c44635286a6d277d5c6
 

run time(m) metric_value ±stddev run time(m) metric_value 
±stddev change   testbox/benchmark/sub-testcase
--- --   --- --  
  --
4   15.3  33.525 ±3.0%   6   11.1  46.133 
±5.0%  37.6% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
3   0.5  262.800 ±1.5%   3   0.4  307.367 
±1.2%  17.0% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync
3   0.5  289.900 ±0.3%   3   0.4  323.367 
±2.4%  11.5% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync
3   0.5  325.667 ±2.2%   3   0.5  358.800 
±1.8%  10.2% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-NoSync
3   0.6  216.100 ±0.4%   3   0.6  230.100 
±0.4%   6.5% 
ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose
3   0.5  309.900 ±0.3%   3   0.5  328.500 
±1.1%   6.0% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-NoSync

3   13.8  37.000 ±0.2%   3   16.5  31.100 
±0.3% -15.9% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync

NOTE: here are some more info about those test parameters for you to
  understand the testcase better:

  1x : where 'x' means iterations or loop, corresponding to the 'L' option 
of fsmark
  64t: where 't' means thread
  4M : means the single file size, corresponding to the '-s' option of 
fsmark
  120G, 30G: means the total test size

  4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' 
means
the size of one ramdisk. So, it would be 48G in total. And we 
made a
raid on those ramdisk.


And FYI, here I listed more detailed changes for the maximal postive and 
negtive changes.


more detailed changes about ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
-

c1dfe87e41d9c292  d4b4c2cdffab86f5c7594c4463  
  --  
 %stddev %change %stddev
 \  |\  
 33.53 ±  3% +37.6%  46.13 ±  4%  fsmark.files_per_sec
   916 ±  3% -27.2%667 ±  5%  fsmark.time.elapsed_time.max
   916 ±  3% -27.2%667 ±  5%  fsmark.time.elapsed_time
 7 ±  5% +37.6% 10 ±  6%  
fsmark.time.percent_of_cpu_this_job_got
 92097 ±  2% -23.1%  70865 ±  4%  
fsmark.time.voluntary_context_switches
  0.04 ± 42%+681.0%   0.27 ± 22%  turbostat.Pkg%pc3
716062 ±  3% -82.7% 124210 ± 21%  cpuidle.C1-IVT.usage
 6.883e+08 ±  2% -86.8%   91146705 ± 34%  cpuidle.C1-IVT.time
  0.04 ± 30%+145.8%   0.10 ± 25%  turbostat.CPU%c3
   404 ± 16% -58.4%168 ± 14%  cpuidle.POLL.usage
   159 ± 47%+179.5%444 ± 23%  
proc-vmstat.kswapd_low_wmark_hit_quickly
 11133 ± 23%+100.3%  22298 ± 30%  cpuidle.C3-IVT.usage
  10286681 ± 27% +95.6%   20116924 ± 27%  cpuidle.C3-IVT.time
  7.92 ± 16% +77.4%  14.05 ±  6%  turbostat.Pkg%pc6
  4.93 ±  3% -38.6%   3.03 ±  2%  turbostat.CPU%c1
   916 ±  3% -27.2%667 ±  5%  time.elapsed_time.max
   916 ±  3% -27.2%667 ±  5%  time.elapsed_time
   2137390 ±  3% -26.7%1566752 ±  5%  proc-vmstat.pgfault
 7 ±  5% +37.6% 10 ±  6%  time.percent_of_cpu_this_job_got
 4.309e+10 ±  3% -26.3%  3.176e+10 ±  5%  cpuidle.C6-IVT.time
 49038 ±  2% -23.9%  37334 ±  4%  uptime.idle
  1047 ±  2% -23.8%797 ±  4%  uptime.boot
 92097 ±  2% -23.1%  70865 ±  4%  time.voluntary_context_switches
   4005888 ±  0% +13.3%4537685 ± 11%  meminfo.DirectMap2M
  3917 ±  2% -16.3%   3278 ±  5%  proc-vmstat.pageoutrun
213737 ±  1% -13.9% 183969 ±  3%  softirqs.SCHED
 46.86 ±  1% +16.5%  54.59 ±  1%  turbostat.Pkg%pc2
 32603 ±  3% -11.7%  28781 ±  5%  numa-vmstat.node1.nr_unevictable
130415 ±  3% -11.7% 115127 ±  5%  numa-meminfo.node1.Unevictable
256781 ±  2%  -8.8% 234146 ±  3%  softirqs.TASKLET
253606 ±  2%  -8.9% 231108 ±  3%  softirqs.BLOCK
119.10 ±  2% 

performance changes on d4b4c2cd: 37.6% fsmark.files_per_sec, -15.9% fsmark.files_per_sec, and few more

2015-03-17 Thread Yuanahn Liu
Hi,

FYI, we noticed performance changes on `fsmark.files_per_sec' by 
d4b4c2cdffab86f5c7594c44635286a6d277d5c6:

 commit d4b4c2cdffab86f5c7594c44635286a6d277d5c6
 Author: s...@kernel.org s...@kernel.org
 AuthorDate: Mon Dec 15 12:57:03 2014 +1100
 Commit: NeilBrown ne...@suse.de
 CommitDate: Wed Mar 4 13:40:17 2015 +1100
 
 RAID5: batch adjacent full stripe write

c1dfe87e41d9c2926fe92f803f02c733ddbccf0b 
d4b4c2cdffab86f5c7594c44635286a6d277d5c6
 

run time(m) metric_value ±stddev run time(m) metric_value 
±stddev change   testbox/benchmark/sub-testcase
--- --   --- --  
  --
4   15.3  33.525 ±3.0%   6   11.1  46.133 
±5.0%  37.6% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
3   0.5  262.800 ±1.5%   3   0.4  307.367 
±1.2%  17.0% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync
3   0.5  289.900 ±0.3%   3   0.4  323.367 
±2.4%  11.5% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync
3   0.5  325.667 ±2.2%   3   0.5  358.800 
±1.8%  10.2% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-NoSync
3   0.6  216.100 ±0.4%   3   0.6  230.100 
±0.4%   6.5% 
ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose
3   0.5  309.900 ±0.3%   3   0.5  328.500 
±1.1%   6.0% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-NoSync

3   13.8  37.000 ±0.2%   3   16.5  31.100 
±0.3% -15.9% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync

NOTE: here are some more info about those test parameters for you to
  understand the testcase better:

  1x : where 'x' means iterations or loop, corresponding to the 'L' option 
of fsmark
  64t: where 't' means thread
  4M : means the single file size, corresponding to the '-s' option of 
fsmark
  120G, 30G: means the total test size

  4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' 
means
the size of one ramdisk. So, it would be 48G in total. And we 
made a
raid on those ramdisk.


And FYI, here I listed more detailed changes for the maximal postive and 
negtive changes.


more detailed changes about ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
-

c1dfe87e41d9c292  d4b4c2cdffab86f5c7594c4463  
  --  
 %stddev %change %stddev
 \  |\  
 33.53 ±  3% +37.6%  46.13 ±  4%  fsmark.files_per_sec
   916 ±  3% -27.2%667 ±  5%  fsmark.time.elapsed_time.max
   916 ±  3% -27.2%667 ±  5%  fsmark.time.elapsed_time
 7 ±  5% +37.6% 10 ±  6%  
fsmark.time.percent_of_cpu_this_job_got
 92097 ±  2% -23.1%  70865 ±  4%  
fsmark.time.voluntary_context_switches
  0.04 ± 42%+681.0%   0.27 ± 22%  turbostat.Pkg%pc3
716062 ±  3% -82.7% 124210 ± 21%  cpuidle.C1-IVT.usage
 6.883e+08 ±  2% -86.8%   91146705 ± 34%  cpuidle.C1-IVT.time
  0.04 ± 30%+145.8%   0.10 ± 25%  turbostat.CPU%c3
   404 ± 16% -58.4%168 ± 14%  cpuidle.POLL.usage
   159 ± 47%+179.5%444 ± 23%  
proc-vmstat.kswapd_low_wmark_hit_quickly
 11133 ± 23%+100.3%  22298 ± 30%  cpuidle.C3-IVT.usage
  10286681 ± 27% +95.6%   20116924 ± 27%  cpuidle.C3-IVT.time
  7.92 ± 16% +77.4%  14.05 ±  6%  turbostat.Pkg%pc6
  4.93 ±  3% -38.6%   3.03 ±  2%  turbostat.CPU%c1
   916 ±  3% -27.2%667 ±  5%  time.elapsed_time.max
   916 ±  3% -27.2%667 ±  5%  time.elapsed_time
   2137390 ±  3% -26.7%1566752 ±  5%  proc-vmstat.pgfault
 7 ±  5% +37.6% 10 ±  6%  time.percent_of_cpu_this_job_got
 4.309e+10 ±  3% -26.3%  3.176e+10 ±  5%  cpuidle.C6-IVT.time
 49038 ±  2% -23.9%  37334 ±  4%  uptime.idle
  1047 ±  2% -23.8%797 ±  4%  uptime.boot
 92097 ±  2% -23.1%  70865 ±  4%  time.voluntary_context_switches
   4005888 ±  0% +13.3%4537685 ± 11%  meminfo.DirectMap2M
  3917 ±  2% -16.3%   3278 ±  5%  proc-vmstat.pageoutrun
213737 ±  1% -13.9% 183969 ±  3%  softirqs.SCHED
 46.86 ±  1% +16.5%  54.59 ±  1%  turbostat.Pkg%pc2
 32603 ±  3% -11.7%  28781 ±  5%  numa-vmstat.node1.nr_unevictable
130415 ±  3% -11.7% 115127 ±  5%  numa-meminfo.node1.Unevictable
256781 ±  2%  -8.8% 234146 ±  3%  softirqs.TASKLET
253606 ±  2%  -8.9% 231108 ±  3%  softirqs.BLOCK