performance changes on d4b4c2cd: 37.6% fsmark.files_per_sec, -15.9% fsmark.files_per_sec, and few more

2015-03-17 Thread Yuanahn Liu
Hi,

FYI, we noticed performance changes on `fsmark.files_per_sec' by 
d4b4c2cdffab86f5c7594c44635286a6d277d5c6:

> commit d4b4c2cdffab86f5c7594c44635286a6d277d5c6
> Author: s...@kernel.org 
> AuthorDate: Mon Dec 15 12:57:03 2014 +1100
> Commit: NeilBrown 
> CommitDate: Wed Mar 4 13:40:17 2015 +1100
> 
> RAID5: batch adjacent full stripe write

c1dfe87e41d9c2926fe92f803f02c733ddbccf0b 
d4b4c2cdffab86f5c7594c44635286a6d277d5c6
 

run time(m) metric_value ±stddev run time(m) metric_value 
±stddev change   testbox/benchmark/sub-testcase
--- --   --- --  
  --
4   15.3  33.525 ±3.0%   6   11.1  46.133 
±5.0%  37.6% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
3   0.5  262.800 ±1.5%   3   0.4  307.367 
±1.2%  17.0% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync
3   0.5  289.900 ±0.3%   3   0.4  323.367 
±2.4%  11.5% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync
3   0.5  325.667 ±2.2%   3   0.5  358.800 
±1.8%  10.2% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-NoSync
3   0.6  216.100 ±0.4%   3   0.6  230.100 
±0.4%   6.5% 
ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose
3   0.5  309.900 ±0.3%   3   0.5  328.500 
±1.1%   6.0% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-NoSync

3   13.8  37.000 ±0.2%   3   16.5  31.100 
±0.3% -15.9% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync

NOTE: here are some more info about those test parameters for you to
  understand the testcase better:

  1x : where 'x' means iterations or loop, corresponding to the 'L' option 
of fsmark
  64t: where 't' means thread
  4M : means the single file size, corresponding to the '-s' option of 
fsmark
  120G, 30G: means the total test size

  4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' 
means
the size of one ramdisk. So, it would be 48G in total. And we 
made a
raid on those ramdisk.


And FYI, here I listed more detailed changes for the maximal postive and 
negtive changes.


more detailed changes about ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
-

c1dfe87e41d9c292  d4b4c2cdffab86f5c7594c4463  
  --  
 %stddev %change %stddev
 \  |\  
 33.53 ±  3% +37.6%  46.13 ±  4%  fsmark.files_per_sec
   916 ±  3% -27.2%667 ±  5%  fsmark.time.elapsed_time.max
   916 ±  3% -27.2%667 ±  5%  fsmark.time.elapsed_time
 7 ±  5% +37.6% 10 ±  6%  
fsmark.time.percent_of_cpu_this_job_got
 92097 ±  2% -23.1%  70865 ±  4%  
fsmark.time.voluntary_context_switches
  0.04 ± 42%+681.0%   0.27 ± 22%  turbostat.Pkg%pc3
716062 ±  3% -82.7% 124210 ± 21%  cpuidle.C1-IVT.usage
 6.883e+08 ±  2% -86.8%   91146705 ± 34%  cpuidle.C1-IVT.time
  0.04 ± 30%+145.8%   0.10 ± 25%  turbostat.CPU%c3
   404 ± 16% -58.4%168 ± 14%  cpuidle.POLL.usage
   159 ± 47%+179.5%444 ± 23%  
proc-vmstat.kswapd_low_wmark_hit_quickly
 11133 ± 23%+100.3%  22298 ± 30%  cpuidle.C3-IVT.usage
  10286681 ± 27% +95.6%   20116924 ± 27%  cpuidle.C3-IVT.time
  7.92 ± 16% +77.4%  14.05 ±  6%  turbostat.Pkg%pc6
  4.93 ±  3% -38.6%   3.03 ±  2%  turbostat.CPU%c1
   916 ±  3% -27.2%667 ±  5%  time.elapsed_time.max
   916 ±  3% -27.2%667 ±  5%  time.elapsed_time
   2137390 ±  3% -26.7%1566752 ±  5%  proc-vmstat.pgfault
 7 ±  5% +37.6% 10 ±  6%  time.percent_of_cpu_this_job_got
 4.309e+10 ±  3% -26.3%  3.176e+10 ±  5%  cpuidle.C6-IVT.time
 49038 ±  2% -23.9%  37334 ±  4%  uptime.idle
  1047 ±  2% -23.8%797 ±  4%  uptime.boot
 92097 ±  2% -23.1%  70865 ±  4%  time.voluntary_context_switches
   4005888 ±  0% +13.3%4537685 ± 11%  meminfo.DirectMap2M
  3917 ±  2% -16.3%   3278 ±  5%  proc-vmstat.pageoutrun
213737 ±  1% -13.9% 183969 ±  3%  softirqs.SCHED
 46.86 ±  1% +16.5%  54.59 ±  1%  turbostat.Pkg%pc2
 32603 ±  3% -11.7%  28781 ±  5%  numa-vmstat.node1.nr_unevictable
130415 ±  3% -11.7% 115127 ±  5%  numa-meminfo.node1.Unevictable
256781 ±  2%  -8.8% 234146 ±  3%  softirqs.TASKLET
253606 ±  2%  -8.9% 231108 ±  3%  softirqs.BLOCK
119.10 ±  2% 

performance changes on 4400755e: 200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more

2015-03-17 Thread Yuanahn Liu
Hi,

FYI, we noticed performance changes on `fsmark.files_per_sec' by 
4400755e356f9a2b0b7ceaa02f57b1c7546c3765:

> commit 4400755e356f9a2b0b7ceaa02f57b1c7546c3765
> Author: NeilBrown 
> AuthorDate: Thu Feb 26 12:47:56 2015 +1100
> Commit: NeilBrown 
> CommitDate: Wed Mar 4 13:40:19 2015 +1100
> 
> md/raid5: allow the stripe_cache to grow and shrink.

26089f4902595a2f64c512066af07af6e82eb096 
4400755e356f9a2b0b7ceaa02f57b1c7546c3765
 

run time(m) metric_value ±stddev run time(m) metric_value 
±stddev change   testbox/benchmark/sub-testcase
--- --   --- --  
  --
3   18.6   6.400 ±0.0%   5   9.2   19.200 
±0.0% 200.0% ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
3   24.7   6.400 ±0.0%   3   13.7  12.800 
±0.0% 100.0% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose
3   17.5  28.267 ±9.6%   3   12.3  42.833 
±6.5%  51.5% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-120G-NoSync
3   16.7  30.700 ±1.5%   3   12.6  40.733 
±2.4%  32.7% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync
3   29.0   5.867 ±0.8%   5   23.6   7.240 
±0.7%  23.4% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
3   28.5   6.000 ±0.0%   3   23.2   7.367 
±0.6%  22.8% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose
5   11.7  14.600 ±0.0%   5   9.7   17.500 
±0.4%  19.9% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
3   22.4  25.600 ±0.0%   5   17.9  30.120 
±4.1%  17.7% ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-120G-NoSync
5   10.8  47.320 ±0.6%   5   9.3   54.820 
±0.2%  15.8% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
1   0.5  252.400 ±0.0%   1   0.5  263.300 
±0.0%   4.3% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-ext4-4M-30G-NoSync

3   0.5  273.100 ±4.3%   3   0.6  223.567 
±6.5% -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
3   8.1   63.133 ±0.5%   3   9.2   55.633 
±0.2% -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync
3   8.2   64.000 ±0.0%   3   9.2   57.600 
±0.0% -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync


NOTE: here are some more info about those test parameters for you to
  understand the testcase better:

  1x: where 'x' means iterations or loop, corresponding to the 'L' option 
of fsmark
  1t, 64t: where 't' means thread
  4M: means the single file size, corresponding to the '-s' option of fsmark
  40G, 30G, 120G: means the total test size

  4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' 
means
the size of one ramdisk. So, it would be 48G in total. And we 
made a
raid on those ramdisk.


As you can see from above data, interestingly, all performance
regressions come from btrfs testing. That's why Chris is also
in the cc list, with which just FYI.


FYI, here I listed more detailed changes for the maximal postive and negtive 
changes.

more detailed changes about 
ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
-

26089f4902595a2f  4400755e356f9a2b0b7ceaa02f  
  --  
 %stddev %change %stddev
 \  |\  
  6.40 ±  0%+200.0%  19.20 ±  0%  fsmark.files_per_sec
 1.015e+08 ±  1% -73.6%   26767355 ±  3%  
fsmark.time.voluntary_context_switches
 13793 ±  1% -73.9%   3603 ±  5%  fsmark.time.system_time
 78473 ±  6% -64.3%  28016 ±  7%  
fsmark.time.involuntary_context_switches
  15789555 ±  9% -54.7%7159485 ± 13%  fsmark.app_overhead
  1115 ±  0% -50.3%554 ±  1%  fsmark.time.elapsed_time.max
  1115 ±  0% -50.3%554 ±  1%  fsmark.time.elapsed_time
  1235 ±  2% -47.5%649 ±  3%  
fsmark.time.percent_of_cpu_this_job_got
456465 ±  1% -26.7% 334594 ±  4%  fsmark.time.minor_page_faults
   275 ±  0%   +1257.7%   3733 ±  2%  slabinfo.raid5-md0.num_objs
   275 ±  0%   +1257.7%   3733 ±  2%  slabinfo.raid5-md0.active_objs
11 ±  0%   +1250.9%148 ±  2%  slabinfo.raid5-md0.active_slabs
11 ±  0%   +1250.9%148 ±  2%  slabinfo.raid5-md0.num_slabs
  2407 ±  4%+293.4%   9471 ± 26%  

performance changes on d4b4c2cd: 37.6% fsmark.files_per_sec, -15.9% fsmark.files_per_sec, and few more

2015-03-17 Thread Yuanahn Liu
Hi,

FYI, we noticed performance changes on `fsmark.files_per_sec' by 
d4b4c2cdffab86f5c7594c44635286a6d277d5c6:

 commit d4b4c2cdffab86f5c7594c44635286a6d277d5c6
 Author: s...@kernel.org s...@kernel.org
 AuthorDate: Mon Dec 15 12:57:03 2014 +1100
 Commit: NeilBrown ne...@suse.de
 CommitDate: Wed Mar 4 13:40:17 2015 +1100
 
 RAID5: batch adjacent full stripe write

c1dfe87e41d9c2926fe92f803f02c733ddbccf0b 
d4b4c2cdffab86f5c7594c44635286a6d277d5c6
 

run time(m) metric_value ±stddev run time(m) metric_value 
±stddev change   testbox/benchmark/sub-testcase
--- --   --- --  
  --
4   15.3  33.525 ±3.0%   6   11.1  46.133 
±5.0%  37.6% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
3   0.5  262.800 ±1.5%   3   0.4  307.367 
±1.2%  17.0% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync
3   0.5  289.900 ±0.3%   3   0.4  323.367 
±2.4%  11.5% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync
3   0.5  325.667 ±2.2%   3   0.5  358.800 
±1.8%  10.2% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-NoSync
3   0.6  216.100 ±0.4%   3   0.6  230.100 
±0.4%   6.5% 
ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose
3   0.5  309.900 ±0.3%   3   0.5  328.500 
±1.1%   6.0% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-NoSync

3   13.8  37.000 ±0.2%   3   16.5  31.100 
±0.3% -15.9% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync

NOTE: here are some more info about those test parameters for you to
  understand the testcase better:

  1x : where 'x' means iterations or loop, corresponding to the 'L' option 
of fsmark
  64t: where 't' means thread
  4M : means the single file size, corresponding to the '-s' option of 
fsmark
  120G, 30G: means the total test size

  4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' 
means
the size of one ramdisk. So, it would be 48G in total. And we 
made a
raid on those ramdisk.


And FYI, here I listed more detailed changes for the maximal postive and 
negtive changes.


more detailed changes about ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
-

c1dfe87e41d9c292  d4b4c2cdffab86f5c7594c4463  
  --  
 %stddev %change %stddev
 \  |\  
 33.53 ±  3% +37.6%  46.13 ±  4%  fsmark.files_per_sec
   916 ±  3% -27.2%667 ±  5%  fsmark.time.elapsed_time.max
   916 ±  3% -27.2%667 ±  5%  fsmark.time.elapsed_time
 7 ±  5% +37.6% 10 ±  6%  
fsmark.time.percent_of_cpu_this_job_got
 92097 ±  2% -23.1%  70865 ±  4%  
fsmark.time.voluntary_context_switches
  0.04 ± 42%+681.0%   0.27 ± 22%  turbostat.Pkg%pc3
716062 ±  3% -82.7% 124210 ± 21%  cpuidle.C1-IVT.usage
 6.883e+08 ±  2% -86.8%   91146705 ± 34%  cpuidle.C1-IVT.time
  0.04 ± 30%+145.8%   0.10 ± 25%  turbostat.CPU%c3
   404 ± 16% -58.4%168 ± 14%  cpuidle.POLL.usage
   159 ± 47%+179.5%444 ± 23%  
proc-vmstat.kswapd_low_wmark_hit_quickly
 11133 ± 23%+100.3%  22298 ± 30%  cpuidle.C3-IVT.usage
  10286681 ± 27% +95.6%   20116924 ± 27%  cpuidle.C3-IVT.time
  7.92 ± 16% +77.4%  14.05 ±  6%  turbostat.Pkg%pc6
  4.93 ±  3% -38.6%   3.03 ±  2%  turbostat.CPU%c1
   916 ±  3% -27.2%667 ±  5%  time.elapsed_time.max
   916 ±  3% -27.2%667 ±  5%  time.elapsed_time
   2137390 ±  3% -26.7%1566752 ±  5%  proc-vmstat.pgfault
 7 ±  5% +37.6% 10 ±  6%  time.percent_of_cpu_this_job_got
 4.309e+10 ±  3% -26.3%  3.176e+10 ±  5%  cpuidle.C6-IVT.time
 49038 ±  2% -23.9%  37334 ±  4%  uptime.idle
  1047 ±  2% -23.8%797 ±  4%  uptime.boot
 92097 ±  2% -23.1%  70865 ±  4%  time.voluntary_context_switches
   4005888 ±  0% +13.3%4537685 ± 11%  meminfo.DirectMap2M
  3917 ±  2% -16.3%   3278 ±  5%  proc-vmstat.pageoutrun
213737 ±  1% -13.9% 183969 ±  3%  softirqs.SCHED
 46.86 ±  1% +16.5%  54.59 ±  1%  turbostat.Pkg%pc2
 32603 ±  3% -11.7%  28781 ±  5%  numa-vmstat.node1.nr_unevictable
130415 ±  3% -11.7% 115127 ±  5%  numa-meminfo.node1.Unevictable
256781 ±  2%  -8.8% 234146 ±  3%  softirqs.TASKLET
253606 ±  2%  -8.9% 231108 ±  3%  softirqs.BLOCK
  

performance changes on 4400755e: 200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more

2015-03-17 Thread Yuanahn Liu
Hi,

FYI, we noticed performance changes on `fsmark.files_per_sec' by 
4400755e356f9a2b0b7ceaa02f57b1c7546c3765:

 commit 4400755e356f9a2b0b7ceaa02f57b1c7546c3765
 Author: NeilBrown ne...@suse.de
 AuthorDate: Thu Feb 26 12:47:56 2015 +1100
 Commit: NeilBrown ne...@suse.de
 CommitDate: Wed Mar 4 13:40:19 2015 +1100
 
 md/raid5: allow the stripe_cache to grow and shrink.

26089f4902595a2f64c512066af07af6e82eb096 
4400755e356f9a2b0b7ceaa02f57b1c7546c3765
 

run time(m) metric_value ±stddev run time(m) metric_value 
±stddev change   testbox/benchmark/sub-testcase
--- --   --- --  
  --
3   18.6   6.400 ±0.0%   5   9.2   19.200 
±0.0% 200.0% ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
3   24.7   6.400 ±0.0%   3   13.7  12.800 
±0.0% 100.0% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose
3   17.5  28.267 ±9.6%   3   12.3  42.833 
±6.5%  51.5% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-120G-NoSync
3   16.7  30.700 ±1.5%   3   12.6  40.733 
±2.4%  32.7% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync
3   29.0   5.867 ±0.8%   5   23.6   7.240 
±0.7%  23.4% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
3   28.5   6.000 ±0.0%   3   23.2   7.367 
±0.6%  22.8% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose
5   11.7  14.600 ±0.0%   5   9.7   17.500 
±0.4%  19.9% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
3   22.4  25.600 ±0.0%   5   17.9  30.120 
±4.1%  17.7% ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-120G-NoSync
5   10.8  47.320 ±0.6%   5   9.3   54.820 
±0.2%  15.8% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
1   0.5  252.400 ±0.0%   1   0.5  263.300 
±0.0%   4.3% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-ext4-4M-30G-NoSync

3   0.5  273.100 ±4.3%   3   0.6  223.567 
±6.5% -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
3   8.1   63.133 ±0.5%   3   9.2   55.633 
±0.2% -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync
3   8.2   64.000 ±0.0%   3   9.2   57.600 
±0.0% -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync


NOTE: here are some more info about those test parameters for you to
  understand the testcase better:

  1x: where 'x' means iterations or loop, corresponding to the 'L' option 
of fsmark
  1t, 64t: where 't' means thread
  4M: means the single file size, corresponding to the '-s' option of fsmark
  40G, 30G, 120G: means the total test size

  4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' 
means
the size of one ramdisk. So, it would be 48G in total. And we 
made a
raid on those ramdisk.


As you can see from above data, interestingly, all performance
regressions come from btrfs testing. That's why Chris is also
in the cc list, with which just FYI.


FYI, here I listed more detailed changes for the maximal postive and negtive 
changes.

more detailed changes about 
ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
-

26089f4902595a2f  4400755e356f9a2b0b7ceaa02f  
  --  
 %stddev %change %stddev
 \  |\  
  6.40 ±  0%+200.0%  19.20 ±  0%  fsmark.files_per_sec
 1.015e+08 ±  1% -73.6%   26767355 ±  3%  
fsmark.time.voluntary_context_switches
 13793 ±  1% -73.9%   3603 ±  5%  fsmark.time.system_time
 78473 ±  6% -64.3%  28016 ±  7%  
fsmark.time.involuntary_context_switches
  15789555 ±  9% -54.7%7159485 ± 13%  fsmark.app_overhead
  1115 ±  0% -50.3%554 ±  1%  fsmark.time.elapsed_time.max
  1115 ±  0% -50.3%554 ±  1%  fsmark.time.elapsed_time
  1235 ±  2% -47.5%649 ±  3%  
fsmark.time.percent_of_cpu_this_job_got
456465 ±  1% -26.7% 334594 ±  4%  fsmark.time.minor_page_faults
   275 ±  0%   +1257.7%   3733 ±  2%  slabinfo.raid5-md0.num_objs
   275 ±  0%   +1257.7%   3733 ±  2%  slabinfo.raid5-md0.active_objs
11 ±  0%   +1250.9%148 ±  2%  slabinfo.raid5-md0.active_slabs
11 ±  0%   +1250.9%148 ±  2%  slabinfo.raid5-md0.num_slabs
  2407 ±  4%+293.4%