performance changes on d4b4c2cd: 37.6% fsmark.files_per_sec, -15.9% fsmark.files_per_sec, and few more
Hi, FYI, we noticed performance changes on `fsmark.files_per_sec' by d4b4c2cdffab86f5c7594c44635286a6d277d5c6: > commit d4b4c2cdffab86f5c7594c44635286a6d277d5c6 > Author: s...@kernel.org > AuthorDate: Mon Dec 15 12:57:03 2014 +1100 > Commit: NeilBrown > CommitDate: Wed Mar 4 13:40:17 2015 +1100 > > RAID5: batch adjacent full stripe write c1dfe87e41d9c2926fe92f803f02c733ddbccf0b d4b4c2cdffab86f5c7594c44635286a6d277d5c6 run time(m) metric_value ±stddev run time(m) metric_value ±stddev change testbox/benchmark/sub-testcase --- -- --- -- -- 4 15.3 33.525 ±3.0% 6 11.1 46.133 ±5.0% 37.6% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync 3 0.5 262.800 ±1.5% 3 0.4 307.367 ±1.2% 17.0% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync 3 0.5 289.900 ±0.3% 3 0.4 323.367 ±2.4% 11.5% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync 3 0.5 325.667 ±2.2% 3 0.5 358.800 ±1.8% 10.2% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-NoSync 3 0.6 216.100 ±0.4% 3 0.6 230.100 ±0.4% 6.5% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose 3 0.5 309.900 ±0.3% 3 0.5 328.500 ±1.1% 6.0% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-NoSync 3 13.8 37.000 ±0.2% 3 16.5 31.100 ±0.3% -15.9% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync NOTE: here are some more info about those test parameters for you to understand the testcase better: 1x : where 'x' means iterations or loop, corresponding to the 'L' option of fsmark 64t: where 't' means thread 4M : means the single file size, corresponding to the '-s' option of fsmark 120G, 30G: means the total test size 4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means the size of one ramdisk. So, it would be 48G in total. And we made a raid on those ramdisk. And FYI, here I listed more detailed changes for the maximal postive and negtive changes. more detailed changes about ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync - c1dfe87e41d9c292 d4b4c2cdffab86f5c7594c4463 -- %stddev %change %stddev \ |\ 33.53 ± 3% +37.6% 46.13 ± 4% fsmark.files_per_sec 916 ± 3% -27.2%667 ± 5% fsmark.time.elapsed_time.max 916 ± 3% -27.2%667 ± 5% fsmark.time.elapsed_time 7 ± 5% +37.6% 10 ± 6% fsmark.time.percent_of_cpu_this_job_got 92097 ± 2% -23.1% 70865 ± 4% fsmark.time.voluntary_context_switches 0.04 ± 42%+681.0% 0.27 ± 22% turbostat.Pkg%pc3 716062 ± 3% -82.7% 124210 ± 21% cpuidle.C1-IVT.usage 6.883e+08 ± 2% -86.8% 91146705 ± 34% cpuidle.C1-IVT.time 0.04 ± 30%+145.8% 0.10 ± 25% turbostat.CPU%c3 404 ± 16% -58.4%168 ± 14% cpuidle.POLL.usage 159 ± 47%+179.5%444 ± 23% proc-vmstat.kswapd_low_wmark_hit_quickly 11133 ± 23%+100.3% 22298 ± 30% cpuidle.C3-IVT.usage 10286681 ± 27% +95.6% 20116924 ± 27% cpuidle.C3-IVT.time 7.92 ± 16% +77.4% 14.05 ± 6% turbostat.Pkg%pc6 4.93 ± 3% -38.6% 3.03 ± 2% turbostat.CPU%c1 916 ± 3% -27.2%667 ± 5% time.elapsed_time.max 916 ± 3% -27.2%667 ± 5% time.elapsed_time 2137390 ± 3% -26.7%1566752 ± 5% proc-vmstat.pgfault 7 ± 5% +37.6% 10 ± 6% time.percent_of_cpu_this_job_got 4.309e+10 ± 3% -26.3% 3.176e+10 ± 5% cpuidle.C6-IVT.time 49038 ± 2% -23.9% 37334 ± 4% uptime.idle 1047 ± 2% -23.8%797 ± 4% uptime.boot 92097 ± 2% -23.1% 70865 ± 4% time.voluntary_context_switches 4005888 ± 0% +13.3%4537685 ± 11% meminfo.DirectMap2M 3917 ± 2% -16.3% 3278 ± 5% proc-vmstat.pageoutrun 213737 ± 1% -13.9% 183969 ± 3% softirqs.SCHED 46.86 ± 1% +16.5% 54.59 ± 1% turbostat.Pkg%pc2 32603 ± 3% -11.7% 28781 ± 5% numa-vmstat.node1.nr_unevictable 130415 ± 3% -11.7% 115127 ± 5% numa-meminfo.node1.Unevictable 256781 ± 2% -8.8% 234146 ± 3% softirqs.TASKLET 253606 ± 2% -8.9% 231108 ± 3% softirqs.BLOCK 119.10 ± 2%
performance changes on 4400755e: 200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more
Hi, FYI, we noticed performance changes on `fsmark.files_per_sec' by 4400755e356f9a2b0b7ceaa02f57b1c7546c3765: > commit 4400755e356f9a2b0b7ceaa02f57b1c7546c3765 > Author: NeilBrown > AuthorDate: Thu Feb 26 12:47:56 2015 +1100 > Commit: NeilBrown > CommitDate: Wed Mar 4 13:40:19 2015 +1100 > > md/raid5: allow the stripe_cache to grow and shrink. 26089f4902595a2f64c512066af07af6e82eb096 4400755e356f9a2b0b7ceaa02f57b1c7546c3765 run time(m) metric_value ±stddev run time(m) metric_value ±stddev change testbox/benchmark/sub-testcase --- -- --- -- -- 3 18.6 6.400 ±0.0% 5 9.2 19.200 ±0.0% 200.0% ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose 3 24.7 6.400 ±0.0% 3 13.7 12.800 ±0.0% 100.0% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose 3 17.5 28.267 ±9.6% 3 12.3 42.833 ±6.5% 51.5% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-120G-NoSync 3 16.7 30.700 ±1.5% 3 12.6 40.733 ±2.4% 32.7% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync 3 29.0 5.867 ±0.8% 5 23.6 7.240 ±0.7% 23.4% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose 3 28.5 6.000 ±0.0% 3 23.2 7.367 ±0.6% 22.8% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose 5 11.7 14.600 ±0.0% 5 9.7 17.500 ±0.4% 19.9% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose 3 22.4 25.600 ±0.0% 5 17.9 30.120 ±4.1% 17.7% ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-120G-NoSync 5 10.8 47.320 ±0.6% 5 9.3 54.820 ±0.2% 15.8% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync 1 0.5 252.400 ±0.0% 1 0.5 263.300 ±0.0% 4.3% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-ext4-4M-30G-NoSync 3 0.5 273.100 ±4.3% 3 0.6 223.567 ±6.5% -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync 3 8.1 63.133 ±0.5% 3 9.2 55.633 ±0.2% -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync 3 8.2 64.000 ±0.0% 3 9.2 57.600 ±0.0% -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync NOTE: here are some more info about those test parameters for you to understand the testcase better: 1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark 1t, 64t: where 't' means thread 4M: means the single file size, corresponding to the '-s' option of fsmark 40G, 30G, 120G: means the total test size 4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means the size of one ramdisk. So, it would be 48G in total. And we made a raid on those ramdisk. As you can see from above data, interestingly, all performance regressions come from btrfs testing. That's why Chris is also in the cc list, with which just FYI. FYI, here I listed more detailed changes for the maximal postive and negtive changes. more detailed changes about ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose - 26089f4902595a2f 4400755e356f9a2b0b7ceaa02f -- %stddev %change %stddev \ |\ 6.40 ± 0%+200.0% 19.20 ± 0% fsmark.files_per_sec 1.015e+08 ± 1% -73.6% 26767355 ± 3% fsmark.time.voluntary_context_switches 13793 ± 1% -73.9% 3603 ± 5% fsmark.time.system_time 78473 ± 6% -64.3% 28016 ± 7% fsmark.time.involuntary_context_switches 15789555 ± 9% -54.7%7159485 ± 13% fsmark.app_overhead 1115 ± 0% -50.3%554 ± 1% fsmark.time.elapsed_time.max 1115 ± 0% -50.3%554 ± 1% fsmark.time.elapsed_time 1235 ± 2% -47.5%649 ± 3% fsmark.time.percent_of_cpu_this_job_got 456465 ± 1% -26.7% 334594 ± 4% fsmark.time.minor_page_faults 275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.num_objs 275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.active_objs 11 ± 0% +1250.9%148 ± 2% slabinfo.raid5-md0.active_slabs 11 ± 0% +1250.9%148 ± 2% slabinfo.raid5-md0.num_slabs 2407 ± 4%+293.4% 9471 ± 26%
performance changes on d4b4c2cd: 37.6% fsmark.files_per_sec, -15.9% fsmark.files_per_sec, and few more
Hi, FYI, we noticed performance changes on `fsmark.files_per_sec' by d4b4c2cdffab86f5c7594c44635286a6d277d5c6: commit d4b4c2cdffab86f5c7594c44635286a6d277d5c6 Author: s...@kernel.org s...@kernel.org AuthorDate: Mon Dec 15 12:57:03 2014 +1100 Commit: NeilBrown ne...@suse.de CommitDate: Wed Mar 4 13:40:17 2015 +1100 RAID5: batch adjacent full stripe write c1dfe87e41d9c2926fe92f803f02c733ddbccf0b d4b4c2cdffab86f5c7594c44635286a6d277d5c6 run time(m) metric_value ±stddev run time(m) metric_value ±stddev change testbox/benchmark/sub-testcase --- -- --- -- -- 4 15.3 33.525 ±3.0% 6 11.1 46.133 ±5.0% 37.6% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync 3 0.5 262.800 ±1.5% 3 0.4 307.367 ±1.2% 17.0% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync 3 0.5 289.900 ±0.3% 3 0.4 323.367 ±2.4% 11.5% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync 3 0.5 325.667 ±2.2% 3 0.5 358.800 ±1.8% 10.2% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-NoSync 3 0.6 216.100 ±0.4% 3 0.6 230.100 ±0.4% 6.5% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose 3 0.5 309.900 ±0.3% 3 0.5 328.500 ±1.1% 6.0% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-NoSync 3 13.8 37.000 ±0.2% 3 16.5 31.100 ±0.3% -15.9% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync NOTE: here are some more info about those test parameters for you to understand the testcase better: 1x : where 'x' means iterations or loop, corresponding to the 'L' option of fsmark 64t: where 't' means thread 4M : means the single file size, corresponding to the '-s' option of fsmark 120G, 30G: means the total test size 4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means the size of one ramdisk. So, it would be 48G in total. And we made a raid on those ramdisk. And FYI, here I listed more detailed changes for the maximal postive and negtive changes. more detailed changes about ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync - c1dfe87e41d9c292 d4b4c2cdffab86f5c7594c4463 -- %stddev %change %stddev \ |\ 33.53 ± 3% +37.6% 46.13 ± 4% fsmark.files_per_sec 916 ± 3% -27.2%667 ± 5% fsmark.time.elapsed_time.max 916 ± 3% -27.2%667 ± 5% fsmark.time.elapsed_time 7 ± 5% +37.6% 10 ± 6% fsmark.time.percent_of_cpu_this_job_got 92097 ± 2% -23.1% 70865 ± 4% fsmark.time.voluntary_context_switches 0.04 ± 42%+681.0% 0.27 ± 22% turbostat.Pkg%pc3 716062 ± 3% -82.7% 124210 ± 21% cpuidle.C1-IVT.usage 6.883e+08 ± 2% -86.8% 91146705 ± 34% cpuidle.C1-IVT.time 0.04 ± 30%+145.8% 0.10 ± 25% turbostat.CPU%c3 404 ± 16% -58.4%168 ± 14% cpuidle.POLL.usage 159 ± 47%+179.5%444 ± 23% proc-vmstat.kswapd_low_wmark_hit_quickly 11133 ± 23%+100.3% 22298 ± 30% cpuidle.C3-IVT.usage 10286681 ± 27% +95.6% 20116924 ± 27% cpuidle.C3-IVT.time 7.92 ± 16% +77.4% 14.05 ± 6% turbostat.Pkg%pc6 4.93 ± 3% -38.6% 3.03 ± 2% turbostat.CPU%c1 916 ± 3% -27.2%667 ± 5% time.elapsed_time.max 916 ± 3% -27.2%667 ± 5% time.elapsed_time 2137390 ± 3% -26.7%1566752 ± 5% proc-vmstat.pgfault 7 ± 5% +37.6% 10 ± 6% time.percent_of_cpu_this_job_got 4.309e+10 ± 3% -26.3% 3.176e+10 ± 5% cpuidle.C6-IVT.time 49038 ± 2% -23.9% 37334 ± 4% uptime.idle 1047 ± 2% -23.8%797 ± 4% uptime.boot 92097 ± 2% -23.1% 70865 ± 4% time.voluntary_context_switches 4005888 ± 0% +13.3%4537685 ± 11% meminfo.DirectMap2M 3917 ± 2% -16.3% 3278 ± 5% proc-vmstat.pageoutrun 213737 ± 1% -13.9% 183969 ± 3% softirqs.SCHED 46.86 ± 1% +16.5% 54.59 ± 1% turbostat.Pkg%pc2 32603 ± 3% -11.7% 28781 ± 5% numa-vmstat.node1.nr_unevictable 130415 ± 3% -11.7% 115127 ± 5% numa-meminfo.node1.Unevictable 256781 ± 2% -8.8% 234146 ± 3% softirqs.TASKLET 253606 ± 2% -8.9% 231108 ± 3% softirqs.BLOCK
performance changes on 4400755e: 200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more
Hi, FYI, we noticed performance changes on `fsmark.files_per_sec' by 4400755e356f9a2b0b7ceaa02f57b1c7546c3765: commit 4400755e356f9a2b0b7ceaa02f57b1c7546c3765 Author: NeilBrown ne...@suse.de AuthorDate: Thu Feb 26 12:47:56 2015 +1100 Commit: NeilBrown ne...@suse.de CommitDate: Wed Mar 4 13:40:19 2015 +1100 md/raid5: allow the stripe_cache to grow and shrink. 26089f4902595a2f64c512066af07af6e82eb096 4400755e356f9a2b0b7ceaa02f57b1c7546c3765 run time(m) metric_value ±stddev run time(m) metric_value ±stddev change testbox/benchmark/sub-testcase --- -- --- -- -- 3 18.6 6.400 ±0.0% 5 9.2 19.200 ±0.0% 200.0% ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose 3 24.7 6.400 ±0.0% 3 13.7 12.800 ±0.0% 100.0% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose 3 17.5 28.267 ±9.6% 3 12.3 42.833 ±6.5% 51.5% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-120G-NoSync 3 16.7 30.700 ±1.5% 3 12.6 40.733 ±2.4% 32.7% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync 3 29.0 5.867 ±0.8% 5 23.6 7.240 ±0.7% 23.4% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose 3 28.5 6.000 ±0.0% 3 23.2 7.367 ±0.6% 22.8% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose 5 11.7 14.600 ±0.0% 5 9.7 17.500 ±0.4% 19.9% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose 3 22.4 25.600 ±0.0% 5 17.9 30.120 ±4.1% 17.7% ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-120G-NoSync 5 10.8 47.320 ±0.6% 5 9.3 54.820 ±0.2% 15.8% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync 1 0.5 252.400 ±0.0% 1 0.5 263.300 ±0.0% 4.3% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-ext4-4M-30G-NoSync 3 0.5 273.100 ±4.3% 3 0.6 223.567 ±6.5% -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync 3 8.1 63.133 ±0.5% 3 9.2 55.633 ±0.2% -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync 3 8.2 64.000 ±0.0% 3 9.2 57.600 ±0.0% -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync NOTE: here are some more info about those test parameters for you to understand the testcase better: 1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark 1t, 64t: where 't' means thread 4M: means the single file size, corresponding to the '-s' option of fsmark 40G, 30G, 120G: means the total test size 4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means the size of one ramdisk. So, it would be 48G in total. And we made a raid on those ramdisk. As you can see from above data, interestingly, all performance regressions come from btrfs testing. That's why Chris is also in the cc list, with which just FYI. FYI, here I listed more detailed changes for the maximal postive and negtive changes. more detailed changes about ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose - 26089f4902595a2f 4400755e356f9a2b0b7ceaa02f -- %stddev %change %stddev \ |\ 6.40 ± 0%+200.0% 19.20 ± 0% fsmark.files_per_sec 1.015e+08 ± 1% -73.6% 26767355 ± 3% fsmark.time.voluntary_context_switches 13793 ± 1% -73.9% 3603 ± 5% fsmark.time.system_time 78473 ± 6% -64.3% 28016 ± 7% fsmark.time.involuntary_context_switches 15789555 ± 9% -54.7%7159485 ± 13% fsmark.app_overhead 1115 ± 0% -50.3%554 ± 1% fsmark.time.elapsed_time.max 1115 ± 0% -50.3%554 ± 1% fsmark.time.elapsed_time 1235 ± 2% -47.5%649 ± 3% fsmark.time.percent_of_cpu_this_job_got 456465 ± 1% -26.7% 334594 ± 4% fsmark.time.minor_page_faults 275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.num_objs 275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.active_objs 11 ± 0% +1250.9%148 ± 2% slabinfo.raid5-md0.active_slabs 11 ± 0% +1250.9%148 ± 2% slabinfo.raid5-md0.num_slabs 2407 ± 4%+293.4%