Re: [PATCH 2/2] md/raid5: exclusive wait_for_stripe

2015-04-26 Thread Yuanhan Liu
On Mon, Apr 27, 2015 at 10:24:05AM +1000, NeilBrown wrote:
> On Fri, 24 Apr 2015 21:39:04 +0800 Yuanhan Liu 
> wrote:
> 
> > I noticed heavy spin lock contention at get_active_stripe() with fsmark
> > multiple thread write workloads.
> > 
> > Here is how this hot contention comes from. We have limited stripes, and
> > it's a multiple thread write workload. Hence, those stripes will be taken
> > soon, which puts later processes to sleep for waiting free stripes. When
> > enough stripes(> 1/4 total stripes) are released, all process are woken,
> > trying to get the lock. But there is one only being able to get this lock
> > for each hash lock, making other processes spinning out there for acquiring
> > the lock.
> > 
> > Thus, it's effectiveless to wakeup all processes and let them battle for
> > a lock that permits one to access only each time. Instead, we could make
> > it be a exclusive wake up: wake up one process only. That avoids the heavy
> > spin lock contention naturally.
> > 
> > Here are some test results I have got with this patch applied(all test run
> > 3 times):
> > 
> > `fsmark.files_per_sec'
> > =
> > 
> > next-20150317 this patch
> > - -
> > metric_value ±stddev  metric_value ±stddev change  
> > testbox/benchmark/testcase-params
> > - -    
> > --
> >   25.600 ±0.0  92.700 ±2.5  262.1% 
> > ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-fsyncBeforeClose
> >   25.600 ±0.0  77.800 ±0.6  203.9% 
> > ivb44/fsmark/1x-64t-9BRD_6G-RAID5-btrfs-4M-30G-fsyncBeforeClose
> >   32.000 ±0.0  93.800 ±1.7  193.1% 
> > ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-fsyncBeforeClose
> >   32.000 ±0.0  81.233 ±1.7  153.9% 
> > ivb44/fsmark/1x-64t-9BRD_6G-RAID5-ext4-4M-30G-fsyncBeforeClose
> >   48.800 ±14.5 99.667 ±2.0  104.2% 
> > ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-fsyncBeforeClose
> >6.400 ±0.0  12.800 ±0.0  100.0% 
> > ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
> >   63.133 ±8.2  82.800 ±0.7   31.2% 
> > ivb44/fsmark/1x-64t-9BRD_6G-RAID5-xfs-4M-30G-fsyncBeforeClose
> >  245.067 ±0.7 306.567 ±7.9   25.1% 
> > ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose
> >   17.533 ±0.3  21.000 ±0.8   19.8% 
> > ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
> >  188.167 ±1.9 215.033 ±3.1   14.3% 
> > ivb44/fsmark/1x-1t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
> >  254.500 ±1.8 290.733 ±2.4   14.2% 
> > ivb44/fsmark/1x-1t-9BRD_6G-RAID5-btrfs-4M-30G-NoSync
> > 
> > `time.system_time'
> > =
> > 
> > next-20150317 this patch
> > --
> > metric_value ±stddev metric_value ±stddev change   
> > testbox/benchmark/testcase-params
> > -- 
> > --
> > 7235.603 ±1.2 185.163 ±1.9  -97.4% 
> > ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-fsyncBeforeClose
> > 7666.883 ±2.9 202.750 ±1.0  -97.4% 
> > ivb44/fsmark/1x-64t-9BRD_6G-RAID5-btrfs-4M-30G-fsyncBeforeClose
> >14567.893 ±0.7 421.230 ±0.4  -97.1% 
> > ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
> > 3697.667 ±14.0148.190 ±1.7  -96.0% 
> > ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-fsyncBeforeClose
> > 5572.867 ±3.8 310.717 ±1.4  -94.4% 
> > ivb44/fsmark/1x-64t-9BRD_6G-RAID5-ext4-4M-30G-fsyncBeforeClose
> > 5565.050 ±0.5 313.277 ±1.5  -94.4% 
> > ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-fsyncBeforeClose
> > 2420.707 ±17.1171.043 ±2.7  -92.9% 
> > ivb44/fsmark/1x-64t-9BRD_6G-RAID5-xfs-4M-30G-fsyncBeforeClose
> > 3743.300 ±4.6 379.827 ±3.5  -89.9% 
> > ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
> > 3308.687 ±6.3 363.050 ±2.0  -89.0% 
> > ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
> > 
> > Where,
> > 
> >  1x: where 'x' means iterations or loop, corresponding to the 'L' 
> > option of fsmark
> > 
> >  1t, 64t: where 't' means thread
> > 
> >  4M: means the single file size, corresponding to the '-s' option of 
> > fsmark
> >  40G, 30G, 120G: means the 

Re: [PATCH 2/2] md/raid5: exclusive wait_for_stripe

2015-04-26 Thread NeilBrown
On Fri, 24 Apr 2015 21:39:04 +0800 Yuanhan Liu 
wrote:

> I noticed heavy spin lock contention at get_active_stripe() with fsmark
> multiple thread write workloads.
> 
> Here is how this hot contention comes from. We have limited stripes, and
> it's a multiple thread write workload. Hence, those stripes will be taken
> soon, which puts later processes to sleep for waiting free stripes. When
> enough stripes(> 1/4 total stripes) are released, all process are woken,
> trying to get the lock. But there is one only being able to get this lock
> for each hash lock, making other processes spinning out there for acquiring
> the lock.
> 
> Thus, it's effectiveless to wakeup all processes and let them battle for
> a lock that permits one to access only each time. Instead, we could make
> it be a exclusive wake up: wake up one process only. That avoids the heavy
> spin lock contention naturally.
> 
> Here are some test results I have got with this patch applied(all test run
> 3 times):
> 
> `fsmark.files_per_sec'
> =
> 
> next-20150317 this patch
> - -
> metric_value ±stddev  metric_value ±stddev change  
> testbox/benchmark/testcase-params
> - -    
> --
>   25.600 ±0.0  92.700 ±2.5  262.1% 
> ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-fsyncBeforeClose
>   25.600 ±0.0  77.800 ±0.6  203.9% 
> ivb44/fsmark/1x-64t-9BRD_6G-RAID5-btrfs-4M-30G-fsyncBeforeClose
>   32.000 ±0.0  93.800 ±1.7  193.1% 
> ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-fsyncBeforeClose
>   32.000 ±0.0  81.233 ±1.7  153.9% 
> ivb44/fsmark/1x-64t-9BRD_6G-RAID5-ext4-4M-30G-fsyncBeforeClose
>   48.800 ±14.5 99.667 ±2.0  104.2% 
> ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-fsyncBeforeClose
>6.400 ±0.0  12.800 ±0.0  100.0% 
> ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
>   63.133 ±8.2  82.800 ±0.7   31.2% 
> ivb44/fsmark/1x-64t-9BRD_6G-RAID5-xfs-4M-30G-fsyncBeforeClose
>  245.067 ±0.7 306.567 ±7.9   25.1% 
> ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose
>   17.533 ±0.3  21.000 ±0.8   19.8% 
> ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
>  188.167 ±1.9 215.033 ±3.1   14.3% 
> ivb44/fsmark/1x-1t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
>  254.500 ±1.8 290.733 ±2.4   14.2% 
> ivb44/fsmark/1x-1t-9BRD_6G-RAID5-btrfs-4M-30G-NoSync
> 
> `time.system_time'
> =
> 
> next-20150317 this patch
> --
> metric_value ±stddev metric_value ±stddev change   
> testbox/benchmark/testcase-params
> -- 
> --
> 7235.603 ±1.2 185.163 ±1.9  -97.4% 
> ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-fsyncBeforeClose
> 7666.883 ±2.9 202.750 ±1.0  -97.4% 
> ivb44/fsmark/1x-64t-9BRD_6G-RAID5-btrfs-4M-30G-fsyncBeforeClose
>14567.893 ±0.7 421.230 ±0.4  -97.1% 
> ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
> 3697.667 ±14.0148.190 ±1.7  -96.0% 
> ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-fsyncBeforeClose
> 5572.867 ±3.8 310.717 ±1.4  -94.4% 
> ivb44/fsmark/1x-64t-9BRD_6G-RAID5-ext4-4M-30G-fsyncBeforeClose
> 5565.050 ±0.5 313.277 ±1.5  -94.4% 
> ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-fsyncBeforeClose
> 2420.707 ±17.1171.043 ±2.7  -92.9% 
> ivb44/fsmark/1x-64t-9BRD_6G-RAID5-xfs-4M-30G-fsyncBeforeClose
> 3743.300 ±4.6 379.827 ±3.5  -89.9% 
> ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
> 3308.687 ±6.3 363.050 ±2.0  -89.0% 
> ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
> 
> Where,
> 
>  1x: where 'x' means iterations or loop, corresponding to the 'L' option 
> of fsmark
> 
>  1t, 64t: where 't' means thread
> 
>  4M: means the single file size, corresponding to the '-s' option of 
> fsmark
>  40G, 30G, 120G: means the total test size
> 
>  4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' 
> means
>the size of one ramdisk. So, it would be 48G in total. And we 
> made a
>raid on those ramdisk
> 
> As 

Re: [PATCH 2/2] md/raid5: exclusive wait_for_stripe

2015-04-26 Thread NeilBrown
On Fri, 24 Apr 2015 21:39:04 +0800 Yuanhan Liu yuanhan@linux.intel.com
wrote:

 I noticed heavy spin lock contention at get_active_stripe() with fsmark
 multiple thread write workloads.
 
 Here is how this hot contention comes from. We have limited stripes, and
 it's a multiple thread write workload. Hence, those stripes will be taken
 soon, which puts later processes to sleep for waiting free stripes. When
 enough stripes( 1/4 total stripes) are released, all process are woken,
 trying to get the lock. But there is one only being able to get this lock
 for each hash lock, making other processes spinning out there for acquiring
 the lock.
 
 Thus, it's effectiveless to wakeup all processes and let them battle for
 a lock that permits one to access only each time. Instead, we could make
 it be a exclusive wake up: wake up one process only. That avoids the heavy
 spin lock contention naturally.
 
 Here are some test results I have got with this patch applied(all test run
 3 times):
 
 `fsmark.files_per_sec'
 =
 
 next-20150317 this patch
 - -
 metric_value ±stddev  metric_value ±stddev change  
 testbox/benchmark/testcase-params
 - -    
 --
   25.600 ±0.0  92.700 ±2.5  262.1% 
 ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-fsyncBeforeClose
   25.600 ±0.0  77.800 ±0.6  203.9% 
 ivb44/fsmark/1x-64t-9BRD_6G-RAID5-btrfs-4M-30G-fsyncBeforeClose
   32.000 ±0.0  93.800 ±1.7  193.1% 
 ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-fsyncBeforeClose
   32.000 ±0.0  81.233 ±1.7  153.9% 
 ivb44/fsmark/1x-64t-9BRD_6G-RAID5-ext4-4M-30G-fsyncBeforeClose
   48.800 ±14.5 99.667 ±2.0  104.2% 
 ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-fsyncBeforeClose
6.400 ±0.0  12.800 ±0.0  100.0% 
 ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
   63.133 ±8.2  82.800 ±0.7   31.2% 
 ivb44/fsmark/1x-64t-9BRD_6G-RAID5-xfs-4M-30G-fsyncBeforeClose
  245.067 ±0.7 306.567 ±7.9   25.1% 
 ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose
   17.533 ±0.3  21.000 ±0.8   19.8% 
 ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
  188.167 ±1.9 215.033 ±3.1   14.3% 
 ivb44/fsmark/1x-1t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
  254.500 ±1.8 290.733 ±2.4   14.2% 
 ivb44/fsmark/1x-1t-9BRD_6G-RAID5-btrfs-4M-30G-NoSync
 
 `time.system_time'
 =
 
 next-20150317 this patch
 --
 metric_value ±stddev metric_value ±stddev change   
 testbox/benchmark/testcase-params
 -- 
 --
 7235.603 ±1.2 185.163 ±1.9  -97.4% 
 ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-fsyncBeforeClose
 7666.883 ±2.9 202.750 ±1.0  -97.4% 
 ivb44/fsmark/1x-64t-9BRD_6G-RAID5-btrfs-4M-30G-fsyncBeforeClose
14567.893 ±0.7 421.230 ±0.4  -97.1% 
 ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
 3697.667 ±14.0148.190 ±1.7  -96.0% 
 ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-fsyncBeforeClose
 5572.867 ±3.8 310.717 ±1.4  -94.4% 
 ivb44/fsmark/1x-64t-9BRD_6G-RAID5-ext4-4M-30G-fsyncBeforeClose
 5565.050 ±0.5 313.277 ±1.5  -94.4% 
 ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-fsyncBeforeClose
 2420.707 ±17.1171.043 ±2.7  -92.9% 
 ivb44/fsmark/1x-64t-9BRD_6G-RAID5-xfs-4M-30G-fsyncBeforeClose
 3743.300 ±4.6 379.827 ±3.5  -89.9% 
 ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
 3308.687 ±6.3 363.050 ±2.0  -89.0% 
 ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
 
 Where,
 
  1x: where 'x' means iterations or loop, corresponding to the 'L' option 
 of fsmark
 
  1t, 64t: where 't' means thread
 
  4M: means the single file size, corresponding to the '-s' option of 
 fsmark
  40G, 30G, 120G: means the total test size
 
  4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' 
 means
the size of one ramdisk. So, it would be 48G in total. And we 
 made a
raid on those ramdisk
 
 As you can see, though there are no much performance gain for hard disk
 

Re: [PATCH 2/2] md/raid5: exclusive wait_for_stripe

2015-04-26 Thread Yuanhan Liu
On Mon, Apr 27, 2015 at 10:24:05AM +1000, NeilBrown wrote:
 On Fri, 24 Apr 2015 21:39:04 +0800 Yuanhan Liu yuanhan@linux.intel.com
 wrote:
 
  I noticed heavy spin lock contention at get_active_stripe() with fsmark
  multiple thread write workloads.
  
  Here is how this hot contention comes from. We have limited stripes, and
  it's a multiple thread write workload. Hence, those stripes will be taken
  soon, which puts later processes to sleep for waiting free stripes. When
  enough stripes( 1/4 total stripes) are released, all process are woken,
  trying to get the lock. But there is one only being able to get this lock
  for each hash lock, making other processes spinning out there for acquiring
  the lock.
  
  Thus, it's effectiveless to wakeup all processes and let them battle for
  a lock that permits one to access only each time. Instead, we could make
  it be a exclusive wake up: wake up one process only. That avoids the heavy
  spin lock contention naturally.
  
  Here are some test results I have got with this patch applied(all test run
  3 times):
  
  `fsmark.files_per_sec'
  =
  
  next-20150317 this patch
  - -
  metric_value ±stddev  metric_value ±stddev change  
  testbox/benchmark/testcase-params
  - -    
  --
25.600 ±0.0  92.700 ±2.5  262.1% 
  ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-fsyncBeforeClose
25.600 ±0.0  77.800 ±0.6  203.9% 
  ivb44/fsmark/1x-64t-9BRD_6G-RAID5-btrfs-4M-30G-fsyncBeforeClose
32.000 ±0.0  93.800 ±1.7  193.1% 
  ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-fsyncBeforeClose
32.000 ±0.0  81.233 ±1.7  153.9% 
  ivb44/fsmark/1x-64t-9BRD_6G-RAID5-ext4-4M-30G-fsyncBeforeClose
48.800 ±14.5 99.667 ±2.0  104.2% 
  ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-fsyncBeforeClose
 6.400 ±0.0  12.800 ±0.0  100.0% 
  ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
63.133 ±8.2  82.800 ±0.7   31.2% 
  ivb44/fsmark/1x-64t-9BRD_6G-RAID5-xfs-4M-30G-fsyncBeforeClose
   245.067 ±0.7 306.567 ±7.9   25.1% 
  ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose
17.533 ±0.3  21.000 ±0.8   19.8% 
  ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
   188.167 ±1.9 215.033 ±3.1   14.3% 
  ivb44/fsmark/1x-1t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
   254.500 ±1.8 290.733 ±2.4   14.2% 
  ivb44/fsmark/1x-1t-9BRD_6G-RAID5-btrfs-4M-30G-NoSync
  
  `time.system_time'
  =
  
  next-20150317 this patch
  --
  metric_value ±stddev metric_value ±stddev change   
  testbox/benchmark/testcase-params
  -- 
  --
  7235.603 ±1.2 185.163 ±1.9  -97.4% 
  ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-fsyncBeforeClose
  7666.883 ±2.9 202.750 ±1.0  -97.4% 
  ivb44/fsmark/1x-64t-9BRD_6G-RAID5-btrfs-4M-30G-fsyncBeforeClose
 14567.893 ±0.7 421.230 ±0.4  -97.1% 
  ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
  3697.667 ±14.0148.190 ±1.7  -96.0% 
  ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-fsyncBeforeClose
  5572.867 ±3.8 310.717 ±1.4  -94.4% 
  ivb44/fsmark/1x-64t-9BRD_6G-RAID5-ext4-4M-30G-fsyncBeforeClose
  5565.050 ±0.5 313.277 ±1.5  -94.4% 
  ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-fsyncBeforeClose
  2420.707 ±17.1171.043 ±2.7  -92.9% 
  ivb44/fsmark/1x-64t-9BRD_6G-RAID5-xfs-4M-30G-fsyncBeforeClose
  3743.300 ±4.6 379.827 ±3.5  -89.9% 
  ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
  3308.687 ±6.3 363.050 ±2.0  -89.0% 
  ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
  
  Where,
  
   1x: where 'x' means iterations or loop, corresponding to the 'L' 
  option of fsmark
  
   1t, 64t: where 't' means thread
  
   4M: means the single file size, corresponding to the '-s' option of 
  fsmark
   40G, 30G, 120G: means the total test size
  
   4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where 
  '12G' means
 the size of one ramdisk. So, it