Re: [patch 0/3 v2] raid5: make stripe handling multi-threading

2013-08-27 Thread NeilBrown
On Mon, 12 Aug 2013 10:18:03 +0800 Shaohua Li  wrote:

> Neil,
> 
> This is another attempt to make raid5 stripe handling multi-threading.
> Recent workqueue improvement for unbound workqueue looks very promising to the
> raid5 usage. I had details in the first patch.
> 
> The patches are against your tree with patch 'raid5: make release_stripe
> lockless' and 'raid5: fix stripe release order' but without 'raid5: create
> multiple threads to handle stripes'
> 
> My test setup has 7 PCIe SSD, chunksize 8k, stripe_cache_size 2048. If 
> enabling
> multi-threading, group_thread_cnt is set to 8.
> 
> randwrite throughput(ratio) unpatch/patch requestsize(sectors) 
> unpatch/patch
> 4k1/5.9   8/8
> 8k1/5.5   16/13
> 16k   1/4.8   16/13
> 32k   1/4.6   18/14
> 64k   1/4.7   17/13
> 128k  1/4.2   23/16
> 256k  1/3.5   41/21
> 512k  1/3 75/28
> 1M1/2.6   134/34
> 
> For example, in 1M randwrite test, patched kernel is 2.6x faster, but average
> request sending to each disk is drop to 34 sectors from 134 sectors long.
> 
> Currently the biggest problem is request size is dropped, because there are
> multiple threads dispatching requests. This indicates multi-threading might 
> not
> be proper for all setup, so I disable it by default in this version. But since
> throughput is largly improved, I thought this isn't a blocking issue. I'm 
> still
> working on improving this, maybe schedule stripes from one block plug as a
> whole.
> 
> Thanks,
> Shaohua


Thanks.  I like this a lot more than the previous version.

It doesn't seem to apply exactly to my current 'for-next' - probably because
I have moved things around and have a different set of patches applied :-(
If you could rebase it on my current for-next I'll apply it and probably
submit for next merge window.
A couple of little changes I'd like made:

1/ alloc_thread_groups need to use GFP_NOIO, it least when called from
raid5_store_group_thread_cnt.  At this point in time IO to the RAID5 is
stalled so if the malloc needs to free memory it might wait for writeout to
the RAID5 and so deadlock.  GFP_NOIO prevents that.

2/ could we move the

+   if (!cpu_online(cpu)) {
+   cpu = cpumask_any(cpu_online_mask);
+   sh->cpu = cpu;
+   }

inside raid5_wakeup_stripe_thread() ?

It isn't a perfect fit, but I think it is a better place for it.
It could check list_empty(>lru) and if it is empty, add to
the appropriate group->handle_list.

The code in do_release_stripe would become
else {
clear_bit(STRIPE_DELAYED, >state);
clear_bit(STRIPE_BIT_DELAY, >state);
+   if (conf->worker_cnt_per_group == 0) {
+   list_add_tail(>lru, >handle_list);
+   } else {
+   raid5_wakeup_stripe_thread(sh);
+   return;
+   }
}
md_wakeup_thread(conf->mddev->thread);

??

NeilBrown


signature.asc
Description: PGP signature


Re: [patch 0/3 v2] raid5: make stripe handling multi-threading

2013-08-27 Thread NeilBrown
On Mon, 12 Aug 2013 10:18:03 +0800 Shaohua Li s...@kernel.org wrote:

 Neil,
 
 This is another attempt to make raid5 stripe handling multi-threading.
 Recent workqueue improvement for unbound workqueue looks very promising to the
 raid5 usage. I had details in the first patch.
 
 The patches are against your tree with patch 'raid5: make release_stripe
 lockless' and 'raid5: fix stripe release order' but without 'raid5: create
 multiple threads to handle stripes'
 
 My test setup has 7 PCIe SSD, chunksize 8k, stripe_cache_size 2048. If 
 enabling
 multi-threading, group_thread_cnt is set to 8.
 
 randwrite throughput(ratio) unpatch/patch requestsize(sectors) 
 unpatch/patch
 4k1/5.9   8/8
 8k1/5.5   16/13
 16k   1/4.8   16/13
 32k   1/4.6   18/14
 64k   1/4.7   17/13
 128k  1/4.2   23/16
 256k  1/3.5   41/21
 512k  1/3 75/28
 1M1/2.6   134/34
 
 For example, in 1M randwrite test, patched kernel is 2.6x faster, but average
 request sending to each disk is drop to 34 sectors from 134 sectors long.
 
 Currently the biggest problem is request size is dropped, because there are
 multiple threads dispatching requests. This indicates multi-threading might 
 not
 be proper for all setup, so I disable it by default in this version. But since
 throughput is largly improved, I thought this isn't a blocking issue. I'm 
 still
 working on improving this, maybe schedule stripes from one block plug as a
 whole.
 
 Thanks,
 Shaohua


Thanks.  I like this a lot more than the previous version.

It doesn't seem to apply exactly to my current 'for-next' - probably because
I have moved things around and have a different set of patches applied :-(
If you could rebase it on my current for-next I'll apply it and probably
submit for next merge window.
A couple of little changes I'd like made:

1/ alloc_thread_groups need to use GFP_NOIO, it least when called from
raid5_store_group_thread_cnt.  At this point in time IO to the RAID5 is
stalled so if the malloc needs to free memory it might wait for writeout to
the RAID5 and so deadlock.  GFP_NOIO prevents that.

2/ could we move the

+   if (!cpu_online(cpu)) {
+   cpu = cpumask_any(cpu_online_mask);
+   sh-cpu = cpu;
+   }

inside raid5_wakeup_stripe_thread() ?

It isn't a perfect fit, but I think it is a better place for it.
It could check list_empty(sh-lru) and if it is empty, add to
the appropriate group-handle_list.

The code in do_release_stripe would become
else {
clear_bit(STRIPE_DELAYED, sh-state);
clear_bit(STRIPE_BIT_DELAY, sh-state);
+   if (conf-worker_cnt_per_group == 0) {
+   list_add_tail(sh-lru, conf-handle_list);
+   } else {
+   raid5_wakeup_stripe_thread(sh);
+   return;
+   }
}
md_wakeup_thread(conf-mddev-thread);

??

NeilBrown


signature.asc
Description: PGP signature


[patch 0/3 v2] raid5: make stripe handling multi-threading

2013-08-11 Thread Shaohua Li
Neil,

This is another attempt to make raid5 stripe handling multi-threading.
Recent workqueue improvement for unbound workqueue looks very promising to the
raid5 usage. I had details in the first patch.

The patches are against your tree with patch 'raid5: make release_stripe
lockless' and 'raid5: fix stripe release order' but without 'raid5: create
multiple threads to handle stripes'

My test setup has 7 PCIe SSD, chunksize 8k, stripe_cache_size 2048. If enabling
multi-threading, group_thread_cnt is set to 8.

randwrite   throughput(ratio) unpatch/patch requestsize(sectors) 
unpatch/patch
4k  1/5.9   8/8
8k  1/5.5   16/13
16k 1/4.8   16/13
32k 1/4.6   18/14
64k 1/4.7   17/13
128k1/4.2   23/16
256k1/3.5   41/21
512k1/3 75/28
1M  1/2.6   134/34

For example, in 1M randwrite test, patched kernel is 2.6x faster, but average
request sending to each disk is drop to 34 sectors from 134 sectors long.

Currently the biggest problem is request size is dropped, because there are
multiple threads dispatching requests. This indicates multi-threading might not
be proper for all setup, so I disable it by default in this version. But since
throughput is largly improved, I thought this isn't a blocking issue. I'm still
working on improving this, maybe schedule stripes from one block plug as a
whole.

Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 0/3 v2] raid5: make stripe handling multi-threading

2013-08-11 Thread Shaohua Li
Neil,

This is another attempt to make raid5 stripe handling multi-threading.
Recent workqueue improvement for unbound workqueue looks very promising to the
raid5 usage. I had details in the first patch.

The patches are against your tree with patch 'raid5: make release_stripe
lockless' and 'raid5: fix stripe release order' but without 'raid5: create
multiple threads to handle stripes'

My test setup has 7 PCIe SSD, chunksize 8k, stripe_cache_size 2048. If enabling
multi-threading, group_thread_cnt is set to 8.

randwrite   throughput(ratio) unpatch/patch requestsize(sectors) 
unpatch/patch
4k  1/5.9   8/8
8k  1/5.5   16/13
16k 1/4.8   16/13
32k 1/4.6   18/14
64k 1/4.7   17/13
128k1/4.2   23/16
256k1/3.5   41/21
512k1/3 75/28
1M  1/2.6   134/34

For example, in 1M randwrite test, patched kernel is 2.6x faster, but average
request sending to each disk is drop to 34 sectors from 134 sectors long.

Currently the biggest problem is request size is dropped, because there are
multiple threads dispatching requests. This indicates multi-threading might not
be proper for all setup, so I disable it by default in this version. But since
throughput is largly improved, I thought this isn't a blocking issue. I'm still
working on improving this, maybe schedule stripes from one block plug as a
whole.

Thanks,
Shaohua
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/