Re: [PATCH V5 00/14] blk-mq-sched: improve sequential I/O performance(part 1)
On 10/10/2017 14:45, Ming Lei wrote: Hi John, All change in V6.2 is blk-mq/scsi-mq only, which shouldn't affect non SCSI_MQ, so I suggest you to compare the perf between deadline and mq-deadline, like Johannes mentioned. > > V6.2 series with default SCSI_MQ > read, rw, write IOPS > 700K, 130K/128K, 640K If possible, could you provide your fio script and log on both non SCSI_MQ(deadline) and SCSI_MQ(mq_deadline)? Maybe some clues can be figured out. Also, I just put another patch on V6.2 branch, which may improve a bit too. You may try that in your test. https://github.com/ming1/linux/commit/e31e2eec46c9b5ae7cfa181e9b77adad2c6a97ce -- Ming . Hi Ming Lei, OK, I have tested deadline vs mq-deadline for your v6.2 branch and 4.12-rc2. Unfortunately I don't have time now to test your experimental patches. 4.14-rc2 without default SCSI_MQ, deadline scheduler read, rw, write IOPS 920K, 115K/115K, 806K 4.14-rc2 with default SCSI_MQ, mq-deadline scheduler read, rw, write IOPS 280K, 99K/99K, 300K V6.2 series without default SCSI_MQ, deadline scheduler read, rw, write IOPS 919K, 117K/117K, 806K V6.2 series with default SCSI_MQ, mq-deadline scheduler read, rw, write IOPS 688K, 128K/128K, 630K I think that the non-mq results look a bit more sensible - that is, consistent results. Here's my script sample: [global] rw=rW direct=1 ioengine=libaio iodepth=2048 numjobs=1 bs=4k ;size=1024m ;zero_buffers=1 group_reporting=1 group_reporting=1 ;ioscheduler=noop cpumask=0xff ;cpus_allowed=0-3 ;gtod_reduce=1 ;iodepth_batch=2 ;iodepth_batch_complete=2 runtime=1 ;thread loops = 1 [job1] filename=/dev/sdb: [job1] filename=/dev/sdc: [job1] filename=/dev/sdd: [job1] filename=/dev/sde: [job1] filename=/dev/sdf: [job1] filename=/dev/sdg: John
Re: [PATCH V5 00/14] blk-mq-sched: improve sequential I/O performance(part 1)
On Tue, Oct 10, 2017 at 01:24:52PM +0100, John Garry wrote: > On 10/10/2017 02:46, Ming Lei wrote: > > > > > > I tested this series for the SAS controller on HiSilicon hip07 > > > > > > platform as I > > > > > > am interested in enabling MQ for this driver. Driver is > > > > > > ./drivers/scsi/hisi_sas/. > > > > > > > > > > > > So I found that that performance is improved when enabling default > > > > > > SCSI_MQ > > > > > > with this series vs baseline. However, it is still not as a good as > > > > > > when > > > > > > default SCSI_MQ is disabled. > > > > > > > > > > > > Here are some figures I got with fio: > > > > > > 4.14-rc2 without default SCSI_MQ > > > > > > read, rw, write IOPS > > > > > > 952K, 133K/133K, 800K > > > > > > > > > > > > 4.14-rc2 with default SCSI_MQ > > > > > > read, rw, write IOPS > > > > > > 311K, 117K/117K, 320K > > > > > > > > > > > > This series* without default SCSI_MQ > > > > > > read, rw, write IOPS > > > > > > 975K, 132K/132K, 790K > > > > > > > > > > > > This series* with default SCSI_MQ > > > > > > read, rw, write IOPS > > > > > > 770K, 164K/164K, 594K > > > > > > > > Thanks for testing this patchset! > > > > > > > > Looks there is big improvement, but the gap compared with > > > > block legacy is not small too. > > > > > > > > > > > > > > > > Please note that hisi_sas driver does not enable mq by exposing > > > > > > multiple > > > > > > queues to upper layer (even though it has multiple queues). I have > > > > > > been > > > > > > playing with enabling it, but my performance is always worse... > > > > > > > > > > > > * I'm using > > > > > > https://github.com/ming1/linux/commits/blk_mq_improve_scsi_mpath_perf_V5.1, > > > > > > as advised by Ming Lei. > > > > > > > > Could you test on the following branch and see if it makes a > > > > difference? > > > > > > > > > > > > https://github.com/ming1/linux/commits/blk_mq_improve_scsi_mpath_perf_V6.1_test > > Hi John, > > > > Please test the following branch directly: > > > > https://github.com/ming1/linux/tree/blk_mq_improve_scsi_mpath_perf_V6.2_test > > > > And code is simplified and cleaned up much in V6.2, then only two extra > > patches(top 2) are needed against V6 which was posted yesterday. > > > > Please test SCSI_MQ with mq-deadline, which should be the default > > mq scheduler on your HiSilicon SAS. > > Hi Ming Lei, > > It's using cfq (for non-mq) and mq-deadline (obviously for mq). > > root@(none)$ pwd > /sys/devices/platform/HISI0162:01/host0/port-0:0/expander-0:0/port-0:0:7/end_device-0:0:7 > root@(none)$ more ./target0:0:3/0:0:3:0/block/sdd/queue/scheduler > noop [cfq] > > and > > root@(none)$ more ./target0:0:3/0:0:3:0/block/sdd/queue/scheduler > [mq-deadline] kyber none > > Unfortunately my setup has changed since yeterday, and the absolute figures > are not the exact same (I retested 4.14-rc2). However, we still see that > drop when mq is enabled. > > Here's the results: > 4.14-rc4 without default SCSI_MQ > read, rw, write IOPS > 860K, 112K/112K, 800K > > 4.14-rc2 without default SCSI_MQ > read, rw, write IOPS > 880K, 113K/113K, 808K > > V6.2 series without default SCSI_MQ > read, rw, write IOPS > 820K, 114/114K, 790K Hi John, All change in V6.2 is blk-mq/scsi-mq only, which shouldn't affect non SCSI_MQ, so I suggest you to compare the perf between deadline and mq-deadline, like Johannes mentioned. > > V6.2 series with default SCSI_MQ > read, rw, write IOPS > 700K, 130K/128K, 640K If possible, could you provide your fio script and log on both non SCSI_MQ(deadline) and SCSI_MQ(mq_deadline)? Maybe some clues can be figured out. Also, I just put another patch on V6.2 branch, which may improve a bit too. You may try that in your test. https://github.com/ming1/linux/commit/e31e2eec46c9b5ae7cfa181e9b77adad2c6a97ce -- Ming
Re: [PATCH V5 00/14] blk-mq-sched: improve sequential I/O performance(part 1)
> Il giorno 10 ott 2017, alle ore 14:34, Johannes Thumshirn >ha scritto: > > Hi John, > > On Tue, Oct 10, 2017 at 01:24:52PM +0100, John Garry wrote: >> It's using cfq (for non-mq) and mq-deadline (obviously for mq). > > Please be aware that cfq and mq-deadline are _not_ comparable, for a realistic > comparasion please use deadline and mq-deadline or cfq and bfq. > Please set low_latency=0 for bfq if yours is just a maximum-throughput test. Thanks, Paolo >> root@(none)$ pwd >> /sys/devices/platform/HISI0162:01/host0/port-0:0/expander-0:0/port-0:0:7/end_device-0:0:7 >> root@(none)$ more ./target0:0:3/0:0:3:0/block/sdd/queue/scheduler >> noop [cfq] > > Maybe missing CONFIG_IOSCHED_DEADLINE? > > Thanks, > Johannes > > -- > Johannes Thumshirn Storage > jthumsh...@suse.de+49 911 74053 689 > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: Felix Imendörffer, Jane Smithard, Graham Norton > HRB 21284 (AG Nürnberg) > Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH V5 00/14] blk-mq-sched: improve sequential I/O performance(part 1)
Hi John, On Tue, Oct 10, 2017 at 01:24:52PM +0100, John Garry wrote: > It's using cfq (for non-mq) and mq-deadline (obviously for mq). Please be aware that cfq and mq-deadline are _not_ comparable, for a realistic comparasion please use deadline and mq-deadline or cfq and bfq. > root@(none)$ pwd > /sys/devices/platform/HISI0162:01/host0/port-0:0/expander-0:0/port-0:0:7/end_device-0:0:7 > root@(none)$ more ./target0:0:3/0:0:3:0/block/sdd/queue/scheduler > noop [cfq] Maybe missing CONFIG_IOSCHED_DEADLINE? Thanks, Johannes -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH V5 00/14] blk-mq-sched: improve sequential I/O performance(part 1)
On 10/10/2017 02:46, Ming Lei wrote: > > I tested this series for the SAS controller on HiSilicon hip07 platform as I > > am interested in enabling MQ for this driver. Driver is > > ./drivers/scsi/hisi_sas/. > > > > So I found that that performance is improved when enabling default SCSI_MQ > > with this series vs baseline. However, it is still not as a good as when > > default SCSI_MQ is disabled. > > > > Here are some figures I got with fio: > > 4.14-rc2 without default SCSI_MQ > > read, rw, write IOPS > > 952K, 133K/133K, 800K > > > > 4.14-rc2 with default SCSI_MQ > > read, rw, write IOPS > > 311K, 117K/117K, 320K > > > > This series* without default SCSI_MQ > > read, rw, write IOPS > > 975K, 132K/132K, 790K > > > > This series* with default SCSI_MQ > > read, rw, write IOPS > > 770K, 164K/164K, 594K > > Thanks for testing this patchset! > > Looks there is big improvement, but the gap compared with > block legacy is not small too. > > > > > Please note that hisi_sas driver does not enable mq by exposing multiple > > queues to upper layer (even though it has multiple queues). I have been > > playing with enabling it, but my performance is always worse... > > > > * I'm using > > https://github.com/ming1/linux/commits/blk_mq_improve_scsi_mpath_perf_V5.1, > > as advised by Ming Lei. > > Could you test on the following branch and see if it makes a > difference? > > https://github.com/ming1/linux/commits/blk_mq_improve_scsi_mpath_perf_V6.1_test Hi John, Please test the following branch directly: https://github.com/ming1/linux/tree/blk_mq_improve_scsi_mpath_perf_V6.2_test And code is simplified and cleaned up much in V6.2, then only two extra patches(top 2) are needed against V6 which was posted yesterday. Please test SCSI_MQ with mq-deadline, which should be the default mq scheduler on your HiSilicon SAS. Hi Ming Lei, It's using cfq (for non-mq) and mq-deadline (obviously for mq). root@(none)$ pwd /sys/devices/platform/HISI0162:01/host0/port-0:0/expander-0:0/port-0:0:7/end_device-0:0:7 root@(none)$ more ./target0:0:3/0:0:3:0/block/sdd/queue/scheduler noop [cfq] and root@(none)$ more ./target0:0:3/0:0:3:0/block/sdd/queue/scheduler [mq-deadline] kyber none Unfortunately my setup has changed since yeterday, and the absolute figures are not the exact same (I retested 4.14-rc2). However, we still see that drop when mq is enabled. Here's the results: 4.14-rc4 without default SCSI_MQ read, rw, write IOPS 860K, 112K/112K, 800K 4.14-rc2 without default SCSI_MQ read, rw, write IOPS 880K, 113K/113K, 808K V6.2 series without default SCSI_MQ read, rw, write IOPS 820K, 114/114K, 790K V6.2 series with default SCSI_MQ read, rw, write IOPS 700K, 130K/128K, 640K Cheers, John -- Ming .
Re: [PATCH V5 00/14] blk-mq-sched: improve sequential I/O performance(part 1)
On Mon, Oct 09, 2017 at 11:04:39PM +0800, Ming Lei wrote: > Hi John, > > On Mon, Oct 09, 2017 at 01:09:22PM +0100, John Garry wrote: > > On 30/09/2017 11:27, Ming Lei wrote: > > > Hi Jens, > > > > > > In Red Hat internal storage test wrt. blk-mq scheduler, we > > > found that I/O performance is much bad with mq-deadline, especially > > > about sequential I/O on some multi-queue SCSI devcies(lpfc, qla2xxx, > > > SRP...) > > > > > > Turns out one big issue causes the performance regression: requests > > > are still dequeued from sw queue/scheduler queue even when ldd's > > > queue is busy, so I/O merge becomes quite difficult to make, then > > > sequential IO degrades a lot. > > > > > > This issue becomes one of mains reasons for reverting default SCSI_MQ > > > in V4.13. > > > > > > The 1st patch takes direct issue in blk_mq_request_bypass_insert(), > > > then we can improve dm-mpath's performance in part 2, which will > > > be posted out soon. > > > > > > The 2nd six patches improve this situation, and brings back > > > some performance loss. > > > > > > With this change, SCSI-MQ sequential I/O performance is > > > improved much, Paolo reported that mq-deadline performance > > > improved much[2] in his dbench test wrt V2. Also performanc > > > improvement on lpfc/qla2xx was observed with V1.[1] > > > > > > Please consider it for V4.15. > > > > > > [1] http://marc.info/?l=linux-block=150151989915776=2 > > > [2] https://marc.info/?l=linux-block=150217980602843=2 > > > > > > > I tested this series for the SAS controller on HiSilicon hip07 platform as I > > am interested in enabling MQ for this driver. Driver is > > ./drivers/scsi/hisi_sas/. > > > > So I found that that performance is improved when enabling default SCSI_MQ > > with this series vs baseline. However, it is still not as a good as when > > default SCSI_MQ is disabled. > > > > Here are some figures I got with fio: > > 4.14-rc2 without default SCSI_MQ > > read, rw, write IOPS > > 952K, 133K/133K, 800K > > > > 4.14-rc2 with default SCSI_MQ > > read, rw, write IOPS > > 311K, 117K/117K, 320K > > > > This series* without default SCSI_MQ > > read, rw, write IOPS > > 975K, 132K/132K, 790K > > > > This series* with default SCSI_MQ > > read, rw, write IOPS > > 770K, 164K/164K, 594K > > Thanks for testing this patchset! > > Looks there is big improvement, but the gap compared with > block legacy is not small too. > > > > > Please note that hisi_sas driver does not enable mq by exposing multiple > > queues to upper layer (even though it has multiple queues). I have been > > playing with enabling it, but my performance is always worse... > > > > * I'm using > > https://github.com/ming1/linux/commits/blk_mq_improve_scsi_mpath_perf_V5.1, > > as advised by Ming Lei. > > Could you test on the following branch and see if it makes a > difference? > > > https://github.com/ming1/linux/commits/blk_mq_improve_scsi_mpath_perf_V6.1_test Hi John, Please test the following branch directly: https://github.com/ming1/linux/tree/blk_mq_improve_scsi_mpath_perf_V6.2_test And code is simplified and cleaned up much in V6.2, then only two extra patches(top 2) are needed against V6 which was posted yesterday. Please test SCSI_MQ with mq-deadline, which should be the default mq scheduler on your HiSilicon SAS. -- Ming
Re: [PATCH V5 00/14] blk-mq-sched: improve sequential I/O performance(part 1)
Hi John, On Mon, Oct 09, 2017 at 01:09:22PM +0100, John Garry wrote: > On 30/09/2017 11:27, Ming Lei wrote: > > Hi Jens, > > > > In Red Hat internal storage test wrt. blk-mq scheduler, we > > found that I/O performance is much bad with mq-deadline, especially > > about sequential I/O on some multi-queue SCSI devcies(lpfc, qla2xxx, > > SRP...) > > > > Turns out one big issue causes the performance regression: requests > > are still dequeued from sw queue/scheduler queue even when ldd's > > queue is busy, so I/O merge becomes quite difficult to make, then > > sequential IO degrades a lot. > > > > This issue becomes one of mains reasons for reverting default SCSI_MQ > > in V4.13. > > > > The 1st patch takes direct issue in blk_mq_request_bypass_insert(), > > then we can improve dm-mpath's performance in part 2, which will > > be posted out soon. > > > > The 2nd six patches improve this situation, and brings back > > some performance loss. > > > > With this change, SCSI-MQ sequential I/O performance is > > improved much, Paolo reported that mq-deadline performance > > improved much[2] in his dbench test wrt V2. Also performanc > > improvement on lpfc/qla2xx was observed with V1.[1] > > > > Please consider it for V4.15. > > > > [1] http://marc.info/?l=linux-block=150151989915776=2 > > [2] https://marc.info/?l=linux-block=150217980602843=2 > > > > I tested this series for the SAS controller on HiSilicon hip07 platform as I > am interested in enabling MQ for this driver. Driver is > ./drivers/scsi/hisi_sas/. > > So I found that that performance is improved when enabling default SCSI_MQ > with this series vs baseline. However, it is still not as a good as when > default SCSI_MQ is disabled. > > Here are some figures I got with fio: > 4.14-rc2 without default SCSI_MQ > read, rw, write IOPS > 952K, 133K/133K, 800K > > 4.14-rc2 with default SCSI_MQ > read, rw, write IOPS > 311K, 117K/117K, 320K > > This series* without default SCSI_MQ > read, rw, write IOPS > 975K, 132K/132K, 790K > > This series* with default SCSI_MQ > read, rw, write IOPS > 770K, 164K/164K, 594K Thanks for testing this patchset! Looks there is big improvement, but the gap compared with block legacy is not small too. > > Please note that hisi_sas driver does not enable mq by exposing multiple > queues to upper layer (even though it has multiple queues). I have been > playing with enabling it, but my performance is always worse... > > * I'm using > https://github.com/ming1/linux/commits/blk_mq_improve_scsi_mpath_perf_V5.1, > as advised by Ming Lei. Could you test on the following branch and see if it makes a difference? https://github.com/ming1/linux/commits/blk_mq_improve_scsi_mpath_perf_V6.1_test BTW, one big change is in the following commit, which just takes block legacy's policy to dequeue request, and I can observe some improvement on virtio-scsi too, and this commit is just for verification/debug purpose, which is never posted out before. https://github.com/ming1/linux/commit/94a117fdd9cfc1291445e5a35f04464c89c9ce70 Thanks, Ming
Re: [PATCH V5 00/14] blk-mq-sched: improve sequential I/O performance(part 1)
On 30/09/2017 11:27, Ming Lei wrote: Hi Jens, In Red Hat internal storage test wrt. blk-mq scheduler, we found that I/O performance is much bad with mq-deadline, especially about sequential I/O on some multi-queue SCSI devcies(lpfc, qla2xxx, SRP...) Turns out one big issue causes the performance regression: requests are still dequeued from sw queue/scheduler queue even when ldd's queue is busy, so I/O merge becomes quite difficult to make, then sequential IO degrades a lot. This issue becomes one of mains reasons for reverting default SCSI_MQ in V4.13. The 1st patch takes direct issue in blk_mq_request_bypass_insert(), then we can improve dm-mpath's performance in part 2, which will be posted out soon. The 2nd six patches improve this situation, and brings back some performance loss. With this change, SCSI-MQ sequential I/O performance is improved much, Paolo reported that mq-deadline performance improved much[2] in his dbench test wrt V2. Also performanc improvement on lpfc/qla2xx was observed with V1.[1] Please consider it for V4.15. [1] http://marc.info/?l=linux-block=150151989915776=2 [2] https://marc.info/?l=linux-block=150217980602843=2 I tested this series for the SAS controller on HiSilicon hip07 platform as I am interested in enabling MQ for this driver. Driver is ./drivers/scsi/hisi_sas/. So I found that that performance is improved when enabling default SCSI_MQ with this series vs baseline. However, it is still not as a good as when default SCSI_MQ is disabled. Here are some figures I got with fio: 4.14-rc2 without default SCSI_MQ read, rw, write IOPS 952K, 133K/133K, 800K 4.14-rc2 with default SCSI_MQ read, rw, write IOPS 311K, 117K/117K, 320K This series* without default SCSI_MQ read, rw, write IOPS 975K, 132K/132K, 790K This series* with default SCSI_MQ read, rw, write IOPS 770K, 164K/164K, 594K Please note that hisi_sas driver does not enable mq by exposing multiple queues to upper layer (even though it has multiple queues). I have been playing with enabling it, but my performance is always worse... * I'm using https://github.com/ming1/linux/commits/blk_mq_improve_scsi_mpath_perf_V5.1, as advised by Ming Lei. Thanks, John V5: - address some comments from Omar - add Tested-by & Reveiewed-by tag - use direct issue for blk_mq_request_bypass_insert(), and start to consider to improve sequential I/O for dm-mpath - only include part 1(the original patch 1 ~ 6), as suggested by Omar V4: - add Reviewed-by tag - some trival change: typo fix in commit log or comment, variable name, no actual functional change V3: - totally round robin for picking req from ctx, as suggested by Bart - remove one local variable in __sbitmap_for_each_set() - drop patches of single dispatch list, which can improve performance on mq-deadline, but cause a bit degrade on none because all hctxs need to be checked after ->dispatch is flushed. Will post it again once it is mature. - rebase on v4.13-rc6 with block for-next V2: - dequeue request from sw queues in round roubin's style as suggested by Bart, and introduces one helper in sbitmap for this purpose - improve bio merge via hash table from sw queue - add comments about using DISPATCH_BUSY state in lockless way, simplifying handling on busy state, - hold ctx->lock when clearing ctx busy bit as suggested by Bart Ming Lei (7): blk-mq: issue rq directly in blk_mq_request_bypass_insert() blk-mq-sched: fix scheduler bad performance sbitmap: introduce __sbitmap_for_each_set() blk-mq: introduce blk_mq_dequeue_from_ctx() blk-mq-sched: move actual dispatching into one helper blk-mq-sched: improve dispatching from sw queue blk-mq-sched: don't dequeue request until all in ->dispatch are flushed block/blk-core.c| 3 +- block/blk-mq-debugfs.c | 1 + block/blk-mq-sched.c| 104 --- block/blk-mq.c | 114 +++- block/blk-mq.h | 4 +- drivers/md/dm-rq.c | 2 +- include/linux/blk-mq.h | 3 ++ include/linux/sbitmap.h | 64 +++ 8 files changed, 238 insertions(+), 57 deletions(-)
Re: [PATCH V5 00/14] blk-mq-sched: improve sequential I/O performance(part 1)
On Sat, Sep 30, 2017 at 06:27:13PM +0800, Ming Lei wrote: > Hi Jens, > > In Red Hat internal storage test wrt. blk-mq scheduler, we > found that I/O performance is much bad with mq-deadline, especially > about sequential I/O on some multi-queue SCSI devcies(lpfc, qla2xxx, > SRP...) > > Turns out one big issue causes the performance regression: requests > are still dequeued from sw queue/scheduler queue even when ldd's > queue is busy, so I/O merge becomes quite difficult to make, then > sequential IO degrades a lot. > > This issue becomes one of mains reasons for reverting default SCSI_MQ > in V4.13. > > The 1st patch takes direct issue in blk_mq_request_bypass_insert(), > then we can improve dm-mpath's performance in part 2, which will > be posted out soon. > > The 2nd six patches improve this situation, and brings back > some performance loss. > > With this change, SCSI-MQ sequential I/O performance is > improved much, Paolo reported that mq-deadline performance > improved much[2] in his dbench test wrt V2. Also performanc > improvement on lpfc/qla2xx was observed with V1.[1] > > Please consider it for V4.15. > > [1] http://marc.info/?l=linux-block=150151989915776=2 > [2] https://marc.info/?l=linux-block=150217980602843=2 > > V5: > - address some comments from Omar > - add Tested-by & Reveiewed-by tag > - use direct issue for blk_mq_request_bypass_insert(), and > start to consider to improve sequential I/O for dm-mpath > - only include part 1(the original patch 1 ~ 6), as suggested > by Omar > > V4: > - add Reviewed-by tag > - some trival change: typo fix in commit log or comment, > variable name, no actual functional change > > V3: > - totally round robin for picking req from ctx, as suggested > by Bart > - remove one local variable in __sbitmap_for_each_set() > - drop patches of single dispatch list, which can improve > performance on mq-deadline, but cause a bit degrade on > none because all hctxs need to be checked after ->dispatch > is flushed. Will post it again once it is mature. > - rebase on v4.13-rc6 with block for-next > > V2: > - dequeue request from sw queues in round roubin's style > as suggested by Bart, and introduces one helper in sbitmap > for this purpose > - improve bio merge via hash table from sw queue > - add comments about using DISPATCH_BUSY state in lockless way, > simplifying handling on busy state, > - hold ctx->lock when clearing ctx busy bit as suggested > by Bart > > > Ming Lei (7): > blk-mq: issue rq directly in blk_mq_request_bypass_insert() > blk-mq-sched: fix scheduler bad performance > sbitmap: introduce __sbitmap_for_each_set() > blk-mq: introduce blk_mq_dequeue_from_ctx() > blk-mq-sched: move actual dispatching into one helper > blk-mq-sched: improve dispatching from sw queue > blk-mq-sched: don't dequeue request until all in ->dispatch are > flushed > > block/blk-core.c| 3 +- > block/blk-mq-debugfs.c | 1 + > block/blk-mq-sched.c| 104 --- > block/blk-mq.c | 114 > +++- > block/blk-mq.h | 4 +- > drivers/md/dm-rq.c | 2 +- > include/linux/blk-mq.h | 3 ++ > include/linux/sbitmap.h | 64 +++ > 8 files changed, 238 insertions(+), 57 deletions(-) Oops, the title should have been: [PATCH V5 0/7] blk-mq-sched: improve sequential I/O performance(part 1) Sorry for that. -- Ming
[PATCH V5 00/14] blk-mq-sched: improve sequential I/O performance(part 1)
Hi Jens, In Red Hat internal storage test wrt. blk-mq scheduler, we found that I/O performance is much bad with mq-deadline, especially about sequential I/O on some multi-queue SCSI devcies(lpfc, qla2xxx, SRP...) Turns out one big issue causes the performance regression: requests are still dequeued from sw queue/scheduler queue even when ldd's queue is busy, so I/O merge becomes quite difficult to make, then sequential IO degrades a lot. This issue becomes one of mains reasons for reverting default SCSI_MQ in V4.13. The 1st patch takes direct issue in blk_mq_request_bypass_insert(), then we can improve dm-mpath's performance in part 2, which will be posted out soon. The 2nd six patches improve this situation, and brings back some performance loss. With this change, SCSI-MQ sequential I/O performance is improved much, Paolo reported that mq-deadline performance improved much[2] in his dbench test wrt V2. Also performanc improvement on lpfc/qla2xx was observed with V1.[1] Please consider it for V4.15. [1] http://marc.info/?l=linux-block=150151989915776=2 [2] https://marc.info/?l=linux-block=150217980602843=2 V5: - address some comments from Omar - add Tested-by & Reveiewed-by tag - use direct issue for blk_mq_request_bypass_insert(), and start to consider to improve sequential I/O for dm-mpath - only include part 1(the original patch 1 ~ 6), as suggested by Omar V4: - add Reviewed-by tag - some trival change: typo fix in commit log or comment, variable name, no actual functional change V3: - totally round robin for picking req from ctx, as suggested by Bart - remove one local variable in __sbitmap_for_each_set() - drop patches of single dispatch list, which can improve performance on mq-deadline, but cause a bit degrade on none because all hctxs need to be checked after ->dispatch is flushed. Will post it again once it is mature. - rebase on v4.13-rc6 with block for-next V2: - dequeue request from sw queues in round roubin's style as suggested by Bart, and introduces one helper in sbitmap for this purpose - improve bio merge via hash table from sw queue - add comments about using DISPATCH_BUSY state in lockless way, simplifying handling on busy state, - hold ctx->lock when clearing ctx busy bit as suggested by Bart Ming Lei (7): blk-mq: issue rq directly in blk_mq_request_bypass_insert() blk-mq-sched: fix scheduler bad performance sbitmap: introduce __sbitmap_for_each_set() blk-mq: introduce blk_mq_dequeue_from_ctx() blk-mq-sched: move actual dispatching into one helper blk-mq-sched: improve dispatching from sw queue blk-mq-sched: don't dequeue request until all in ->dispatch are flushed block/blk-core.c| 3 +- block/blk-mq-debugfs.c | 1 + block/blk-mq-sched.c| 104 --- block/blk-mq.c | 114 +++- block/blk-mq.h | 4 +- drivers/md/dm-rq.c | 2 +- include/linux/blk-mq.h | 3 ++ include/linux/sbitmap.h | 64 +++ 8 files changed, 238 insertions(+), 57 deletions(-) -- 2.9.5