This patch uses hash table to do bio merge from sw queue,
then we can align to blk-mq scheduler/block legacy's way
for bio merge.
Turns out bio merge via hash table is more efficient than
simple merge on the last 8 requests in sw queue. On SCSI SRP,
it is observed ~10% IOPS is increased in
Prepare for supporting bio merge to sw queue if no
blk-mq io scheduler is taken.
Signed-off-by: Ming Lei
---
block/blk-mq.h | 4
block/blk.h | 3 +++
block/elevator.c | 22 +++---
3 files changed, 26 insertions(+), 3 deletions(-)
diff --git
This patch introduces one function __blk_mq_try_merge()
which will be resued for bio merge to sw queue in
the following patch.
No functional change.
Reviewed-by: Bart Van Assche
Signed-off-by: Ming Lei
---
block/blk-mq-sched.c | 18
So that we can reuse __elv_merge() to merge bio
into requests from sw queue in the following patches.
Signed-off-by: Ming Lei
---
block/elevator.c | 19 +--
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/block/elevator.c b/block/elevator.c
blk_mq_sched_try_merge() will be reused in following patches
to support bio merge to blk-mq sw queue, so add checkes to
related functions which are called from blk_mq_sched_try_merge().
Signed-off-by: Ming Lei
---
block/elevator.c | 16
1 file changed, 16
We need this helpers for supporting to use hashtable to improve
bio merge from sw queue in the following patches.
No functional change.
Signed-off-by: Ming Lei
---
block/blk.h | 52
block/elevator.c | 36
SCSI sets q->queue_depth from shost->cmd_per_lun, and
q->queue_depth is per request_queue and more related to
scheduler queue compared with hw queue depth, which can be
shared by queues, such as TAG_SHARED.
This patch tries to use q->queue_depth as hint for computing
q->nr_requests, which should
This function is introduced for dequeuing request
from sw queue so that we can dispatch it in
scheduler's way.
More importantly, some SCSI devices may set
q->queue_depth, which is a per-request_queue limit,
and applied on pending I/O from all hctxs. This
function is introduced for avoiding to
During dispatching, we moved all requests from hctx->dispatch to
one temporary list, then dispatch them one by one from this list.
Unfortunately during this period, run queue from other contexts
may think the queue is idle, then start to dequeue from sw/scheduler
queue and still try to dispatch
The following patch will use one hint to figure out
default queue depth for scheduler queue, so introduce
the helper of blk_mq_sched_queue_depth() for this purpose.
Reviewed-by: Christoph Hellwig
Reviewed-by: Bart Van Assche
Signed-off-by: Ming Lei
So that it becomes easy to support to dispatch from
sw queue in the following patch.
No functional change.
Reviewed-by: Bart Van Assche
Signed-off-by: Ming Lei
---
block/blk-mq-sched.c | 28 ++--
1 file changed, 18
SCSI devices use host-wide tagset, and the shared
driver tag space is often quite big. Meantime
there is also queue depth for each lun(.cmd_per_lun),
which is often small.
So lots of requests may stay in sw queue, and we
always flush all belonging to same hw queue and
dispatch them all to driver,
We need to iterate ctx starting from any ctx in round robin
way, so introduce this helper.
Cc: Omar Sandoval
Signed-off-by: Ming Lei
---
include/linux/sbitmap.h | 54 -
1 file changed, 40 insertions(+), 14
When hw queue is busy, we shouldn't take requests from
scheduler queue any more, otherwise it is difficult to do
IO merge.
This patch fixes the awful IO performance on some
SCSI devices(lpfc, qla2xxx, ...) when mq-deadline/kyber
is used by not taking requests if hw queue is busy.
Reviewed-by:
Again,
Tested-by: Oleksandr Natalenko
On sobota 2. září 2017 15:08:32 CEST Ming Lei wrote:
> Hi,
>
> The current SCSI quiesce isn't safe and easy to trigger I/O deadlock.
>
> Once SCSI device is put into QUIESCE, no new request except for RQF_PREEMPT
> can be
On Sat, Sep 02, 2017 at 09:08:39PM +0800, Ming Lei wrote:
> REQF_PREEMPT is a bit special because the request is required
> to be dispatched to lld even when SCSI device is quiesced.
>
> So this patch introduces __blk_get_request() to allow block
> layer to allocate request when queue is preempt
The two APIs are required to allow request allocation of
RQF_PREEMPT when queue is preempt frozen.
The following two points have to be guaranteed for one queue:
1) preempt freezing can be started only after all in-progress
normal & preempt freezings are completed
2) normal freezing can be
Simply quiesing SCSI device and waiting for completeion of IO
dispatched to SCSI queue isn't safe, it is easy to use up
requests because all these allocated requests can't be dispatched
when device is put in QIUESCE. Then no request can be allocated
for RQF_PREEMPT, and system may hang somewhere,
This usage is basically same with blk-mq, so that we can
support to freeze queue easily.
Signed-off-by: Ming Lei
---
block/blk-core.c | 8
1 file changed, 8 insertions(+)
diff --git a/block/blk-core.c b/block/blk-core.c
index ce2d3b6f6c62..85b15833a7a5 100644
---
The only change on legacy is that blk_drain_queue() is run
from blk_freeze_queue(), which is called in blk_cleanup_queue().
So this patch removes the explicite __blk_drain_queue() in
blk_cleanup_queue().
Signed-off-by: Ming Lei
---
block/blk-core.c | 17
REQF_PREEMPT is a bit special because the request is required
to be dispatched to lld even when SCSI device is quiesced.
So this patch introduces __blk_get_request() to allow block
layer to allocate request when queue is preempt frozen, since we
will preempt freeze queue before quiescing SCSI
This patch just makes it explicitely.
Reviewed-by: Johannes Thumshirn
Signed-off-by: Ming Lei
---
block/blk-mq.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8cf1f7cbef2b..4c532d8612e1
Hi,
The current SCSI quiesce isn't safe and easy to trigger I/O deadlock.
Once SCSI device is put into QUIESCE, no new request except for RQF_PREEMPT
can be dispatched to SCSI successfully, and scsi_device_quiesce() just simply
waits for completion of I/Os dispatched to SCSI stack. It isn't
We will support to freeze queue on block legacy path too.
Signed-off-by: Ming Lei
---
block/blk-cgroup.c | 4 ++--
block/blk-mq.c | 10 +-
block/elevator.c | 2 +-
drivers/block/loop.c | 8
drivers/nvme/host/core.c | 4 ++--
This APIs will be used by legacy path too.
Signed-off-by: Ming Lei
---
block/bfq-iosched.c | 2 +-
block/blk-cgroup.c | 4 ++--
block/blk-mq.c | 17 -
block/blk-mq.h | 1 -
block/elevator.c | 2 +-
With regard to suspend/resume cycle:
Tested-by: Oleksandr Natalenko
On pátek 1. září 2017 20:49:49 CEST Ming Lei wrote:
> Hi,
>
> The current SCSI quiesce isn't safe and easy to trigger I/O deadlock.
>
> Once SCSI device is put into QUIESCE, no new request except for
26 matches
Mail list logo