Re: Issues about the merge_bvec_fn callback in 3.10 series
Hi Neil Thanks so much for your suggestion. On 2019/8/23 9:04, NeilBrown wrote: >> I have checked >> v3.10.108 >> v3.18.140 >> v4.1.49 >> but there seems not fix for it. >> >> And maybe it would be fixed until >> 8ae126660fddbeebb9251a174e6fa45b6ad8f932 >> block: kill merge_bvec_fn() completely >> >> Would anyone please give some suggestion on this ? > > One option would be to make sure that ->bi_rw is set before > bio_add_page is called. > There are about 80 calls, so that isn't trivial, but you might not care > about several of them. > > You could backport the 'kill merge_bvec_fn' patch if you like, but I > wouldn't. The change of introducing a bug is much higher. > I have killed the raid5_mergeable_bvec and backport the patches that could make the chunk_aligned_read be able to split bio by its own. Then I just need to modify the raid5 code and needn't to touch other part of the system, especially the block core. It seems work well till now. Thanks again Jianchao
Re: Issues about the merge_bvec_fn callback in 3.10 series
Would anyone please give some comment here ? Should we discard the merge_bvec_fn for raid5 and backport the bio split code there ? Thanks in advance. Jianchao On 2019/8/21 19:42, Jianchao Wang wrote: > Hi dear all > > This is a question in older kernel versions. > > We are using 3.10 series kernel in our production. And we encountered issue > as below, > > When add a page into a bio, .merge_bvec_fn will be invoked down to the bottom, > and the bio->bi_rw would be saved into bvec_merge_data.bi_rw as the following > code, > > __bio_add_page > --- > if (q->merge_bvec_fn) { > struct bvm = { > .bi_bdev = bio->bi_bdev, > .bi_sector = bio->bi_iter.bi_sector, > .bi_size = bio->bi_iter.bi_size, > .bi_rw = bio->bi_rw, > }; > > /* >* merge_bvec_fn() returns number of bytes it can accept >* at this offset >*/ > if (q->merge_bvec_fn(q, &bvm, bvec) < bvec->bv_len) { > bvec->bv_page = NULL; > bvec->bv_len = 0; > bvec->bv_offset = 0; > return 0; > } > } > --- > > However, it seems that the bio->bi_rw has not been set at the moment (set by > submit_bio), > so it is always zero. > > We have a raid5 and the raid5_mergeable_bvec would always handle the write as > read and then > we always get a write bio with a stripe chunk size which is not expected and > would degrade the > performance. This is code, > > raid5_mergeable_bvec > --- > if ((bvm->bi_rw & 1) == WRITE) > return biovec->bv_len; /* always allow writes to be mergeable */ > > if (mddev->new_chunk_sectors < mddev->chunk_sectors) > chunk_sectors = mddev->new_chunk_sectors; > max = (chunk_sectors - ((sector & (chunk_sectors - 1)) + bio_sectors)) > << 9; > if (max < 0) max = 0; > if (max <= biovec->bv_len && bio_sectors == 0) > return biovec->bv_len; > else > return max; > > --- > > I have checked > v3.10.108 > v3.18.140 > v4.1.49 > but there seems not fix for it. > > And maybe it would be fixed until > 8ae126660fddbeebb9251a174e6fa45b6ad8f932 > block: kill merge_bvec_fn() completely > > Would anyone please give some suggestion on this ? > Any comment will be welcomed. > > Thanks in advance > Jianchao >
Issues about the merge_bvec_fn callback in 3.10 series
Hi dear all This is a question in older kernel versions. We are using 3.10 series kernel in our production. And we encountered issue as below, When add a page into a bio, .merge_bvec_fn will be invoked down to the bottom, and the bio->bi_rw would be saved into bvec_merge_data.bi_rw as the following code, __bio_add_page --- if (q->merge_bvec_fn) { struct bvm = { .bi_bdev = bio->bi_bdev, .bi_sector = bio->bi_iter.bi_sector, .bi_size = bio->bi_iter.bi_size, .bi_rw = bio->bi_rw, }; /* * merge_bvec_fn() returns number of bytes it can accept * at this offset */ if (q->merge_bvec_fn(q, &bvm, bvec) < bvec->bv_len) { bvec->bv_page = NULL; bvec->bv_len = 0; bvec->bv_offset = 0; return 0; } } --- However, it seems that the bio->bi_rw has not been set at the moment (set by submit_bio), so it is always zero. We have a raid5 and the raid5_mergeable_bvec would always handle the write as read and then we always get a write bio with a stripe chunk size which is not expected and would degrade the performance. This is code, raid5_mergeable_bvec --- if ((bvm->bi_rw & 1) == WRITE) return biovec->bv_len; /* always allow writes to be mergeable */ if (mddev->new_chunk_sectors < mddev->chunk_sectors) chunk_sectors = mddev->new_chunk_sectors; max = (chunk_sectors - ((sector & (chunk_sectors - 1)) + bio_sectors)) << 9; if (max < 0) max = 0; if (max <= biovec->bv_len && bio_sectors == 0) return biovec->bv_len; else return max; --- I have checked v3.10.108 v3.18.140 v4.1.49 but there seems not fix for it. And maybe it would be fixed until 8ae126660fddbeebb9251a174e6fa45b6ad8f932 block: kill merge_bvec_fn() completely Would anyone please give some suggestion on this ? Any comment will be welcomed. Thanks in advance Jianchao
[PATCH V2] blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue
When requeue, if RQF_DONTPREP, rq has contained some driver specific data, so insert it to hctx dispatch list to avoid any merge. Take scsi as example, here is the trace event log (no io scheduler, because RQF_STARTED would prevent merging), kworker/0:1H-339 [000] ...1 2037.209289: block_rq_insert: 8,0 R 4096 () 32768 + 8 [kworker/0:1H] scsi_inert_test-1987 [000] 2037.220465: block_bio_queue: 8,0 R 32776 + 8 [scsi_inert_test] scsi_inert_test-1987 [000] ...2 2037.220466: block_bio_backmerge: 8,0 R 32776 + 8 [scsi_inert_test] kworker/0:1H-339 [000] 2047.220913: block_rq_issue: 8,0 R 8192 () 32768 + 16 [kworker/0:1H] scsi_inert_test-1996 [000] ..s1 2047.221007: block_rq_complete: 8,0 R () 32768 + 8 [0] scsi_inert_test-1996 [000] .Ns1 2047.221045: block_rq_requeue: 8,0 R () 32776 + 8 [0] kworker/0:1H-339 [000] ...1 2047.221054: block_rq_insert: 8,0 R 4096 () 32776 + 8 [kworker/0:1H] kworker/0:1H-339 [000] ...1 2047.221056: block_rq_issue: 8,0 R 4096 () 32776 + 8 [kworker/0:1H] scsi_inert_test-1986 [000] ..s1 2047.221119: block_rq_complete: 8,0 R () 32776 + 8 [0] (32768 + 8) was requeued by scsi_queue_insert and had RQF_DONTPREP. Then it was merged with (32776 + 8) and issued. Due to RQF_DONTPREP, the sdb only contained the part of (32768 + 8), then only that part was completed. The lucky thing was that scsi_io_completion detected it and requeued the remaining part. So we didn't get corrupted data. However, the requeue of (32776 + 8) is not expected. Suggested-by: Jens Axboe Signed-off-by: Jianchao Wang --- V2: - refactor the code based on Jens' suggestion block/blk-mq.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 8f5b533..9437a5e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -737,12 +737,20 @@ static void blk_mq_requeue_work(struct work_struct *work) spin_unlock_irq(&q->requeue_lock); list_for_each_entry_safe(rq, next, &rq_list, queuelist) { - if (!(rq->rq_flags & RQF_SOFTBARRIER)) + if (!(rq->rq_flags & (RQF_SOFTBARRIER | RQF_DONTPREP))) continue; rq->rq_flags &= ~RQF_SOFTBARRIER; list_del_init(&rq->queuelist); - blk_mq_sched_insert_request(rq, true, false, false); + /* +* If RQF_DONTPREP, rq has contained some driver specific +* data, so insert it to hctx dispatch list to avoid any +* merge. +*/ + if (rq->rq_flags & RQF_DONTPREP) + blk_mq_request_bypass_insert(rq, false); + else + blk_mq_sched_insert_request(rq, true, false, false); } while (!list_empty(&rq_list)) { -- 2.7.4
[PATCH] blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue
When requeue, if RQF_DONTPREP, rq has contained some driver specific data, so insert it to hctx dispatch list to avoid any merge. Take scsi as example, here is the trace event log (no io scheduler, because RQF_STARTED would prevent merging), kworker/0:1H-339 [000] ...1 2037.209289: block_rq_insert: 8,0 R 4096 () 32768 + 8 [kworker/0:1H] scsi_inert_test-1987 [000] 2037.220465: block_bio_queue: 8,0 R 32776 + 8 [scsi_inert_test] scsi_inert_test-1987 [000] ...2 2037.220466: block_bio_backmerge: 8,0 R 32776 + 8 [scsi_inert_test] kworker/0:1H-339 [000] 2047.220913: block_rq_issue: 8,0 R 8192 () 32768 + 16 [kworker/0:1H] scsi_inert_test-1996 [000] ..s1 2047.221007: block_rq_complete: 8,0 R () 32768 + 8 [0] scsi_inert_test-1996 [000] .Ns1 2047.221045: block_rq_requeue: 8,0 R () 32776 + 8 [0] kworker/0:1H-339 [000] ...1 2047.221054: block_rq_insert: 8,0 R 4096 () 32776 + 8 [kworker/0:1H] kworker/0:1H-339 [000] ...1 2047.221056: block_rq_issue: 8,0 R 4096 () 32776 + 8 [kworker/0:1H] scsi_inert_test-1986 [000] ..s1 2047.221119: block_rq_complete: 8,0 R () 32776 + 8 [0] (32768 + 8) was requeued by scsi_queue_insert and had RQF_DONTPREP. Then it was merged with (32776 + 8) and issued. Due to RQF_DONTPREP, the sdb only contained the part of (32768 + 8), then only that part was completed. The lucky thing was that scsi_io_completion detected it and requeued the remaining part. So we didn't get corrupted data. However, the requeue of (32776 + 8) is not expected. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 12 1 file changed, 12 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index 8f5b533..2d93eb5 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -737,6 +737,18 @@ static void blk_mq_requeue_work(struct work_struct *work) spin_unlock_irq(&q->requeue_lock); list_for_each_entry_safe(rq, next, &rq_list, queuelist) { + /* +* If RQF_DONTPREP, rq has contained some driver specific +* data, so insert it to hctx dispatch list to avoid any +* merge. +*/ + if (rq->rq_flags & RQF_DONTPREP) { + rq->rq_flags &= ~RQF_SOFTBARRIER; + list_del_init(&rq->queuelist); + blk_mq_request_bypass_insert(rq, false); + continue; + } + if (!(rq->rq_flags & RQF_SOFTBARRIER)) continue; -- 2.7.4
[PATCH] blk-mq: fix a hung issue when fsync
Florian reported a io hung issue when fsync(). It should be triggered by following race condition. data + post flush a flush blk_flush_complete_seq case REQ_FSEQ_DATA blk_flush_queue_rq issued to driver blk_mq_dispatch_rq_list try to issue a flush req failed due to NON-NCQ command .queue_rq return BLK_STS_DEV_RESOURCE request completion req->end_io // doesn't check RESTART mq_flush_data_end_io case REQ_FSEQ_POSTFLUSH blk_kick_flush do nothing because previous flush has not been completed blk_mq_run_hw_queue insert rq to hctx->dispatch due to RESTART is still set, do nothing To fix this, replace the blk_mq_run_hw_queue in mq_flush_data_end_io with blk_mq_sched_restart to check and clear the RESTART flag. Fixes: bd166ef1 (blk-mq-sched: add framework for MQ capable IO schedulers) Reported-by: Florian Stecker Tested-by: Florian Stecker Signed-off-by: Jianchao Wang --- block/blk-flush.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index a3fc7191..6e0f2d9 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -335,7 +335,7 @@ static void mq_flush_data_end_io(struct request *rq, blk_status_t error) blk_flush_complete_seq(rq, fq, REQ_FSEQ_DATA, error); spin_unlock_irqrestore(&fq->mq_flush_lock, flags); - blk_mq_run_hw_queue(hctx, true); + blk_mq_sched_restart(hctx); } /** -- 2.7.4
[PATCH] blk-mq: fix the cmd_flag_name array
Swap REQ_NOWAIT and REQ_NOUNMAP and add REQ_HIPRI. Signed-off-by: Jianchao Wang --- block/blk-mq-debugfs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index 90d6876..f812083 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -308,8 +308,9 @@ static const char *const cmd_flag_name[] = { CMD_FLAG_NAME(PREFLUSH), CMD_FLAG_NAME(RAHEAD), CMD_FLAG_NAME(BACKGROUND), - CMD_FLAG_NAME(NOUNMAP), CMD_FLAG_NAME(NOWAIT), + CMD_FLAG_NAME(NOUNMAP), + CMD_FLAG_NAME(HIPRI), }; #undef CMD_FLAG_NAME -- 2.7.4
[PATCH 1/2] blk-mq: save queue mapping result into ctx directly
Currelty, the queue mapping result is saved in a two-dimensional array. In hot path, to get a hctx, we need do following, q->queue_hw_ctx[q->tag_set->map[type].mq_map[cpu]] This looks not efficient. Actually, we could save the queue mapping result into ctx directly with different hctx type, like, ctx->hctxs[type] Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 2 +- block/blk-mq-tag.c | 2 +- block/blk-mq.c | 4 ++-- block/blk-mq.h | 7 --- block/blk.h | 2 +- 5 files changed, 9 insertions(+), 8 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 140933e..4090553 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -321,7 +321,7 @@ bool __blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio) { struct elevator_queue *e = q->elevator; struct blk_mq_ctx *ctx = blk_mq_get_ctx(q); - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, bio->bi_opf, ctx->cpu); + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, bio->bi_opf, ctx); bool ret = false; enum hctx_type type; diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 2089c6c..a4931fc 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -170,7 +170,7 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) data->ctx = blk_mq_get_ctx(data->q); data->hctx = blk_mq_map_queue(data->q, data->cmd_flags, - data->ctx->cpu); + data->ctx); tags = blk_mq_tags_from_data(data); if (data->flags & BLK_MQ_REQ_RESERVED) bt = &tags->breserved_tags; diff --git a/block/blk-mq.c b/block/blk-mq.c index 8f5b533..445d0a2 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -364,7 +364,7 @@ static struct request *blk_mq_get_request(struct request_queue *q, } if (likely(!data->hctx)) data->hctx = blk_mq_map_queue(q, data->cmd_flags, - data->ctx->cpu); + data->ctx); if (data->cmd_flags & REQ_NOWAIT) data->flags |= BLK_MQ_REQ_NOWAIT; @@ -2435,7 +2435,7 @@ static void blk_mq_map_swqueue(struct request_queue *q) continue; hctx = blk_mq_map_queue_type(q, j, i); - + ctx->hctxs[j] = hctx; /* * If the CPU is already set in the mask, then we've * mapped this one already. This can happen if diff --git a/block/blk-mq.h b/block/blk-mq.h index d943d46..9fb0626 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -23,6 +23,7 @@ struct blk_mq_ctx { unsigned intcpu; unsigned short index_hw[HCTX_MAX_TYPES]; + struct blk_mq_hw_ctx*hctxs[HCTX_MAX_TYPES]; /* incremented at dispatch time */ unsigned long rq_dispatched[2]; @@ -97,11 +98,11 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue_type(struct request_queue * * blk_mq_map_queue() - map (cmd_flags,type) to hardware queue * @q: request queue * @flags: request command flags - * @cpu: CPU + * @cpu: cpu ctx */ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, unsigned int flags, -unsigned int cpu) +struct blk_mq_ctx *ctx) { enum hctx_type type = HCTX_TYPE_DEFAULT; @@ -116,7 +117,7 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, q->tag_set->map[HCTX_TYPE_READ].nr_queues) type = HCTX_TYPE_READ; - return blk_mq_map_queue_type(q, type, cpu); + return ctx->hctxs[type]; } /* diff --git a/block/blk.h b/block/blk.h index 848278c..5d636ee 100644 --- a/block/blk.h +++ b/block/blk.h @@ -38,7 +38,7 @@ extern struct ida blk_queue_ida; static inline struct blk_flush_queue * blk_get_flush_queue(struct request_queue *q, struct blk_mq_ctx *ctx) { - return blk_mq_map_queue(q, REQ_OP_FLUSH, ctx->cpu)->fq; + return blk_mq_map_queue(q, REQ_OP_FLUSH, ctx)->fq; } static inline void __blk_get_queue(struct request_queue *q) -- 2.7.4
[PATCH 2/2] blk-mq: save default hctx into ctx->hctxs for not-supported type
Currently, we check whether the hctx type is supported every time in hot path. Actually, this is not necessary, we could save the default hctx into ctx->hctxs if the type is not supported when map swqueues and use it directly with ctx->hctxs[type]. We also needn't check whether the poll is enabled or not, because the caller would clear the REQ_HIPRI in that case. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 9 - block/blk-mq.h | 13 + 2 files changed, 13 insertions(+), 9 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 445d0a2..8a825ae 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2431,8 +2431,11 @@ static void blk_mq_map_swqueue(struct request_queue *q) ctx = per_cpu_ptr(q->queue_ctx, i); for (j = 0; j < set->nr_maps; j++) { - if (!set->map[j].nr_queues) + if (!set->map[j].nr_queues) { + ctx->hctxs[j] = blk_mq_map_queue_type(q, + HCTX_TYPE_DEFAULT, i); continue; + } hctx = blk_mq_map_queue_type(q, j, i); ctx->hctxs[j] = hctx; @@ -2455,6 +2458,10 @@ static void blk_mq_map_swqueue(struct request_queue *q) */ BUG_ON(!hctx->nr_ctx); } + + for (; j < HCTX_MAX_TYPES; j++) + ctx->hctxs[j] = blk_mq_map_queue_type(q, + HCTX_TYPE_DEFAULT, i); } mutex_unlock(&q->sysfs_lock); diff --git a/block/blk-mq.h b/block/blk-mq.h index 9fb0626..14b7efb9 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -106,15 +106,12 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, { enum hctx_type type = HCTX_TYPE_DEFAULT; - if ((flags & REQ_HIPRI) && - q->tag_set->nr_maps > HCTX_TYPE_POLL && - q->tag_set->map[HCTX_TYPE_POLL].nr_queues && - test_bit(QUEUE_FLAG_POLL, &q->queue_flags)) + /* +* The caller ensure that if REQ_HIPRI, poll must be enabled. +*/ + if (flags & REQ_HIPRI) type = HCTX_TYPE_POLL; - - else if (((flags & REQ_OP_MASK) == REQ_OP_READ) && -q->tag_set->nr_maps > HCTX_TYPE_READ && -q->tag_set->map[HCTX_TYPE_READ].nr_queues) + else if ((flags & REQ_OP_MASK) == REQ_OP_READ) type = HCTX_TYPE_READ; return ctx->hctxs[type]; -- 2.7.4
[PATCH 0/2] small optimization for accessing queue map
Hi Jens These two patches are small optimization for accessing the queue mapping in hot path. It saves the queue mapping results into blk_mq_ctx directly, then we needn't do the complicated bounce on queue_hw_ctx[] map[] and mq_map[]. Jianchao Wang (2) blk-mq: save queue mapping result into ctx directly blk-mq: save default hctx into ctx->hctxs block/blk-mq-sched.c | 2 +- block/blk-mq-tag.c | 2 +- block/blk-mq.c | 13 ++--- block/blk-mq.h | 20 +--- block/blk.h | 2 +- 5 files changed, 22 insertions(+), 17 deletions(-) Thanks Jianchao
[PATCH 1/2] blk-mq: save queue mapping result into ctx directly
Currelty, the queue mapping result is saved in a two-dimensional array. In hot path, to get a hctx, we need do following, q->queue_hw_ctx[q->tag_set->map[type].mq_map[cpu]] This looks not efficient. Actually, we could save the queue mapping result into ctx directly with different hctx type, like, ctx->hctxs[type] Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 2 +- block/blk-mq-tag.c | 2 +- block/blk-mq.c | 4 ++-- block/blk-mq.h | 5 +++-- block/blk.h | 2 +- 5 files changed, 8 insertions(+), 7 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 140933e..4090553 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -321,7 +321,7 @@ bool __blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio) { struct elevator_queue *e = q->elevator; struct blk_mq_ctx *ctx = blk_mq_get_ctx(q); - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, bio->bi_opf, ctx->cpu); + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, bio->bi_opf, ctx); bool ret = false; enum hctx_type type; diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 2089c6c..a4931fc 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -170,7 +170,7 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) data->ctx = blk_mq_get_ctx(data->q); data->hctx = blk_mq_map_queue(data->q, data->cmd_flags, - data->ctx->cpu); + data->ctx); tags = blk_mq_tags_from_data(data); if (data->flags & BLK_MQ_REQ_RESERVED) bt = &tags->breserved_tags; diff --git a/block/blk-mq.c b/block/blk-mq.c index 3ba37b9..6898d24 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -364,7 +364,7 @@ static struct request *blk_mq_get_request(struct request_queue *q, } if (likely(!data->hctx)) data->hctx = blk_mq_map_queue(q, data->cmd_flags, - data->ctx->cpu); + data->ctx); if (data->cmd_flags & REQ_NOWAIT) data->flags |= BLK_MQ_REQ_NOWAIT; @@ -2434,7 +2434,7 @@ static void blk_mq_map_swqueue(struct request_queue *q) continue; hctx = blk_mq_map_queue_type(q, j, i); - + ctx->hctxs[j] = hctx; /* * If the CPU is already set in the mask, then we've * mapped this one already. This can happen if diff --git a/block/blk-mq.h b/block/blk-mq.h index d943d46..998f5cf 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -23,6 +23,7 @@ struct blk_mq_ctx { unsigned intcpu; unsigned short index_hw[HCTX_MAX_TYPES]; + struct blk_mq_hw_ctx *hctxs[HCTX_MAX_TYPES]; /* incremented at dispatch time */ unsigned long rq_dispatched[2]; @@ -101,7 +102,7 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue_type(struct request_queue * */ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, unsigned int flags, -unsigned int cpu) +struct blk_mq_ctx *ctx) { enum hctx_type type = HCTX_TYPE_DEFAULT; @@ -116,7 +117,7 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, q->tag_set->map[HCTX_TYPE_READ].nr_queues) type = HCTX_TYPE_READ; - return blk_mq_map_queue_type(q, type, cpu); + return ctx->hctxs[type]; } /* diff --git a/block/blk.h b/block/blk.h index 848278c..5d636ee 100644 --- a/block/blk.h +++ b/block/blk.h @@ -38,7 +38,7 @@ extern struct ida blk_queue_ida; static inline struct blk_flush_queue * blk_get_flush_queue(struct request_queue *q, struct blk_mq_ctx *ctx) { - return blk_mq_map_queue(q, REQ_OP_FLUSH, ctx->cpu)->fq; + return blk_mq_map_queue(q, REQ_OP_FLUSH, ctx)->fq; } static inline void __blk_get_queue(struct request_queue *q) -- 2.7.4
[PATCH 2/2] blk-mq: save default hctx into ctx->hctxs for not-supported type
Currently, we check whether the hctx type is supported every time in hot path. Actually, this is not necessary, we could save the default hctx into ctx->hctxs if the type is not supported when map swqueues and use it directly with ctx->hctxs[type]. We also needn't check whether the poll is enabled or not, because the caller would clear the REQ_HIPRI in that case. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 9 - block/blk-mq.h | 15 ++- 2 files changed, 14 insertions(+), 10 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 6898d24..1dab467 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2430,8 +2430,11 @@ static void blk_mq_map_swqueue(struct request_queue *q) ctx = per_cpu_ptr(q->queue_ctx, i); for (j = 0; j < set->nr_maps; j++) { - if (!set->map[j].nr_queues) + if (!set->map[j].nr_queues) { + ctx->hctxs[j] = blk_mq_map_queue_type(q, + HCTX_TYPE_DEFAULT, i); continue; + } hctx = blk_mq_map_queue_type(q, j, i); ctx->hctxs[j] = hctx; @@ -2454,6 +2457,10 @@ static void blk_mq_map_swqueue(struct request_queue *q) */ BUG_ON(!hctx->nr_ctx); } + + for (; j < HCTX_MAX_TYPES; j++) + ctx->hctxs[j] = blk_mq_map_queue_type(q, + HCTX_TYPE_DEFAULT, i); } mutex_unlock(&q->sysfs_lock); diff --git a/block/blk-mq.h b/block/blk-mq.h index 998f5cf..ed1ed45 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -98,7 +98,7 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue_type(struct request_queue * * blk_mq_map_queue() - map (cmd_flags,type) to hardware queue * @q: request queue * @flags: request command flags - * @cpu: CPU + * @ctx: mq cpu ctx */ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, unsigned int flags, @@ -106,15 +106,12 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, { enum hctx_type type = HCTX_TYPE_DEFAULT; - if ((flags & REQ_HIPRI) && - q->tag_set->nr_maps > HCTX_TYPE_POLL && - q->tag_set->map[HCTX_TYPE_POLL].nr_queues && - test_bit(QUEUE_FLAG_POLL, &q->queue_flags)) + /* +* The caller ensure that if REQ_HIPRI, poll must be enabled. +*/ + if (flags & REQ_HIPRI) type = HCTX_TYPE_POLL; - - else if (((flags & REQ_OP_MASK) == REQ_OP_READ) && -q->tag_set->nr_maps > HCTX_TYPE_READ && -q->tag_set->map[HCTX_TYPE_READ].nr_queues) + else if ((flags & REQ_OP_MASK) == REQ_OP_READ) type = HCTX_TYPE_READ; return ctx->hctxs[type]; -- 2.7.4
[PATCH 0/2] blk-mq: small optimization for accessing of queue map
Hi Jens These two patches are small optimization for accessing the queue mapping in hot path. It saves the queue mapping results into blk_mq_ctx directly, then we needn't do the complicated bounce on queue_hw_ctx[] map[] and mq_map[]. Jianchao Wang(2) blk-mq: save queue mapping result into ctx directly blk-mq: save default hctx into ctx->hctxs for not-supported type block/blk-mq-sched.c | 2 +- block/blk-mq-tag.c | 2 +- block/blk-mq.c | 13 ++--- block/blk-mq.h | 20 +--- block/blk.h | 2 +- 5 files changed, 22 insertions(+), 17 deletions(-) Thanks Jianchao
[PATCH V14 3/3] blk-mq: replace and kill blk_mq_request_issue_directly
Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly in blk_insert_cloned_request and kill it as nobody uses it any more. Signed-off-by: Jianchao Wang --- block/blk-core.c | 4 +++- block/blk-mq.c | 9 + block/blk-mq.h | 6 -- 3 files changed, 8 insertions(+), 11 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 268d2b8..fa661ba 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1240,6 +1240,8 @@ static int blk_cloned_rq_check_limits(struct request_queue *q, */ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq) { + blk_qc_t unused; + if (blk_cloned_rq_check_limits(q, rq)) return BLK_STS_IOERR; @@ -1255,7 +1257,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * * bypass a potential scheduler on the bottom device for * insert. */ - return blk_mq_request_issue_directly(rq, true); + return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, true); } EXPORT_SYMBOL_GPL(blk_insert_cloned_request); diff --git a/block/blk-mq.c b/block/blk-mq.c index 90361fe..2d3a29e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1792,7 +1792,7 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, bool bypass, bool last) @@ -1864,13 +1864,6 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) -{ - blk_qc_t unused; - - return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, last); -} - void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { diff --git a/block/blk-mq.h b/block/blk-mq.h index 0c9c9ea..b63a0de 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -68,8 +68,10 @@ void blk_mq_request_bypass_insert(struct request *rq, bool run_queue); void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list); -/* Used by blk_insert_cloned_request() to issue request directly */ -blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last); +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie, + bool bypass, bool last); void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list); -- 2.7.4
[PATCH V14 0/3] blk-mq: refactor and fix the code of issue directly
Hi Jens After commit c616cbee ( blk-mq: punt failed direct issue to dispatch list ), we always insert request to hctx dispatch list whenever get a BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE, this is overkill and will harm the merging. We just need to do that for the requests that has been through .queue_rq. This patch set fixes the issue above and refactors the code of issue request directly to unify the interface and make the code clearer and more readable. Please consider this patchset for 4.21. The 1st patch refactors the code of issue request directly to unify the helper interface which could handle all the cases. And also it does more reasonable decision about how to insert the requests. The 2nd patch make blk_mq_sched_insert_requests issue requests directly with 'bypass' false, then it needn't to handle the non-issued requests any more. The 3rd patch replace and kill the blk_mq_request_issue_directly. V14: - change some comment V13: - remove the unused tag 'out' (1/3) V12: - remove the original 1st patch/ - rebase other 3 patches on newest for-4.21/block - add some comment V11: - insert request to dispatch list when .queue_rq return BLK_STS_RESOURCE/BLK_STS_DEV_RESOURCE. (2/4) - stop to attempt direct-issue and insert the reset when meet non-ok case (3/4). V10: - address Jen's comment. - let blk_mq_try_issue_directly return actual result for case 'bypass == false'. (2/4) - use return value of blk_mq_try_issue_directly to identify whether the request is direct-issued successfully. (3/4) V9: - rebase on recent for-4.21/block - add 1st patch V8: - drop the original 2nd patch which try to insert requests into hctx->dispatch if quiesced or stopped. - remove two wrong 'unlikely' V7: - drop the original 3rd patch which try to ensure hctx to be ran on its mapped cpu in issueing directly path. V6: - drop original 1st patch to address Jen's comment - discard the enum mq_issue_decision and blk_mq_make_decision and use BLK_STS_* return values directly to address Jen's comment. (1/5) - add 'unlikely' in blk_mq_try_issue_directly (1/5) - refactor the 2nd and 3rd patch based on the new 1st patch. - reserve the unused_cookie in 4th and 5th patch V5: - rebase against Jens' for-4.21/block branch - adjust the order of patch04 and patch05 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert - comment changes V4: - split the original patch 1 into two patches, 1st and 2nd patch currently - rename the mq_decision to mq_issue_decision - comment changes V3: - Correct the code about the case bypass_insert is true and io scheduler attached. The request still need to be issued in case above. (1/4) - Refactor the code to make code clearer. blk_mq_make_request is introduced to decide insert, end or just return based on the return value of .queue_rq and bypass_insert (1/4) - Add the 2nd patch. It introduce a new decision result which indicates to insert request with blk_mq_request_bypass_insert. - Modify the code to adapt the new patch 1. V2: - Add 1st and 2nd patch to refactor the code. Jianchao Wang (3) blk-mq: refactor the code of issue request directly blk-mq: issue directly with bypass 'false' in blk-mq: replace and kill blk_mq_request_issue_directly block/blk-core.c | 4 +++- block/blk-mq-sched.c | 8 +++ block/blk-mq.c | 122 --- block/blk-mq.h | 6 +++-- 4 files changed, 69 insertions(+), 71 deletions(-) Thanks Jianchao
[PATCH V14 1/3] blk-mq: refactor the code of issue request directly
Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly into one interface to unify the interfaces to issue requests directly. The merged interface takes over the requests totally, it could insert, end or do nothing based on the return value of .queue_rq and 'bypass' parameter. Then caller needn't any other handling any more and then code could be cleaned up. And also the commit c616cbee ( blk-mq: punt failed direct issue to dispatch list ) always inserts requests to hctx dispatch list whenever get a BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE, this is overkill and will harm the merging. We just need to do that for the requests that has been through .queue_rq. This patch also could fix this. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 103 ++--- 1 file changed, 54 insertions(+), 49 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 9690f4f..af4dc82 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1792,78 +1792,83 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, - bool bypass_insert, bool last) + bool bypass, bool last) { struct request_queue *q = rq->q; bool run_queue = true; + blk_status_t ret = BLK_STS_RESOURCE; + int srcu_idx; + bool force = false; + hctx_lock(hctx, &srcu_idx); /* -* RCU or SRCU read lock is needed before checking quiesced flag. +* hctx_lock is needed before checking quiesced flag. * -* When queue is stopped or quiesced, ignore 'bypass_insert' from -* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller, -* and avoid driver to try to dispatch again. +* When queue is stopped or quiesced, ignore 'bypass', insert +* and return BLK_STS_OK to caller, and avoid driver to try to +* dispatch again. */ - if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { + if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) { run_queue = false; - bypass_insert = false; - goto insert; + bypass = false; + goto out_unlock; } - if (q->elevator && !bypass_insert) - goto insert; + if (unlikely(q->elevator && !bypass)) + goto out_unlock; if (!blk_mq_get_dispatch_budget(hctx)) - goto insert; + goto out_unlock; if (!blk_mq_get_driver_tag(rq)) { blk_mq_put_dispatch_budget(hctx); - goto insert; + goto out_unlock; } - return __blk_mq_issue_directly(hctx, rq, cookie, last); -insert: - if (bypass_insert) - return BLK_STS_RESOURCE; - - blk_mq_request_bypass_insert(rq, run_queue); - return BLK_STS_OK; -} - -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) -{ - blk_status_t ret; - int srcu_idx; - - might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); - - hctx_lock(hctx, &srcu_idx); - - ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false, true); - if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - blk_mq_request_bypass_insert(rq, true); - else if (ret != BLK_STS_OK) - blk_mq_end_request(rq, ret); - + /* +* Always add a request that has been through +*.queue_rq() to the hardware dispatch list. +*/ + force = true; + ret = __blk_mq_issue_directly(hctx, rq, cookie, last); +out_unlock: hctx_unlock(hctx, srcu_idx); + switch (ret) { + case BLK_STS_OK: + break; + case BLK_STS_DEV_RESOURCE: + case BLK_STS_RESOURCE: + if (force) { + blk_mq_request_bypass_insert(rq, run_queue); + /* +* We have to return BLK_STS_OK for the DM +* to avoid livelock. Otherwise, we return +* the real result to indicate whether the +* request is direct-issued successfully. +*/ + ret = bypass ? BLK_STS_OK : ret; + } else if (!bypass) { + blk_mq_sched_insert_request(rq, false, + run_queue, false); +
[PATCH V14 2/3] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
It is not necessary to issue request directly with bypass 'true' in blk_mq_sched_insert_requests and handle the non-issued requests itself. Just set bypass to 'false' and let blk_mq_try_issue_directly handle them totally. Remove the blk_rq_can_direct_dispatch check, because blk_mq_try_issue_directly can handle it well.If request is direct-issued unsuccessfully, insert the reset. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 +++- block/blk-mq.c | 20 +--- 2 files changed, 12 insertions(+), 16 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index f096d898..5b4d52d 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -417,12 +417,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, * busy in case of 'none' scheduler, and this way may save * us one extra enqueue & dequeue to sw queue. */ - if (!hctx->dispatch_busy && !e && !run_queue_async) { + if (!hctx->dispatch_busy && !e && !run_queue_async) blk_mq_try_issue_list_directly(hctx, list); - if (list_empty(list)) - return; - } - blk_mq_insert_requests(hctx, ctx, list); + else + blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index af4dc82..90361fe 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1874,22 +1874,20 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { + blk_qc_t unused; + blk_status_t ret = BLK_STS_OK; + while (!list_empty(list)) { - blk_status_t ret; struct request *rq = list_first_entry(list, struct request, queuelist); list_del_init(&rq->queuelist); - ret = blk_mq_request_issue_directly(rq, list_empty(list)); - if (ret != BLK_STS_OK) { - if (ret == BLK_STS_RESOURCE || - ret == BLK_STS_DEV_RESOURCE) { - blk_mq_request_bypass_insert(rq, + if (ret == BLK_STS_OK) + ret = blk_mq_try_issue_directly(hctx, rq, &unused, + false, list_empty(list)); - break; - } - blk_mq_end_request(rq, ret); - } + else + blk_mq_sched_insert_request(rq, false, true, false); } /* @@ -1897,7 +1895,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, * the driver there was more coming, but that turned out to * be a lie. */ - if (!list_empty(list) && hctx->queue->mq_ops->commit_rqs) + if (ret != BLK_STS_OK && hctx->queue->mq_ops->commit_rqs) hctx->queue->mq_ops->commit_rqs(hctx); } -- 2.7.4
[PATCH V13 0/3] blk-mq: refactor code of issue directly
Hi Jens Please consider this patchset for 4.21. It refactors the code of issue request directly to unify the interface and make the code clearer and more readable. The 1st patch refactors the code of issue request directly to unify the helper interface which could handle all the cases. The 2nd patch make blk_mq_sched_insert_requests issue requests directly with 'bypass' false, then it needn't to handle the non-issued requests any more. The 3rd patch replace and kill the blk_mq_request_issue_directly. V13: - remove the unused tag 'out' (1/3) V12: - remove the original 1st patch/ - rebase other 3 patches on newest for-4.21/block - add some comment V11: - insert request to dispatch list when .queue_rq return BLK_STS_RESOURCE/BLK_STS_DEV_RESOURCE. (2/4) - stop to attempt direct-issue and insert the reset when meet non-ok case (3/4). V10: - address Jen's comment. - let blk_mq_try_issue_directly return actual result for case 'bypass == false'. (2/4) - use return value of blk_mq_try_issue_directly to identify whether the request is direct-issued successfully. (3/4) V9: - rebase on recent for-4.21/block - add 1st patch V8: - drop the original 2nd patch which try to insert requests into hctx->dispatch if quiesced or stopped. - remove two wrong 'unlikely' V7: - drop the original 3rd patch which try to ensure hctx to be ran on its mapped cpu in issueing directly path. V6: - drop original 1st patch to address Jen's comment - discard the enum mq_issue_decision and blk_mq_make_decision and use BLK_STS_* return values directly to address Jen's comment. (1/5) - add 'unlikely' in blk_mq_try_issue_directly (1/5) - refactor the 2nd and 3rd patch based on the new 1st patch. - reserve the unused_cookie in 4th and 5th patch V5: - rebase against Jens' for-4.21/block branch - adjust the order of patch04 and patch05 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert - comment changes V4: - split the original patch 1 into two patches, 1st and 2nd patch currently - rename the mq_decision to mq_issue_decision - comment changes V3: - Correct the code about the case bypass_insert is true and io scheduler attached. The request still need to be issued in case above. (1/4) - Refactor the code to make code clearer. blk_mq_make_request is introduced to decide insert, end or just return based on the return value of .queue_rq and bypass_insert (1/4) - Add the 2nd patch. It introduce a new decision result which indicates to insert request with blk_mq_request_bypass_insert. - Modify the code to adapt the new patch 1. V2: - Add 1st and 2nd patch to refactor the code. Jianchao Wang (3) blk-mq: refactor the code of issue request directly blk-mq: issue directly with bypass 'false' in blk-mq: replace and kill blk_mq_request_issue_directly block/blk-core.c | 4 +- block/blk-mq-sched.c | 8 ++-- block/blk-mq.c | 123 +-- block/blk-mq.h | 6 ++- 4 files changed, 70 insertions(+), 71 deletions(-) Thanks Jianchao
[PATCH V13 3/3] blk-mq: replace and kill blk_mq_request_issue_directly
Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly in blk_insert_cloned_request and kill it as nobody uses it any more. Signed-off-by: Jianchao Wang --- block/blk-core.c | 4 +++- block/blk-mq.c | 9 + block/blk-mq.h | 6 -- 3 files changed, 8 insertions(+), 11 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index ad59102..c92d866 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1297,6 +1297,8 @@ static int blk_cloned_rq_check_limits(struct request_queue *q, */ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq) { + blk_qc_t unused; + if (blk_cloned_rq_check_limits(q, rq)) return BLK_STS_IOERR; @@ -1312,7 +1314,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * * bypass a potential scheduler on the bottom device for * insert. */ - return blk_mq_request_issue_directly(rq, true); + return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, true); } EXPORT_SYMBOL_GPL(blk_insert_cloned_request); diff --git a/block/blk-mq.c b/block/blk-mq.c index 8f554d1..890d074 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1794,7 +1794,7 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, bool bypass, bool last) @@ -1866,13 +1866,6 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) -{ - blk_qc_t unused; - - return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, last); -} - void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { diff --git a/block/blk-mq.h b/block/blk-mq.h index a664ea4..b81e619 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -68,8 +68,10 @@ void blk_mq_request_bypass_insert(struct request *rq, bool run_queue); void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list); -/* Used by blk_insert_cloned_request() to issue request directly */ -blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last); +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie, + bool bypass, bool last); void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list); -- 2.7.4
[PATCH V13 2/3] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
It is not necessary to issue request directly with bypass 'true' in blk_mq_sched_insert_requests and handle the non-issued requests itself. Just set bypass to 'false' and let blk_mq_try_issue_directly handle them totally. Remove the blk_rq_can_direct_dispatch check, because blk_mq_try_issue_directly can handle it well.If request is direct-issued unsuccessfully, insert the reset. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 +++- block/blk-mq.c | 20 +--- 2 files changed, 12 insertions(+), 16 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index f096d898..5b4d52d 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -417,12 +417,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, * busy in case of 'none' scheduler, and this way may save * us one extra enqueue & dequeue to sw queue. */ - if (!hctx->dispatch_busy && !e && !run_queue_async) { + if (!hctx->dispatch_busy && !e && !run_queue_async) blk_mq_try_issue_list_directly(hctx, list); - if (list_empty(list)) - return; - } - blk_mq_insert_requests(hctx, ctx, list); + else + blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index 8dc34d2..8f554d1 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1876,22 +1876,20 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { + blk_qc_t unused; + blk_status_t ret = BLK_STS_OK; + while (!list_empty(list)) { - blk_status_t ret; struct request *rq = list_first_entry(list, struct request, queuelist); list_del_init(&rq->queuelist); - ret = blk_mq_request_issue_directly(rq, list_empty(list)); - if (ret != BLK_STS_OK) { - if (ret == BLK_STS_RESOURCE || - ret == BLK_STS_DEV_RESOURCE) { - blk_mq_request_bypass_insert(rq, + if (ret == BLK_STS_OK) + ret = blk_mq_try_issue_directly(hctx, rq, &unused, + false, list_empty(list)); - break; - } - blk_mq_end_request(rq, ret); - } + else + blk_mq_sched_insert_request(rq, false, true, false); } /* @@ -1899,7 +1897,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, * the driver there was more coming, but that turned out to * be a lie. */ - if (!list_empty(list) && hctx->queue->mq_ops->commit_rqs) + if (ret != BLK_STS_OK && hctx->queue->mq_ops->commit_rqs) hctx->queue->mq_ops->commit_rqs(hctx); } -- 2.7.4
[PATCH V13 1/3] blk-mq: refactor the code of issue request directly
Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly into one interface to unify the interfaces to issue requests directly. The merged interface takes over the requests totally, it could insert, end or do nothing based on the return value of .queue_rq and 'bypass' parameter. Then caller needn't any other handling any more and then code could be cleaned up. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 103 ++--- 1 file changed, 54 insertions(+), 49 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index b645275..8dc34d2 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1794,78 +1794,83 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, - bool bypass_insert, bool last) + bool bypass, bool last) { struct request_queue *q = rq->q; bool run_queue = true; + blk_status_t ret = BLK_STS_RESOURCE; + int srcu_idx; + bool force = false; + hctx_lock(hctx, &srcu_idx); /* -* RCU or SRCU read lock is needed before checking quiesced flag. +* hctx_lock is needed before checking quiesced flag. * -* When queue is stopped or quiesced, ignore 'bypass_insert' from -* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller, -* and avoid driver to try to dispatch again. +* When queue is stopped or quiesced, ignore 'bypass', insert +* and return BLK_STS_OK to caller, and avoid driver to try to +* dispatch again. */ - if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { + if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) { run_queue = false; - bypass_insert = false; - goto insert; + bypass = false; + goto out_unlock; } - if (q->elevator && !bypass_insert) - goto insert; + if (unlikely(q->elevator && !bypass)) + goto out_unlock; if (!blk_mq_get_dispatch_budget(hctx)) - goto insert; + goto out_unlock; if (!blk_mq_get_driver_tag(rq)) { blk_mq_put_dispatch_budget(hctx); - goto insert; + goto out_unlock; } - return __blk_mq_issue_directly(hctx, rq, cookie, last); -insert: - if (bypass_insert) - return BLK_STS_RESOURCE; - - blk_mq_request_bypass_insert(rq, run_queue); - return BLK_STS_OK; -} - -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) -{ - blk_status_t ret; - int srcu_idx; - - might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); - - hctx_lock(hctx, &srcu_idx); - - ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false, true); - if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - blk_mq_request_bypass_insert(rq, true); - else if (ret != BLK_STS_OK) - blk_mq_end_request(rq, ret); - + /* +* Always add a request that has been through +*.queue_rq() to the hardware dispatch list. +*/ + force = true; + ret = __blk_mq_issue_directly(hctx, rq, cookie, last); +out_unlock: hctx_unlock(hctx, srcu_idx); + switch (ret) { + case BLK_STS_OK: + break; + case BLK_STS_DEV_RESOURCE: + case BLK_STS_RESOURCE: + if (force) { + blk_mq_request_bypass_insert(rq, run_queue); + /* +* We have to return BLK_STS_OK for the DM +* to avoid livelock. Otherwise, we return +* the real result to indicate whether the +* request is direct-issued successfully. +*/ + ret = bypass ? BLK_STS_OK : ret; + } else if (!bypass) { + blk_mq_sched_insert_request(rq, false, + run_queue, false); + } + break; + default: + if (!bypass) + blk_mq_end_request(rq, ret); + break; + } + + return ret; } blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) { - blk_status_t ret; - int srcu_idx; - blk_qc_t unused_cookie
[PATCH V12 0/3] blk-mq: refactor code of issue directly
Hi Jens Please consider this patchset for 4.21. It refactors the code of issue request directly to unify the interface and make the code clearer and more readable. The 1st patch refactors the code of issue request directly to unify the helper interface which could handle all the cases. The 2nd patch make blk_mq_sched_insert_requests issue requests directly with 'bypass' false, then it needn't to handle the non-issued requests any more. The 3rd patch replace and kill the blk_mq_request_issue_directly. V12: - remove the original 1st patch/ - rebase other 3 patches on newest for-4.21/block - add some comment V11: - insert request to dispatch list when .queue_rq return BLK_STS_RESOURCE/BLK_STS_DEV_RESOURCE. (2/4) - stop to attempt direct-issue and insert the reset when meet non-ok case (3/4). V10: - address Jen's comment. - let blk_mq_try_issue_directly return actual result for case 'bypass == false'. (2/4) - use return value of blk_mq_try_issue_directly to identify whether the request is direct-issued successfully. (3/4) V9: - rebase on recent for-4.21/block - add 1st patch V8: - drop the original 2nd patch which try to insert requests into hctx->dispatch if quiesced or stopped. - remove two wrong 'unlikely' V7: - drop the original 3rd patch which try to ensure hctx to be ran on its mapped cpu in issueing directly path. V6: - drop original 1st patch to address Jen's comment - discard the enum mq_issue_decision and blk_mq_make_decision and use BLK_STS_* return values directly to address Jen's comment. (1/5) - add 'unlikely' in blk_mq_try_issue_directly (1/5) - refactor the 2nd and 3rd patch based on the new 1st patch. - reserve the unused_cookie in 4th and 5th patch V5: - rebase against Jens' for-4.21/block branch - adjust the order of patch04 and patch05 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert - comment changes V4: - split the original patch 1 into two patches, 1st and 2nd patch currently - rename the mq_decision to mq_issue_decision - comment changes V3: - Correct the code about the case bypass_insert is true and io scheduler attached. The request still need to be issued in case above. (1/4) - Refactor the code to make code clearer. blk_mq_make_request is introduced to decide insert, end or just return based on the return value of .queue_rq and bypass_insert (1/4) - Add the 2nd patch. It introduce a new decision result which indicates to insert request with blk_mq_request_bypass_insert. - Modify the code to adapt the new patch 1. V2: - Add 1st and 2nd patch to refactor the code. Jianchao Wang (3) blk-mq: refactor the code of issue request directly blk-mq: issue directly with bypass 'false' in blk-mq: replace and kill blk_mq_request_issue_directly block/blk-core.c | 4 +- block/blk-mq-sched.c | 8 ++-- block/blk-mq.c | 123 +-- block/blk-mq.h | 6 ++- 4 files changed, 70 insertions(+), 71 deletions(-) Thanks Jianchao
[PATCH V12 1/3] blk-mq: refactor the code of issue request directly
Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly into one interface to unify the interfaces to issue requests directly. The merged interface takes over the requests totally, it could insert, end or do nothing based on the return value of .queue_rq and 'bypass' parameter. Then caller needn't any other handling any more and then code could be cleaned up. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 104 ++--- 1 file changed, 55 insertions(+), 49 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index b645275..261ff6d 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1794,78 +1794,84 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, - bool bypass_insert, bool last) + bool bypass, bool last) { struct request_queue *q = rq->q; bool run_queue = true; + blk_status_t ret = BLK_STS_RESOURCE; + int srcu_idx; + bool force = false; + hctx_lock(hctx, &srcu_idx); /* -* RCU or SRCU read lock is needed before checking quiesced flag. +* hctx_lock is needed before checking quiesced flag. * -* When queue is stopped or quiesced, ignore 'bypass_insert' from -* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller, -* and avoid driver to try to dispatch again. +* When queue is stopped or quiesced, ignore 'bypass', insert +* and return BLK_STS_OK to caller, and avoid driver to try to +* dispatch again. */ - if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { + if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) { run_queue = false; - bypass_insert = false; - goto insert; + bypass = false; + goto out_unlock; } - if (q->elevator && !bypass_insert) - goto insert; + if (unlikely(q->elevator && !bypass)) + goto out_unlock; if (!blk_mq_get_dispatch_budget(hctx)) - goto insert; + goto out_unlock; if (!blk_mq_get_driver_tag(rq)) { blk_mq_put_dispatch_budget(hctx); - goto insert; + goto out_unlock; } - return __blk_mq_issue_directly(hctx, rq, cookie, last); -insert: - if (bypass_insert) - return BLK_STS_RESOURCE; - - blk_mq_request_bypass_insert(rq, run_queue); - return BLK_STS_OK; -} - -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) -{ - blk_status_t ret; - int srcu_idx; - - might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); - - hctx_lock(hctx, &srcu_idx); - - ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false, true); - if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - blk_mq_request_bypass_insert(rq, true); - else if (ret != BLK_STS_OK) - blk_mq_end_request(rq, ret); - + /* +* Always add a request that has been through +*.queue_rq() to the hardware dispatch list. +*/ + force = true; + ret = __blk_mq_issue_directly(hctx, rq, cookie, last); +out_unlock: hctx_unlock(hctx, srcu_idx); +out: + switch (ret) { + case BLK_STS_OK: + break; + case BLK_STS_DEV_RESOURCE: + case BLK_STS_RESOURCE: + if (force) { + blk_mq_request_bypass_insert(rq, run_queue); + /* +* We have to return BLK_STS_OK for the DM +* to avoid livelock. Otherwise, we return +* the real result to indicate whether the +* request is direct-issued successfully. +*/ + ret = bypass ? BLK_STS_OK : ret; + } else if (!bypass) { + blk_mq_sched_insert_request(rq, false, + run_queue, false); + } + break; + default: + if (!bypass) + blk_mq_end_request(rq, ret); + break; + } + + return ret; } blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) { - blk_status_t ret; - int srcu_idx; - blk_qc_t unused_cookie
[PATCH V12 3/3] blk-mq: replace and kill blk_mq_request_issue_directly
Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly in blk_insert_cloned_request and kill it as nobody uses it any more. Signed-off-by: Jianchao Wang --- block/blk-core.c | 4 +++- block/blk-mq.c | 9 + block/blk-mq.h | 6 -- 3 files changed, 8 insertions(+), 11 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index ad59102..c92d866 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1297,6 +1297,8 @@ static int blk_cloned_rq_check_limits(struct request_queue *q, */ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq) { + blk_qc_t unused; + if (blk_cloned_rq_check_limits(q, rq)) return BLK_STS_IOERR; @@ -1312,7 +1314,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * * bypass a potential scheduler on the bottom device for * insert. */ - return blk_mq_request_issue_directly(rq, true); + return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, true); } EXPORT_SYMBOL_GPL(blk_insert_cloned_request); diff --git a/block/blk-mq.c b/block/blk-mq.c index c663102..999e811 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1794,7 +1794,7 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, bool bypass, bool last) @@ -1867,13 +1867,6 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) -{ - blk_qc_t unused; - - return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, last); -} - void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { diff --git a/block/blk-mq.h b/block/blk-mq.h index a664ea4..b81e619 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -68,8 +68,10 @@ void blk_mq_request_bypass_insert(struct request *rq, bool run_queue); void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list); -/* Used by blk_insert_cloned_request() to issue request directly */ -blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last); +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie, + bool bypass, bool last); void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list); -- 2.7.4
[PATCH V12 2/3] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
It is not necessary to issue request directly with bypass 'true' in blk_mq_sched_insert_requests and handle the non-issued requests itself. Just set bypass to 'false' and let blk_mq_try_issue_directly handle them totally. Remove the blk_rq_can_direct_dispatch check, because blk_mq_try_issue_directly can handle it well.If request is direct-issued unsuccessfully, insert the reset. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 +++- block/blk-mq.c | 20 +--- 2 files changed, 12 insertions(+), 16 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index f096d898..5b4d52d 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -417,12 +417,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, * busy in case of 'none' scheduler, and this way may save * us one extra enqueue & dequeue to sw queue. */ - if (!hctx->dispatch_busy && !e && !run_queue_async) { + if (!hctx->dispatch_busy && !e && !run_queue_async) blk_mq_try_issue_list_directly(hctx, list); - if (list_empty(list)) - return; - } - blk_mq_insert_requests(hctx, ctx, list); + else + blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index 261ff6d..c663102 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1877,22 +1877,20 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { + blk_qc_t unused; + blk_status_t ret = BLK_STS_OK; + while (!list_empty(list)) { - blk_status_t ret; struct request *rq = list_first_entry(list, struct request, queuelist); list_del_init(&rq->queuelist); - ret = blk_mq_request_issue_directly(rq, list_empty(list)); - if (ret != BLK_STS_OK) { - if (ret == BLK_STS_RESOURCE || - ret == BLK_STS_DEV_RESOURCE) { - blk_mq_request_bypass_insert(rq, + if (ret == BLK_STS_OK) + ret = blk_mq_try_issue_directly(hctx, rq, &unused, + false, list_empty(list)); - break; - } - blk_mq_end_request(rq, ret); - } + else + blk_mq_sched_insert_request(rq, false, true, false); } /* @@ -1900,7 +1898,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, * the driver there was more coming, but that turned out to * be a lie. */ - if (!list_empty(list) && hctx->queue->mq_ops->commit_rqs) + if (ret != BLK_STS_OK && hctx->queue->mq_ops->commit_rqs) hctx->queue->mq_ops->commit_rqs(hctx); } -- 2.7.4
[PATCH V11 4/4] blk-mq: replace and kill blk_mq_request_issue_directly
Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly in blk_insert_cloned_request and kill it as nobody uses it any more. Signed-off-by: Jianchao Wang --- block/blk-core.c | 4 +++- block/blk-mq.c | 9 + block/blk-mq.h | 6 -- 3 files changed, 8 insertions(+), 11 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index ad59102..c92d866 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1297,6 +1297,8 @@ static int blk_cloned_rq_check_limits(struct request_queue *q, */ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq) { + blk_qc_t unused; + if (blk_cloned_rq_check_limits(q, rq)) return BLK_STS_IOERR; @@ -1312,7 +1314,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * * bypass a potential scheduler on the bottom device for * insert. */ - return blk_mq_request_issue_directly(rq, true); + return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, true); } EXPORT_SYMBOL_GPL(blk_insert_cloned_request); diff --git a/block/blk-mq.c b/block/blk-mq.c index 3265e30..32102d0 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1815,7 +1815,7 @@ static bool blk_rq_can_direct_dispatch(struct request *rq) return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE; } -static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, bool bypass, bool last) @@ -1894,13 +1894,6 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) -{ - blk_qc_t unused; - - return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, last); -} - void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { diff --git a/block/blk-mq.h b/block/blk-mq.h index a664ea4..b81e619 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -68,8 +68,10 @@ void blk_mq_request_bypass_insert(struct request *rq, bool run_queue); void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list); -/* Used by blk_insert_cloned_request() to issue request directly */ -blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last); +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie, + bool bypass, bool last); void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list); -- 2.7.4
[PATCH V11 3/4] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
It is not necessary to issue request directly with bypass 'true' in blk_mq_sched_insert_requests and handle the non-issued requests itself. Just set bypass to 'false' and let blk_mq_try_issue_directly handle them totally. Remove the blk_rq_can_direct_dispatch check, because blk_mq_try_issue_directly can handle it well.If request is direct-issued unsuccessfully, insert the reset. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 +++- block/blk-mq.c | 29 ++--- 2 files changed, 13 insertions(+), 24 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index f096d898..5b4d52d 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -417,12 +417,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, * busy in case of 'none' scheduler, and this way may save * us one extra enqueue & dequeue to sw queue. */ - if (!hctx->dispatch_busy && !e && !run_queue_async) { + if (!hctx->dispatch_busy && !e && !run_queue_async) blk_mq_try_issue_list_directly(hctx, list); - if (list_empty(list)) - return; - } - blk_mq_insert_requests(hctx, ctx, list); + else + blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index 88ee447..3265e30 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1904,32 +1904,23 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { + blk_qc_t unused; + blk_status_t ret = BLK_STS_OK; + while (!list_empty(list)) { - blk_status_t ret; struct request *rq = list_first_entry(list, struct request, queuelist); - if (!blk_rq_can_direct_dispatch(rq)) - break; - list_del_init(&rq->queuelist); - ret = blk_mq_request_issue_directly(rq, list_empty(list)); - if (ret != BLK_STS_OK) { - if (ret == BLK_STS_RESOURCE || - ret == BLK_STS_DEV_RESOURCE) { - list_add(&rq->queuelist, list); - break; - } - blk_mq_end_request(rq, ret); - } + if (ret == BLK_STS_OK) + ret = blk_mq_try_issue_directly(hctx, rq, &unused, + false, + list_empty(list)); + else + blk_mq_sched_insert_request(rq, false, true, false); } - /* -* If we didn't flush the entire list, we could have told -* the driver there was more coming, but that turned out to -* be a lie. -*/ - if (!list_empty(list) && hctx->queue->mq_ops->commit_rqs) + if (ret != BLK_STS_OK && hctx->queue->mq_ops->commit_rqs) hctx->queue->mq_ops->commit_rqs(hctx); } -- 2.7.4
[PATCH V11 2/4] blk-mq: refactor the code of issue request directly
Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly into one interface to unify the interfaces to issue requests directly. The merged interface takes over the requests totally, it could insert, end or do nothing based on the return value of .queue_rq and 'bypass' parameter. Then caller needn't any other handling any more and then code could be cleaned up. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 120 - 1 file changed, 59 insertions(+), 61 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 01802bf..88ee447 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1815,92 +1815,90 @@ static bool blk_rq_can_direct_dispatch(struct request *rq) return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE; } -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, - bool bypass_insert, bool last) + bool bypass, bool last) { struct request_queue *q = rq->q; bool run_queue = true; + blk_status_t ret = BLK_STS_RESOURCE; + int srcu_idx; bool force = false; + if (!blk_rq_can_direct_dispatch(rq)) { + /* +* Insert request to hctx dispatch list and return +* BLK_STS_OK for 'bypass == true' case, otherwise, +* the caller will fail forever. +*/ + force = bypass; + goto out; + } + + hctx_lock(hctx, &srcu_idx); /* -* RCU or SRCU read lock is needed before checking quiesced flag. +* hctx_lock is needed before checking quiesced flag. * -* When queue is stopped or quiesced, ignore 'bypass_insert' from -* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller, -* and avoid driver to try to dispatch again. +* When queue is stopped or quiesced, ignore 'bypass', insert +* and return BLK_STS_OK to caller, and avoid driver to try to +* dispatch again. */ - if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { + if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) { run_queue = false; - bypass_insert = false; - goto insert; + bypass = false; + goto out_unlock; } - if (q->elevator && !bypass_insert) - goto insert; - - if (!blk_rq_can_direct_dispatch(rq)) { - /* -* For 'bypass_insert == true' case, insert request into hctx -* dispatch list. -*/ - force = bypass_insert; - goto insert; - } + if (unlikely(q->elevator && !bypass)) + goto out_unlock; if (!blk_mq_get_dispatch_budget(hctx)) - goto insert; + goto out_unlock; if (!blk_mq_get_driver_tag(rq)) { blk_mq_put_dispatch_budget(hctx); - goto insert; + goto out_unlock; } - return __blk_mq_issue_directly(hctx, rq, cookie, last); -insert: - if (force) { - blk_mq_request_bypass_insert(rq, run_queue); - return BLK_STS_OK; - } else if (bypass_insert) { - return BLK_STS_RESOURCE; + /* +* If the request is issued unsuccessfully with +* BLK_STS_DEV_RESOURCE or BLK_STS_RESOURCE, insert +* the request to hctx dispatch list due to attached +* lldd resource. +*/ + force = true; + ret = __blk_mq_issue_directly(hctx, rq, cookie, last); +out_unlock: + hctx_unlock(hctx, srcu_idx); +out: + switch (ret) { + case BLK_STS_OK: + break; + case BLK_STS_DEV_RESOURCE: + case BLK_STS_RESOURCE: + if (force) { + blk_mq_request_bypass_insert(rq, run_queue); + ret = bypass ? BLK_STS_OK : ret; + } else if (!bypass) { + blk_mq_sched_insert_request(rq, false, + run_queue, false); + } + break; + default: + if (!bypass) + blk_mq_end_request(rq, ret); + break; } - blk_mq_sched_insert_request(rq, false, run_queue, false); - return BLK_STS_OK; -} - -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) -{ - blk_status_t ret; - int srcu
[PATCH V11 0/4] blk-mq: refactor code of issue directly
Hi Jens Please consider this patchset for 4.21. It refactors the code of issue request directly to unify the interface and make the code clearer and more readable. This patch set is rebased on the recent for-4.21/block and add the 1st patch which inserts the non-read-write request to hctx dispatch list to avoid to involve merge and io scheduler when bypass_insert is true, otherwise, inserting is ignored, BLK_STS_RESOURCE is returned and the caller will fail forever. The 2nd patch refactors the code of issue request directly to unify the helper interface which could handle all the cases. The 3rd patch make blk_mq_sched_insert_requests issue requests directly with 'bypass' false, then it needn't to handle the non-issued requests any more. The 4th patch replace and kill the blk_mq_request_issue_directly. V11: - insert request to dispatch list when .queue_rq return BLK_STS_RESOURCE/BLK_STS_DEV_RESOURCE. (2/4) - stop to attempt direct-issue and insert the reset when meet non-ok case (3/4). V10: - address Jen's comment. - let blk_mq_try_issue_directly return actual result for case 'bypass == false'. (2/4) - use return value of blk_mq_try_issue_directly to identify whether the request is direct-issued successfully. (3/4) V9: - rebase on recent for-4.21/block - add 1st patch V8: - drop the original 2nd patch which try to insert requests into hctx->dispatch if quiesced or stopped. - remove two wrong 'unlikely' V7: - drop the original 3rd patch which try to ensure hctx to be ran on its mapped cpu in issueing directly path. V6: - drop original 1st patch to address Jen's comment - discard the enum mq_issue_decision and blk_mq_make_decision and use BLK_STS_* return values directly to address Jen's comment. (1/5) - add 'unlikely' in blk_mq_try_issue_directly (1/5) - refactor the 2nd and 3rd patch based on the new 1st patch. - reserve the unused_cookie in 4th and 5th patch V5: - rebase against Jens' for-4.21/block branch - adjust the order of patch04 and patch05 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert - comment changes V4: - split the original patch 1 into two patches, 1st and 2nd patch currently - rename the mq_decision to mq_issue_decision - comment changes V3: - Correct the code about the case bypass_insert is true and io scheduler attached. The request still need to be issued in case above. (1/4) - Refactor the code to make code clearer. blk_mq_make_request is introduced to decide insert, end or just return based on the return value of .queue_rq and bypass_insert (1/4) - Add the 2nd patch. It introduce a new decision result which indicates to insert request with blk_mq_request_bypass_insert. - Modify the code to adapt the new patch 1. V2: - Add 1st and 2nd patch to refactor the code. Jianchao Wang (4) blk-mq: insert to hctx dispatch list when blk-mq: refactor the code of issue request directly blk-mq: issue directly with bypass 'false' in blk-mq: replace and kill blk_mq_request_issue_directly block/blk-core.c | 4 +- block/blk-mq-sched.c | 8 ++- block/blk-mq.c | 137 +-- block/blk-mq.h | 6 ++- 4 files changed, 76 insertions(+), 79 deletions(-) Thanks Jianchao
[PATCH V11 1/4] blk-mq: insert to hctx dispatch list when bypass_insert is true
We don't allow direct dispatch of anything but regular reads/writes and insert all of non-read-write requests. However, this is not correct for 'bypass_insert == true' case where inserting is ignored and BLK_STS_RESOURCE is returned. The caller will fail forever. Fix it with inserting the non-read-write request to hctx dispatch list to avoid to involve merge and io scheduler when bypass_insert is true. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 9005505..01802bf 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1822,6 +1822,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, { struct request_queue *q = rq->q; bool run_queue = true; + bool force = false; /* * RCU or SRCU read lock is needed before checking quiesced flag. @@ -1836,9 +1837,18 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, goto insert; } - if (!blk_rq_can_direct_dispatch(rq) || (q->elevator && !bypass_insert)) + if (q->elevator && !bypass_insert) goto insert; + if (!blk_rq_can_direct_dispatch(rq)) { + /* +* For 'bypass_insert == true' case, insert request into hctx +* dispatch list. +*/ + force = bypass_insert; + goto insert; + } + if (!blk_mq_get_dispatch_budget(hctx)) goto insert; @@ -1849,8 +1859,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return __blk_mq_issue_directly(hctx, rq, cookie, last); insert: - if (bypass_insert) + if (force) { + blk_mq_request_bypass_insert(rq, run_queue); + return BLK_STS_OK; + } else if (bypass_insert) { return BLK_STS_RESOURCE; + } blk_mq_sched_insert_request(rq, false, run_queue, false); return BLK_STS_OK; -- 2.7.4
[PATCH V10 4/4] blk-mq: replace and kill blk_mq_request_issue_directly
Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly in blk_insert_cloned_request and kill it as nobody uses it any more. Signed-off-by: Jianchao Wang --- block/blk-core.c | 4 +++- block/blk-mq.c | 9 + block/blk-mq.h | 6 -- 3 files changed, 8 insertions(+), 11 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index ad59102..c92d866 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1297,6 +1297,8 @@ static int blk_cloned_rq_check_limits(struct request_queue *q, */ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq) { + blk_qc_t unused; + if (blk_cloned_rq_check_limits(q, rq)) return BLK_STS_IOERR; @@ -1312,7 +1314,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * * bypass a potential scheduler on the bottom device for * insert. */ - return blk_mq_request_issue_directly(rq, true); + return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, true); } EXPORT_SYMBOL_GPL(blk_insert_cloned_request); diff --git a/block/blk-mq.c b/block/blk-mq.c index dd07fe1..3ae5095 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1815,7 +1815,7 @@ static bool blk_rq_can_direct_dispatch(struct request *rq) return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE; } -static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, bool bypass, bool last) @@ -1886,13 +1886,6 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) -{ - blk_qc_t unused; - - return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, last); -} - void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { diff --git a/block/blk-mq.h b/block/blk-mq.h index a664ea4..b81e619 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -68,8 +68,10 @@ void blk_mq_request_bypass_insert(struct request *rq, bool run_queue); void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list); -/* Used by blk_insert_cloned_request() to issue request directly */ -blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last); +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie, + bool bypass, bool last); void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list); -- 2.7.4
[PATCH V10 1/4] blk-mq: insert to hctx dispatch list when bypass_insert is true
We don't allow direct dispatch of anything but regular reads/writes and insert all of non-read-write requests. However, this is not correct for 'bypass_insert == true' case where inserting is ignored and BLK_STS_RESOURCE is returned. The caller will fail forever. Fix it with inserting the non-read-write request to hctx dispatch list to avoid to involve merge and io scheduler when bypass_insert is true. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 9005505..01802bf 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1822,6 +1822,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, { struct request_queue *q = rq->q; bool run_queue = true; + bool force = false; /* * RCU or SRCU read lock is needed before checking quiesced flag. @@ -1836,9 +1837,18 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, goto insert; } - if (!blk_rq_can_direct_dispatch(rq) || (q->elevator && !bypass_insert)) + if (q->elevator && !bypass_insert) goto insert; + if (!blk_rq_can_direct_dispatch(rq)) { + /* +* For 'bypass_insert == true' case, insert request into hctx +* dispatch list. +*/ + force = bypass_insert; + goto insert; + } + if (!blk_mq_get_dispatch_budget(hctx)) goto insert; @@ -1849,8 +1859,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return __blk_mq_issue_directly(hctx, rq, cookie, last); insert: - if (bypass_insert) + if (force) { + blk_mq_request_bypass_insert(rq, run_queue); + return BLK_STS_OK; + } else if (bypass_insert) { return BLK_STS_RESOURCE; + } blk_mq_sched_insert_request(rq, false, run_queue, false); return BLK_STS_OK; -- 2.7.4
[PATCH V10 0/4] blk-mq: refactor code of issue directly
Hi Jens Please consider this patchset for 4.21. It refactors the code of issue request directly to unify the interface and make the code clearer and more readable. This patch set is rebased on the recent for-4.21/block and add the 1st patch which inserts the non-read-write request to hctx dispatch list to avoid to involve merge and io scheduler when bypass_insert is true, otherwise, inserting is ignored, BLK_STS_RESOURCE is returned and the caller will fail forever. The 2nd patch refactors the code of issue request directly to unify the helper interface which could handle all the cases. The 3rd patch make blk_mq_sched_insert_requests issue requests directly with 'bypass' false, then it needn't to handle the non-issued requests any more. The 4th patch replace and kill the blk_mq_request_issue_directly. V10: - address Jen's comment. - let blk_mq_try_issue_directly return actual result for case 'bypass == false'. (2/4) - use return value of blk_mq_try_issue_directly to identify whether the request is direct-issued successfully. (3/4) V9: - rebase on recent for-4.21/block - add 1st patch V8: - drop the original 2nd patch which try to insert requests into hctx->dispatch if quiesced or stopped. - remove two wrong 'unlikely' V7: - drop the original 3rd patch which try to ensure hctx to be ran on its mapped cpu in issueing directly path. V6: - drop original 1st patch to address Jen's comment - discard the enum mq_issue_decision and blk_mq_make_decision and use BLK_STS_* return values directly to address Jen's comment. (1/5) - add 'unlikely' in blk_mq_try_issue_directly (1/5) - refactor the 2nd and 3rd patch based on the new 1st patch. - reserve the unused_cookie in 4th and 5th patch V5: - rebase against Jens' for-4.21/block branch - adjust the order of patch04 and patch05 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert - comment changes V4: - split the original patch 1 into two patches, 1st and 2nd patch currently - rename the mq_decision to mq_issue_decision - comment changes V3: - Correct the code about the case bypass_insert is true and io scheduler attached. The request still need to be issued in case above. (1/4) - Refactor the code to make code clearer. blk_mq_make_request is introduced to decide insert, end or just return based on the return value of .queue_rq and bypass_insert (1/4) - Add the 2nd patch. It introduce a new decision result which indicates to insert request with blk_mq_request_bypass_insert. - Modify the code to adapt the new patch 1. V2: - Add 1st and 2nd patch to refactor the code. Jianchao Wang (4) blk-mq: insert to hctx dispatch list when blk-mq: refactor the code of issue request directly blk-mq: issue directly with bypass 'false' in blk-mq: replace and kill blk_mq_request_issue_directly block/blk-core.c | 4 +- block/blk-mq-sched.c | 8 ++-- block/blk-mq.c | 127 +++ block/blk-mq.h | 6 ++- 4 files changed, 68 insertions(+), 77 deletions(-) Thanks Jianchao
[PATCH V10 2/4] blk-mq: refactor the code of issue request directly
Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly into one interface to unify the interfaces to issue requests directly. The merged interface takes over the requests totally, it could insert, end or do nothing based on the return value of .queue_rq and 'bypass' parameter. Then caller needn't any other handling any more and then code could be cleaned up. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 112 ++--- 1 file changed, 51 insertions(+), 61 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 01802bf..a1cccdd 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1815,92 +1815,82 @@ static bool blk_rq_can_direct_dispatch(struct request *rq) return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE; } -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, - bool bypass_insert, bool last) + bool bypass, bool last) { struct request_queue *q = rq->q; bool run_queue = true; + blk_status_t ret = BLK_STS_RESOURCE; + int srcu_idx; bool force = false; + if (!blk_rq_can_direct_dispatch(rq)) { + /* +* Insert request to hctx dispatch list and return +* BLK_STS_OK for 'bypass == true' case, otherwise, +* the caller will fail forever. +*/ + force = bypass; + goto out; + } + + hctx_lock(hctx, &srcu_idx); /* -* RCU or SRCU read lock is needed before checking quiesced flag. +* hctx_lock is needed before checking quiesced flag. * -* When queue is stopped or quiesced, ignore 'bypass_insert' from -* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller, -* and avoid driver to try to dispatch again. +* When queue is stopped or quiesced, ignore 'bypass', insert +* and return BLK_STS_OK to caller, and avoid driver to try to +* dispatch again. */ - if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { + if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) { run_queue = false; - bypass_insert = false; - goto insert; + bypass = false; + goto out_unlock; } - if (q->elevator && !bypass_insert) - goto insert; - - if (!blk_rq_can_direct_dispatch(rq)) { - /* -* For 'bypass_insert == true' case, insert request into hctx -* dispatch list. -*/ - force = bypass_insert; - goto insert; - } + if (unlikely(q->elevator && !bypass)) + goto out_unlock; if (!blk_mq_get_dispatch_budget(hctx)) - goto insert; + goto out_unlock; if (!blk_mq_get_driver_tag(rq)) { blk_mq_put_dispatch_budget(hctx); - goto insert; + goto out_unlock; } - return __blk_mq_issue_directly(hctx, rq, cookie, last); -insert: - if (force) { - blk_mq_request_bypass_insert(rq, run_queue); - return BLK_STS_OK; - } else if (bypass_insert) { - return BLK_STS_RESOURCE; + ret = __blk_mq_issue_directly(hctx, rq, cookie, last); +out_unlock: + hctx_unlock(hctx, srcu_idx); +out: + switch (ret) { + case BLK_STS_OK: + break; + case BLK_STS_DEV_RESOURCE: + case BLK_STS_RESOURCE: + if (force) { + blk_mq_request_bypass_insert(rq, run_queue); + ret = BLK_STS_OK; + } else if (!bypass) { + blk_mq_sched_insert_request(rq, false, run_queue, false); + } + break; + default: + if (!bypass) + blk_mq_end_request(rq, ret); + break; } - blk_mq_sched_insert_request(rq, false, run_queue, false); - return BLK_STS_OK; -} - -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) -{ - blk_status_t ret; - int srcu_idx; - - might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); - - hctx_lock(hctx, &srcu_idx); - - ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false, true); - if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - blk_mq_sched_insert_request(rq, fals
[PATCH V10 3/4] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
It is not necessary to issue request directly with bypass 'true' in blk_mq_sched_insert_requests and handle the non-issued requests itself. Just set bypass to 'false' and let blk_mq_try_issue_directly handle them totally. Remove the blk_rq_can_direct_dispatch check, because blk_mq_try_issue_directly can handle it well. With respect to commit_rqs hook, we only need to care about the last request's result. If it is inserted, invoke commit_rqs. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 +++- block/blk-mq.c | 26 +- 2 files changed, 12 insertions(+), 22 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index f096d898..5b4d52d 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -417,12 +417,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, * busy in case of 'none' scheduler, and this way may save * us one extra enqueue & dequeue to sw queue. */ - if (!hctx->dispatch_busy && !e && !run_queue_async) { + if (!hctx->dispatch_busy && !e && !run_queue_async) blk_mq_try_issue_list_directly(hctx, list); - if (list_empty(list)) - return; - } - blk_mq_insert_requests(hctx, ctx, list); + else + blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index a1cccdd..dd07fe1 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1896,32 +1896,24 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { + blk_qc_t unused; + blk_status_t ret = BLK_STS_OK; + while (!list_empty(list)) { - blk_status_t ret; struct request *rq = list_first_entry(list, struct request, queuelist); - if (!blk_rq_can_direct_dispatch(rq)) - break; - list_del_init(&rq->queuelist); - ret = blk_mq_request_issue_directly(rq, list_empty(list)); - if (ret != BLK_STS_OK) { - if (ret == BLK_STS_RESOURCE || - ret == BLK_STS_DEV_RESOURCE) { - list_add(&rq->queuelist, list); - break; - } - blk_mq_end_request(rq, ret); - } + ret = blk_mq_try_issue_directly(hctx, rq, &unused, false, + list_empty(list)); } /* -* If we didn't flush the entire list, we could have told -* the driver there was more coming, but that turned out to -* be a lie. +* We only need to care about the last request's result, +* if it is inserted, kick the hardware with commit_rqs hook. */ - if (!list_empty(list) && hctx->queue->mq_ops->commit_rqs) + if ((ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) && + hctx->queue->mq_ops->commit_rqs) hctx->queue->mq_ops->commit_rqs(hctx); } -- 2.7.4
[PATCH V9 3/4] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
It is not necessary to issue request directly with bypass 'true' in blk_mq_sched_insert_requests and handle the non-issued requests itself. Just set bypass to 'false' and let blk_mq_try_issue_directly handle them totally. Remove the blk_rq_can_direct_dispatch check, because blk_mq_try_issue_directly can handle it well. With respect to commit_rqs hook, we only need to care about the last request's result. If it is inserted, invoke commit_rqs. We identify the actual result of blk_mq_try_issue_directly with outputed cookie. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 +++- block/blk-mq.c| 25 - include/linux/blk_types.h | 1 + 3 files changed, 12 insertions(+), 22 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index f096d898..5b4d52d 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -417,12 +417,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, * busy in case of 'none' scheduler, and this way may save * us one extra enqueue & dequeue to sw queue. */ - if (!hctx->dispatch_busy && !e && !run_queue_async) { + if (!hctx->dispatch_busy && !e && !run_queue_async) blk_mq_try_issue_list_directly(hctx, list); - if (list_empty(list)) - return; - } - blk_mq_insert_requests(hctx, ctx, list); + else + blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index fe92e52..0dfa269 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1899,32 +1899,23 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { + blk_qc_t cookie = BLK_QC_T_INVALID; + while (!list_empty(list)) { - blk_status_t ret; struct request *rq = list_first_entry(list, struct request, queuelist); - if (!blk_rq_can_direct_dispatch(rq)) - break; - list_del_init(&rq->queuelist); - ret = blk_mq_request_issue_directly(rq, list_empty(list)); - if (ret != BLK_STS_OK) { - if (ret == BLK_STS_RESOURCE || - ret == BLK_STS_DEV_RESOURCE) { - list_add(&rq->queuelist, list); - break; - } - blk_mq_end_request(rq, ret); - } + blk_mq_try_issue_directly(hctx, rq, &cookie, false, + list_empty(list)); } /* -* If we didn't flush the entire list, we could have told -* the driver there was more coming, but that turned out to -* be a lie. +* cookie is set to a valid value only when reqeust is issued successfully. +* We only need to care about the last request's result, if it is inserted, +* kick the hardware with commit_rqs hook. */ - if (!list_empty(list) && hctx->queue->mq_ops->commit_rqs) + if ((cookie == BLK_QC_T_INVALID) && hctx->queue->mq_ops->commit_rqs) hctx->queue->mq_ops->commit_rqs(hctx); } diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index c0ba1a0..a0a467a41 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -414,6 +414,7 @@ static inline int op_stat_group(unsigned int op) } typedef unsigned int blk_qc_t; +#define BLK_QC_T_INVALID -2U #define BLK_QC_T_NONE -1U #define BLK_QC_T_SHIFT 16 #define BLK_QC_T_INTERNAL (1U << 31) -- 2.7.4
[PATCH V9 0/4] blk-mq: refactor code of issue directly
Hi Jens Please consider this patchset for 4.21. It refactors the code of issue request directly to unify the interface and make the code clearer and more readable. This patch set is rebased on the recent for-4.21/block and add the 1st patch which inserts the non-read-write request to hctx dispatch list to avoid to involve merge and io scheduler when bypass_insert is true, otherwise, inserting is ignored, BLK_STS_RESOURCE is returned and the caller will fail forever. The 2nd patch refactors the code of issue request directly to unify the helper interface which could handle all the cases. The 3rd patch make blk_mq_sched_insert_requests issue requests directly with 'bypass' false, then it needn't to handle the non-issued requests any more. The 4th patch replace and kill the blk_mq_request_issue_directly. V9: - rebase on recent for-4.21/block - add 1st patch V8: - drop the original 2nd patch which try to insert requests into hctx->dispatch if quiesced or stopped. - remove two wrong 'unlikely' V7: - drop the original 3rd patch which try to ensure hctx to be ran on its mapped cpu in issueing directly path. V6: - drop original 1st patch to address Jen's comment - discard the enum mq_issue_decision and blk_mq_make_decision and use BLK_STS_* return values directly to address Jen's comment. (1/5) - add 'unlikely' in blk_mq_try_issue_directly (1/5) - refactor the 2nd and 3rd patch based on the new 1st patch. - reserve the unused_cookie in 4th and 5th patch V5: - rebase against Jens' for-4.21/block branch - adjust the order of patch04 and patch05 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert - comment changes V4: - split the original patch 1 into two patches, 1st and 2nd patch currently - rename the mq_decision to mq_issue_decision - comment changes V3: - Correct the code about the case bypass_insert is true and io scheduler attached. The request still need to be issued in case above. (1/4) - Refactor the code to make code clearer. blk_mq_make_request is introduced to decide insert, end or just return based on the return value of .queue_rq and bypass_insert (1/4) - Add the 2nd patch. It introduce a new decision result which indicates to insert request with blk_mq_request_bypass_insert. - Modify the code to adapt the new patch 1. V2: - Add 1st and 2nd patch to refactor the code. Jianchao Wang (4) blk-mq: insert to hctx dispatch list when blk-mq: refactor the code of issue request directly blk-mq: issue directly with bypass 'false' in blk-mq: replace and kill blk_mq_request_issue_directly block/blk-core.c | 4 +- block/blk-mq-sched.c | 8 ++- block/blk-mq.c| 129 +- block/blk-mq.h| 6 ++- include/linux/blk_types.h | 1 + 5 files changed, 71 insertions(+), 77 deletions(-) Thanks Jianchao
[PATCH V9 4/4] blk-mq: replace and kill blk_mq_request_issue_directly
Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly in blk_insert_cloned_request and kill it as nobody uses it any more. Signed-off-by: Jianchao Wang --- block/blk-core.c | 4 +++- block/blk-mq.c | 9 + block/blk-mq.h | 6 -- 3 files changed, 8 insertions(+), 11 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index ad59102..c92d866 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1297,6 +1297,8 @@ static int blk_cloned_rq_check_limits(struct request_queue *q, */ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq) { + blk_qc_t unused; + if (blk_cloned_rq_check_limits(q, rq)) return BLK_STS_IOERR; @@ -1312,7 +1314,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * * bypass a potential scheduler on the bottom device for * insert. */ - return blk_mq_request_issue_directly(rq, true); + return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, true); } EXPORT_SYMBOL_GPL(blk_insert_cloned_request); diff --git a/block/blk-mq.c b/block/blk-mq.c index 0dfa269..9d5c949 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1815,7 +1815,7 @@ static bool blk_rq_can_direct_dispatch(struct request *rq) return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE; } -static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, bool bypass, bool last) @@ -1889,13 +1889,6 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) -{ - blk_qc_t unused; - - return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, last); -} - void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { diff --git a/block/blk-mq.h b/block/blk-mq.h index a664ea4..b81e619 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -68,8 +68,10 @@ void blk_mq_request_bypass_insert(struct request *rq, bool run_queue); void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list); -/* Used by blk_insert_cloned_request() to issue request directly */ -blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last); +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie, + bool bypass, bool last); void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list); -- 2.7.4
[PATCH V9 2/4] blk-mq: refactor the code of issue request directly
Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly into one interface to unify the interfaces to issue requests directly. The merged interface takes over the requests totally, it could insert, end or do nothing based on the return value of .queue_rq and 'bypass' parameter. Then caller needn't any other handling any more and then code could be cleaned up. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 116 +++-- 1 file changed, 54 insertions(+), 62 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 153af90..fe92e52 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1815,93 +1815,85 @@ static bool blk_rq_can_direct_dispatch(struct request *rq) return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE; } -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, - bool bypass_insert, bool last) + bool bypass, bool last) { struct request_queue *q = rq->q; bool run_queue = true; + blk_status_t ret = BLK_STS_RESOURCE; + int srcu_idx; bool force = false; + if (!blk_rq_can_direct_dispatch(rq)) { + /* +* Insert request to hctx dispatch list for 'bypass == true' +* case, otherwise, the caller will fail forever. +*/ + if (bypass) + force = true; + goto out; + } + + hctx_lock(hctx, &srcu_idx); /* -* RCU or SRCU read lock is needed before checking quiesced flag. +* hctx_lock is needed before checking quiesced flag. * -* When queue is stopped or quiesced, ignore 'bypass_insert' from -* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller, -* and avoid driver to try to dispatch again. +* When queue is stopped or quiesced, ignore 'bypass', insert +* and return BLK_STS_OK to caller, and avoid driver to try to +* dispatch again. */ - if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { + if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) { run_queue = false; - bypass_insert = false; - goto insert; + bypass = false; + goto out_unlock; } - if (q->elevator && !bypass_insert) - goto insert; - - if (!blk_rq_can_direct_dispatch(rq)) { - /* -* For 'bypass_insert == true' case, insert request into hctx -* dispatch list. -*/ - if (bypass_insert) - force = true; - goto insert; - } + if (unlikely(q->elevator && !bypass)) + goto out_unlock; if (!blk_mq_get_dispatch_budget(hctx)) - goto insert; + goto out_unlock; if (!blk_mq_get_driver_tag(rq)) { blk_mq_put_dispatch_budget(hctx); - goto insert; + goto out_unlock; } - return __blk_mq_issue_directly(hctx, rq, cookie, last); -insert: - if (force) { - blk_mq_request_bypass_insert(rq, run_queue); - return BLK_STS_OK; - } else if (bypass_insert) { - return BLK_STS_RESOURCE; + ret = __blk_mq_issue_directly(hctx, rq, cookie, last); +out_unlock: + hctx_unlock(hctx, srcu_idx); +out: + switch (ret) { + case BLK_STS_OK: + break; + case BLK_STS_DEV_RESOURCE: + case BLK_STS_RESOURCE: + if (force) { + blk_mq_request_bypass_insert(rq, run_queue); + ret = BLK_STS_OK; + } else if (!bypass) { + blk_mq_sched_insert_request(rq, false, run_queue, false); + ret = BLK_STS_OK; + } + break; + default: + if (!bypass) { + blk_mq_end_request(rq, ret); + ret = BLK_STS_OK; + } + break; } - blk_mq_sched_insert_request(rq, false, run_queue, false); - return BLK_STS_OK; -} - -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) -{ - blk_status_t ret; - int srcu_idx; - - might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); - - hctx_lock(hctx, &srcu_idx); - - ret = __blk_mq_try_issue_directly(hct
[PATCH V9 1/4] blk-mq: insert to hctx dispatch list when bypass_insert is true
We don't allow direct dispatch of anything but regular reads/writes and insert all of non-read-write requests. However, this is not correct for 'bypass_insert == true' case where inserting is ignored and BLK_STS_RESOURCE is returned. The caller will fail forever. Fix it with inserting the non-read-write request to hctx dispatch list to avoid to involve merge and io scheduler when bypass_insert is true. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 9005505..153af90 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1822,6 +1822,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, { struct request_queue *q = rq->q; bool run_queue = true; + bool force = false; /* * RCU or SRCU read lock is needed before checking quiesced flag. @@ -1836,9 +1837,19 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, goto insert; } - if (!blk_rq_can_direct_dispatch(rq) || (q->elevator && !bypass_insert)) + if (q->elevator && !bypass_insert) goto insert; + if (!blk_rq_can_direct_dispatch(rq)) { + /* +* For 'bypass_insert == true' case, insert request into hctx +* dispatch list. +*/ + if (bypass_insert) + force = true; + goto insert; + } + if (!blk_mq_get_dispatch_budget(hctx)) goto insert; @@ -1849,8 +1860,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return __blk_mq_issue_directly(hctx, rq, cookie, last); insert: - if (bypass_insert) + if (force) { + blk_mq_request_bypass_insert(rq, run_queue); + return BLK_STS_OK; + } else if (bypass_insert) { return BLK_STS_RESOURCE; + } blk_mq_sched_insert_request(rq, false, run_queue, false); return BLK_STS_OK; -- 2.7.4
[PATCH V8 0/3] blk-mq: refactor code of issue request directly
Hi Jens Please consider this patchset for 4.21. It refactors the code of issue request directly to unify the interface and make the code clearer and more readable. The 1st patch refactors the code of issue request directly to unify the helper interface which could handle all the cases. The 2nd patch make blk_mq_sched_insert_requests issue requests directly with 'bypass' false, then it needn't to handle the non-issued requests any more. The 3rd patch replace and kill the blk_mq_request_issue_directly. V8: - drop the original 2nd patch which try to insert requests into hctx->dispatch if quiesced or stopped. - remove two wrong 'unlikely' V7: - drop the original 3rd patch which try to ensure hctx to be ran on its mapped cpu in issueing directly path. V6: - drop original 1st patch to address Jen's comment - discard the enum mq_issue_decision and blk_mq_make_decision and use BLK_STS_* return values directly to address Jen's comment. (1/5) - add 'unlikely' in blk_mq_try_issue_directly (1/5) - refactor the 2nd and 3rd patch based on the new 1st patch. - reserve the unused_cookie in 4th and 5th patch V5: - rebase against Jens' for-4.21/block branch - adjust the order of patch04 and patch05 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert - comment changes V4: - split the original patch 1 into two patches, 1st and 2nd patch currently - rename the mq_decision to mq_issue_decision - comment changes V3: - Correct the code about the case bypass_insert is true and io scheduler attached. The request still need to be issued in case above. (1/4) - Refactor the code to make code clearer. blk_mq_make_request is introduced to decide insert, end or just return based on the return value of .queue_rq and bypass_insert (1/4) - Add the 2nd patch. It introduce a new decision result which indicates to insert request with blk_mq_request_bypass_insert. - Modify the code to adapt the new patch 1. V2: - Add 1st and 2nd patch to refactor the code. Jianchao Wang(3) blk-mq: refactor the code of issue request directly blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests blk-mq: replace and kill blk_mq_request_issue_directly block/blk-core.c | 4 +- block/blk-mq-sched.c | 8 ++-- block/blk-mq.c | 103 +-- block/blk-mq.h | 7 ++-- 4 files changed, 53 insertions(+), 69 deletions(-) Thanks Jianchao
[PATCH V8 2/3] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
It is not necessary to issue request directly with bypass 'true' in blk_mq_sched_insert_requests and handle the non-issued requests itself. Just set bypass to 'false' and let blk_mq_try_issue_directly handle them totally. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 +++- block/blk-mq.c | 13 +++-- 2 files changed, 6 insertions(+), 15 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 66fda19..9af57c8 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -410,12 +410,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, * busy in case of 'none' scheduler, and this way may save * us one extra enqueue & dequeue to sw queue. */ - if (!hctx->dispatch_busy && !e && !run_queue_async) { + if (!hctx->dispatch_busy && !e && !run_queue_async) blk_mq_try_issue_list_directly(hctx, list); - if (list_empty(list)) - return; - } - blk_mq_insert_requests(hctx, ctx, list); + else + blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index 1b57449..c8566f9 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1840,21 +1840,14 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq) void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { + blk_qc_t unused_cookie; + while (!list_empty(list)) { - blk_status_t ret; struct request *rq = list_first_entry(list, struct request, queuelist); list_del_init(&rq->queuelist); - ret = blk_mq_request_issue_directly(rq); - if (ret != BLK_STS_OK) { - if (ret == BLK_STS_RESOURCE || - ret == BLK_STS_DEV_RESOURCE) { - list_add(&rq->queuelist, list); - break; - } - blk_mq_end_request(rq, ret); - } + blk_mq_try_issue_directly(hctx, rq, &unused_cookie, false); } } -- 2.7.4
[PATCH V8 1/3] blk-mq: refactor the code of issue request directly
Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly into one interface to unify the interfaces to issue requests directly. The merged interface takes over the requests totally, it could insert, end or do nothing based on the return value of .queue_rq and 'bypass' parameter. Then caller needn't any other handling any more. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 89 -- 1 file changed, 43 insertions(+), 46 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 411be60..1b57449 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1766,78 +1766,75 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, - bool bypass_insert) + bool bypass) { struct request_queue *q = rq->q; bool run_queue = true; + blk_status_t ret = BLK_STS_RESOURCE; + int srcu_idx; + hctx_lock(hctx, &srcu_idx); /* -* RCU or SRCU read lock is needed before checking quiesced flag. +* hctx_lock is needed before checking quiesced flag. * -* When queue is stopped or quiesced, ignore 'bypass_insert' from -* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller, -* and avoid driver to try to dispatch again. +* When queue is stopped or quiesced, ignore 'bypass', insert +* and return BLK_STS_OK to caller, and avoid driver to try to +* dispatch again. */ - if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { + if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) { run_queue = false; - bypass_insert = false; - goto insert; + bypass = false; + goto out_unlock; } - if (q->elevator && !bypass_insert) - goto insert; + /* +* Bypass the potential scheduler on the bottom device. +*/ + if (unlikely(q->elevator && !bypass)) + goto out_unlock; if (!blk_mq_get_dispatch_budget(hctx)) - goto insert; + goto out_unlock; if (!blk_mq_get_driver_tag(rq)) { blk_mq_put_dispatch_budget(hctx); - goto insert; + goto out_unlock; } - return __blk_mq_issue_directly(hctx, rq, cookie); -insert: - if (bypass_insert) - return BLK_STS_RESOURCE; - - blk_mq_sched_insert_request(rq, false, run_queue, false); - return BLK_STS_OK; -} - -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) -{ - blk_status_t ret; - int srcu_idx; - - might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); + ret = __blk_mq_issue_directly(hctx, rq, cookie); - hctx_lock(hctx, &srcu_idx); +out_unlock: + hctx_unlock(hctx, srcu_idx); - ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false); - if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - blk_mq_sched_insert_request(rq, false, true, false); - else if (ret != BLK_STS_OK) - blk_mq_end_request(rq, ret); + switch (ret) { + case BLK_STS_OK: + break; + case BLK_STS_DEV_RESOURCE: + case BLK_STS_RESOURCE: + if (!bypass) { + blk_mq_sched_insert_request(rq, false, run_queue, false); + ret = BLK_STS_OK; + } + break; + default: + if (!bypass) { + blk_mq_end_request(rq, ret); + ret = BLK_STS_OK; + } + break; + } - hctx_unlock(hctx, srcu_idx); + return ret; } blk_status_t blk_mq_request_issue_directly(struct request *rq) { - blk_status_t ret; - int srcu_idx; blk_qc_t unused_cookie; - struct blk_mq_hw_ctx *hctx = rq->mq_hctx; - hctx_lock(hctx, &srcu_idx); - ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true); - hctx_unlock(hctx, srcu_idx); - - return ret; + return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true); } void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, @@ -1958,13 +1955,13 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) if (same_queue_rq) { da
[PATCH V8 3/3] blk-mq: replace and kill blk_mq_request_issue_directly
Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly in blk_insert_cloned_request and kill it as nobody uses it any more. Signed-off-by: Jianchao Wang --- block/blk-core.c | 4 +++- block/blk-mq.c | 9 + block/blk-mq.h | 7 --- 3 files changed, 8 insertions(+), 12 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index fdc0ad2..e4eedc7 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1421,6 +1421,8 @@ static int blk_cloned_rq_check_limits(struct request_queue *q, */ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq) { + blk_qc_t unused_cookie; + if (blk_cloned_rq_check_limits(q, rq)) return BLK_STS_IOERR; @@ -1436,7 +1438,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * * bypass a potential scheduler on the bottom device for * insert. */ - return blk_mq_request_issue_directly(rq); + return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true); } EXPORT_SYMBOL_GPL(blk_insert_cloned_request); diff --git a/block/blk-mq.c b/block/blk-mq.c index c8566f9..784ca6b 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1766,7 +1766,7 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, bool bypass) @@ -1830,13 +1830,6 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -blk_status_t blk_mq_request_issue_directly(struct request *rq) -{ - blk_qc_t unused_cookie; - - return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true); -} - void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { diff --git a/block/blk-mq.h b/block/blk-mq.h index facb6e9..f18c27c 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -61,9 +61,10 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, void blk_mq_request_bypass_insert(struct request *rq, bool run_queue); void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list); - -/* Used by blk_insert_cloned_request() to issue request directly */ -blk_status_t blk_mq_request_issue_directly(struct request *rq); +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie, + bool bypass); void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list); -- 2.7.4
[PATCH V7 3/4] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
It is not necessary to issue request directly with bypass 'true' in blk_mq_sched_insert_requests and handle the non-issued requests itself. Just set bypass to 'false' and let blk_mq_try_issue_directly handle them totally. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 +++- block/blk-mq.c | 13 +++-- 2 files changed, 6 insertions(+), 15 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 66fda19..9af57c8 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -410,12 +410,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, * busy in case of 'none' scheduler, and this way may save * us one extra enqueue & dequeue to sw queue. */ - if (!hctx->dispatch_busy && !e && !run_queue_async) { + if (!hctx->dispatch_busy && !e && !run_queue_async) blk_mq_try_issue_list_directly(hctx, list); - if (list_empty(list)) - return; - } - blk_mq_insert_requests(hctx, ctx, list); + else + blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index 11c52bb..049fd47 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1843,21 +1843,14 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq) void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { + blk_qc_t unused_cookie; + while (!list_empty(list)) { - blk_status_t ret; struct request *rq = list_first_entry(list, struct request, queuelist); list_del_init(&rq->queuelist); - ret = blk_mq_request_issue_directly(rq); - if (ret != BLK_STS_OK) { - if (ret == BLK_STS_RESOURCE || - ret == BLK_STS_DEV_RESOURCE) { - list_add(&rq->queuelist, list); - break; - } - blk_mq_end_request(rq, ret); - } + blk_mq_try_issue_directly(hctx, rq, &unused_cookie, false); } } -- 2.7.4
[PATCH V7 1/4] blk-mq: refactor the code of issue request directly
Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly into one interface to unify the interfaces to issue requests directly. The merged interface takes over the requests totally, it could insert, end or do nothing based on the return value of .queue_rq and 'bypass' parameter. Then caller needn't any other handling any more. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 93 -- 1 file changed, 45 insertions(+), 48 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 411be60..14b4d06 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1766,78 +1766,75 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, - bool bypass_insert) + bool bypass) { struct request_queue *q = rq->q; bool run_queue = true; + blk_status_t ret = BLK_STS_RESOURCE; + int srcu_idx; + hctx_lock(hctx, &srcu_idx); /* -* RCU or SRCU read lock is needed before checking quiesced flag. +* hctx_lock is needed before checking quiesced flag. * -* When queue is stopped or quiesced, ignore 'bypass_insert' from -* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller, -* and avoid driver to try to dispatch again. +* When queue is stopped or quiesced, ignore 'bypass', insert +* and return BLK_STS_OK to caller, and avoid driver to try to +* dispatch again. */ - if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { + if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) { run_queue = false; - bypass_insert = false; - goto insert; + bypass = false; + goto out_unlock; } - if (q->elevator && !bypass_insert) - goto insert; + /* +* Bypass the potential scheduler on the bottom device. +*/ + if (unlikely(q->elevator && !bypass)) + goto out_unlock; - if (!blk_mq_get_dispatch_budget(hctx)) - goto insert; + if (unlikely(!blk_mq_get_dispatch_budget(hctx))) + goto out_unlock; - if (!blk_mq_get_driver_tag(rq)) { + if (unlikely(!blk_mq_get_driver_tag(rq))) { blk_mq_put_dispatch_budget(hctx); - goto insert; + goto out_unlock; } - return __blk_mq_issue_directly(hctx, rq, cookie); -insert: - if (bypass_insert) - return BLK_STS_RESOURCE; - - blk_mq_sched_insert_request(rq, false, run_queue, false); - return BLK_STS_OK; -} - -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) -{ - blk_status_t ret; - int srcu_idx; - - might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); + ret = __blk_mq_issue_directly(hctx, rq, cookie); - hctx_lock(hctx, &srcu_idx); +out_unlock: + hctx_unlock(hctx, srcu_idx); - ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false); - if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - blk_mq_sched_insert_request(rq, false, true, false); - else if (ret != BLK_STS_OK) - blk_mq_end_request(rq, ret); + switch (ret) { + case BLK_STS_OK: + break; + case BLK_STS_DEV_RESOURCE: + case BLK_STS_RESOURCE: + if (!bypass) { + blk_mq_sched_insert_request(rq, false, run_queue, false); + ret = BLK_STS_OK; + } + break; + default: + if (!bypass) { + blk_mq_end_request(rq, ret); + ret = BLK_STS_OK; + } + break; + } - hctx_unlock(hctx, srcu_idx); + return ret; } blk_status_t blk_mq_request_issue_directly(struct request *rq) { - blk_status_t ret; - int srcu_idx; blk_qc_t unused_cookie; - struct blk_mq_hw_ctx *hctx = rq->mq_hctx; - hctx_lock(hctx, &srcu_idx); - ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true); - hctx_unlock(hctx, srcu_idx); - - return ret; + return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true); } void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, @@ -1958,13 +1955,13 @@ static blk_qc_t blk
[PATCH V7 2/4] blk-mq: fix issue directly case when q is stopped or quiesced
When try to issue request directly, if the queue is stopped or quiesced, 'bypass' will be ignored and return BLK_STS_OK to caller to avoid it dispatch request again. Then the request will be inserted with blk_mq_sched_insert_request. This is not correct for dm-rq case where we should avoid to pass through the underlying path's io scheduler. To fix it, use blk_mq_request_bypass_insert to insert the request to hctx->dispatch when we cannot pass through io scheduler but have to insert. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 14b4d06..11c52bb 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1772,7 +1772,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, bool bypass) { struct request_queue *q = rq->q; - bool run_queue = true; + bool run_queue = true, force = false; blk_status_t ret = BLK_STS_RESOURCE; int srcu_idx; @@ -1786,7 +1786,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, */ if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) { run_queue = false; - bypass = false; + force = true; goto out_unlock; } @@ -1817,6 +1817,9 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, if (!bypass) { blk_mq_sched_insert_request(rq, false, run_queue, false); ret = BLK_STS_OK; + } else if (force) { + blk_mq_request_bypass_insert(rq, run_queue); + ret = BLK_STS_OK; } break; default: -- 2.7.4
[PATCH V7 4/4] blk-mq: replace and kill blk_mq_request_issue_directly
Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly in blk_insert_cloned_request and kill it as nobody uses it any more. Signed-off-by: Jianchao Wang --- block/blk-core.c | 4 +++- block/blk-mq.c | 9 + block/blk-mq.h | 7 --- 3 files changed, 8 insertions(+), 12 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index fdc0ad2..e4eedc7 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1421,6 +1421,8 @@ static int blk_cloned_rq_check_limits(struct request_queue *q, */ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq) { + blk_qc_t unused_cookie; + if (blk_cloned_rq_check_limits(q, rq)) return BLK_STS_IOERR; @@ -1436,7 +1438,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * * bypass a potential scheduler on the bottom device for * insert. */ - return blk_mq_request_issue_directly(rq); + return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true); } EXPORT_SYMBOL_GPL(blk_insert_cloned_request); diff --git a/block/blk-mq.c b/block/blk-mq.c index 049fd47..3fcf2f9 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1766,7 +1766,7 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, bool bypass) @@ -1833,13 +1833,6 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -blk_status_t blk_mq_request_issue_directly(struct request *rq) -{ - blk_qc_t unused_cookie; - - return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true); -} - void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { diff --git a/block/blk-mq.h b/block/blk-mq.h index facb6e9..f18c27c 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -61,9 +61,10 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, void blk_mq_request_bypass_insert(struct request *rq, bool run_queue); void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list); - -/* Used by blk_insert_cloned_request() to issue request directly */ -blk_status_t blk_mq_request_issue_directly(struct request *rq); +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie, + bool bypass); void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list); -- 2.7.4
[PATCH V7 0/4] blk-mq: refactor and fix on issue request directly
Hi Jens Please consider this patchset for 4.21. It refactors the code of issue request directly to unify the interface and make the code clearer and more readable, and also fixes a defects there. The 1st patch refactors the code of issue request directly to unify the helper interface which could handle all the cases. The 2nd patch fix the issue that when queue is stopped or quiesced request may pass through bottom device's potential io scheduler. The 3rd patch make blk_mq_sched_insert_requests issue requests directly with 'bypass' false, then it needn't to handle the non-issued requests any more. The 4th patch replace and kill the blk_mq_request_issue_directly. V7: - drop the original 3rd patch which try to ensure hctx to be ran on mapped cpu. As it add get/put_cpu and cpumask test in hot path and it is not necessary for drivers to do the guarantee. V6: - drop original 1st patch to address Jen's comment - discard the enum mq_issue_decision and blk_mq_make_decision and use BLK_STS_* return values directly to address Jen's comment. (1/5) - add 'unlikely' in blk_mq_try_issue_directly (1/5) - refactor the 2nd and 3rd patch based on the new 1st patch. - reserve the unused_cookie in 4th and 5th patch V5: - rebase against Jens' for-4.21/block branch - adjust the order of patch04 and patch05 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert - comment changes V4: - split the original patch 1 into two patches, 1st and 2nd patch currently - rename the mq_decision to mq_issue_decision - comment changes V3: - Correct the code about the case bypass_insert is true and io scheduler attached. The request still need to be issued in case above. (1/4) - Refactor the code to make code clearer. blk_mq_make_request is introduced to decide insert, end or just return based on the return value of .queue_rq and bypass_insert (1/4) - Add the 2nd patch. It introduce a new decision result which indicates to insert request with blk_mq_request_bypass_insert. - Modify the code to adapt the new patch 1. V2: - Add 1st and 2nd patch to refactor the code. Jianchao Wang(4) blk-mq: refactor the code of issue request directly blk-mq: fix issue directly case when q is stopped or quiesced blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests blk-mq: replace and kill blk_mq_request_issue_directly block/blk-core.c | 4 +- block/blk-mq-sched.c | 8 ++-- block/blk-mq.c | 112 ++- block/blk-mq.h | 7 ++-- 4 files changed, 59 insertions(+), 72 deletions(-) Thanks Jianchao
[PATCH V6 1/5] blk-mq: refactor the code of issue request directly
Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly into one interface to unify the interfaces to issue requests directly. The merged interface takes over the requests totally, it could insert, end or do nothing based on the return value of .queue_rq and 'bypass' parameter. Then caller needn't any other handling any more. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 93 -- 1 file changed, 45 insertions(+), 48 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 411be60..14b4d06 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1766,78 +1766,75 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, - bool bypass_insert) + bool bypass) { struct request_queue *q = rq->q; bool run_queue = true; + blk_status_t ret = BLK_STS_RESOURCE; + int srcu_idx; + hctx_lock(hctx, &srcu_idx); /* -* RCU or SRCU read lock is needed before checking quiesced flag. +* hctx_lock is needed before checking quiesced flag. * -* When queue is stopped or quiesced, ignore 'bypass_insert' from -* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller, -* and avoid driver to try to dispatch again. +* When queue is stopped or quiesced, ignore 'bypass', insert +* and return BLK_STS_OK to caller, and avoid driver to try to +* dispatch again. */ - if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { + if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) { run_queue = false; - bypass_insert = false; - goto insert; + bypass = false; + goto out_unlock; } - if (q->elevator && !bypass_insert) - goto insert; + /* +* Bypass the potential scheduler on the bottom device. +*/ + if (unlikely(q->elevator && !bypass)) + goto out_unlock; - if (!blk_mq_get_dispatch_budget(hctx)) - goto insert; + if (unlikely(!blk_mq_get_dispatch_budget(hctx))) + goto out_unlock; - if (!blk_mq_get_driver_tag(rq)) { + if (unlikely(!blk_mq_get_driver_tag(rq))) { blk_mq_put_dispatch_budget(hctx); - goto insert; + goto out_unlock; } - return __blk_mq_issue_directly(hctx, rq, cookie); -insert: - if (bypass_insert) - return BLK_STS_RESOURCE; - - blk_mq_sched_insert_request(rq, false, run_queue, false); - return BLK_STS_OK; -} - -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) -{ - blk_status_t ret; - int srcu_idx; - - might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); + ret = __blk_mq_issue_directly(hctx, rq, cookie); - hctx_lock(hctx, &srcu_idx); +out_unlock: + hctx_unlock(hctx, srcu_idx); - ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false); - if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - blk_mq_sched_insert_request(rq, false, true, false); - else if (ret != BLK_STS_OK) - blk_mq_end_request(rq, ret); + switch (ret) { + case BLK_STS_OK: + break; + case BLK_STS_DEV_RESOURCE: + case BLK_STS_RESOURCE: + if (!bypass) { + blk_mq_sched_insert_request(rq, false, run_queue, false); + ret = BLK_STS_OK; + } + break; + default: + if (!bypass) { + blk_mq_end_request(rq, ret); + ret = BLK_STS_OK; + } + break; + } - hctx_unlock(hctx, srcu_idx); + return ret; } blk_status_t blk_mq_request_issue_directly(struct request *rq) { - blk_status_t ret; - int srcu_idx; blk_qc_t unused_cookie; - struct blk_mq_hw_ctx *hctx = rq->mq_hctx; - hctx_lock(hctx, &srcu_idx); - ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true); - hctx_unlock(hctx, srcu_idx); - - return ret; + return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true); } void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, @@ -1958,13 +1955,13 @@ static blk_qc_t blk
[PATCH V6 4/5] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
It is not necessary to issue request directly with bypass 'true' in blk_mq_sched_insert_requests and handle the non-issued requests itself. Just set bypass to 'false' and let blk_mq_try_issue_directly handle them totally. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 +++- block/blk-mq.c | 13 +++-- 2 files changed, 6 insertions(+), 15 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 66fda19..9af57c8 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -410,12 +410,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, * busy in case of 'none' scheduler, and this way may save * us one extra enqueue & dequeue to sw queue. */ - if (!hctx->dispatch_busy && !e && !run_queue_async) { + if (!hctx->dispatch_busy && !e && !run_queue_async) blk_mq_try_issue_list_directly(hctx, list); - if (list_empty(list)) - return; - } - blk_mq_insert_requests(hctx, ctx, list); + else + blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index 58f15cc..f41a815 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1855,21 +1855,14 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq) void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { + blk_qc_t unused_cookie; + while (!list_empty(list)) { - blk_status_t ret; struct request *rq = list_first_entry(list, struct request, queuelist); list_del_init(&rq->queuelist); - ret = blk_mq_request_issue_directly(rq); - if (ret != BLK_STS_OK) { - if (ret == BLK_STS_RESOURCE || - ret == BLK_STS_DEV_RESOURCE) { - list_add(&rq->queuelist, list); - break; - } - blk_mq_end_request(rq, ret); - } + blk_mq_try_issue_directly(hctx, rq, &unused_cookie, false); } } -- 2.7.4
[PATCH V6 3/5] blk-mq: ensure hctx to be ran on mapped cpu when issue directly
When issue request directly and the task is migrated out of the original cpu where it allocates request, hctx could be ran on the cpu where it is not mapped. To fix this, - insert the request forcibly if BLK_MQ_F_BLOCKING is set. - check whether the current is mapped to the hctx, if not, insert forcibly. - invoke __blk_mq_issue_directly under preemption disabled. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 11c52bb..58f15cc 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1776,6 +1776,17 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, blk_status_t ret = BLK_STS_RESOURCE; int srcu_idx; + if (hctx->flags & BLK_MQ_F_BLOCKING) { + force = true; + goto out; + } + + if (unlikely(!cpumask_test_cpu(get_cpu(), hctx->cpumask))) { + put_cpu(); + force = true; + goto out; + } + hctx_lock(hctx, &srcu_idx); /* * hctx_lock is needed before checking quiesced flag. @@ -1808,7 +1819,8 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, out_unlock: hctx_unlock(hctx, srcu_idx); - + put_cpu(); +out: switch (ret) { case BLK_STS_OK: break; -- 2.7.4
[PATCH V6 2/5] blk-mq: fix issue directly case when q is stopped or quiesced
When try to issue request directly, if the queue is stopped or quiesced, 'bypass' will be ignored and return BLK_STS_OK to caller to avoid it dispatch request again. Then the request will be inserted with blk_mq_sched_insert_request. This is not correct for dm-rq case where we should avoid to pass through the underlying path's io scheduler. To fix it, use blk_mq_request_bypass_insert to insert the request to hctx->dispatch when we cannot pass through io scheduler but have to insert. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 14b4d06..11c52bb 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1772,7 +1772,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, bool bypass) { struct request_queue *q = rq->q; - bool run_queue = true; + bool run_queue = true, force = false; blk_status_t ret = BLK_STS_RESOURCE; int srcu_idx; @@ -1786,7 +1786,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, */ if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) { run_queue = false; - bypass = false; + force = true; goto out_unlock; } @@ -1817,6 +1817,9 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, if (!bypass) { blk_mq_sched_insert_request(rq, false, run_queue, false); ret = BLK_STS_OK; + } else if (force) { + blk_mq_request_bypass_insert(rq, run_queue); + ret = BLK_STS_OK; } break; default: -- 2.7.4
[PATCH V6 5/5] blk-mq: replace and kill blk_mq_request_issue_directly
Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly in blk_insert_cloned_request and kill it as nobody uses it any more. Signed-off-by: Jianchao Wang --- block/blk-core.c | 4 +++- block/blk-mq.c | 9 + block/blk-mq.h | 7 --- 3 files changed, 8 insertions(+), 12 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index fdc0ad2..e4eedc7 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1421,6 +1421,8 @@ static int blk_cloned_rq_check_limits(struct request_queue *q, */ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq) { + blk_qc_t unused_cookie; + if (blk_cloned_rq_check_limits(q, rq)) return BLK_STS_IOERR; @@ -1436,7 +1438,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * * bypass a potential scheduler on the bottom device for * insert. */ - return blk_mq_request_issue_directly(rq); + return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true); } EXPORT_SYMBOL_GPL(blk_insert_cloned_request); diff --git a/block/blk-mq.c b/block/blk-mq.c index f41a815..b5316c78 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1766,7 +1766,7 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, bool bypass) @@ -1845,13 +1845,6 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -blk_status_t blk_mq_request_issue_directly(struct request *rq) -{ - blk_qc_t unused_cookie; - - return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true); -} - void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { diff --git a/block/blk-mq.h b/block/blk-mq.h index facb6e9..f18c27c 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -61,9 +61,10 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, void blk_mq_request_bypass_insert(struct request *rq, bool run_queue); void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list); - -/* Used by blk_insert_cloned_request() to issue request directly */ -blk_status_t blk_mq_request_issue_directly(struct request *rq); +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie, + bool bypass); void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list); -- 2.7.4
[PATCH V6 0/5] blk-mq: refactor and fix on issue request directly
Hi Jens Please consider this patchset for 4.21. It refactors the code of issue request directly to unify the interface and make the code clearer and more readable, and also fixes two defects there. The 1st patch refactors the code of issue request directly to unify the helper interface which could handle all the cases. The 2nd patch fix the issue that when queue is stopped or quiesced request may pass through bottom device's potential io scheduler. The 3rd patch fix the issue that hctx maybe ran on a cpu where it is not mapped in issue directly path. The 4th patch make blk_mq_sched_insert_requests issue requests directly with 'bypass' false, then it needn't to handle the non-issued requests any more. The 5th patch replace and kill the blk_mq_request_issue_directly. V6: - drop original 1st patch to address Jen's comment - discard the enum mq_issue_decision and blk_mq_make_decision and use BLK_STS_* return values directly to address Jen's comment. (1/5) - add 'unlikely' in blk_mq_try_issue_directly (1/5) - refactor the 2nd and 3rd patch based on the new 1st patch. - reserve the unused_cookie in 4th and 5th patch V5: - rebase against Jens' for-4.21/block branch - adjust the order of patch04 and patch05 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert - comment changes V4: - split the original patch 1 into two patches, 1st and 2nd patch currently - rename the mq_decision to mq_issue_decision - comment changes V3: - Correct the code about the case bypass_insert is true and io scheduler attached. The request still need to be issued in case above. (1/4) - Refactor the code to make code clearer. blk_mq_make_request is introduced to decide insert, end or just return based on the return value of .queue_rq and bypass_insert (1/4) - Add the 2nd patch. It introduce a new decision result which indicates to insert request with blk_mq_request_bypass_insert. - Modify the code to adapt the new patch 1. V2: - Add 1st and 2nd patch to refactor the code. Jianchao Wang(6) blk-mq: refactor the code of issue request directly blk-mq: fix issue directly case when q is stopped or quiesced blk-mq: ensure hctx to be ran on mapped cpu when issue directly blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests blk-mq: replace and kill blk_mq_request_issue_directly block/blk-core.c | 4 +- block/blk-mq-sched.c | 8 ++-- block/blk-mq.c | 126 +-- block/blk-mq.h | 7 +-- 4 files changed, 72 insertions(+), 73 deletions(-) Thanks Jianchao
[PATCH V5 3/6] blk-mq: fix issue directly case when q is stopped or quiesced
When try to issue request directly, if the queue is stopped or quiesced, 'bypass' will be ignored and return BLK_STS_OK to caller to avoid it dispatch request again. Then the request will be inserted with blk_mq_sched_insert_request. This is not correct for dm-rq case where we should avoid to pass through the underlying path's io scheduler. To fix it, add new mq_issue_decision entry MQ_ISSUE_INSERT_DISPATCH for above case where the request need to be inserted forcibly. And use blk_mq_request_bypass_insert to insert the request into hctx->dispatch directly. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 48b7a7c..f54c092 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1768,12 +1768,13 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, enum mq_issue_decision { MQ_ISSUE_INSERT_QUEUE, + MQ_ISSUE_INSERT_DISPATCH, MQ_ISSUE_END_REQUEST, MQ_ISSUE_DO_NOTHING, }; static inline enum mq_issue_decision - blk_mq_make_dicision(blk_status_t ret, bool bypass) + blk_mq_make_dicision(blk_status_t ret, bool bypass, bool force) { enum mq_issue_decision dec; @@ -1783,7 +1784,10 @@ static inline enum mq_issue_decision break; case BLK_STS_DEV_RESOURCE: case BLK_STS_RESOURCE: - dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_INSERT_QUEUE; + if (force) + dec = bypass ? MQ_ISSUE_INSERT_DISPATCH : MQ_ISSUE_INSERT_QUEUE; + else + dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_INSERT_QUEUE; break; default: dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_END_REQUEST; @@ -1799,7 +1803,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, bool bypass) { struct request_queue *q = rq->q; - bool run_queue = true; + bool run_queue = true, force = false; blk_status_t ret = BLK_STS_RESOURCE; enum mq_issue_decision dec; int srcu_idx; @@ -1814,7 +1818,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, */ if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { run_queue = false; - bypass = false; + force = true; goto out_unlock; } @@ -1837,11 +1841,14 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, out_unlock: hctx_unlock(hctx, srcu_idx); - dec = blk_mq_make_dicision(ret, bypass); + dec = blk_mq_make_dicision(ret, bypass, force); switch(dec) { case MQ_ISSUE_INSERT_QUEUE: blk_mq_sched_insert_request(rq, false, run_queue, false); break; + case MQ_ISSUE_INSERT_DISPATCH: + blk_mq_request_bypass_insert(rq, run_queue); + break; case MQ_ISSUE_END_REQUEST: blk_mq_end_request(rq, ret); break; -- 2.7.4
[PATCH V5 2/6] blk-mq: refactor the code of issue request directly
Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly into one interface to unify the interfaces to issue requests directly. The merged interface takes over the requests totally, it could insert, end or do nothing based on the return value of .queue_rq and 'bypass' parameter. Then caller needn't any other handling any more. To make code clearer, introduce new helpers enum mq_issue_decision and blk_mq_make_decision to decide how to handle the non-issued requests. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 108 + 1 file changed, 63 insertions(+), 45 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 364a53f..48b7a7c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1766,77 +1766,95 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +enum mq_issue_decision { + MQ_ISSUE_INSERT_QUEUE, + MQ_ISSUE_END_REQUEST, + MQ_ISSUE_DO_NOTHING, +}; + +static inline enum mq_issue_decision + blk_mq_make_dicision(blk_status_t ret, bool bypass) +{ + enum mq_issue_decision dec; + + switch(ret) { + case BLK_STS_OK: + dec = MQ_ISSUE_DO_NOTHING; + break; + case BLK_STS_DEV_RESOURCE: + case BLK_STS_RESOURCE: + dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_INSERT_QUEUE; + break; + default: + dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_END_REQUEST; + break; + } + + return dec; +} + +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, - bool bypass_insert) + bool bypass) { struct request_queue *q = rq->q; bool run_queue = true; + blk_status_t ret = BLK_STS_RESOURCE; + enum mq_issue_decision dec; + int srcu_idx; + hctx_lock(hctx, &srcu_idx); /* -* RCU or SRCU read lock is needed before checking quiesced flag. +* hctx_lock is needed before checking quiesced flag. * -* When queue is stopped or quiesced, ignore 'bypass_insert' from -* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller, -* and avoid driver to try to dispatch again. +* When queue is stopped or quiesced, ignore 'bypass', insert +* and return BLK_STS_OK to caller, and avoid driver to try to +* dispatch again. */ if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { run_queue = false; - bypass_insert = false; - goto insert; + bypass = false; + goto out_unlock; } - if (q->elevator && !bypass_insert) - goto insert; + /* +* Bypass the potential scheduler on the bottom device. +*/ + if (q->elevator && !bypass) + goto out_unlock; if (!blk_mq_get_dispatch_budget(hctx)) - goto insert; + goto out_unlock; if (!blk_mq_get_driver_tag(rq)) { blk_mq_put_dispatch_budget(hctx); - goto insert; + goto out_unlock; } - return __blk_mq_issue_directly(hctx, rq, cookie); -insert: - if (bypass_insert) - return BLK_STS_RESOURCE; - - blk_mq_sched_insert_request(rq, false, run_queue, false); - return BLK_STS_OK; -} - -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) -{ - blk_status_t ret; - int srcu_idx; + ret = __blk_mq_issue_directly(hctx, rq, cookie); - might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); - - hctx_lock(hctx, &srcu_idx); +out_unlock: + hctx_unlock(hctx, srcu_idx); - ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false); - if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - blk_mq_sched_insert_request(rq, false, true, false); - else if (ret != BLK_STS_OK) + dec = blk_mq_make_dicision(ret, bypass); + switch(dec) { + case MQ_ISSUE_INSERT_QUEUE: + blk_mq_sched_insert_request(rq, false, run_queue, false); + break; + case MQ_ISSUE_END_REQUEST: blk_mq_end_request(rq, ret); + break; + default: + return ret; + } - hctx_unlock(hctx, srcu_idx); + return BLK_STS_OK; } blk_status_t blk_mq_request_issue_directly(struct request *rq) { - blk_status_t ret; - int srcu_idx; - st
[PATCH V5 6/6] blk-mq: replace and kill blk_mq_request_issue_directly
Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly in blk_insert_cloned_request and remove blk_mq_request_issue_directly as nobody uses it. Signed-off-by: Jianchao Wang --- block/blk-core.c | 2 +- block/blk-mq.c | 7 +-- block/blk-mq.h | 6 -- 3 files changed, 6 insertions(+), 9 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index fdc0ad2..5cae9ef 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1436,7 +1436,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * * bypass a potential scheduler on the bottom device for * insert. */ - return blk_mq_request_issue_directly(rq); + return blk_mq_try_issue_directly(rq->mq_hctx, rq, NULL, true); } EXPORT_SYMBOL_GPL(blk_insert_cloned_request); diff --git a/block/blk-mq.c b/block/blk-mq.c index f27d547..f016028 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1797,7 +1797,7 @@ static inline enum mq_issue_decision return dec; } -static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, bool bypass) @@ -1871,11 +1871,6 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return BLK_STS_OK; } -blk_status_t blk_mq_request_issue_directly(struct request *rq) -{ - return blk_mq_try_issue_directly(rq->mq_hctx, rq, NULL, true); -} - void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { diff --git a/block/blk-mq.h b/block/blk-mq.h index facb6e9..deee3a4 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -62,8 +62,10 @@ void blk_mq_request_bypass_insert(struct request *rq, bool run_queue); void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list); -/* Used by blk_insert_cloned_request() to issue request directly */ -blk_status_t blk_mq_request_issue_directly(struct request *rq); +blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie, + bool bypass); void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list); -- 2.7.4
[PATCH V5 1/6] blk-mq: make __blk_mq_issue_directly be able to accept NULL cookie pointer
Make __blk_mq_issue_directly be able to accept a NULL cookie pointer and remove the dummy unused_cookie in blk_mq_request_issue_directly. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 411be60..364a53f 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1736,11 +1736,9 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, .rq = rq, .last = true, }; - blk_qc_t new_cookie; + blk_qc_t new_cookie = BLK_QC_T_NONE; blk_status_t ret; - new_cookie = request_to_qc_t(hctx, rq); - /* * For OK queue, we are done. For error, caller may kill it. * Any other error (busy), just add it to our list as we @@ -1750,7 +1748,7 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, switch (ret) { case BLK_STS_OK: blk_mq_update_dispatch_busy(hctx, false); - *cookie = new_cookie; + new_cookie = request_to_qc_t(hctx, rq); break; case BLK_STS_RESOURCE: case BLK_STS_DEV_RESOURCE: @@ -1759,10 +1757,12 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, break; default: blk_mq_update_dispatch_busy(hctx, false); - *cookie = BLK_QC_T_NONE; break; } + if (cookie) + *cookie = new_cookie; + return ret; } @@ -1830,11 +1830,10 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq) { blk_status_t ret; int srcu_idx; - blk_qc_t unused_cookie; struct blk_mq_hw_ctx *hctx = rq->mq_hctx; hctx_lock(hctx, &srcu_idx); - ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true); + ret = __blk_mq_try_issue_directly(hctx, rq, NULL, true); hctx_unlock(hctx, srcu_idx); return ret; -- 2.7.4
[PATCH V5 4/6] blk-mq: ensure hctx to be ran on mapped cpu when issue directly
When issue request directly and the task is migrated out of the original cpu where it allocates request, hctx could be ran on the cpu where it is not mapped. To fix this, - insert the request forcibly if BLK_MQ_F_BLOCKING is set. - check whether the current is mapped to the hctx, if not, insert forcibly. - invoke __blk_mq_issue_directly under preemption disabled. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index f54c092..7915f44 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1808,6 +1808,17 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, enum mq_issue_decision dec; int srcu_idx; + if (hctx->flags & BLK_MQ_F_BLOCKING) { + force = true; + goto out; + } + + if (!cpumask_test_cpu(get_cpu(), hctx->cpumask)) { + put_cpu(); + force = true; + goto out; + } + hctx_lock(hctx, &srcu_idx); /* * hctx_lock is needed before checking quiesced flag. @@ -1840,7 +1851,8 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, out_unlock: hctx_unlock(hctx, srcu_idx); - + put_cpu(); +out: dec = blk_mq_make_dicision(ret, bypass, force); switch(dec) { case MQ_ISSUE_INSERT_QUEUE: -- 2.7.4
[PATCH V5 5/6] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
It is not necessary to issue request directly with bypass 'true' in blk_mq_sched_insert_requests and insert the non-issued requests itself. Just set bypass to 'false' and let blk_mq_try_issue_directly handle them totally. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 +++- block/blk-mq.c | 11 +-- 2 files changed, 4 insertions(+), 15 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 66fda19..9af57c8 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -410,12 +410,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, * busy in case of 'none' scheduler, and this way may save * us one extra enqueue & dequeue to sw queue. */ - if (!hctx->dispatch_busy && !e && !run_queue_async) { + if (!hctx->dispatch_busy && !e && !run_queue_async) blk_mq_try_issue_list_directly(hctx, list); - if (list_empty(list)) - return; - } - blk_mq_insert_requests(hctx, ctx, list); + else + blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index 7915f44..f27d547 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1880,20 +1880,11 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { while (!list_empty(list)) { - blk_status_t ret; struct request *rq = list_first_entry(list, struct request, queuelist); list_del_init(&rq->queuelist); - ret = blk_mq_request_issue_directly(rq); - if (ret != BLK_STS_OK) { - if (ret == BLK_STS_RESOURCE || - ret == BLK_STS_DEV_RESOURCE) { - list_add(&rq->queuelist, list); - break; - } - blk_mq_end_request(rq, ret); - } + blk_mq_try_issue_directly(hctx, rq, NULL, false); } } -- 2.7.4
[PATCH V5 0/6] blk-mq: refactor and fix on issue request directly
Hi Jens Please consider this patchset for 4.21. It refactors the code of issue request directly to unify the interface and make the code clearer and more readable, and also fixes two defects there. The 1st patch make __blk_mq_issue_directly be able to accept NULL cookie pointer then we needn't the dummy unused_cookie. The 2nd patch refactors the code of issue request directly to unify the helper interface which could handle all the cases. The 3rd patch fix the issue that when queue is stopped or quiesced request may pass through bottom device's potential io scheduler. The 4th patch fix the issue that hctx maybe ran on a cpu where it is not mapped in issue directly path. The 5th patch make blk_mq_sched_insert_requests issue requests directly with 'bypass' false, then it needn't to handle the non-issued requests any more. The 6th patch replace and kill the one line blk_mq_request_issue_directly. V5: - rebase against Jens' for-4.21/block branch - adjust the order of patch04 and patch05 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert - comment changes V4: - split the original patch 1 into two patches, 1st and 2nd patch currently - rename the mq_decision to mq_issue_decision - comment changes V3: - Correct the code about the case bypass_insert is true and io scheduler attached. The request still need to be issued in case above. (1/4) - Refactor the code to make code clearer. blk_mq_make_request is introduced to decide insert, end or just return based on the return value of .queue_rq and bypass_insert (1/4) - Add the 2nd patch. It introduce a new decision result which indicates to insert request with blk_mq_request_bypass_insert. - Modify the code to adapt the new patch 1. V2: - Add 1st and 2nd patch to refactor the code. Jianchao Wang(6) blk-mq: make __blk_mq_issue_directly be able to accept NULL cookie pointer blk-mq: refactor the code of issue request directly blk-mq: fix issue directly case when q is stopped or quiesced blk-mq: ensure hctx to be ran on mapped cpu when issue directly blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests blk-mq: replace and kill blk_mq_request_issue_directly block/blk-core.c | 2 +- block/blk-mq-sched.c | 8 +-- block/blk-mq.c | 156 +-- block/blk-mq.h | 6 +- 4 files changed, 97 insertions(+), 75 deletions(-) Thanks Jianchao
[PATCH V4 4/5] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
It is not necessary to issue request directly with bypass 'true' in blk_mq_sched_insert_requests and insert the non-issued requests itself. Just set bypass to 'false' and let blk_mq_try_issue_directly handle them totally. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 +++- block/blk-mq.c | 11 +-- 2 files changed, 4 insertions(+), 15 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 29bfe80..23cd97e 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -411,12 +411,10 @@ void blk_mq_sched_insert_requests(struct request_queue *q, * busy in case of 'none' scheduler, and this way may save * us one extra enqueue & dequeue to sw queue. */ - if (!hctx->dispatch_busy && !e && !run_queue_async) { + if (!hctx->dispatch_busy && !e && !run_queue_async) blk_mq_try_issue_list_directly(hctx, list); - if (list_empty(list)) - return; - } - blk_mq_insert_requests(hctx, ctx, list); + else + blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index a0b9b6c..bf8b144 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1832,20 +1832,11 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct list_head *list) { while (!list_empty(list)) { - blk_status_t ret; struct request *rq = list_first_entry(list, struct request, queuelist); list_del_init(&rq->queuelist); - ret = blk_mq_request_issue_directly(rq); - if (ret != BLK_STS_OK) { - if (ret == BLK_STS_RESOURCE || - ret == BLK_STS_DEV_RESOURCE) { - list_add(&rq->queuelist, list); - break; - } - blk_mq_end_request(rq, ret); - } + blk_mq_try_issue_directly(hctx, rq, NULL, false); } } -- 2.7.4
[PATCH V4 2/5] blk-mq: refactor the code of issue request directly
Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly into one interface which is able to handle the return value from .queue_rq callback. To make the code clearer, introduce new helpers enum mq_issue_decision and blk_mq_make_decision to decide how to handle the non-issued requests. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 104 + 1 file changed, 61 insertions(+), 43 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index af5b591..962fdfc 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1729,78 +1729,96 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +enum mq_issue_decision { + MQ_ISSUE_INSERT_QUEUE, + MQ_ISSUE_END_REQUEST, + MQ_ISSUE_DO_NOTHING, +}; + +static inline enum mq_issue_decision + blk_mq_make_dicision(blk_status_t ret, bool bypass) +{ + enum mq_issue_decision dec; + + switch(ret) { + case BLK_STS_OK: + dec = MQ_ISSUE_DO_NOTHING; + break; + case BLK_STS_DEV_RESOURCE: + case BLK_STS_RESOURCE: + dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_INSERT_QUEUE; + break; + default: + dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_END_REQUEST; + break; + } + + return dec; +} + +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, - bool bypass_insert) + bool bypass) { struct request_queue *q = rq->q; bool run_queue = true; + blk_status_t ret = BLK_STS_RESOURCE; + enum mq_issue_decision dec; + int srcu_idx; + + hctx_lock(hctx, &srcu_idx); /* -* RCU or SRCU read lock is needed before checking quiesced flag. +* hctx_lock is needed before checking quiesced flag. * -* When queue is stopped or quiesced, ignore 'bypass_insert' from -* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller, -* and avoid driver to try to dispatch again. +* When queue is stopped or quiesced, ignore 'bypass', insert +* and return BLK_STS_OK to caller, and avoid driver to try to +* dispatch again. */ if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { run_queue = false; - bypass_insert = false; - goto insert; + bypass = false; + goto out_unlock; } - if (q->elevator && !bypass_insert) - goto insert; + if (q->elevator && !bypass) + goto out_unlock; if (!blk_mq_get_dispatch_budget(hctx)) - goto insert; + goto out_unlock; if (!blk_mq_get_driver_tag(rq)) { blk_mq_put_dispatch_budget(hctx); - goto insert; + goto out_unlock; } - return __blk_mq_issue_directly(hctx, rq, cookie); -insert: - if (bypass_insert) - return BLK_STS_RESOURCE; + ret = __blk_mq_issue_directly(hctx, rq, cookie); - blk_mq_sched_insert_request(rq, false, run_queue, false); - return BLK_STS_OK; -} - -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) -{ - blk_status_t ret; - int srcu_idx; - - might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); - - hctx_lock(hctx, &srcu_idx); +out_unlock: + hctx_unlock(hctx, srcu_idx); - ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false); - if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - blk_mq_sched_insert_request(rq, false, true, false); - else if (ret != BLK_STS_OK) + dec = blk_mq_make_dicision(ret, bypass); + switch(dec) { + case MQ_ISSUE_INSERT_QUEUE: + blk_mq_sched_insert_request(rq, false, run_queue, false); + break; + case MQ_ISSUE_END_REQUEST: blk_mq_end_request(rq, ret); + break; + default: + return ret; + } - hctx_unlock(hctx, srcu_idx); + return BLK_STS_OK; } blk_status_t blk_mq_request_issue_directly(struct request *rq) { - blk_status_t ret; - int srcu_idx; struct blk_mq_ctx *ctx = rq->mq_ctx; struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu); - hctx_lock(hctx, &srcu_idx); - ret = __blk_mq_try_issue_directly(hctx, rq, NULL, true); - hctx_unlock(hctx, srcu_idx); - - return ret; +
[PATCH V4 1/5] blk-mq: make __blk_mq_issue_directly be able to accept NULL cookie pointer
Make __blk_mq_issue_directly be able to accept a NULL cookie pointer and remove the dummy unused_cookie in blk_mq_request_issue_directly. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index dcf10e3..af5b591 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1700,8 +1700,6 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, blk_qc_t new_cookie; blk_status_t ret; - new_cookie = request_to_qc_t(hctx, rq); - /* * For OK queue, we are done. For error, caller may kill it. * Any other error (busy), just add it to our list as we @@ -1711,19 +1709,23 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, switch (ret) { case BLK_STS_OK: blk_mq_update_dispatch_busy(hctx, false); - *cookie = new_cookie; + new_cookie = request_to_qc_t(hctx, rq); break; case BLK_STS_RESOURCE: case BLK_STS_DEV_RESOURCE: blk_mq_update_dispatch_busy(hctx, true); __blk_mq_requeue_request(rq); + new_cookie = BLK_QC_T_NONE; break; default: blk_mq_update_dispatch_busy(hctx, false); - *cookie = BLK_QC_T_NONE; + new_cookie = BLK_QC_T_NONE; break; } + if (cookie) + *cookie = new_cookie; + return ret; } @@ -1791,12 +1793,11 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq) { blk_status_t ret; int srcu_idx; - blk_qc_t unused_cookie; struct blk_mq_ctx *ctx = rq->mq_ctx; struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu); hctx_lock(hctx, &srcu_idx); - ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true); + ret = __blk_mq_try_issue_directly(hctx, rq, NULL, true); hctx_unlock(hctx, srcu_idx); return ret; -- 2.7.4
[PATCH V4 3/5] blk-mq: fix issue directly case when q is stopped or quiesced
When try to issue request directly, if the queue is stopped or quiesced, 'bypass' will be ignored and return BLK_STS_OK to caller to avoid it dispatch request again. Then the request will be inserted with blk_mq_sched_insert_request. This is not correct for dm-rq case where we should avoid to pass through the underlying path's io scheduler. To fix it, add new mq_issue_decision entry MQ_ISSUE_INSERT_DISPATCH for above case where the request need to be inserted forcibly. And use blk_mq_request_bypass_insert to insert the request into hctx->dispatch directly. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 962fdfc..a0b9b6c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1731,12 +1731,13 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, enum mq_issue_decision { MQ_ISSUE_INSERT_QUEUE, + MQ_ISSUE_INSERT_DISPATCH, MQ_ISSUE_END_REQUEST, MQ_ISSUE_DO_NOTHING, }; static inline enum mq_issue_decision - blk_mq_make_dicision(blk_status_t ret, bool bypass) + blk_mq_make_dicision(blk_status_t ret, bool bypass, bool force) { enum mq_issue_decision dec; @@ -1746,7 +1747,10 @@ static inline enum mq_issue_decision break; case BLK_STS_DEV_RESOURCE: case BLK_STS_RESOURCE: - dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_INSERT_QUEUE; + if (force) + dec = bypass ? MQ_ISSUE_INSERT_DISPATCH : MQ_ISSUE_INSERT_QUEUE; + else + dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_INSERT_QUEUE; break; default: dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_END_REQUEST; @@ -1762,7 +1766,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, bool bypass) { struct request_queue *q = rq->q; - bool run_queue = true; + bool run_queue = true, force = false; blk_status_t ret = BLK_STS_RESOURCE; enum mq_issue_decision dec; int srcu_idx; @@ -1778,7 +1782,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, */ if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { run_queue = false; - bypass = false; + force = true; goto out_unlock; } @@ -1798,11 +1802,14 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, out_unlock: hctx_unlock(hctx, srcu_idx); - dec = blk_mq_make_dicision(ret, bypass); + dec = blk_mq_make_dicision(ret, bypass, force); switch(dec) { case MQ_ISSUE_INSERT_QUEUE: blk_mq_sched_insert_request(rq, false, run_queue, false); break; + case MQ_ISSUE_INSERT_DISPATCH: + blk_mq_request_bypass_insert(rq, run_queue); + break; case MQ_ISSUE_END_REQUEST: blk_mq_end_request(rq, ret); break; -- 2.7.4
[PATCH V4 0/5] blk-mq: refactor and fix on issue request directly
Hi Jens These patch set refactors the code of issueing request driectly and fix some defects. The 1st patch make __blk_mq_issue_directly be able to accept NULL cookie pointer. The 2nd patch refactors the code of issue request directly. It merges the blk_mq_try_issue_directly and __blk_mq_try_issue_directly and make it handle the return value of .queue_rq itself. The 3rd patch let the requests be inserted into hctx->dispatch when the queue is stopped or quiesced if bypass is true. The 4th patch make blk_mq_sched_insert_requests issue requests directly with 'bypass' false, then it needn't to handle the non-issued requests any more. The 5th patch ensures the hctx to be ran on its mapped cpu in issue directly path. V4: - split the original patch 1 into two patches, 1st and 2nd patch currently - rename the mq_decision to mq_issue_decision - comment changes V3: - Correct the code about the case bypass_insert is true and io scheduler attached. The request still need to be issued in case above. (1/4) - Refactor the code to make code clearer. blk_mq_make_request is introduced to decide insert, end or just return based on the return value of .queue_rq and bypass_insert (1/4) - Add the 2nd patch. It introduce a new decision result which indicates to insert request with blk_mq_request_bypass_insert. - Modify the code to adapt the new patch 1. V2: - Add 1st and 2nd patch Jianchao Wang (5) blk-mq: make __blk_mq_issue_directly be able to accept blk-mq: refactor the code of issue request directly blk-mq: fix issue directly case when q is stopped or blk-mq: issue directly with bypass 'false' in blk-mq: ensure hctx to be ran on mapped cpu when issue block/blk-mq-sched.c | 8 ++- block/blk-mq.c | 149 ++- 2 files changed, 92 insertions(+), 65 deletions(-) Thanks Jianchao
[PATCH V4 5/5] blk-mq: ensure hctx to be ran on mapped cpu when issue directly
When issue request directly and the task is migrated out of the original cpu where it allocates request, hctx could be ran on the cpu where it is not mapped. To fix this, insert the request if BLK_MQ_F_BLOCKING is set, check whether the current is mapped to the hctx and invoke __blk_mq_issue_directly under preemption disabled. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index bf8b144..4450eb6 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1771,6 +1771,17 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, enum mq_issue_decision dec; int srcu_idx; + if (hctx->flags & BLK_MQ_F_BLOCKING) { + force = true; + goto out; + } + + if (!cpumask_test_cpu(get_cpu(), hctx->cpumask)) { + put_cpu(); + force = true; + goto out; + } + hctx_lock(hctx, &srcu_idx); /* @@ -1801,7 +1812,8 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, out_unlock: hctx_unlock(hctx, srcu_idx); - + put_cpu(); +out: dec = blk_mq_make_dicision(ret, bypass, force); switch(dec) { case MQ_ISSUE_INSERT_QUEUE: -- 2.7.4
[PATCH RESENT V3 0/4] blk-mq: refactor and fix on issue request directly
Hi Jens The 1st patch refactors the code of issue request directly. It merges the blk_mq_try_issue_directly and __blk_mq_try_issue_directly and make it handle the return value of .queue_rq itself. The 2nd patch let the requests be inserted into hctx->dispatch when the queue is stopped or quiesced if bypass_insert is true. The 3rd patch make blk_mq_sched_insert_requests issue requests directly with 'bypass_insert' false, then it needn't to handle the non-issued requests any more. The 4th patch ensures the hctx to be ran on its mapped cpu in issue directly path. V3: - Correct the code about the case bypass_insert is true and io scheduler attached. The request still need to be issued in case above. (1/4) - Refactor the code to make code clearer. blk_mq_make_request is introduced to decide insert, end or just return based on the return value of .queue_rq and bypass_insert (1/4) - Add the 2nd patch. It introduce a new decision result which indicates to insert request with blk_mq_request_bypass_insert. - Modify the code to adapt the new patch 1. V2: - Add 1st and 2nd patch Jianchao Wang(4) blk-mq: refactor the code of issue request directly blk-mq: insert request without involving any io blk-mq: issue directly with bypass_insert 'false' in blk-mq: ensure hctx to be ran on mapped cpu when issue block/blk-mq-sched.c | 8 ++- block/blk-mq.c | 139 ++- 2 files changed, 86 insertions(+), 61 deletions(-) Thanks Jianchao
[PATCH V3 1/4] blk-mq: refactor the code of issue request directly
Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly into one interface which is able to handle the return value from .queue_rq callback. To make the code clearer, introduce new helpers blk_mq_make_decision and enum mq_decision. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 104 + 1 file changed, 60 insertions(+), 44 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index dcf10e3..0f6328b 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1711,7 +1711,6 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, switch (ret) { case BLK_STS_OK: blk_mq_update_dispatch_busy(hctx, false); - *cookie = new_cookie; break; case BLK_STS_RESOURCE: case BLK_STS_DEV_RESOURCE: @@ -1720,86 +1719,103 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, break; default: blk_mq_update_dispatch_busy(hctx, false); - *cookie = BLK_QC_T_NONE; + new_cookie = BLK_QC_T_NONE; break; } + if (cookie) + *cookie = new_cookie; return ret; } -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +enum mq_decision { + MQ_INSERT_QUEUE, + MQ_END_REQUEST, + MQ_DO_NOTHING, +}; + +static inline enum mq_decision + blk_mq_make_dicision(blk_status_t ret, bool bypass_insert) +{ + enum mq_decision dec; + + switch(ret) { + case BLK_STS_OK: + dec = MQ_DO_NOTHING; + break; + case BLK_STS_DEV_RESOURCE: + case BLK_STS_RESOURCE: + dec = bypass_insert ? MQ_DO_NOTHING : MQ_INSERT_QUEUE; + break; + default: + dec = bypass_insert ? MQ_DO_NOTHING : MQ_END_REQUEST; + break; + } + + return dec; +} + +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, bool bypass_insert) { struct request_queue *q = rq->q; bool run_queue = true; + enum mq_decision dec; + blk_status_t ret = BLK_STS_RESOURCE; + int srcu_idx; + hctx_lock(hctx, &srcu_idx); /* -* RCU or SRCU read lock is needed before checking quiesced flag. +* hctx_lock is needed before checking quiesced flag. * -* When queue is stopped or quiesced, ignore 'bypass_insert' from -* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller, -* and avoid driver to try to dispatch again. +* When queue is stopped or quiesced, ignore 'bypass_insert', +* and return BLK_STS_OK to caller, and avoid driver to try to +* dispatch again. */ if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { run_queue = false; bypass_insert = false; - goto insert; + goto out_unlock; } if (q->elevator && !bypass_insert) - goto insert; + goto out_unlock; if (!blk_mq_get_dispatch_budget(hctx)) - goto insert; + goto out_unlock; if (!blk_mq_get_driver_tag(rq)) { blk_mq_put_dispatch_budget(hctx); - goto insert; + goto out_unlock; } - return __blk_mq_issue_directly(hctx, rq, cookie); -insert: - if (bypass_insert) - return BLK_STS_RESOURCE; - - blk_mq_sched_insert_request(rq, false, run_queue, false); - return BLK_STS_OK; -} - -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) -{ - blk_status_t ret; - int srcu_idx; - - might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); - - hctx_lock(hctx, &srcu_idx); + ret = __blk_mq_issue_directly(hctx, rq, cookie); +out_unlock: + hctx_unlock(hctx, srcu_idx); - ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false); - if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - blk_mq_sched_insert_request(rq, false, true, false); - else if (ret != BLK_STS_OK) + dec = blk_mq_make_dicision(ret, bypass_insert); + switch(dec) { + case MQ_INSERT_QUEUE: + blk_mq_sched_insert_request(rq, false, run_queue, false); + break; + case MQ_END_REQUEST: blk_mq_end_request(rq, ret); + break; + default: + return ret; + } - hctx_unlock(hctx, srcu_idx); + return BLK_STS_OK; } blk_status_t blk_mq_request_issue_directly(struct request *rq)
[PATCH V3 4/4] blk-mq: ensure hctx to be ran on mapped cpu when issue directly
When issue request directly and the task is migrated out of the original cpu where it allocates request, hctx could be ran on the cpu where it is not mapped. To fix this, insert the request if BLK_MQ_F_BLOCKING is set, check whether the current is mapped to the hctx and invoke __blk_mq_issue_directly under preemption disabled. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 9c6c858..ced3346 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1770,6 +1770,17 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, blk_status_t ret = BLK_STS_RESOURCE; int srcu_idx; + if (hctx->flags & BLK_MQ_F_BLOCKING) { + force = true; + goto out; + } + + if (!cpumask_test_cpu(get_cpu(), hctx->cpumask)) { + put_cpu(); + force = true; + goto out; + } + hctx_lock(hctx, &srcu_idx); /* * hctx_lock is needed before checking quiesced flag. @@ -1798,7 +1809,8 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, ret = __blk_mq_issue_directly(hctx, rq, cookie); out_unlock: hctx_unlock(hctx, srcu_idx); - + put_cpu(); +out: dec = blk_mq_make_dicision(ret, bypass_insert, force); switch(dec) { case MQ_INSERT_DISPATCH: -- 2.7.4
[PATCH V3 3/4] blk-mq: issue directly with bypass_insert 'false' in blk_mq_sched_insert_requests
It is not necessary to issue request directly with bypass_insert 'true' in blk_mq_sched_insert_requests and insert the non-issued requests itself. Just set bypass_insert to 'false' and let blk_mq_try_issue_directly handle them. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 +++- block/blk-mq.c | 10 +- 2 files changed, 4 insertions(+), 14 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 29bfe80..23cd97e 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -411,12 +411,10 @@ void blk_mq_sched_insert_requests(struct request_queue *q, * busy in case of 'none' scheduler, and this way may save * us one extra enqueue & dequeue to sw queue. */ - if (!hctx->dispatch_busy && !e && !run_queue_async) { + if (!hctx->dispatch_busy && !e && !run_queue_async) blk_mq_try_issue_list_directly(hctx, list); - if (list_empty(list)) - return; - } - blk_mq_insert_requests(hctx, ctx, list); + else + blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index e0aa068..9c6c858 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1834,15 +1834,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, queuelist); list_del_init(&rq->queuelist); - ret = blk_mq_request_issue_directly(rq); - if (ret != BLK_STS_OK) { - if (ret == BLK_STS_RESOURCE || - ret == BLK_STS_DEV_RESOURCE) { - list_add(&rq->queuelist, list); - break; - } - blk_mq_end_request(rq, ret); - } + ret = blk_mq_try_issue_directly(hctx, rq, NULL, false); } } -- 2.7.4
[PATCH V3 2/4] blk-mq: insert request without involving any io scheduler
When try to issue request directly, if the queue is stopped or quiesced, 'bypass_insert' will be ignored and return BLK_STS_OK to caller to avoid it dispatch request again. Then the request will be inserted with blk_mq_sched_insert_request. This is not correct for dm-rq case where we should avoid to pass through the underlying paths' io scheduler. To fix it, use blk_mq_request_bypass_insert to insert the request into hctx->dispatch directly. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 0f6328b..e0aa068 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1729,13 +1729,14 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, } enum mq_decision { + MQ_INSERT_DISPATCH, MQ_INSERT_QUEUE, MQ_END_REQUEST, MQ_DO_NOTHING, }; static inline enum mq_decision - blk_mq_make_dicision(blk_status_t ret, bool bypass_insert) + blk_mq_make_dicision(blk_status_t ret, bool bypass_insert, bool force) { enum mq_decision dec; @@ -1745,7 +1746,10 @@ static inline enum mq_decision break; case BLK_STS_DEV_RESOURCE: case BLK_STS_RESOURCE: - dec = bypass_insert ? MQ_DO_NOTHING : MQ_INSERT_QUEUE; + if (force) + dec = bypass_insert ? MQ_INSERT_DISPATCH : MQ_INSERT_QUEUE; + else + dec = bypass_insert ? MQ_DO_NOTHING : MQ_INSERT_QUEUE; break; default: dec = bypass_insert ? MQ_DO_NOTHING : MQ_END_REQUEST; @@ -1761,7 +1765,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, bool bypass_insert) { struct request_queue *q = rq->q; - bool run_queue = true; + bool run_queue = true, force = false; enum mq_decision dec; blk_status_t ret = BLK_STS_RESOURCE; int srcu_idx; @@ -1776,7 +1780,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, */ if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { run_queue = false; - bypass_insert = false; + force = true; goto out_unlock; } @@ -1795,8 +1799,11 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, out_unlock: hctx_unlock(hctx, srcu_idx); - dec = blk_mq_make_dicision(ret, bypass_insert); + dec = blk_mq_make_dicision(ret, bypass_insert, force); switch(dec) { + case MQ_INSERT_DISPATCH: + blk_mq_request_bypass_insert(rq, run_queue); + break; case MQ_INSERT_QUEUE: blk_mq_sched_insert_request(rq, false, run_queue, false); break; -- 2.7.4
[PATCH V3 3/4] blk-mq: issue directly with bypass_insert 'false' in blk_mq_sched_insert_requests
It is not necessary to issue request directly with bypass_insert 'true' in blk_mq_sched_insert_requests and insert the non-issued requests itself. Just set bypass_insert to 'false' and let blk_mq_try_issue_directly handle them. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 +++- block/blk-mq.c | 10 +- 2 files changed, 4 insertions(+), 14 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 29bfe80..23cd97e 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -411,12 +411,10 @@ void blk_mq_sched_insert_requests(struct request_queue *q, * busy in case of 'none' scheduler, and this way may save * us one extra enqueue & dequeue to sw queue. */ - if (!hctx->dispatch_busy && !e && !run_queue_async) { + if (!hctx->dispatch_busy && !e && !run_queue_async) blk_mq_try_issue_list_directly(hctx, list); - if (list_empty(list)) - return; - } - blk_mq_insert_requests(hctx, ctx, list); + else + blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index e0aa068..9c6c858 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1834,15 +1834,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, queuelist); list_del_init(&rq->queuelist); - ret = blk_mq_request_issue_directly(rq); - if (ret != BLK_STS_OK) { - if (ret == BLK_STS_RESOURCE || - ret == BLK_STS_DEV_RESOURCE) { - list_add(&rq->queuelist, list); - break; - } - blk_mq_end_request(rq, ret); - } + ret = blk_mq_try_issue_directly(hctx, rq, NULL, false); } } -- 2.7.4
[PATCH V3 2/4] blk-mq: insert request without involving any io scheduler
When try to issue request directly, if the queue is stopped or quiesced, 'bypass_insert' will be ignored and return BLK_STS_OK to caller to avoid it dispatch request again. Then the request will be inserted with blk_mq_sched_insert_request. This is not correct for dm-rq case where we should avoid to pass through the underlying paths' io scheduler. To fix it, use blk_mq_request_bypass_insert to insert the request into hctx->dispatch directly. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 0f6328b..e0aa068 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1729,13 +1729,14 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, } enum mq_decision { + MQ_INSERT_DISPATCH, MQ_INSERT_QUEUE, MQ_END_REQUEST, MQ_DO_NOTHING, }; static inline enum mq_decision - blk_mq_make_dicision(blk_status_t ret, bool bypass_insert) + blk_mq_make_dicision(blk_status_t ret, bool bypass_insert, bool force) { enum mq_decision dec; @@ -1745,7 +1746,10 @@ static inline enum mq_decision break; case BLK_STS_DEV_RESOURCE: case BLK_STS_RESOURCE: - dec = bypass_insert ? MQ_DO_NOTHING : MQ_INSERT_QUEUE; + if (force) + dec = bypass_insert ? MQ_INSERT_DISPATCH : MQ_INSERT_QUEUE; + else + dec = bypass_insert ? MQ_DO_NOTHING : MQ_INSERT_QUEUE; break; default: dec = bypass_insert ? MQ_DO_NOTHING : MQ_END_REQUEST; @@ -1761,7 +1765,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, bool bypass_insert) { struct request_queue *q = rq->q; - bool run_queue = true; + bool run_queue = true, force = false; enum mq_decision dec; blk_status_t ret = BLK_STS_RESOURCE; int srcu_idx; @@ -1776,7 +1780,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, */ if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { run_queue = false; - bypass_insert = false; + force = true; goto out_unlock; } @@ -1795,8 +1799,11 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, out_unlock: hctx_unlock(hctx, srcu_idx); - dec = blk_mq_make_dicision(ret, bypass_insert); + dec = blk_mq_make_dicision(ret, bypass_insert, force); switch(dec) { + case MQ_INSERT_DISPATCH: + blk_mq_request_bypass_insert(rq, run_queue); + break; case MQ_INSERT_QUEUE: blk_mq_sched_insert_request(rq, false, run_queue, false); break; -- 2.7.4
[PATCH V3 4/4] blk-mq: ensure hctx to be ran on mapped cpu when issue directly
When issue request directly and the task is migrated out of the original cpu where it allocates request, hctx could be ran on the cpu where it is not mapped. To fix this, insert the request if BLK_MQ_F_BLOCKING is set, check whether the current is mapped to the hctx and invoke __blk_mq_issue_directly under preemption disabled. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 9c6c858..ced3346 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1770,6 +1770,17 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, blk_status_t ret = BLK_STS_RESOURCE; int srcu_idx; + if (hctx->flags & BLK_MQ_F_BLOCKING) { + force = true; + goto out; + } + + if (!cpumask_test_cpu(get_cpu(), hctx->cpumask)) { + put_cpu(); + force = true; + goto out; + } + hctx_lock(hctx, &srcu_idx); /* * hctx_lock is needed before checking quiesced flag. @@ -1798,7 +1809,8 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, ret = __blk_mq_issue_directly(hctx, rq, cookie); out_unlock: hctx_unlock(hctx, srcu_idx); - + put_cpu(); +out: dec = blk_mq_make_dicision(ret, bypass_insert, force); switch(dec) { case MQ_INSERT_DISPATCH: -- 2.7.4
[PATCH V3 1/4] blk-mq: refactor the code of issue request directly
Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly into one interface which is able to handle the return value from .queue_rq callback. To make the code clearer, introduce new helpers blk_mq_make_decision and enum mq_decision. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 104 ++--- block/blk-mq.c.rej | 60 +++ 2 files changed, 120 insertions(+), 44 deletions(-) create mode 100644 block/blk-mq.c.rej diff --git a/block/blk-mq.c b/block/blk-mq.c index dcf10e3..0f6328b 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1711,7 +1711,6 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, switch (ret) { case BLK_STS_OK: blk_mq_update_dispatch_busy(hctx, false); - *cookie = new_cookie; break; case BLK_STS_RESOURCE: case BLK_STS_DEV_RESOURCE: @@ -1720,86 +1719,103 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, break; default: blk_mq_update_dispatch_busy(hctx, false); - *cookie = BLK_QC_T_NONE; + new_cookie = BLK_QC_T_NONE; break; } + if (cookie) + *cookie = new_cookie; return ret; } -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +enum mq_decision { + MQ_INSERT_QUEUE, + MQ_END_REQUEST, + MQ_DO_NOTHING, +}; + +static inline enum mq_decision + blk_mq_make_dicision(blk_status_t ret, bool bypass_insert) +{ + enum mq_decision dec; + + switch(ret) { + case BLK_STS_OK: + dec = MQ_DO_NOTHING; + break; + case BLK_STS_DEV_RESOURCE: + case BLK_STS_RESOURCE: + dec = bypass_insert ? MQ_DO_NOTHING : MQ_INSERT_QUEUE; + break; + default: + dec = bypass_insert ? MQ_DO_NOTHING : MQ_END_REQUEST; + break; + } + + return dec; +} + +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, bool bypass_insert) { struct request_queue *q = rq->q; bool run_queue = true; + enum mq_decision dec; + blk_status_t ret = BLK_STS_RESOURCE; + int srcu_idx; + hctx_lock(hctx, &srcu_idx); /* -* RCU or SRCU read lock is needed before checking quiesced flag. +* hctx_lock is needed before checking quiesced flag. * -* When queue is stopped or quiesced, ignore 'bypass_insert' from -* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller, -* and avoid driver to try to dispatch again. +* When queue is stopped or quiesced, ignore 'bypass_insert', +* and return BLK_STS_OK to caller, and avoid driver to try to +* dispatch again. */ if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { run_queue = false; bypass_insert = false; - goto insert; + goto out_unlock; } if (q->elevator && !bypass_insert) - goto insert; + goto out_unlock; if (!blk_mq_get_dispatch_budget(hctx)) - goto insert; + goto out_unlock; if (!blk_mq_get_driver_tag(rq)) { blk_mq_put_dispatch_budget(hctx); - goto insert; + goto out_unlock; } - return __blk_mq_issue_directly(hctx, rq, cookie); -insert: - if (bypass_insert) - return BLK_STS_RESOURCE; - - blk_mq_sched_insert_request(rq, false, run_queue, false); - return BLK_STS_OK; -} - -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) -{ - blk_status_t ret; - int srcu_idx; - - might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); - - hctx_lock(hctx, &srcu_idx); + ret = __blk_mq_issue_directly(hctx, rq, cookie); +out_unlock: + hctx_unlock(hctx, srcu_idx); - ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false); - if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - blk_mq_sched_insert_request(rq, false, true, false); - else if (ret != BLK_STS_OK) + dec = blk_mq_make_dicision(ret, bypass_insert); + switch(dec) { + case MQ_INSERT_QUEUE: + blk_mq_sched_insert_request(rq, false, run_queue, false); + break; + case MQ_END_REQUEST: blk_mq_end_request(rq, ret); + break; + default: + return ret; + } - hctx_unlock(hctx, srcu_id
[PATCH V3 0/4] blk-mq: refactor and fix on issue request directly
Hi Jens The 1st patch refactors the code of issue request directly. It merges the blk_mq_try_issue_directly and __blk_mq_try_issue_directly and make it handle the return value of .queue_rq itself. The 2nd patch let the requests be inserted into hctx->dispatch when the queue is stopped or quiesced if bypass_insert is true. The 3rd patch make blk_mq_sched_insert_requests issue requests directly with 'bypass_insert' false, then it needn't to handle the non-issued requests any more. The 4th patch ensures the hctx to be ran on its mapped cpu in issue directly path. V3: - Correct the code about the case bypass_insert is true and io scheduler attached. The request still need to be issued in case above. (1/4) - Refactor the code to make code clearer. (1/4) - Add the 2nd patch. - Modify the code to adapt the new patch 1. V2: - Add 1st and 2nd patch Jianchao Wang(4) blk-mq: refactor the code of issue request directly blk-mq: insert request without involving any io blk-mq: issue directly with bypass_insert 'false' in blk-mq: ensure hctx to be ran on mapped cpu when issue block/blk-mq-sched.c | 8 ++- block/blk-mq.c | 139 ++- block/blk-mq.c.rej | 60 ++ 3 files changed, 146 insertions(+), 61 deletions(-)
[PATCH V5] block: fix the DISCARD request merge
There are two cases when handle DISCARD merge. If max_discard_segments == 1, the bios/requests need to be contiguous to merge. If max_discard_segments > 1, it takes every bio as a range and different range needn't to be contiguous. But now, attempt_merge screws this up. It always consider contiguity for DISCARD for the case max_discard_segments > 1 and cannot merge contiguous DISCARD for the case max_discard_segments == 1, because rq_attempt_discard_merge always returns false in this case. This patch fixes both of the two cases above. Signed-off-by: Jianchao Wang --- V5: - get rid of the redundant 'else' in blk_discard_mergable V4: - introduce blk_try_req_merge as suggestion of Christoph. V3: - Introduce blk_discard_mergable into attempt_merge and blk_try_merge. - Some comment changes. V2: - Add max_discard_segments > 1 checking in attempt_merge. - Change patch title and comment. - Add more comment in attempt_merge block/blk-merge.c | 46 -- 1 file changed, 36 insertions(+), 10 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index 42a4674..6b5ad27 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -714,6 +714,31 @@ static void blk_account_io_merge(struct request *req) part_stat_unlock(); } } +/* + * Two cases of handling DISCARD merge: + * If max_discard_segments > 1, the driver takes every bio + * as a range and send them to controller together. The ranges + * needn't to be contiguous. + * Otherwise, the bios/requests will be handled as same as + * others which should be contiguous. + */ +static inline bool blk_discard_mergable(struct request *req) +{ + if (req_op(req) == REQ_OP_DISCARD && + queue_max_discard_segments(req->q) > 1) + return true; + return false; +} + +enum elv_merge blk_try_req_merge(struct request *req, struct request *next) +{ + if (blk_discard_mergable(req)) + return ELEVATOR_DISCARD_MERGE; + else if (blk_rq_pos(req) + blk_rq_sectors(req) == blk_rq_pos(next)) + return ELEVATOR_BACK_MERGE; + + return ELEVATOR_NO_MERGE; +} /* * For non-mq, this has to be called with the request spinlock acquired. @@ -731,12 +756,6 @@ static struct request *attempt_merge(struct request_queue *q, if (req_op(req) != req_op(next)) return NULL; - /* -* not contiguous -*/ - if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next)) - return NULL; - if (rq_data_dir(req) != rq_data_dir(next) || req->rq_disk != next->rq_disk || req_no_special_merge(next)) @@ -760,11 +779,19 @@ static struct request *attempt_merge(struct request_queue *q, * counts here. Handle DISCARDs separately, as they * have separate settings. */ - if (req_op(req) == REQ_OP_DISCARD) { + + switch (blk_try_req_merge(req, next)) { + case ELEVATOR_DISCARD_MERGE: if (!req_attempt_discard_merge(q, req, next)) return NULL; - } else if (!ll_merge_requests_fn(q, req, next)) + break; + case ELEVATOR_BACK_MERGE: + if (!ll_merge_requests_fn(q, req, next)) + return NULL; + break; + default: return NULL; + } /* * If failfast settings disagree or any of the two is already @@ -888,8 +915,7 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio) enum elv_merge blk_try_merge(struct request *rq, struct bio *bio) { - if (req_op(rq) == REQ_OP_DISCARD && - queue_max_discard_segments(rq->q) > 1) + if (blk_discard_mergable(rq)) return ELEVATOR_DISCARD_MERGE; else if (blk_rq_pos(rq) + blk_rq_sectors(rq) == bio->bi_iter.bi_sector) return ELEVATOR_BACK_MERGE; -- 2.7.4
[PATCH V4] block: fix the DISCARD request merge
There are two cases when handle DISCARD merge. If max_discard_segments == 1, the bios/requests need to be contiguous to merge. If max_discard_segments > 1, it takes every bio as a range and different range needn't to be contiguous. But now, attempt_merge screws this up. It always consider contiguity for DISCARD for the case max_discard_segments > 1 and cannot merge contiguous DISCARD for the case max_discard_segments == 1, because rq_attempt_discard_merge always returns false in this case. This patch fixes both of the two cases above. Signed-off-by: Jianchao Wang --- V4: - introduce blk_try_req_merge as suggestion of Christoph. V3: - Introduce blk_discard_mergable into attempt_merge and blk_try_merge. - Some comment changes. V2: - Add max_discard_segments > 1 checking in attempt_merge. - Change patch title and comment. - Add more comment in attempt_merge block/blk-merge.c | 47 +-- 1 file changed, 37 insertions(+), 10 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index 42a4674..cf817c7a 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -714,6 +714,32 @@ static void blk_account_io_merge(struct request *req) part_stat_unlock(); } } +/* + * Two cases of handling DISCARD merge: + * If max_discard_segments > 1, the driver takes every bio + * as a range and send them to controller together. The ranges + * needn't to be contiguous. + * Otherwise, the bios/requests will be handled as same as + * others which should be contiguous. + */ +static inline bool blk_discard_mergable(struct request *req) +{ + if (req_op(req) == REQ_OP_DISCARD && + queue_max_discard_segments(req->q) > 1) + return true; + else + return false; +} + +enum elv_merge blk_try_req_merge(struct request *req, struct request *next) +{ + if (blk_discard_mergable(req)) + return ELEVATOR_DISCARD_MERGE; + else if (blk_rq_pos(req) + blk_rq_sectors(req) == blk_rq_pos(next)) + return ELEVATOR_BACK_MERGE; + + return ELEVATOR_NO_MERGE; +} /* * For non-mq, this has to be called with the request spinlock acquired. @@ -731,12 +757,6 @@ static struct request *attempt_merge(struct request_queue *q, if (req_op(req) != req_op(next)) return NULL; - /* -* not contiguous -*/ - if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next)) - return NULL; - if (rq_data_dir(req) != rq_data_dir(next) || req->rq_disk != next->rq_disk || req_no_special_merge(next)) @@ -760,11 +780,19 @@ static struct request *attempt_merge(struct request_queue *q, * counts here. Handle DISCARDs separately, as they * have separate settings. */ - if (req_op(req) == REQ_OP_DISCARD) { + + switch (blk_try_req_merge(req, next)) { + case ELEVATOR_DISCARD_MERGE: if (!req_attempt_discard_merge(q, req, next)) return NULL; - } else if (!ll_merge_requests_fn(q, req, next)) + break; + case ELEVATOR_BACK_MERGE: + if (!ll_merge_requests_fn(q, req, next)) + return NULL; + break; + default: return NULL; + } /* * If failfast settings disagree or any of the two is already @@ -888,8 +916,7 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio) enum elv_merge blk_try_merge(struct request *rq, struct bio *bio) { - if (req_op(rq) == REQ_OP_DISCARD && - queue_max_discard_segments(rq->q) > 1) + if (blk_discard_mergable(rq)) return ELEVATOR_DISCARD_MERGE; else if (blk_rq_pos(rq) + blk_rq_sectors(rq) == bio->bi_iter.bi_sector) return ELEVATOR_BACK_MERGE; -- 2.7.4
[PATCH V2 1/3] blk-mq: refactor the code of issue request directly
Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly into one interface which is able to handle the return value from .queue_rq callback. Due to we can only issue directly w/o io scheduler, so remove the blk_mq_get_driver_tag. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 109 ++--- 1 file changed, 50 insertions(+), 59 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index dcf10e3..a81d2ca 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1700,8 +1700,6 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, blk_qc_t new_cookie; blk_status_t ret; - new_cookie = request_to_qc_t(hctx, rq); - /* * For OK queue, we are done. For error, caller may kill it. * Any other error (busy), just add it to our list as we @@ -1711,7 +1709,7 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, switch (ret) { case BLK_STS_OK: blk_mq_update_dispatch_busy(hctx, false); - *cookie = new_cookie; + new_cookie = request_to_qc_t(hctx, rq); break; case BLK_STS_RESOURCE: case BLK_STS_DEV_RESOURCE: @@ -1720,86 +1718,79 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, break; default: blk_mq_update_dispatch_busy(hctx, false); - *cookie = BLK_QC_T_NONE; + new_cookie = BLK_QC_T_NONE; break; } + if (cookie) + *cookie = new_cookie; return ret; } -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, +/* + * When the bypass is true, the caller is responsible for handling the + * request if it is not issued. The only exception is that io scheduler + * is set. + */ +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, - bool bypass_insert) + bool bypass) { struct request_queue *q = rq->q; - bool run_queue = true; + blk_status_t ret = BLK_STS_OK; + bool insert = true; + int srcu_idx; + + if (q->elevator) + goto out; + hctx_lock(hctx, &srcu_idx); /* -* RCU or SRCU read lock is needed before checking quiesced flag. +* hctx_lock is needed before checking quiesced flag. * -* When queue is stopped or quiesced, ignore 'bypass_insert' from -* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller, -* and avoid driver to try to dispatch again. +* When queue is stopped or quiesced, ignore 'bypass', insert and return +* BLK_STS_OK to caller, and avoid driver to try to dispatch again. */ - if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { - run_queue = false; - bypass_insert = false; - goto insert; - } - - if (q->elevator && !bypass_insert) - goto insert; + if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) + goto out_unlock; - if (!blk_mq_get_dispatch_budget(hctx)) - goto insert; - - if (!blk_mq_get_driver_tag(rq)) { - blk_mq_put_dispatch_budget(hctx); - goto insert; + if (!blk_mq_get_dispatch_budget(hctx)) { + insert = !bypass; + ret = bypass ? BLK_STS_RESOURCE : BLK_STS_OK; + goto out_unlock; } - return __blk_mq_issue_directly(hctx, rq, cookie); -insert: - if (bypass_insert) - return BLK_STS_RESOURCE; - - blk_mq_sched_insert_request(rq, false, run_queue, false); - return BLK_STS_OK; -} - -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) -{ - blk_status_t ret; - int srcu_idx; - - might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); - - hctx_lock(hctx, &srcu_idx); - - ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false); - if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - blk_mq_sched_insert_request(rq, false, true, false); - else if (ret != BLK_STS_OK) - blk_mq_end_request(rq, ret); + ret = __blk_mq_issue_directly(hctx, rq, cookie); + switch(ret) { + case BLK_STS_OK: + insert = false; + break; + case BLK_STS_DEV_RESOURCE: + case BLK_STS_RESOURCE: + insert = !bypass; + break; + default: + if (!bypass) + blk_mq_end_request(rq, ret); +
[PATCH V2 2/3] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
It is not necessary to issue request directly with bypass 'true' in blk_mq_sched_insert_requests and insert the non-issued requests itself. Just set bypass to 'false' and let blk_mq_try_issue_directly handle them. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 +++- block/blk-mq.c | 10 +- 2 files changed, 4 insertions(+), 14 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 29bfe80..23cd97e 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -411,12 +411,10 @@ void blk_mq_sched_insert_requests(struct request_queue *q, * busy in case of 'none' scheduler, and this way may save * us one extra enqueue & dequeue to sw queue. */ - if (!hctx->dispatch_busy && !e && !run_queue_async) { + if (!hctx->dispatch_busy && !e && !run_queue_async) blk_mq_try_issue_list_directly(hctx, list); - if (list_empty(list)) - return; - } - blk_mq_insert_requests(hctx, ctx, list); + else + blk_mq_insert_requests(hctx, ctx, list); } blk_mq_run_hw_queue(hctx, run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index a81d2ca..71b829c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1802,15 +1802,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, queuelist); list_del_init(&rq->queuelist); - ret = blk_mq_request_issue_directly(rq); - if (ret != BLK_STS_OK) { - if (ret == BLK_STS_RESOURCE || - ret == BLK_STS_DEV_RESOURCE) { - list_add(&rq->queuelist, list); - break; - } - blk_mq_end_request(rq, ret); - } + ret = blk_mq_try_issue_directly(hctx, rq, NULL, false); } } -- 2.7.4
[PATCH V2 3/3] blk-mq: ensure hctx to be ran on mapped cpu when issue directly
When issue request directly and the task is migrated out of the original cpu where it allocates request, hctx could be ran on the cpu where it is not mapped. To fix this, insert the request if BLK_MQ_F_BLOCKING is set, check whether the current is mapped to the hctx and invoke __blk_mq_issue_directly under preemption disabled. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 9 + 1 file changed, 9 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index 71b829c..4f1dedb 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1745,6 +1745,14 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, if (q->elevator) goto out; + if (hctx->flags & BLK_MQ_F_BLOCKING) + goto out; + + if (!cpumask_test_cpu(get_cpu(), hctx->cpumask)) { + put_cpu(); + goto out; + } + hctx_lock(hctx, &srcu_idx); /* * hctx_lock is needed before checking quiesced flag. @@ -1779,6 +1787,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, out_unlock: hctx_unlock(hctx, srcu_idx); + put_cpu(); out: if (insert) blk_mq_sched_insert_request(rq, false, true, false); -- 2.7.4
[PATCH V2 0/3] blk-mq: refactor and fix on issue request directly
Hi Jens The 1st patch refactors the code of issue request directly. It merges the blk_mq_try_issue_directly and __blk_mq_try_issue_directly and make blk_mq_try_issue_directly handle return value of .queue_rq itself. The 2nd patch make blk_mq_sched_insert_requests issue request directly with bypass "false" instead of the "true", then needn't to handle the non-issued requests any more. The 3rd patch ensures the hctx to be ran on its mapped cpu in issue directly path. V2: - Add 1st and 2nd patch. Jianchao Wang(3) blk-mq: refactor the code of issue request directly blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests blk-mq: ensure hctx to be ran on mapped cpu when issue block/blk-mq-sched.c | 8 ++-- block/blk-mq.c | 128 --- 2 files changed, 63 insertions(+), 73 deletions(-)
[PATCH] blk-mq: ensure hctx to be ran on mapped cpu when issue directly
When issue request directly and the task is migrated out of the original cpu where it allocates request, hctx could be ran on the cpu where it is not mapped. To fix this, insert the request if BLK_MQ_F_BLOCKING is set, check whether the current is mapped to the hctx and invoke __blk_mq_issue_directly under preemption disabled. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index e3c39ea..0cdc306 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1717,6 +1717,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, { struct request_queue *q = rq->q; bool run_queue = true; + blk_status_t ret; + + if (hctx->flags & BLK_MQ_F_BLOCKING) { + bypass_insert = false; + goto insert; + } /* * RCU or SRCU read lock is needed before checking quiesced flag. @@ -1734,6 +1740,11 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, if (q->elevator && !bypass_insert) goto insert; + if (!cpumask_test_cpu(get_cpu(), hctx->cpumask)) { + bypass_insert = false; + goto insert; + } + if (!blk_mq_get_dispatch_budget(hctx)) goto insert; @@ -1742,8 +1753,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, goto insert; } - return __blk_mq_issue_directly(hctx, rq, cookie); + ret = __blk_mq_issue_directly(hctx, rq, cookie); + put_cpu(); + return ret; + insert: + put_cpu(); if (bypass_insert) return BLK_STS_RESOURCE; -- 2.7.4
[PATCH V3] block: fix the DISCARD request merge
There are two cases when handle DISCARD merge. If max_discard_segments == 1, the bios/requests need to be contiguous to merge. If max_discard_segments > 1, it takes every bio as a range and different range needn't to be contiguous. But now, attempt_merge screws this up. It always consider contiguity for DISCARD for the case max_discard_segments > 1 and cannot merge contiguous DISCARD for the case max_discard_segments == 1, because rq_attempt_discard_merge always returns false in this case. This patch fixes both of the two cases above. Signed-off-by: Jianchao Wang --- V3: - Introduce blk_discard_mergable into attempt_merge and blk_try_merge. - Some comment changes. V2: - Add max_discard_segments > 1 checking in attempt_merge. - Change patch title and comment. - Add more comment in attempt_merge block/blk-merge.c | 34 -- 1 file changed, 24 insertions(+), 10 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index 42a4674..b258de0 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -714,6 +714,22 @@ static void blk_account_io_merge(struct request *req) part_stat_unlock(); } } +/* + * Two cases of handling DISCARD merge: + * If max_discard_segments > 1, the driver takes every bio + * as a range and send them to controller together. The ranges + * needn't to be contiguous. + * Otherwise, the bios/requests will be handled as same as + * others which should be contiguous. + */ +static inline bool blk_discard_mergable(struct request *req) +{ + if (req_op(req) == REQ_OP_DISCARD && + queue_max_discard_segments(req->q) > 1) + return true; + else + return false; +} /* * For non-mq, this has to be called with the request spinlock acquired. @@ -731,12 +747,6 @@ static struct request *attempt_merge(struct request_queue *q, if (req_op(req) != req_op(next)) return NULL; - /* -* not contiguous -*/ - if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next)) - return NULL; - if (rq_data_dir(req) != rq_data_dir(next) || req->rq_disk != next->rq_disk || req_no_special_merge(next)) @@ -760,11 +770,16 @@ static struct request *attempt_merge(struct request_queue *q, * counts here. Handle DISCARDs separately, as they * have separate settings. */ - if (req_op(req) == REQ_OP_DISCARD) { + + if (blk_discard_mergable(req)) { if (!req_attempt_discard_merge(q, req, next)) return NULL; - } else if (!ll_merge_requests_fn(q, req, next)) + } else if (blk_rq_pos(req) + blk_rq_sectors(req) == blk_rq_pos(next)) { + if (!ll_merge_requests_fn(q, req, next)) + return NULL; + } else { return NULL; + } /* * If failfast settings disagree or any of the two is already @@ -888,8 +903,7 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio) enum elv_merge blk_try_merge(struct request *rq, struct bio *bio) { - if (req_op(rq) == REQ_OP_DISCARD && - queue_max_discard_segments(rq->q) > 1) + if (blk_discard_mergable(rq)) return ELEVATOR_DISCARD_MERGE; else if (blk_rq_pos(rq) + blk_rq_sectors(rq) == bio->bi_iter.bi_sector) return ELEVATOR_BACK_MERGE; -- 2.7.4
[PATCH V2] block: fix the DISCARD request merge
There are two cases when handle DISCARD merge - max_discard_segments == 1 bios need to be contiguous - max_discard_segments > 1 Only nvme right now. It takes every bio as a range and different range needn't to be contiguous. But now, attempt_merge screws this up. It always consider contiguity for DISCARD for the case max_discard_segments > 1 and cannot merge contiguous DISCARD for the case max_discard_segments == 1, because rq_attempt_discard_merge always returns false in this case. This patch fixes both of the two cases above. Signed-off-by: Jianchao Wang --- V2: - Add max_discard_segments > 1 checking in attempt_merge - Change patch title and comment - Add more comment in attempt_merge block/blk-merge.c | 24 +++- 1 file changed, 19 insertions(+), 5 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index 42a4674..8f22374 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -734,8 +734,15 @@ static struct request *attempt_merge(struct request_queue *q, /* * not contiguous */ - if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next)) - return NULL; + if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next)) { + /* +* When max_discard_segments is bigger than 1 (only nvme right +* now), needn't consider the contiguity. +*/ + if (!(req_op(req) == REQ_OP_DISCARD && + queue_max_discard_segments(q) > 1)) + return NULL; + } if (rq_data_dir(req) != rq_data_dir(next) || req->rq_disk != next->rq_disk @@ -757,10 +764,17 @@ static struct request *attempt_merge(struct request_queue *q, * If we are allowed to merge, then append bio list * from next to rq and release next. merge_requests_fn * will have updated segment counts, update sector -* counts here. Handle DISCARDs separately, as they -* have separate settings. +* counts here. +* Two cases of Handling DISCARD: +* - max_discard_segments == 1 +*The bios need to be contiguous. +* - max_discard_segments > 1 +*Only nvme right now. It takes every bio as a +*range and send them to controller together. The ranges +*needn't to be contiguous. */ - if (req_op(req) == REQ_OP_DISCARD) { + if (req_op(req) == REQ_OP_DISCARD && + queue_max_discard_segments(q) > 1) { if (!req_attempt_discard_merge(q, req, next)) return NULL; } else if (!ll_merge_requests_fn(q, req, next)) -- 2.7.4
[PATCH] block: don't check position contiguity for DISCARD in attempt_merge
Discard command supports multiple ranges of blocks, so needn't checking position contiguity when merging. Let's do the same thing in attempt_merge as the blk_try_merge. Signed-off-by: Jianchao Wang --- block/blk-merge.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index 42a4674..c94749b 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -732,9 +732,10 @@ static struct request *attempt_merge(struct request_queue *q, return NULL; /* -* not contiguous +* not contiguous, except for DISCARD */ - if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next)) + if ((req_op(req) != REQ_OP_DISCARD) && + (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next))) return NULL; if (rq_data_dir(req) != rq_data_dir(next) -- 2.7.4
[PATCH V2 3/3] nvme-pci: unquiesce queues after update hw queues
updating hw queues uses bio retrieve to drain the queues. unquiescing queues before that is not needed and will cause requests to be issued to dead hw queue. So move unquiescing queues after updating hw queues, as well as the wait freeze. Signed-off-by: Jianchao Wang --- drivers/nvme/host/pci.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index d668682..dbf6904 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2327,11 +2327,11 @@ static void nvme_reset_work(struct work_struct *work) nvme_remove_namespaces(&dev->ctrl); new_state = NVME_CTRL_ADMIN_ONLY; } else { - nvme_start_queues(&dev->ctrl); - nvme_wait_freeze(&dev->ctrl); /* hit this only when allocate tagset fails */ if (nvme_dev_add(dev)) new_state = NVME_CTRL_ADMIN_ONLY; + nvme_start_queues(&dev->ctrl); + nvme_wait_freeze(&dev->ctrl); nvme_unfreeze(&dev->ctrl); } -- 2.7.4
[PATCH V2 1/3] blk-mq: introduce bio retrieve mechanism
Currently request requeue mechanism cannot work well with updating nr_hw_queues. Because the requests are highly bound with specific hw queue, requests on the dying hw queue have to be failed. And this could be fatal for filesystem. In addition, the request_queue need to be frozen and drained before updating nr_hw_queues, if IO timeout, we have to depend on the LLDD to do recovery. But the recovery path maybe sleeping to wait the the request_queue to be drained. IO hang comes up. To avoid the two case above, we introduce bio retrieve mechanism. The bio retrieving will do following things: - flush requests on hctx->dispatch, sw queue or io scheduler queue - take the bios down from the requests and end the requests - requeue this bios and submit them through generic_make_request again later. Then we could avoid to fail requests on dying hw queue and depend on storage device to drain request_queue. Signed-off-by: Jianchao Wang --- block/blk-core.c | 2 ++ block/blk-mq-sched.c | 88 ++ block/blk-mq.c | 42 include/linux/blk-mq.h | 4 +++ include/linux/blkdev.h | 2 ++ 5 files changed, 138 insertions(+) diff --git a/block/blk-core.c b/block/blk-core.c index cdfabc5..f3c6fa8 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -807,6 +807,8 @@ void blk_cleanup_queue(struct request_queue *q) /* @q won't process any more request, flush async actions */ del_timer_sync(&q->backing_dev_info->laptop_mode_wb_timer); + if (q->mq_ops) + cancel_delayed_work_sync(&q->bio_requeue_work); blk_sync_queue(q); /* diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 29bfe80..9d0b2a2 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -422,6 +422,94 @@ void blk_mq_sched_insert_requests(struct request_queue *q, blk_mq_run_hw_queue(hctx, run_queue_async); } +static void blk_mq_sched_retrieve_one_req(struct request *rq, + struct bio_list *list) +{ + struct bio *bio; + + blk_steal_bios(list, rq); + blk_mq_end_request(rq, BLK_STS_OK); + + bio_list_for_each(bio, list) { + /* +* bio with BIO_QUEUE_ENTERED will enter queue with +* blk_queue_enter_live. +*/ + bio_clear_flag(bio, BIO_QUEUE_ENTERED); + } +} + +static void __blk_mq_sched_retrieve_bios(struct blk_mq_hw_ctx *hctx) +{ + struct request_queue *q = hctx->queue; + struct bio_list bio_list; + LIST_HEAD(rq_list); + struct request *rq; + + bio_list_init(&bio_list); + + if (!list_empty_careful(&hctx->dispatch)) { + spin_lock(&hctx->lock); + if (!list_empty(&hctx->dispatch)) + list_splice_tail_init(&hctx->dispatch, &rq_list); + spin_unlock(&hctx->lock); + } + + if (!q->elevator) + blk_mq_flush_busy_ctxs(hctx, &rq_list); + + while (!list_empty(&rq_list)) { + rq = list_first_entry(&rq_list, struct request, queuelist); + list_del_init(&rq->queuelist); + blk_mq_sched_retrieve_one_req(rq, &bio_list); + } + + if (q->elevator) { + struct elevator_queue *e = hctx->queue->elevator; + + while (e->type->ops.mq.has_work && + e->type->ops.mq.has_work(hctx)) { + rq = e->type->ops.mq.dispatch_request(hctx); + if (!rq) + continue; + + blk_mq_sched_retrieve_one_req(rq, &bio_list); + } + } + /* For the request with RQF_FLUSH_SEQ, blk_mq_end_request cannot end them +* but just push the flush sm. So there could still be rqs in flush queue, +* the caller will check q_usage_counter and come back again. +*/ + blk_mq_requeue_bios(q, &bio_list, false); +} + +/* + * When blk_mq_sched_retrieve_bios returns: + * - All the rqs are ended, q_usage_counter is zero + * - All the bios are queued to q->requeue_bios + */ +void blk_mq_sched_retrieve_bios(struct request_queue *q) +{ + struct blk_mq_hw_ctx *hctx; + int i; + + BUG_ON(!atomic_read(&q->mq_freeze_depth) || + !blk_queue_quiesced(q)); + + /* +* Kick the requeue_work to flush the reqs in requeue_list +*/ + blk_mq_kick_requeue_list(q); + + while (!percpu_ref_is_zero(&q->q_usage_counter)) { + queue_for_each_hw_ctx(q, hctx, i) + __blk_mq_sched_retrieve_bios(hctx); + } + + blk_mq_requeue_bios(q, NULL, true); +} +EXPORT_SYMBOL_GPL(blk_mq_sched_retrieve_bios); + static void blk_mq_sched_fre
[PATCH V2 2/3] blk-mq: retrieve bios before update nr_hw_queues
retrieve bios of all requests on queue to drain requests, then needn't depend on storage device to drain the queue any more. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index f75598b..6d89d3e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3085,7 +3085,10 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, return; list_for_each_entry(q, &set->tag_list, tag_set_list) - blk_mq_freeze_queue(q); + blk_freeze_queue_start(q); + + list_for_each_entry(q, &set->tag_list, tag_set_list) + blk_mq_sched_retrieve_bios(q); /* * Sync with blk_mq_queue_tag_busy_iter. */ -- 2.7.4
[PATCH V2 0/3] blk-mq: introduce bio retrieve mechanism
Hi Jenss This patchset introduces a new bio retrieve mechanism. It will flush all the requests, take the bios down from them and end them, then submit the bios again later. Then we could avoid: - fail requests on the dying hw queues, this could be fatal for filesystem. - depend on storage device to drain the queue before update hw queues. this could introduce hung into nvme_reset_work. The 1st patch introduce the bio retrieve mechanism The 2nd patch apply the bio retrieve mechanism in updating hw queues The 3rd patch unquiesces the queues after updating hw queues in nvme_reset_work V2: - clear BIO_QUEUE_ENTERED of requeued bios (1/3) - sync bio_requeue_work in blk_cleanup_queue (1/3) - discard the unnecessary synchronize_sched (2/3) - discard the 4th patch which is wrong - some misc comment changes Jianchao Wang(3) blk-mq: introduce bio retrieve mechanism blk-mq: retrieve bios before update nr_hw_queues nvme-pci: unquiesce queues after update hw queues block/blk-core.c| 2 ++ block/blk-mq-sched.c| 88 block/blk-mq.c | 47 ++- drivers/nvme/host/pci.c | 4 ++-- include/linux/blk-mq.h | 4 include/linux/blkdev.h | 2 ++ 6 files changed, 144 insertions(+), 3 deletions(-) Thanks Jianchao
[PATCH V2 0/2] block: kyber: make kyber more friendly with merging
Hi Jens This is the second version patchset to make the kyber io scheduler more firendly with merging. Most of time, kyber io scheduler will not leave any requests in ctx rq_list, this is because even if tokens of one domain is used up, kyber will try to dispatch requests from other domain and flush the rq_list there. So there is not any change to do merge. To improve this, setup kyber_ctx_queue (kcq) which is similar with blk_mq_ctx but has rq_list for every domain, and build mapping between kcq and khd as ctx and hctx. Then we could merge, insert and dispatch on rq_list of different domains separately. At the same time, only flush the rq_list of kcq after get domain token successfully, then the requests could be left in the rq_list of that domain and maybe merged with following io. In my local test on NVMe card, the sequential io performance could be improved a lot, especially on high workload. More details please refer to the second patch. Changes V1 to V2: - Add patch from Jens which abstract out the blk-mq-sched rq list iteration bio merge helper interface and rebase the patch based on it. - merge bio_sched_domain and rq_sched_domain to avoid the duplicated code. - allocate kcqs per khd Jens Axboe (1) 0001-blk-mq-abstract-out-blk-mq-sched-rq-list-iteration-b.patch Jianchao Wang (1) 0002-block-kyber-make-kyber-more-friendly-with-merging.patch block/blk-mq-sched.c | 34 ++--- block/kyber-iosched.c | 197 - include/linux/blk-mq.h | 3 +- 3 files changed, 188 insertions(+), 46 deletions(-) Thanks Jianchao
[PATCH V2 2/2] block: kyber: make kyber more friendly with merging
Currently, kyber is very unfriendly with merging. kyber depends on ctx rq_list to do merging, however, most of time, it will not leave any requests in ctx rq_list. This is because even if tokens of one domain is used up, kyber will try to dispatch requests from other domain and flush the rq_list there. To improve this, we setup kyber_ctx_queue (kcq) which is similar with ctx, but it has rq_lists for different domain and build same mapping between kcq and khd as the ctx & hctx. Then we could merge, insert and dispatch for different domains separately. At the same time, only flush the rq_list of kcq when get domain token successfully. Then if one domain token is used up, the requests could be left in the rq_list of that domain and maybe merged with following io. Following is my test result on machine with 8 cores and NVMe card INTEL SSDPEKKR128G7 fio size=256m ioengine=libaio iodepth=64 direct=1 numjobs=8 seq/random +--+---+ |patch?| bw(MB/s) | iops| slat(usec) |clat(usec) | merge | +--+ | w/o | 606/612 | 151k/153k | 6.89/7.03 | 3349.21/3305.40 | 0/0 | +--+ | w/ | 1083/616 | 277k/154k | 4.93/6.95 | 1830.62/3279.95 | 223k/3k | +--+ When set numjobs to 16, the bw and iops could reach 1662MB/s and 425k on my platform. Signed-off-by: Jianchao Wang Tested-by: Holger Hoffstätte --- block/kyber-iosched.c | 197 +- 1 file changed, 162 insertions(+), 35 deletions(-) diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c index 0d6d25e3..7ca1364 100644 --- a/block/kyber-iosched.c +++ b/block/kyber-iosched.c @@ -72,6 +72,15 @@ static const unsigned int kyber_batch_size[] = { [KYBER_OTHER] = 8, }; +struct kyber_ctx_queue { + /* +* Copied from blk_mq_ctx->index_hw +*/ + unsigned int index; + spinlock_t lock; + struct list_head rq_list[KYBER_NUM_DOMAINS]; +} cacheline_aligned_in_smp; + struct kyber_queue_data { struct request_queue *q; @@ -99,6 +108,8 @@ struct kyber_hctx_data { struct list_head rqs[KYBER_NUM_DOMAINS]; unsigned int cur_domain; unsigned int batching; + struct kyber_ctx_queue *kcqs; + struct sbitmap kcq_map[KYBER_NUM_DOMAINS]; wait_queue_entry_t domain_wait[KYBER_NUM_DOMAINS]; struct sbq_wait_state *domain_ws[KYBER_NUM_DOMAINS]; atomic_t wait_index[KYBER_NUM_DOMAINS]; @@ -107,10 +118,8 @@ struct kyber_hctx_data { static int kyber_domain_wake(wait_queue_entry_t *wait, unsigned mode, int flags, void *key); -static int rq_sched_domain(const struct request *rq) +static int kyber_sched_domain(unsigned int op) { - unsigned int op = rq->cmd_flags; - if ((op & REQ_OP_MASK) == REQ_OP_READ) return KYBER_READ; else if ((op & REQ_OP_MASK) == REQ_OP_WRITE && op_is_sync(op)) @@ -284,6 +293,11 @@ static unsigned int kyber_sched_tags_shift(struct kyber_queue_data *kqd) return kqd->q->queue_hw_ctx[0]->sched_tags->bitmap_tags.sb.shift; } +static int kyber_bucket_fn(const struct request *rq) +{ + return kyber_sched_domain(rq->cmd_flags); +} + static struct kyber_queue_data *kyber_queue_data_alloc(struct request_queue *q) { struct kyber_queue_data *kqd; @@ -297,11 +311,10 @@ static struct kyber_queue_data *kyber_queue_data_alloc(struct request_queue *q) goto err; kqd->q = q; - kqd->cb = blk_stat_alloc_callback(kyber_stat_timer_fn, rq_sched_domain, + kqd->cb = blk_stat_alloc_callback(kyber_stat_timer_fn, kyber_bucket_fn, KYBER_NUM_DOMAINS, kqd); if (!kqd->cb) goto err_kqd; - /* * The maximum number of tokens for any scheduling domain is at least * the queue depth of a single hardware queue. If the hardware doesn't @@ -376,15 +389,45 @@ static void kyber_exit_sched(struct elevator_queue *e) kfree(kqd); } +static void kyber_ctx_queue_init(struct kyber_ctx_queue *kcq) +{ + unsigned int i; + + spin_lock_init(&kcq->lock); + for (i = 0; i < KYBER_NUM_DOMAINS; i++) + INIT_LIST_HEAD(&kcq->rq_list[i]); +} + static int kyber_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx) { struct kyber_hctx_data *khd; - int i; + int i, sd; khd = kmalloc_node(sizeof(*khd), GFP_KERNEL, hctx->numa_node); if (!khd) return -ENOMEM; + khd->kcqs = kmalloc_array_node(hctx->nr_ctx, + sizeof(struct kyber_ctx_queue), +
[PATCH V2 1/2] blk-mq: abstract out blk-mq-sched rq list iteration bio merge helper
From: Jens Axboe No functional changes in this patch, just a prep patch for utilizing this in an IO scheduler. Signed-off-by: Jens Axboe --- block/blk-mq-sched.c | 34 -- include/linux/blk-mq.h | 3 ++- 2 files changed, 26 insertions(+), 11 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 25c14c5..b0f2c2a 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -268,19 +268,16 @@ bool blk_mq_sched_try_merge(struct request_queue *q, struct bio *bio, EXPORT_SYMBOL_GPL(blk_mq_sched_try_merge); /* - * Reverse check our software queue for entries that we could potentially - * merge with. Currently includes a hand-wavy stop count of 8, to not spend - * too much time checking for merges. + * Iterate list of requests and see if we can merge this bio with any + * of them. */ -static bool blk_mq_attempt_merge(struct request_queue *q, -struct blk_mq_ctx *ctx, struct bio *bio) +bool blk_mq_bio_list_merge(struct request_queue *q, struct list_head *list, + struct bio *bio) { struct request *rq; int checked = 8; - lockdep_assert_held(&ctx->lock); - - list_for_each_entry_reverse(rq, &ctx->rq_list, queuelist) { + list_for_each_entry_reverse(rq, list, queuelist) { bool merged = false; if (!checked--) @@ -305,13 +302,30 @@ static bool blk_mq_attempt_merge(struct request_queue *q, continue; } - if (merged) - ctx->rq_merged++; return merged; } return false; } +EXPORT_SYMBOL_GPL(blk_mq_bio_list_merge); + +/* + * Reverse check our software queue for entries that we could potentially + * merge with. Currently includes a hand-wavy stop count of 8, to not spend + * too much time checking for merges. + */ +static bool blk_mq_attempt_merge(struct request_queue *q, +struct blk_mq_ctx *ctx, struct bio *bio) +{ + lockdep_assert_held(&ctx->lock); + + if (blk_mq_bio_list_merge(q, &ctx->rq_list, bio)) { + ctx->rq_merged++; + return true; + } + + return false; +} bool __blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio) { diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index ebc34a5..fb35517 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -259,7 +259,8 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, void blk_mq_kick_requeue_list(struct request_queue *q); void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs); void blk_mq_complete_request(struct request *rq); - +bool blk_mq_bio_list_merge(struct request_queue *q, struct list_head *list, + struct bio *bio); bool blk_mq_queue_stopped(struct request_queue *q); void blk_mq_stop_hw_queue(struct blk_mq_hw_ctx *hctx); void blk_mq_start_hw_queue(struct blk_mq_hw_ctx *hctx); -- 2.7.4
[PATCH] block: kyber: make kyber more friendly with merging
Currently, kyber is very unfriendly with merging. kyber depends on ctx rq_list to do merging, however, most of time, it will not leave any requests in ctx rq_list. This is because even if tokens of one domain is used up, kyber will try to dispatch requests from other domain and flush the rq_list there. To improve this, we setup kyber_ctx_queue (kcq) which is similar with ctx, but it has rq_lists for different domain and build same mapping between kcq and khd as the ctx & hctx. Then we could merge, insert and dispatch for different domains separately. If one domain token is used up, the requests could be left in the rq_list of that domain and maybe merged with following io. Following is my test result on machine with 8 cores and NVMe card INTEL SSDPEKKR128G7 fio size=256m ioengine=libaio iodepth=64 direct=1 numjobs=8 seq/random +--+---+ |patch?| bw(MB/s) | iops| slat(usec) |clat(usec) | merge | +--+ | w/o | 606/612 | 151k/153k | 6.89/7.03 | 3349.21/3305.40 | 0/0 | +--+ | w/ | 1083/616 | 277k/154k | 4.93/6.95 | 1830.62/3279.95 | 223k/3k | +--+ When set numjobs to 16, the bw and iops could reach 1662MB/s and 425k on my platform. Signed-off-by: Jianchao Wang --- block/kyber-iosched.c | 240 -- 1 file changed, 212 insertions(+), 28 deletions(-) diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c index 0d6d25e3..04da05b 100644 --- a/block/kyber-iosched.c +++ b/block/kyber-iosched.c @@ -72,6 +72,15 @@ static const unsigned int kyber_batch_size[] = { [KYBER_OTHER] = 8, }; +struct kyber_ctx_queue { + /* +* Copied from blk_mq_ctx->index_hw +*/ + unsigned int index; + spinlock_t lock; + struct list_head rq_list[KYBER_NUM_DOMAINS]; +} cacheline_aligned_in_smp; + struct kyber_queue_data { struct request_queue *q; @@ -84,6 +93,7 @@ struct kyber_queue_data { */ struct sbitmap_queue domain_tokens[KYBER_NUM_DOMAINS]; + struct kyber_ctx_queue *ctx_queue; /* * Async request percentage, converted to per-word depth for * sbitmap_get_shallow(). @@ -99,6 +109,8 @@ struct kyber_hctx_data { struct list_head rqs[KYBER_NUM_DOMAINS]; unsigned int cur_domain; unsigned int batching; + struct kyber_ctx_queue **kcqs; + struct sbitmap kcq_map[KYBER_NUM_DOMAINS]; wait_queue_entry_t domain_wait[KYBER_NUM_DOMAINS]; struct sbq_wait_state *domain_ws[KYBER_NUM_DOMAINS]; atomic_t wait_index[KYBER_NUM_DOMAINS]; @@ -284,6 +296,19 @@ static unsigned int kyber_sched_tags_shift(struct kyber_queue_data *kqd) return kqd->q->queue_hw_ctx[0]->sched_tags->bitmap_tags.sb.shift; } +static void kyber_ctx_queue_init(struct kyber_queue_data *kqd) +{ + unsigned int i, j; + + for (i = 0; i < nr_cpu_ids; i++) { + struct kyber_ctx_queue *kcq = &kqd->ctx_queue[i]; + + spin_lock_init(&kcq->lock); + for (j = 0; j < KYBER_NUM_DOMAINS; j++) + INIT_LIST_HEAD(&kcq->rq_list[j]); + } +} + static struct kyber_queue_data *kyber_queue_data_alloc(struct request_queue *q) { struct kyber_queue_data *kqd; @@ -302,6 +327,13 @@ static struct kyber_queue_data *kyber_queue_data_alloc(struct request_queue *q) if (!kqd->cb) goto err_kqd; + kqd->ctx_queue = kmalloc_array_node(nr_cpu_ids, + sizeof(struct kyber_ctx_queue), GFP_KERNEL, -1); + if (!kqd->ctx_queue) + goto err_cb; + + kyber_ctx_queue_init(kqd); + /* * The maximum number of tokens for any scheduling domain is at least * the queue depth of a single hardware queue. If the hardware doesn't @@ -318,7 +350,7 @@ static struct kyber_queue_data *kyber_queue_data_alloc(struct request_queue *q) if (ret) { while (--i >= 0) sbitmap_queue_free(&kqd->domain_tokens[i]); - goto err_cb; + goto err_kcq; } sbitmap_queue_resize(&kqd->domain_tokens[i], kyber_depth[i]); } @@ -331,6 +363,8 @@ static struct kyber_queue_data *kyber_queue_data_alloc(struct request_queue *q) return kqd; +err_kcq: + kfree(kqd->ctx_queue); err_cb: blk_stat_free_callback(kqd->cb); err_kqd: @@ -372,6 +406,7 @@ static void kyber_exit_sched(struct elevator_queue *e) for (i = 0; i < KYBER_NUM_DOMAINS; i++) sbitmap_queue_free(&kqd->domain_tokens[i
[PATCH] blk-mq: add plug trace event for multiple hw queues case
There is no plug trace event for multiple hw queues. This is confusing when check block trace event log and find unplug one there. Add plug trace event when request is added to a empty plug list. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index 9ce9cac..ddf726a 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1935,6 +1935,9 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) same_queue_rq = NULL; if (same_queue_rq) list_del_init(&same_queue_rq->queuelist); + else + trace_block_plug(q); + list_add_tail(&rq->queuelist, &plug->mq_list); blk_mq_put_ctx(data.ctx); -- 2.7.4
[PATCH] blk-mq: start request gstate with gen 1
rq->gstate and rq->aborted_gstate both are zero before rqs are allocated. If we have a small timeout, when the timer fires, there could be rqs that are never allocated, and also there could be rq that has been allocated but not initialized and started. At the moment, the rq->gstate and rq->aborted_gstate both are 0, thus the blk_mq_terminate_expired will identify the rq is timed out and invoke .timeout early. For scsi, this will cause scsi_times_out to be invoked before the scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at the moment, then we will get crash. Cc: Bart Van Assche Cc: Tejun Heo Cc: Ming Lei Cc: Martin Steigerwald Cc: sta...@vger.kernel.org Signed-off-by: Jianchao Wang --- block/blk-core.c | 4 block/blk-mq.c | 7 +++ 2 files changed, 11 insertions(+) diff --git a/block/blk-core.c b/block/blk-core.c index abcb868..ce62681 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -201,6 +201,10 @@ void blk_rq_init(struct request_queue *q, struct request *rq) rq->part = NULL; seqcount_init(&rq->gstate_seq); u64_stats_init(&rq->aborted_gstate_sync); + /* +* See comment of blk_mq_init_request +*/ + WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC); } EXPORT_SYMBOL(blk_rq_init); diff --git a/block/blk-mq.c b/block/blk-mq.c index f5c7dbc..d62030a 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2069,6 +2069,13 @@ static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq, seqcount_init(&rq->gstate_seq); u64_stats_init(&rq->aborted_gstate_sync); + /* +* start gstate with gen 1 instead of 0, otherwise it will be equal +* to aborted_gstate, and be identified timed out by +* blk_mq_terminate_expired. +*/ + WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC); + return 0; } -- 2.7.4
[PATCH] blk-mq: start request gstate with gen 1
rq->gstate and rq->aborted_gstate both are zero before rqs are allocated. If we have a small timeout, when the timer fires, there could be rqs that are never allocated, and also there could be rq that has been allocated but not initialized and started. At the moment, the rq->gstate and rq->aborted_gstate both are 0, thus the blk_mq_terminate_expired will identify the rq is timed out and invoke .timeout early. For scsi, this will cause scsi_times_out to be invoked before the scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at the moment, then we will get crash. Cc: Bart Van Assche Cc: Tejun Heo Cc: Ming Lei Cc: Martin Steigerwald Cc: sta...@vger.kernel.org Signed-off-by: Jianchao Wang --- block/blk-core.c | 4 block/blk-mq.c | 7 +++ 2 files changed, 11 insertions(+) diff --git a/block/blk-core.c b/block/blk-core.c index abcb868..ce62681 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -201,6 +201,10 @@ void blk_rq_init(struct request_queue *q, struct request *rq) rq->part = NULL; seqcount_init(&rq->gstate_seq); u64_stats_init(&rq->aborted_gstate_sync); + /* +* See comment of blk_mq_init_request +*/ + WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC); } EXPORT_SYMBOL(blk_rq_init); diff --git a/block/blk-mq.c b/block/blk-mq.c index f5c7dbc..d62030a 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2069,6 +2069,13 @@ static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq, seqcount_init(&rq->gstate_seq); u64_stats_init(&rq->aborted_gstate_sync); + /* +* start gstate with gen 1 instead of 0, otherwise it will be equal +* to aborted_gstate, and be identified timed out by +* blk_mq_terminate_expired. +*/ + WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC); + return 0; } -- 2.7.4
[PATCH] blk-mq: mark hctx RESTART when get budget fails
When get budget fails, blk_mq_sched_dispatch_requests does not do anything to ensure the hctx to be restarted. We can survive from this, because only the scsi implements .get_budget and it always runs the hctx queues when request is completed. Signed-off-by: Jianchao Wang --- block/blk-mq-sched.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 25c14c5..49e7cd9 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -102,8 +102,10 @@ static void blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) !e->type->ops.mq.has_work(hctx)) break; - if (!blk_mq_get_dispatch_budget(hctx)) + if (!blk_mq_get_dispatch_budget(hctx)) { + blk_mq_sched_mark_restart_hctx(hctx); break; + } rq = e->type->ops.mq.dispatch_request(hctx); if (!rq) { @@ -148,8 +150,10 @@ static void blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx) if (!sbitmap_any_bit_set(&hctx->ctx_map)) break; - if (!blk_mq_get_dispatch_budget(hctx)) + if (!blk_mq_get_dispatch_budget(hctx)) { + blk_mq_sched_mark_restart_hctx(hctx); break; + } rq = blk_mq_dequeue_from_ctx(hctx, ctx); if (!rq) { -- 2.7.4