from:"Jianchao Wang"

Re: Issues about the merge_bvec_fn callback in 3.10 series

2019-08-22 Thread Jianchao Wang

Hi Neil

Thanks so much for your suggestion.

On 2019/8/23 9:04, NeilBrown wrote:

>> I have checked   
>> v3.10.108
>> v3.18.140
>> v4.1.49
>> but there seems not fix for it.
>>
>> And maybe it would be fixed until 
>> 8ae126660fddbeebb9251a174e6fa45b6ad8f932
>> block: kill merge_bvec_fn() completely
>>
>> Would anyone please give some suggestion on this ?
> 
> One option would be to make sure that ->bi_rw is set before
> bio_add_page is called.
> There are about 80 calls, so that isn't trivial, but you might not care
> about several of them.
> 
> You could backport the 'kill merge_bvec_fn' patch if you like, but I
> wouldn't.  The change of introducing a bug is much higher.
> 

I have killed the raid5_mergeable_bvec and backport the patches that
could make the chunk_aligned_read be able to split bio by its own.
Then I just need to modify the raid5 code and needn't to touch other part
of the system, especially the block core.

It seems work well till now.

Thanks again
Jianchao

Re: Issues about the merge_bvec_fn callback in 3.10 series

2019-08-21 Thread Jianchao Wang

Would anyone please give some comment here ?

Should we discard the merge_bvec_fn for raid5 and backport the bio split code 
there ?

Thanks in advance.
Jianchao


On 2019/8/21 19:42, Jianchao Wang wrote:
> Hi dear all
> 
> This is a question in older kernel versions.
> 
> We are using 3.10 series kernel in our production. And we encountered issue 
> as below,
> 
> When add a page into a bio, .merge_bvec_fn will be invoked down to the bottom,
> and the bio->bi_rw would be saved into bvec_merge_data.bi_rw as the following 
> code,
> 
> __bio_add_page
> ---
>   if (q->merge_bvec_fn) {
>   struct  bvm = {
>   .bi_bdev = bio->bi_bdev,
>   .bi_sector = bio->bi_iter.bi_sector,
>   .bi_size = bio->bi_iter.bi_size,
>   .bi_rw = bio->bi_rw,
>   };
> 
>   /*
>* merge_bvec_fn() returns number of bytes it can accept
>* at this offset
>*/
>   if (q->merge_bvec_fn(q, &bvm, bvec) < bvec->bv_len) {
>   bvec->bv_page = NULL;
>   bvec->bv_len = 0;
>   bvec->bv_offset = 0;
>   return 0;
>   }
>   }
> ---
> 
> However, it seems that the bio->bi_rw has not been set at the moment (set by 
> submit_bio), 
> so it is always zero.
> 
> We have a raid5 and the raid5_mergeable_bvec would always handle the write as 
> read and then
> we always get a write bio with a stripe chunk size which is not expected and 
> would degrade the
> performance. This is code,
> 
> raid5_mergeable_bvec
> ---
>   if ((bvm->bi_rw & 1) == WRITE)
>   return biovec->bv_len; /* always allow writes to be mergeable */
> 
>   if (mddev->new_chunk_sectors < mddev->chunk_sectors)
>   chunk_sectors = mddev->new_chunk_sectors;
>   max =  (chunk_sectors - ((sector & (chunk_sectors - 1)) + bio_sectors)) 
> << 9;
>   if (max < 0) max = 0;
>   if (max <= biovec->bv_len && bio_sectors == 0)
>   return biovec->bv_len;
>   else
>   return max;
> 
> ---
> 
> I have checked   
> v3.10.108
> v3.18.140
> v4.1.49
> but there seems not fix for it.
> 
> And maybe it would be fixed until 
> 8ae126660fddbeebb9251a174e6fa45b6ad8f932
> block: kill merge_bvec_fn() completely
> 
> Would anyone please give some suggestion on this ?
> Any comment will be welcomed.
> 
> Thanks in advance
> Jianchao
>

Issues about the merge_bvec_fn callback in 3.10 series

2019-08-21 Thread Jianchao Wang

Hi dear all

This is a question in older kernel versions.

We are using 3.10 series kernel in our production. And we encountered issue as 
below,

When add a page into a bio, .merge_bvec_fn will be invoked down to the bottom,
and the bio->bi_rw would be saved into bvec_merge_data.bi_rw as the following 
code,

__bio_add_page
---
if (q->merge_bvec_fn) {
struct  bvm = {
.bi_bdev = bio->bi_bdev,
.bi_sector = bio->bi_iter.bi_sector,
.bi_size = bio->bi_iter.bi_size,
.bi_rw = bio->bi_rw,
};

/*
 * merge_bvec_fn() returns number of bytes it can accept
 * at this offset
 */
if (q->merge_bvec_fn(q, &bvm, bvec) < bvec->bv_len) {
bvec->bv_page = NULL;
bvec->bv_len = 0;
bvec->bv_offset = 0;
return 0;
}
}
---

However, it seems that the bio->bi_rw has not been set at the moment (set by 
submit_bio), 
so it is always zero.

We have a raid5 and the raid5_mergeable_bvec would always handle the write as 
read and then
we always get a write bio with a stripe chunk size which is not expected and 
would degrade the
performance. This is code,

raid5_mergeable_bvec
---
if ((bvm->bi_rw & 1) == WRITE)
return biovec->bv_len; /* always allow writes to be mergeable */

if (mddev->new_chunk_sectors < mddev->chunk_sectors)
chunk_sectors = mddev->new_chunk_sectors;
max =  (chunk_sectors - ((sector & (chunk_sectors - 1)) + bio_sectors)) 
<< 9;
if (max < 0) max = 0;
if (max <= biovec->bv_len && bio_sectors == 0)
return biovec->bv_len;
else
return max;

---

I have checked   
v3.10.108
v3.18.140
v4.1.49
but there seems not fix for it.

And maybe it would be fixed until 
8ae126660fddbeebb9251a174e6fa45b6ad8f932
block: kill merge_bvec_fn() completely

Would anyone please give some suggestion on this ?
Any comment will be welcomed.

Thanks in advance
Jianchao

[PATCH V2] blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue

2019-02-11 Thread Jianchao Wang

When requeue, if RQF_DONTPREP, rq has contained some driver
specific data, so insert it to hctx dispatch list to avoid any
merge. Take scsi as example, here is the trace event log (no
io scheduler, because RQF_STARTED would prevent merging),

   kworker/0:1H-339   [000] ...1  2037.209289: block_rq_insert: 8,0 R 4096 () 
32768 + 8 [kworker/0:1H]
scsi_inert_test-1987  [000]   2037.220465: block_bio_queue: 8,0 R 32776 + 8 
[scsi_inert_test]
scsi_inert_test-1987  [000] ...2  2037.220466: block_bio_backmerge: 8,0 R 32776 
+ 8 [scsi_inert_test]
   kworker/0:1H-339   [000]   2047.220913: block_rq_issue: 8,0 R 8192 () 
32768 + 16 [kworker/0:1H]
scsi_inert_test-1996  [000] ..s1  2047.221007: block_rq_complete: 8,0 R () 
32768 + 8 [0]
scsi_inert_test-1996  [000] .Ns1  2047.221045: block_rq_requeue: 8,0 R () 32776 
+ 8 [0]
   kworker/0:1H-339   [000] ...1  2047.221054: block_rq_insert: 8,0 R 4096 () 
32776 + 8 [kworker/0:1H]
   kworker/0:1H-339   [000] ...1  2047.221056: block_rq_issue: 8,0 R 4096 () 
32776 + 8 [kworker/0:1H]
scsi_inert_test-1986  [000] ..s1  2047.221119: block_rq_complete: 8,0 R () 
32776 + 8 [0]

(32768 + 8) was requeued by scsi_queue_insert and had RQF_DONTPREP.
Then it was merged with (32776 + 8) and issued. Due to RQF_DONTPREP,
the sdb only contained the part of (32768 + 8), then only that part
was completed. The lucky thing was that scsi_io_completion detected
it and requeued the remaining part. So we didn't get corrupted data.
However, the requeue of (32776 + 8) is not expected.

Suggested-by: Jens Axboe 
Signed-off-by: Jianchao Wang 
---
V2:
 - refactor the code based on Jens' suggestion

 block/blk-mq.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8f5b533..9437a5e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -737,12 +737,20 @@ static void blk_mq_requeue_work(struct work_struct *work)
spin_unlock_irq(&q->requeue_lock);
 
list_for_each_entry_safe(rq, next, &rq_list, queuelist) {
-   if (!(rq->rq_flags & RQF_SOFTBARRIER))
+   if (!(rq->rq_flags & (RQF_SOFTBARRIER | RQF_DONTPREP)))
continue;
 
rq->rq_flags &= ~RQF_SOFTBARRIER;
list_del_init(&rq->queuelist);
-   blk_mq_sched_insert_request(rq, true, false, false);
+   /*
+* If RQF_DONTPREP, rq has contained some driver specific
+* data, so insert it to hctx dispatch list to avoid any
+* merge.
+*/
+   if (rq->rq_flags & RQF_DONTPREP)
+   blk_mq_request_bypass_insert(rq, false);
+   else
+   blk_mq_sched_insert_request(rq, true, false, false);
}
 
while (!list_empty(&rq_list)) {
-- 
2.7.4

[PATCH] blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue

2019-02-10 Thread Jianchao Wang

When requeue, if RQF_DONTPREP, rq has contained some driver
specific data, so insert it to hctx dispatch list to avoid any
merge. Take scsi as example, here is the trace event log (no
io scheduler, because RQF_STARTED would prevent merging),

   kworker/0:1H-339   [000] ...1  2037.209289: block_rq_insert: 8,0 R 4096 () 
32768 + 8 [kworker/0:1H]
scsi_inert_test-1987  [000]   2037.220465: block_bio_queue: 8,0 R 32776 + 8 
[scsi_inert_test]
scsi_inert_test-1987  [000] ...2  2037.220466: block_bio_backmerge: 8,0 R 32776 
+ 8 [scsi_inert_test]
   kworker/0:1H-339   [000]   2047.220913: block_rq_issue: 8,0 R 8192 () 
32768 + 16 [kworker/0:1H]
scsi_inert_test-1996  [000] ..s1  2047.221007: block_rq_complete: 8,0 R () 
32768 + 8 [0]
scsi_inert_test-1996  [000] .Ns1  2047.221045: block_rq_requeue: 8,0 R () 32776 
+ 8 [0]
   kworker/0:1H-339   [000] ...1  2047.221054: block_rq_insert: 8,0 R 4096 () 
32776 + 8 [kworker/0:1H]
   kworker/0:1H-339   [000] ...1  2047.221056: block_rq_issue: 8,0 R 4096 () 
32776 + 8 [kworker/0:1H]
scsi_inert_test-1986  [000] ..s1  2047.221119: block_rq_complete: 8,0 R () 
32776 + 8 [0]

(32768 + 8) was requeued by scsi_queue_insert and had RQF_DONTPREP.
Then it was merged with (32776 + 8) and issued. Due to RQF_DONTPREP,
the sdb only contained the part of (32768 + 8), then only that part
was completed. The lucky thing was that scsi_io_completion detected
it and requeued the remaining part. So we didn't get corrupted data.
However, the requeue of (32776 + 8) is not expected.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8f5b533..2d93eb5 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -737,6 +737,18 @@ static void blk_mq_requeue_work(struct work_struct *work)
spin_unlock_irq(&q->requeue_lock);
 
list_for_each_entry_safe(rq, next, &rq_list, queuelist) {
+   /*
+* If RQF_DONTPREP, rq has contained some driver specific
+* data, so insert it to hctx dispatch list to avoid any
+* merge.
+*/
+   if (rq->rq_flags & RQF_DONTPREP) {
+   rq->rq_flags &= ~RQF_SOFTBARRIER;
+   list_del_init(&rq->queuelist);
+   blk_mq_request_bypass_insert(rq, false);
+   continue;
+   }
+
if (!(rq->rq_flags & RQF_SOFTBARRIER))
continue;
 
-- 
2.7.4

[PATCH] blk-mq: fix a hung issue when fsync

2019-01-30 Thread Jianchao Wang

Florian reported a io hung issue when fsync(). It should be
triggered by following race condition.

data + post flush a flush

blk_flush_complete_seq
  case REQ_FSEQ_DATA
blk_flush_queue_rq
issued to driver  blk_mq_dispatch_rq_list
try to issue a flush req
failed due to NON-NCQ command
.queue_rq return BLK_STS_DEV_RESOURCE

request completion
  req->end_io // doesn't check RESTART
  mq_flush_data_end_io
case REQ_FSEQ_POSTFLUSH
  blk_kick_flush
do nothing because previous flush
has not been completed
 blk_mq_run_hw_queue
  insert rq to hctx->dispatch
  due to RESTART is still set, do nothing

To fix this, replace the blk_mq_run_hw_queue in mq_flush_data_end_io
with blk_mq_sched_restart to check and clear the RESTART flag.

Fixes: bd166ef1 (blk-mq-sched: add framework for MQ capable IO schedulers)
Reported-by: Florian Stecker 
Tested-by: Florian Stecker 
Signed-off-by: Jianchao Wang 
---
 block/blk-flush.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index a3fc7191..6e0f2d9 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -335,7 +335,7 @@ static void mq_flush_data_end_io(struct request *rq, 
blk_status_t error)
blk_flush_complete_seq(rq, fq, REQ_FSEQ_DATA, error);
spin_unlock_irqrestore(&fq->mq_flush_lock, flags);
 
-   blk_mq_run_hw_queue(hctx, true);
+   blk_mq_sched_restart(hctx);
 }
 
 /**
-- 
2.7.4

[PATCH] blk-mq: fix the cmd_flag_name array

2019-01-24 Thread Jianchao Wang

Swap REQ_NOWAIT and REQ_NOUNMAP and add REQ_HIPRI.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-debugfs.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 90d6876..f812083 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -308,8 +308,9 @@ static const char *const cmd_flag_name[] = {
CMD_FLAG_NAME(PREFLUSH),
CMD_FLAG_NAME(RAHEAD),
CMD_FLAG_NAME(BACKGROUND),
-   CMD_FLAG_NAME(NOUNMAP),
CMD_FLAG_NAME(NOWAIT),
+   CMD_FLAG_NAME(NOUNMAP),
+   CMD_FLAG_NAME(HIPRI),
 };
 #undef CMD_FLAG_NAME
 
-- 
2.7.4

[PATCH 1/2] blk-mq: save queue mapping result into ctx directly

2019-01-24 Thread Jianchao Wang

Currelty, the queue mapping result is saved in a two-dimensional
array. In hot path, to get a hctx, we need do following,

  q->queue_hw_ctx[q->tag_set->map[type].mq_map[cpu]]

This looks not efficient. Actually, we could save the queue mapping
result into ctx directly with different hctx type, like,

  ctx->hctxs[type]

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c | 2 +-
 block/blk-mq-tag.c   | 2 +-
 block/blk-mq.c   | 4 ++--
 block/blk-mq.h   | 7 ---
 block/blk.h  | 2 +-
 5 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 140933e..4090553 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -321,7 +321,7 @@ bool __blk_mq_sched_bio_merge(struct request_queue *q, 
struct bio *bio)
 {
struct elevator_queue *e = q->elevator;
struct blk_mq_ctx *ctx = blk_mq_get_ctx(q);
-   struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, bio->bi_opf, ctx->cpu);
+   struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, bio->bi_opf, ctx);
bool ret = false;
enum hctx_type type;
 
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 2089c6c..a4931fc 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -170,7 +170,7 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data)
 
data->ctx = blk_mq_get_ctx(data->q);
data->hctx = blk_mq_map_queue(data->q, data->cmd_flags,
-   data->ctx->cpu);
+   data->ctx);
tags = blk_mq_tags_from_data(data);
if (data->flags & BLK_MQ_REQ_RESERVED)
bt = &tags->breserved_tags;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8f5b533..445d0a2 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -364,7 +364,7 @@ static struct request *blk_mq_get_request(struct 
request_queue *q,
}
if (likely(!data->hctx))
data->hctx = blk_mq_map_queue(q, data->cmd_flags,
-   data->ctx->cpu);
+   data->ctx);
if (data->cmd_flags & REQ_NOWAIT)
data->flags |= BLK_MQ_REQ_NOWAIT;
 
@@ -2435,7 +2435,7 @@ static void blk_mq_map_swqueue(struct request_queue *q)
continue;
 
hctx = blk_mq_map_queue_type(q, j, i);
-
+   ctx->hctxs[j] = hctx;
/*
 * If the CPU is already set in the mask, then we've
 * mapped this one already. This can happen if
diff --git a/block/blk-mq.h b/block/blk-mq.h
index d943d46..9fb0626 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -23,6 +23,7 @@ struct blk_mq_ctx {
 
unsigned intcpu;
unsigned short  index_hw[HCTX_MAX_TYPES];
+   struct blk_mq_hw_ctx*hctxs[HCTX_MAX_TYPES];
 
/* incremented at dispatch time */
unsigned long   rq_dispatched[2];
@@ -97,11 +98,11 @@ static inline struct blk_mq_hw_ctx 
*blk_mq_map_queue_type(struct request_queue *
  * blk_mq_map_queue() - map (cmd_flags,type) to hardware queue
  * @q: request queue
  * @flags: request command flags
- * @cpu: CPU
+ * @cpu: cpu ctx
  */
 static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q,
 unsigned int flags,
-unsigned int cpu)
+struct blk_mq_ctx *ctx)
 {
enum hctx_type type = HCTX_TYPE_DEFAULT;
 
@@ -116,7 +117,7 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct 
request_queue *q,
 q->tag_set->map[HCTX_TYPE_READ].nr_queues)
type = HCTX_TYPE_READ;

-   return blk_mq_map_queue_type(q, type, cpu);
+   return ctx->hctxs[type];
 }
 
 /*
diff --git a/block/blk.h b/block/blk.h
index 848278c..5d636ee 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -38,7 +38,7 @@ extern struct ida blk_queue_ida;
 static inline struct blk_flush_queue *
 blk_get_flush_queue(struct request_queue *q, struct blk_mq_ctx *ctx)
 {
-   return blk_mq_map_queue(q, REQ_OP_FLUSH, ctx->cpu)->fq;
+   return blk_mq_map_queue(q, REQ_OP_FLUSH, ctx)->fq;
 }
 
 static inline void __blk_get_queue(struct request_queue *q)
-- 
2.7.4

[PATCH 2/2] blk-mq: save default hctx into ctx->hctxs for not-supported type

2019-01-24 Thread Jianchao Wang

Currently, we check whether the hctx type is supported every time
in hot path. Actually, this is not necessary, we could save the
default hctx into ctx->hctxs if the type is not supported when
map swqueues and use it directly with ctx->hctxs[type].

We also needn't check whether the poll is enabled or not, because
the caller would clear the REQ_HIPRI in that case.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c |  9 -
 block/blk-mq.h | 13 +
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 445d0a2..8a825ae 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2431,8 +2431,11 @@ static void blk_mq_map_swqueue(struct request_queue *q)
 
ctx = per_cpu_ptr(q->queue_ctx, i);
for (j = 0; j < set->nr_maps; j++) {
-   if (!set->map[j].nr_queues)
+   if (!set->map[j].nr_queues) {
+   ctx->hctxs[j] = blk_mq_map_queue_type(q,
+   HCTX_TYPE_DEFAULT, i);
continue;
+   }
 
hctx = blk_mq_map_queue_type(q, j, i);
ctx->hctxs[j] = hctx;
@@ -2455,6 +2458,10 @@ static void blk_mq_map_swqueue(struct request_queue *q)
 */
BUG_ON(!hctx->nr_ctx);
}
+
+   for (; j < HCTX_MAX_TYPES; j++)
+   ctx->hctxs[j] = blk_mq_map_queue_type(q,
+   HCTX_TYPE_DEFAULT, i);
}
 
mutex_unlock(&q->sysfs_lock);
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 9fb0626..14b7efb9 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -106,15 +106,12 @@ static inline struct blk_mq_hw_ctx 
*blk_mq_map_queue(struct request_queue *q,
 {
enum hctx_type type = HCTX_TYPE_DEFAULT;
 
-   if ((flags & REQ_HIPRI) &&
-   q->tag_set->nr_maps > HCTX_TYPE_POLL && 
-   q->tag_set->map[HCTX_TYPE_POLL].nr_queues &&
-   test_bit(QUEUE_FLAG_POLL, &q->queue_flags))
+   /*
+* The caller ensure that if REQ_HIPRI, poll must be enabled.
+*/
+   if (flags & REQ_HIPRI)
type = HCTX_TYPE_POLL;
-
-   else if (((flags & REQ_OP_MASK) == REQ_OP_READ) &&
-q->tag_set->nr_maps > HCTX_TYPE_READ &&
-q->tag_set->map[HCTX_TYPE_READ].nr_queues)
+   else if ((flags & REQ_OP_MASK) == REQ_OP_READ)
type = HCTX_TYPE_READ;

return ctx->hctxs[type];
-- 
2.7.4

[PATCH 0/2] small optimization for accessing queue map

2019-01-24 Thread Jianchao Wang

Hi Jens

These two patches are small optimization for accessing the queue mapping
in hot path. It saves the queue mapping results into blk_mq_ctx directly,
then we needn't do the complicated bounce on queue_hw_ctx[] map[] and
mq_map[].

Jianchao Wang (2)
blk-mq: save queue mapping result into ctx directly
blk-mq: save default hctx into ctx->hctxs

 block/blk-mq-sched.c |  2 +-
 block/blk-mq-tag.c   |  2 +-
 block/blk-mq.c   | 13 ++---
 block/blk-mq.h   | 20 +---
 block/blk.h  |  2 +-
 5 files changed, 22 insertions(+), 17 deletions(-)

Thanks
Jianchao

[PATCH 1/2] blk-mq: save queue mapping result into ctx directly

2018-12-26 Thread Jianchao Wang

Currelty, the queue mapping result is saved in a two-dimensional
array. In hot path, to get a hctx, we need do following,

  q->queue_hw_ctx[q->tag_set->map[type].mq_map[cpu]]

This looks not efficient. Actually, we could save the queue mapping
result into ctx directly with different hctx type, like,

  ctx->hctxs[type]

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c | 2 +-
 block/blk-mq-tag.c   | 2 +-
 block/blk-mq.c   | 4 ++--
 block/blk-mq.h   | 5 +++--
 block/blk.h  | 2 +-
 5 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 140933e..4090553 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -321,7 +321,7 @@ bool __blk_mq_sched_bio_merge(struct request_queue *q, 
struct bio *bio)
 {
struct elevator_queue *e = q->elevator;
struct blk_mq_ctx *ctx = blk_mq_get_ctx(q);
-   struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, bio->bi_opf, ctx->cpu);
+   struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, bio->bi_opf, ctx);
bool ret = false;
enum hctx_type type;
 
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 2089c6c..a4931fc 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -170,7 +170,7 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data)
 
data->ctx = blk_mq_get_ctx(data->q);
data->hctx = blk_mq_map_queue(data->q, data->cmd_flags,
-   data->ctx->cpu);
+   data->ctx);
tags = blk_mq_tags_from_data(data);
if (data->flags & BLK_MQ_REQ_RESERVED)
bt = &tags->breserved_tags;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3ba37b9..6898d24 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -364,7 +364,7 @@ static struct request *blk_mq_get_request(struct 
request_queue *q,
}
if (likely(!data->hctx))
data->hctx = blk_mq_map_queue(q, data->cmd_flags,
-   data->ctx->cpu);
+   data->ctx);
if (data->cmd_flags & REQ_NOWAIT)
data->flags |= BLK_MQ_REQ_NOWAIT;
 
@@ -2434,7 +2434,7 @@ static void blk_mq_map_swqueue(struct request_queue *q)
continue;
 
hctx = blk_mq_map_queue_type(q, j, i);
-
+   ctx->hctxs[j] = hctx;
/*
 * If the CPU is already set in the mask, then we've
 * mapped this one already. This can happen if
diff --git a/block/blk-mq.h b/block/blk-mq.h
index d943d46..998f5cf 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -23,6 +23,7 @@ struct blk_mq_ctx {
 
unsigned intcpu;
unsigned short  index_hw[HCTX_MAX_TYPES];
+   struct blk_mq_hw_ctx *hctxs[HCTX_MAX_TYPES];
 
/* incremented at dispatch time */
unsigned long   rq_dispatched[2];
@@ -101,7 +102,7 @@ static inline struct blk_mq_hw_ctx 
*blk_mq_map_queue_type(struct request_queue *
  */
 static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q,
 unsigned int flags,
-unsigned int cpu)
+struct blk_mq_ctx *ctx)
 {
enum hctx_type type = HCTX_TYPE_DEFAULT;
 
@@ -116,7 +117,7 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct 
request_queue *q,
 q->tag_set->map[HCTX_TYPE_READ].nr_queues)
type = HCTX_TYPE_READ;

-   return blk_mq_map_queue_type(q, type, cpu);
+   return ctx->hctxs[type];
 }
 
 /*
diff --git a/block/blk.h b/block/blk.h
index 848278c..5d636ee 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -38,7 +38,7 @@ extern struct ida blk_queue_ida;
 static inline struct blk_flush_queue *
 blk_get_flush_queue(struct request_queue *q, struct blk_mq_ctx *ctx)
 {
-   return blk_mq_map_queue(q, REQ_OP_FLUSH, ctx->cpu)->fq;
+   return blk_mq_map_queue(q, REQ_OP_FLUSH, ctx)->fq;
 }
 
 static inline void __blk_get_queue(struct request_queue *q)
-- 
2.7.4

[PATCH 2/2] blk-mq: save default hctx into ctx->hctxs for not-supported type

2018-12-26 Thread Jianchao Wang

Currently, we check whether the hctx type is supported every time
in hot path. Actually, this is not necessary, we could save the
default hctx into ctx->hctxs if the type is not supported when
map swqueues and use it directly with ctx->hctxs[type].

We also needn't check whether the poll is enabled or not, because
the caller would clear the REQ_HIPRI in that case.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c |  9 -
 block/blk-mq.h | 15 ++-
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 6898d24..1dab467 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2430,8 +2430,11 @@ static void blk_mq_map_swqueue(struct request_queue *q)
 
ctx = per_cpu_ptr(q->queue_ctx, i);
for (j = 0; j < set->nr_maps; j++) {
-   if (!set->map[j].nr_queues)
+   if (!set->map[j].nr_queues) {
+   ctx->hctxs[j] = blk_mq_map_queue_type(q,
+   HCTX_TYPE_DEFAULT, i);
continue;
+   }
 
hctx = blk_mq_map_queue_type(q, j, i);
ctx->hctxs[j] = hctx;
@@ -2454,6 +2457,10 @@ static void blk_mq_map_swqueue(struct request_queue *q)
 */
BUG_ON(!hctx->nr_ctx);
}
+
+   for (; j < HCTX_MAX_TYPES; j++)
+   ctx->hctxs[j] = blk_mq_map_queue_type(q,
+   HCTX_TYPE_DEFAULT, i);
}
 
mutex_unlock(&q->sysfs_lock);
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 998f5cf..ed1ed45 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -98,7 +98,7 @@ static inline struct blk_mq_hw_ctx 
*blk_mq_map_queue_type(struct request_queue *
  * blk_mq_map_queue() - map (cmd_flags,type) to hardware queue
  * @q: request queue
  * @flags: request command flags
- * @cpu: CPU
+ * @ctx: mq cpu ctx
  */
 static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q,
 unsigned int flags,
@@ -106,15 +106,12 @@ static inline struct blk_mq_hw_ctx 
*blk_mq_map_queue(struct request_queue *q,
 {
enum hctx_type type = HCTX_TYPE_DEFAULT;
 
-   if ((flags & REQ_HIPRI) &&
-   q->tag_set->nr_maps > HCTX_TYPE_POLL && 
-   q->tag_set->map[HCTX_TYPE_POLL].nr_queues &&
-   test_bit(QUEUE_FLAG_POLL, &q->queue_flags))
+   /*
+* The caller ensure that if REQ_HIPRI, poll must be enabled.
+*/
+   if (flags & REQ_HIPRI)
type = HCTX_TYPE_POLL;
-
-   else if (((flags & REQ_OP_MASK) == REQ_OP_READ) &&
-q->tag_set->nr_maps > HCTX_TYPE_READ &&
-q->tag_set->map[HCTX_TYPE_READ].nr_queues)
+   else if ((flags & REQ_OP_MASK) == REQ_OP_READ)
type = HCTX_TYPE_READ;

return ctx->hctxs[type];
-- 
2.7.4

[PATCH 0/2] blk-mq: small optimization for accessing of queue map

2018-12-26 Thread Jianchao Wang

Hi Jens

These two patches are small optimization for accessing the queue mapping
in hot path. It saves the queue mapping results into blk_mq_ctx directly,
then we needn't do the complicated bounce on queue_hw_ctx[] map[] and
mq_map[].

Jianchao Wang(2)
  blk-mq: save queue mapping result into ctx directly
  blk-mq: save default hctx into ctx->hctxs for not-supported type

block/blk-mq-sched.c |  2 +-
block/blk-mq-tag.c   |  2 +-
block/blk-mq.c   | 13 ++---
block/blk-mq.h   | 20 +---
block/blk.h  |  2 +-
5 files changed, 22 insertions(+), 17 deletions(-)

Thanks
Jianchao

[PATCH V14 3/3] blk-mq: replace and kill blk_mq_request_issue_directly

2018-12-13 Thread Jianchao Wang

Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly
in blk_insert_cloned_request and kill it as nobody uses it any more.

Signed-off-by: Jianchao Wang 
---
 block/blk-core.c | 4 +++-
 block/blk-mq.c   | 9 +
 block/blk-mq.h   | 6 --
 3 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 268d2b8..fa661ba 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1240,6 +1240,8 @@ static int blk_cloned_rq_check_limits(struct 
request_queue *q,
  */
 blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request 
*rq)
 {
+   blk_qc_t unused;
+
if (blk_cloned_rq_check_limits(q, rq))
return BLK_STS_IOERR;
 
@@ -1255,7 +1257,7 @@ blk_status_t blk_insert_cloned_request(struct 
request_queue *q, struct request *
 * bypass a potential scheduler on the bottom device for
 * insert.
 */
-   return blk_mq_request_issue_directly(rq, true);
+   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, true);
 }
 EXPORT_SYMBOL_GPL(blk_insert_cloned_request);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 90361fe..2d3a29e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1792,7 +1792,7 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
bool bypass, bool last)
@@ -1864,13 +1864,6 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last)
-{
-   blk_qc_t unused;
-
-   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, last);
-}
-
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 0c9c9ea..b63a0de 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -68,8 +68,10 @@ void blk_mq_request_bypass_insert(struct request *rq, bool 
run_queue);
 void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct list_head *list);
 
-/* Used by blk_insert_cloned_request() to issue request directly */
-blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last);
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+   struct request *rq,
+   blk_qc_t *cookie,
+   bool bypass, bool last);
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list);
 
-- 
2.7.4

[PATCH V14 0/3] blk-mq: refactor and fix the code of issue directly

2018-12-13 Thread Jianchao Wang

Hi Jens

After commit c616cbee ( blk-mq: punt failed direct issue to dispatch
list ), we always insert request to hctx dispatch list whenever get a
BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE, this is overkill and will
harm the merging. We just need to do that for the requests that has
been through .queue_rq.

This patch set fixes the issue above and refactors the code of issue
request directly to unify the interface and make the code clearer and
more readable.

Please consider this patchset for 4.21.

The 1st patch refactors the code of issue request directly to unify the
helper interface which could handle all the cases. And also it does more
reasonable decision about how to insert the requests.

The 2nd patch make blk_mq_sched_insert_requests issue requests directly
with 'bypass' false, then it needn't to handle the non-issued requests
any more.

The 3rd patch replace and kill the blk_mq_request_issue_directly.

V14:
 - change some comment

V13:
 - remove the unused tag 'out' (1/3)

V12:
 - remove the original 1st patch/
 - rebase other 3 patches on newest for-4.21/block
 - add some comment

V11:
 - insert request to dispatch list when .queue_rq return
   BLK_STS_RESOURCE/BLK_STS_DEV_RESOURCE. (2/4)
 - stop to attempt direct-issue and insert the reset when meet
   non-ok case (3/4).

V10:
 - address Jen's comment.
 - let blk_mq_try_issue_directly return actual result for case
   'bypass == false'. (2/4)
 - use return value of blk_mq_try_issue_directly to identify
   whether the request is direct-issued successfully. (3/4)

V9:
 - rebase on recent for-4.21/block
 - add 1st patch

V8:
 - drop the original 2nd patch which try to insert requests into hctx->dispatch
   if quiesced or stopped.
 - remove two wrong 'unlikely'

V7:
 - drop the original 3rd patch which try to ensure hctx to be ran on
   its mapped cpu in issueing directly path.

V6:
 - drop original 1st patch to address Jen's comment
 - discard the enum mq_issue_decision and blk_mq_make_decision and use
   BLK_STS_* return values directly to address Jen's comment. (1/5)
 - add 'unlikely' in blk_mq_try_issue_directly (1/5)
 - refactor the 2nd and 3rd patch based on the new 1st patch.
 - reserve the unused_cookie in 4th and 5th patch

V5:
 - rebase against Jens' for-4.21/block branch
 - adjust the order of patch04 and patch05
 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert
 - comment changes

V4:
 - split the original patch 1 into two patches, 1st and 2nd patch currently
 - rename the mq_decision to mq_issue_decision
 - comment changes

V3:
 - Correct the code about the case bypass_insert is true and io scheduler
   attached. The request still need to be issued in case above. (1/4)
 - Refactor the code to make code clearer. blk_mq_make_request is introduced
   to decide insert, end or just return based on the return value of .queue_rq
   and bypass_insert (1/4) 
 - Add the 2nd patch. It introduce a new decision result which indicates to
   insert request with blk_mq_request_bypass_insert.
 - Modify the code to adapt the new patch 1.

V2:
 - Add 1st and 2nd patch to refactor the code.


Jianchao Wang (3)
blk-mq: refactor the code of issue request directly
blk-mq: issue directly with bypass 'false' in
blk-mq: replace and kill blk_mq_request_issue_directly

 block/blk-core.c |   4 +++-
 block/blk-mq-sched.c |   8 +++
 block/blk-mq.c   | 122 
---
 block/blk-mq.h   |   6 +++--
 4 files changed, 69 insertions(+), 71 deletions(-)


Thanks
Jianchao

[PATCH V14 1/3] blk-mq: refactor the code of issue request directly

2018-12-13 Thread Jianchao Wang

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface to unify the interfaces to issue requests
directly. The merged interface takes over the requests totally,
it could insert, end or do nothing based on the return value of
.queue_rq and 'bypass' parameter. Then caller needn't any other
handling any more and then code could be cleaned up.

And also the commit c616cbee ( blk-mq: punt failed direct issue
to dispatch list ) always inserts requests to hctx dispatch list
whenever get a BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE, this is
overkill and will harm the merging. We just need to do that for
the requests that has been through .queue_rq. This patch also
could fix this.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 103 ++---
 1 file changed, 54 insertions(+), 49 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9690f4f..af4dc82 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1792,78 +1792,83 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
-   bool bypass_insert, bool last)
+   bool bypass, bool last)
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   blk_status_t ret = BLK_STS_RESOURCE;
+   int srcu_idx;
+   bool force = false;
 
+   hctx_lock(hctx, &srcu_idx);
/*
-* RCU or SRCU read lock is needed before checking quiesced flag.
+* hctx_lock is needed before checking quiesced flag.
 *
-* When queue is stopped or quiesced, ignore 'bypass_insert' from
-* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
-* and avoid driver to try to dispatch again.
+* When queue is stopped or quiesced, ignore 'bypass', insert
+* and return BLK_STS_OK to caller, and avoid driver to try to
+* dispatch again.
 */
-   if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
+   if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
run_queue = false;
-   bypass_insert = false;
-   goto insert;
+   bypass = false;
+   goto out_unlock;
}
 
-   if (q->elevator && !bypass_insert)
-   goto insert;
+   if (unlikely(q->elevator && !bypass))
+   goto out_unlock;
 
if (!blk_mq_get_dispatch_budget(hctx))
-   goto insert;
+   goto out_unlock;
 
if (!blk_mq_get_driver_tag(rq)) {
blk_mq_put_dispatch_budget(hctx);
-   goto insert;
+   goto out_unlock;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie, last);
-insert:
-   if (bypass_insert)
-   return BLK_STS_RESOURCE;
-
-   blk_mq_request_bypass_insert(rq, run_queue);
-   return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-   struct request *rq, blk_qc_t *cookie)
-{
-   blk_status_t ret;
-   int srcu_idx;
-
-   might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
-
-   hctx_lock(hctx, &srcu_idx);
-
-   ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false, true);
-   if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
-   blk_mq_request_bypass_insert(rq, true);
-   else if (ret != BLK_STS_OK)
-   blk_mq_end_request(rq, ret);
-
+   /*
+* Always add a request that has been through
+*.queue_rq() to the hardware dispatch list.
+*/
+   force = true;
+   ret = __blk_mq_issue_directly(hctx, rq, cookie, last);
+out_unlock:
hctx_unlock(hctx, srcu_idx);
+   switch (ret) {
+   case BLK_STS_OK:
+   break;
+   case BLK_STS_DEV_RESOURCE:
+   case BLK_STS_RESOURCE:
+   if (force) {
+   blk_mq_request_bypass_insert(rq, run_queue);
+   /*
+* We have to return BLK_STS_OK for the DM
+* to avoid livelock. Otherwise, we return
+* the real result to indicate whether the
+* request is direct-issued successfully.
+*/
+   ret = bypass ? BLK_STS_OK : ret;
+   } else if (!bypass) {
+   blk_mq_sched_insert_request(rq, false,
+   run_queue, false);
+

[PATCH V14 2/3] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests

2018-12-13 Thread Jianchao Wang

It is not necessary to issue request directly with bypass 'true'
in blk_mq_sched_insert_requests and handle the non-issued requests
itself. Just set bypass to 'false' and let blk_mq_try_issue_directly
handle them totally. Remove the blk_rq_can_direct_dispatch check,
because blk_mq_try_issue_directly can handle it well.If request is
direct-issued unsuccessfully, insert the reset.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c |  8 +++-
 block/blk-mq.c   | 20 +---
 2 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index f096d898..5b4d52d 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -417,12 +417,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx 
*hctx,
 * busy in case of 'none' scheduler, and this way may save
 * us one extra enqueue & dequeue to sw queue.
 */
-   if (!hctx->dispatch_busy && !e && !run_queue_async) {
+   if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
-   if (list_empty(list))
-   return;
-   }
-   blk_mq_insert_requests(hctx, ctx, list);
+   else
+   blk_mq_insert_requests(hctx, ctx, list);
}
 
blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index af4dc82..90361fe 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1874,22 +1874,20 @@ blk_status_t blk_mq_request_issue_directly(struct 
request *rq, bool last)
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
+   blk_qc_t unused;
+   blk_status_t ret = BLK_STS_OK;
+
while (!list_empty(list)) {
-   blk_status_t ret;
struct request *rq = list_first_entry(list, struct request,
queuelist);
 
list_del_init(&rq->queuelist);
-   ret = blk_mq_request_issue_directly(rq, list_empty(list));
-   if (ret != BLK_STS_OK) {
-   if (ret == BLK_STS_RESOURCE ||
-   ret == BLK_STS_DEV_RESOURCE) {
-   blk_mq_request_bypass_insert(rq,
+   if (ret == BLK_STS_OK)
+   ret = blk_mq_try_issue_directly(hctx, rq, &unused,
+   false,
list_empty(list));
-   break;
-   }
-   blk_mq_end_request(rq, ret);
-   }
+   else
+   blk_mq_sched_insert_request(rq, false, true, false);
}
 
/*
@@ -1897,7 +1895,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx 
*hctx,
 * the driver there was more coming, but that turned out to
 * be a lie.
 */
-   if (!list_empty(list) && hctx->queue->mq_ops->commit_rqs)
+   if (ret != BLK_STS_OK && hctx->queue->mq_ops->commit_rqs)
hctx->queue->mq_ops->commit_rqs(hctx);
 }
 
-- 
2.7.4

[PATCH V13 0/3] blk-mq: refactor code of issue directly

2018-12-09 Thread Jianchao Wang

Hi Jens

Please consider this patchset for 4.21.

It refactors the code of issue request directly to unify the interface
and make the code clearer and more readable.

The 1st patch refactors the code of issue request directly to unify the
helper interface which could handle all the cases.

The 2nd patch make blk_mq_sched_insert_requests issue requests directly
with 'bypass' false, then it needn't to handle the non-issued requests
any more.

The 3rd patch replace and kill the blk_mq_request_issue_directly.

V13:
 - remove the unused tag 'out' (1/3)

V12:
 - remove the original 1st patch/
 - rebase other 3 patches on newest for-4.21/block
 - add some comment

V11:
 - insert request to dispatch list when .queue_rq return
   BLK_STS_RESOURCE/BLK_STS_DEV_RESOURCE. (2/4)
 - stop to attempt direct-issue and insert the reset when meet
   non-ok case (3/4).

V10:
 - address Jen's comment.
 - let blk_mq_try_issue_directly return actual result for case
   'bypass == false'. (2/4)
 - use return value of blk_mq_try_issue_directly to identify
   whether the request is direct-issued successfully. (3/4)

V9:
 - rebase on recent for-4.21/block
 - add 1st patch

V8:
 - drop the original 2nd patch which try to insert requests into hctx->dispatch
   if quiesced or stopped.
 - remove two wrong 'unlikely'

V7:
 - drop the original 3rd patch which try to ensure hctx to be ran on
   its mapped cpu in issueing directly path.

V6:
 - drop original 1st patch to address Jen's comment
 - discard the enum mq_issue_decision and blk_mq_make_decision and use
   BLK_STS_* return values directly to address Jen's comment. (1/5)
 - add 'unlikely' in blk_mq_try_issue_directly (1/5)
 - refactor the 2nd and 3rd patch based on the new 1st patch.
 - reserve the unused_cookie in 4th and 5th patch

V5:
 - rebase against Jens' for-4.21/block branch
 - adjust the order of patch04 and patch05
 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert
 - comment changes

V4:
 - split the original patch 1 into two patches, 1st and 2nd patch currently
 - rename the mq_decision to mq_issue_decision
 - comment changes

V3:
 - Correct the code about the case bypass_insert is true and io scheduler
   attached. The request still need to be issued in case above. (1/4)
 - Refactor the code to make code clearer. blk_mq_make_request is introduced
   to decide insert, end or just return based on the return value of .queue_rq
   and bypass_insert (1/4) 
 - Add the 2nd patch. It introduce a new decision result which indicates to
   insert request with blk_mq_request_bypass_insert.
 - Modify the code to adapt the new patch 1.

V2:
 - Add 1st and 2nd patch to refactor the code.


Jianchao Wang (3)
blk-mq: refactor the code of issue request directly
blk-mq: issue directly with bypass 'false' in
blk-mq: replace and kill blk_mq_request_issue_directly

block/blk-core.c |   4 +-
block/blk-mq-sched.c |   8 ++--
block/blk-mq.c   | 123 +--
block/blk-mq.h   |   6 ++-
4 files changed, 70 insertions(+), 71 deletions(-)

Thanks
Jianchao

[PATCH V13 3/3] blk-mq: replace and kill blk_mq_request_issue_directly

2018-12-09 Thread Jianchao Wang

Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly
in blk_insert_cloned_request and kill it as nobody uses it any more.

Signed-off-by: Jianchao Wang 
---
 block/blk-core.c | 4 +++-
 block/blk-mq.c   | 9 +
 block/blk-mq.h   | 6 --
 3 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index ad59102..c92d866 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1297,6 +1297,8 @@ static int blk_cloned_rq_check_limits(struct 
request_queue *q,
  */
 blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request 
*rq)
 {
+   blk_qc_t unused;
+
if (blk_cloned_rq_check_limits(q, rq))
return BLK_STS_IOERR;
 
@@ -1312,7 +1314,7 @@ blk_status_t blk_insert_cloned_request(struct 
request_queue *q, struct request *
 * bypass a potential scheduler on the bottom device for
 * insert.
 */
-   return blk_mq_request_issue_directly(rq, true);
+   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, true);
 }
 EXPORT_SYMBOL_GPL(blk_insert_cloned_request);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8f554d1..890d074 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1794,7 +1794,7 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
bool bypass, bool last)
@@ -1866,13 +1866,6 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last)
-{
-   blk_qc_t unused;
-
-   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, last);
-}
-
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
diff --git a/block/blk-mq.h b/block/blk-mq.h
index a664ea4..b81e619 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -68,8 +68,10 @@ void blk_mq_request_bypass_insert(struct request *rq, bool 
run_queue);
 void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct list_head *list);
 
-/* Used by blk_insert_cloned_request() to issue request directly */
-blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last);
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+   struct request *rq,
+   blk_qc_t *cookie,
+   bool bypass, bool last);
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list);
 
-- 
2.7.4

[PATCH V13 2/3] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests

2018-12-09 Thread Jianchao Wang

It is not necessary to issue request directly with bypass 'true'
in blk_mq_sched_insert_requests and handle the non-issued requests
itself. Just set bypass to 'false' and let blk_mq_try_issue_directly
handle them totally. Remove the blk_rq_can_direct_dispatch check,
because blk_mq_try_issue_directly can handle it well.If request is
direct-issued unsuccessfully, insert the reset.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c |  8 +++-
 block/blk-mq.c   | 20 +---
 2 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index f096d898..5b4d52d 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -417,12 +417,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx 
*hctx,
 * busy in case of 'none' scheduler, and this way may save
 * us one extra enqueue & dequeue to sw queue.
 */
-   if (!hctx->dispatch_busy && !e && !run_queue_async) {
+   if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
-   if (list_empty(list))
-   return;
-   }
-   blk_mq_insert_requests(hctx, ctx, list);
+   else
+   blk_mq_insert_requests(hctx, ctx, list);
}
 
blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8dc34d2..8f554d1 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1876,22 +1876,20 @@ blk_status_t blk_mq_request_issue_directly(struct 
request *rq, bool last)
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
+   blk_qc_t unused;
+   blk_status_t ret = BLK_STS_OK;
+
while (!list_empty(list)) {
-   blk_status_t ret;
struct request *rq = list_first_entry(list, struct request,
queuelist);
 
list_del_init(&rq->queuelist);
-   ret = blk_mq_request_issue_directly(rq, list_empty(list));
-   if (ret != BLK_STS_OK) {
-   if (ret == BLK_STS_RESOURCE ||
-   ret == BLK_STS_DEV_RESOURCE) {
-   blk_mq_request_bypass_insert(rq,
+   if (ret == BLK_STS_OK)
+   ret = blk_mq_try_issue_directly(hctx, rq, &unused,
+   false,
list_empty(list));
-   break;
-   }
-   blk_mq_end_request(rq, ret);
-   }
+   else
+   blk_mq_sched_insert_request(rq, false, true, false);
}
 
/*
@@ -1899,7 +1897,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx 
*hctx,
 * the driver there was more coming, but that turned out to
 * be a lie.
 */
-   if (!list_empty(list) && hctx->queue->mq_ops->commit_rqs)
+   if (ret != BLK_STS_OK && hctx->queue->mq_ops->commit_rqs)
hctx->queue->mq_ops->commit_rqs(hctx);
 }
 
-- 
2.7.4

[PATCH V13 1/3] blk-mq: refactor the code of issue request directly

2018-12-09 Thread Jianchao Wang

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface to unify the interfaces to issue requests
directly. The merged interface takes over the requests totally,
it could insert, end or do nothing based on the return value of
.queue_rq and 'bypass' parameter. Then caller needn't any other
handling any more and then code could be cleaned up.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 103 ++---
 1 file changed, 54 insertions(+), 49 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index b645275..8dc34d2 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1794,78 +1794,83 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
-   bool bypass_insert, bool last)
+   bool bypass, bool last)
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   blk_status_t ret = BLK_STS_RESOURCE;
+   int srcu_idx;
+   bool force = false;
 
+   hctx_lock(hctx, &srcu_idx);
/*
-* RCU or SRCU read lock is needed before checking quiesced flag.
+* hctx_lock is needed before checking quiesced flag.
 *
-* When queue is stopped or quiesced, ignore 'bypass_insert' from
-* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
-* and avoid driver to try to dispatch again.
+* When queue is stopped or quiesced, ignore 'bypass', insert
+* and return BLK_STS_OK to caller, and avoid driver to try to
+* dispatch again.
 */
-   if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
+   if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
run_queue = false;
-   bypass_insert = false;
-   goto insert;
+   bypass = false;
+   goto out_unlock;
}
 
-   if (q->elevator && !bypass_insert)
-   goto insert;
+   if (unlikely(q->elevator && !bypass))
+   goto out_unlock;
 
if (!blk_mq_get_dispatch_budget(hctx))
-   goto insert;
+   goto out_unlock;
 
if (!blk_mq_get_driver_tag(rq)) {
blk_mq_put_dispatch_budget(hctx);
-   goto insert;
+   goto out_unlock;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie, last);
-insert:
-   if (bypass_insert)
-   return BLK_STS_RESOURCE;
-
-   blk_mq_request_bypass_insert(rq, run_queue);
-   return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-   struct request *rq, blk_qc_t *cookie)
-{
-   blk_status_t ret;
-   int srcu_idx;
-
-   might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
-
-   hctx_lock(hctx, &srcu_idx);
-
-   ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false, true);
-   if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
-   blk_mq_request_bypass_insert(rq, true);
-   else if (ret != BLK_STS_OK)
-   blk_mq_end_request(rq, ret);
-
+   /*
+* Always add a request that has been through
+*.queue_rq() to the hardware dispatch list.
+*/
+   force = true;
+   ret = __blk_mq_issue_directly(hctx, rq, cookie, last);
+out_unlock:
hctx_unlock(hctx, srcu_idx);
+   switch (ret) {
+   case BLK_STS_OK:
+   break;
+   case BLK_STS_DEV_RESOURCE:
+   case BLK_STS_RESOURCE:
+   if (force) {
+   blk_mq_request_bypass_insert(rq, run_queue);
+   /*
+* We have to return BLK_STS_OK for the DM
+* to avoid livelock. Otherwise, we return
+* the real result to indicate whether the
+* request is direct-issued successfully.
+*/
+   ret = bypass ? BLK_STS_OK : ret;
+   } else if (!bypass) {
+   blk_mq_sched_insert_request(rq, false,
+   run_queue, false);
+   }
+   break;
+   default:
+   if (!bypass)
+   blk_mq_end_request(rq, ret);
+   break;
+   }
+
+   return ret;
 }
 
 blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last)
 {
-   blk_status_t ret;
-   int srcu_idx;
-   blk_qc_t unused_cookie

[PATCH V12 0/3] blk-mq: refactor code of issue directly

2018-12-09 Thread Jianchao Wang

Hi Jens

Please consider this patchset for 4.21.

It refactors the code of issue request directly to unify the interface
and make the code clearer and more readable.

The 1st patch refactors the code of issue request directly to unify the
helper interface which could handle all the cases.

The 2nd patch make blk_mq_sched_insert_requests issue requests directly
with 'bypass' false, then it needn't to handle the non-issued requests
any more.

The 3rd patch replace and kill the blk_mq_request_issue_directly.

V12:
 - remove the original 1st patch/
 - rebase other 3 patches on newest for-4.21/block
 - add some comment

V11:
 - insert request to dispatch list when .queue_rq return
   BLK_STS_RESOURCE/BLK_STS_DEV_RESOURCE. (2/4)
 - stop to attempt direct-issue and insert the reset when meet
   non-ok case (3/4).

V10:
 - address Jen's comment.
 - let blk_mq_try_issue_directly return actual result for case
   'bypass == false'. (2/4)
 - use return value of blk_mq_try_issue_directly to identify
   whether the request is direct-issued successfully. (3/4)

V9:
 - rebase on recent for-4.21/block
 - add 1st patch

V8:
 - drop the original 2nd patch which try to insert requests into hctx->dispatch
   if quiesced or stopped.
 - remove two wrong 'unlikely'

V7:
 - drop the original 3rd patch which try to ensure hctx to be ran on
   its mapped cpu in issueing directly path.

V6:
 - drop original 1st patch to address Jen's comment
 - discard the enum mq_issue_decision and blk_mq_make_decision and use
   BLK_STS_* return values directly to address Jen's comment. (1/5)
 - add 'unlikely' in blk_mq_try_issue_directly (1/5)
 - refactor the 2nd and 3rd patch based on the new 1st patch.
 - reserve the unused_cookie in 4th and 5th patch

V5:
 - rebase against Jens' for-4.21/block branch
 - adjust the order of patch04 and patch05
 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert
 - comment changes

V4:
 - split the original patch 1 into two patches, 1st and 2nd patch currently
 - rename the mq_decision to mq_issue_decision
 - comment changes

V3:
 - Correct the code about the case bypass_insert is true and io scheduler
   attached. The request still need to be issued in case above. (1/4)
 - Refactor the code to make code clearer. blk_mq_make_request is introduced
   to decide insert, end or just return based on the return value of .queue_rq
   and bypass_insert (1/4) 
 - Add the 2nd patch. It introduce a new decision result which indicates to
   insert request with blk_mq_request_bypass_insert.
 - Modify the code to adapt the new patch 1.

V2:
 - Add 1st and 2nd patch to refactor the code.


Jianchao Wang (3)
blk-mq: refactor the code of issue request directly
blk-mq: issue directly with bypass 'false' in
blk-mq: replace and kill blk_mq_request_issue_directly

block/blk-core.c |   4 +-
block/blk-mq-sched.c |   8 ++--
block/blk-mq.c   | 123 +--
block/blk-mq.h   |   6 ++-
4 files changed, 70 insertions(+), 71 deletions(-)

Thanks
Jianchao

[PATCH V12 1/3] blk-mq: refactor the code of issue request directly

2018-12-09 Thread Jianchao Wang

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface to unify the interfaces to issue requests
directly. The merged interface takes over the requests totally,
it could insert, end or do nothing based on the return value of
.queue_rq and 'bypass' parameter. Then caller needn't any other
handling any more and then code could be cleaned up.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 104 ++---
 1 file changed, 55 insertions(+), 49 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index b645275..261ff6d 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1794,78 +1794,84 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
-   bool bypass_insert, bool last)
+   bool bypass, bool last)
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   blk_status_t ret = BLK_STS_RESOURCE;
+   int srcu_idx;
+   bool force = false;
 
+   hctx_lock(hctx, &srcu_idx);
/*
-* RCU or SRCU read lock is needed before checking quiesced flag.
+* hctx_lock is needed before checking quiesced flag.
 *
-* When queue is stopped or quiesced, ignore 'bypass_insert' from
-* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
-* and avoid driver to try to dispatch again.
+* When queue is stopped or quiesced, ignore 'bypass', insert
+* and return BLK_STS_OK to caller, and avoid driver to try to
+* dispatch again.
 */
-   if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
+   if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
run_queue = false;
-   bypass_insert = false;
-   goto insert;
+   bypass = false;
+   goto out_unlock;
}
 
-   if (q->elevator && !bypass_insert)
-   goto insert;
+   if (unlikely(q->elevator && !bypass))
+   goto out_unlock;
 
if (!blk_mq_get_dispatch_budget(hctx))
-   goto insert;
+   goto out_unlock;
 
if (!blk_mq_get_driver_tag(rq)) {
blk_mq_put_dispatch_budget(hctx);
-   goto insert;
+   goto out_unlock;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie, last);
-insert:
-   if (bypass_insert)
-   return BLK_STS_RESOURCE;
-
-   blk_mq_request_bypass_insert(rq, run_queue);
-   return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-   struct request *rq, blk_qc_t *cookie)
-{
-   blk_status_t ret;
-   int srcu_idx;
-
-   might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
-
-   hctx_lock(hctx, &srcu_idx);
-
-   ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false, true);
-   if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
-   blk_mq_request_bypass_insert(rq, true);
-   else if (ret != BLK_STS_OK)
-   blk_mq_end_request(rq, ret);
-
+   /*
+* Always add a request that has been through
+*.queue_rq() to the hardware dispatch list.
+*/
+   force = true;
+   ret = __blk_mq_issue_directly(hctx, rq, cookie, last);
+out_unlock:
hctx_unlock(hctx, srcu_idx);
+out:
+   switch (ret) {
+   case BLK_STS_OK:
+   break;
+   case BLK_STS_DEV_RESOURCE:
+   case BLK_STS_RESOURCE:
+   if (force) {
+   blk_mq_request_bypass_insert(rq, run_queue);
+   /*
+* We have to return BLK_STS_OK for the DM
+* to avoid livelock. Otherwise, we return
+* the real result to indicate whether the
+* request is direct-issued successfully.
+*/
+   ret = bypass ? BLK_STS_OK : ret;
+   } else if (!bypass) {
+   blk_mq_sched_insert_request(rq, false,
+   run_queue, false);
+   }
+   break;
+   default:
+   if (!bypass)
+   blk_mq_end_request(rq, ret);
+   break;
+   }
+
+   return ret;
 }
 
 blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last)
 {
-   blk_status_t ret;
-   int srcu_idx;
-   blk_qc_t unused_cookie

[PATCH V12 3/3] blk-mq: replace and kill blk_mq_request_issue_directly

2018-12-09 Thread Jianchao Wang

Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly
in blk_insert_cloned_request and kill it as nobody uses it any more.

Signed-off-by: Jianchao Wang 
---
 block/blk-core.c | 4 +++-
 block/blk-mq.c   | 9 +
 block/blk-mq.h   | 6 --
 3 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index ad59102..c92d866 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1297,6 +1297,8 @@ static int blk_cloned_rq_check_limits(struct 
request_queue *q,
  */
 blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request 
*rq)
 {
+   blk_qc_t unused;
+
if (blk_cloned_rq_check_limits(q, rq))
return BLK_STS_IOERR;
 
@@ -1312,7 +1314,7 @@ blk_status_t blk_insert_cloned_request(struct 
request_queue *q, struct request *
 * bypass a potential scheduler on the bottom device for
 * insert.
 */
-   return blk_mq_request_issue_directly(rq, true);
+   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, true);
 }
 EXPORT_SYMBOL_GPL(blk_insert_cloned_request);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index c663102..999e811 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1794,7 +1794,7 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
bool bypass, bool last)
@@ -1867,13 +1867,6 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last)
-{
-   blk_qc_t unused;
-
-   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, last);
-}
-
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
diff --git a/block/blk-mq.h b/block/blk-mq.h
index a664ea4..b81e619 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -68,8 +68,10 @@ void blk_mq_request_bypass_insert(struct request *rq, bool 
run_queue);
 void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct list_head *list);
 
-/* Used by blk_insert_cloned_request() to issue request directly */
-blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last);
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+   struct request *rq,
+   blk_qc_t *cookie,
+   bool bypass, bool last);
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list);
 
-- 
2.7.4

[PATCH V12 2/3] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests

2018-12-09 Thread Jianchao Wang

It is not necessary to issue request directly with bypass 'true'
in blk_mq_sched_insert_requests and handle the non-issued requests
itself. Just set bypass to 'false' and let blk_mq_try_issue_directly
handle them totally. Remove the blk_rq_can_direct_dispatch check,
because blk_mq_try_issue_directly can handle it well.If request is
direct-issued unsuccessfully, insert the reset.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c |  8 +++-
 block/blk-mq.c   | 20 +---
 2 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index f096d898..5b4d52d 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -417,12 +417,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx 
*hctx,
 * busy in case of 'none' scheduler, and this way may save
 * us one extra enqueue & dequeue to sw queue.
 */
-   if (!hctx->dispatch_busy && !e && !run_queue_async) {
+   if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
-   if (list_empty(list))
-   return;
-   }
-   blk_mq_insert_requests(hctx, ctx, list);
+   else
+   blk_mq_insert_requests(hctx, ctx, list);
}
 
blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 261ff6d..c663102 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1877,22 +1877,20 @@ blk_status_t blk_mq_request_issue_directly(struct 
request *rq, bool last)
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
+   blk_qc_t unused;
+   blk_status_t ret = BLK_STS_OK;
+
while (!list_empty(list)) {
-   blk_status_t ret;
struct request *rq = list_first_entry(list, struct request,
queuelist);
 
list_del_init(&rq->queuelist);
-   ret = blk_mq_request_issue_directly(rq, list_empty(list));
-   if (ret != BLK_STS_OK) {
-   if (ret == BLK_STS_RESOURCE ||
-   ret == BLK_STS_DEV_RESOURCE) {
-   blk_mq_request_bypass_insert(rq,
+   if (ret == BLK_STS_OK)
+   ret = blk_mq_try_issue_directly(hctx, rq, &unused,
+   false,
list_empty(list));
-   break;
-   }
-   blk_mq_end_request(rq, ret);
-   }
+   else
+   blk_mq_sched_insert_request(rq, false, true, false);
}
 
/*
@@ -1900,7 +1898,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx 
*hctx,
 * the driver there was more coming, but that turned out to
 * be a lie.
 */
-   if (!list_empty(list) && hctx->queue->mq_ops->commit_rqs)
+   if (ret != BLK_STS_OK && hctx->queue->mq_ops->commit_rqs)
hctx->queue->mq_ops->commit_rqs(hctx);
 }
 
-- 
2.7.4

[PATCH V11 4/4] blk-mq: replace and kill blk_mq_request_issue_directly

2018-12-06 Thread Jianchao Wang

Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly
in blk_insert_cloned_request and kill it as nobody uses it any more.

Signed-off-by: Jianchao Wang 
---
 block/blk-core.c | 4 +++-
 block/blk-mq.c   | 9 +
 block/blk-mq.h   | 6 --
 3 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index ad59102..c92d866 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1297,6 +1297,8 @@ static int blk_cloned_rq_check_limits(struct 
request_queue *q,
  */
 blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request 
*rq)
 {
+   blk_qc_t unused;
+
if (blk_cloned_rq_check_limits(q, rq))
return BLK_STS_IOERR;
 
@@ -1312,7 +1314,7 @@ blk_status_t blk_insert_cloned_request(struct 
request_queue *q, struct request *
 * bypass a potential scheduler on the bottom device for
 * insert.
 */
-   return blk_mq_request_issue_directly(rq, true);
+   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, true);
 }
 EXPORT_SYMBOL_GPL(blk_insert_cloned_request);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3265e30..32102d0 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1815,7 +1815,7 @@ static bool blk_rq_can_direct_dispatch(struct request *rq)
return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE;
 }
 
-static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
bool bypass, bool last)
@@ -1894,13 +1894,6 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last)
-{
-   blk_qc_t unused;
-
-   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, last);
-}
-
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
diff --git a/block/blk-mq.h b/block/blk-mq.h
index a664ea4..b81e619 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -68,8 +68,10 @@ void blk_mq_request_bypass_insert(struct request *rq, bool 
run_queue);
 void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct list_head *list);
 
-/* Used by blk_insert_cloned_request() to issue request directly */
-blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last);
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+   struct request *rq,
+   blk_qc_t *cookie,
+   bool bypass, bool last);
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list);
 
-- 
2.7.4

[PATCH V11 3/4] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests

2018-12-06 Thread Jianchao Wang

It is not necessary to issue request directly with bypass 'true'
in blk_mq_sched_insert_requests and handle the non-issued requests
itself. Just set bypass to 'false' and let blk_mq_try_issue_directly
handle them totally. Remove the blk_rq_can_direct_dispatch check,
because blk_mq_try_issue_directly can handle it well.If request is
direct-issued unsuccessfully, insert the reset.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c |  8 +++-
 block/blk-mq.c   | 29 ++---
 2 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index f096d898..5b4d52d 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -417,12 +417,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx 
*hctx,
 * busy in case of 'none' scheduler, and this way may save
 * us one extra enqueue & dequeue to sw queue.
 */
-   if (!hctx->dispatch_busy && !e && !run_queue_async) {
+   if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
-   if (list_empty(list))
-   return;
-   }
-   blk_mq_insert_requests(hctx, ctx, list);
+   else
+   blk_mq_insert_requests(hctx, ctx, list);
}
 
blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 88ee447..3265e30 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1904,32 +1904,23 @@ blk_status_t blk_mq_request_issue_directly(struct 
request *rq, bool last)
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
+   blk_qc_t unused;
+   blk_status_t ret = BLK_STS_OK;
+
while (!list_empty(list)) {
-   blk_status_t ret;
struct request *rq = list_first_entry(list, struct request,
queuelist);
 
-   if (!blk_rq_can_direct_dispatch(rq))
-   break;
-
list_del_init(&rq->queuelist);
-   ret = blk_mq_request_issue_directly(rq, list_empty(list));
-   if (ret != BLK_STS_OK) {
-   if (ret == BLK_STS_RESOURCE ||
-   ret == BLK_STS_DEV_RESOURCE) {
-   list_add(&rq->queuelist, list);
-   break;
-   }
-   blk_mq_end_request(rq, ret);
-   }
+   if (ret == BLK_STS_OK)
+   ret = blk_mq_try_issue_directly(hctx, rq, &unused,
+   false,
+   list_empty(list));
+   else
+   blk_mq_sched_insert_request(rq, false, true, false);
}
 
-   /*
-* If we didn't flush the entire list, we could have told
-* the driver there was more coming, but that turned out to
-* be a lie.
-*/
-   if (!list_empty(list) && hctx->queue->mq_ops->commit_rqs)
+   if (ret != BLK_STS_OK && hctx->queue->mq_ops->commit_rqs)
hctx->queue->mq_ops->commit_rqs(hctx);
 }
 
-- 
2.7.4

[PATCH V11 2/4] blk-mq: refactor the code of issue request directly

2018-12-06 Thread Jianchao Wang

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface to unify the interfaces to issue requests
directly. The merged interface takes over the requests totally,
it could insert, end or do nothing based on the return value of
.queue_rq and 'bypass' parameter. Then caller needn't any other
handling any more and then code could be cleaned up.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 120 -
 1 file changed, 59 insertions(+), 61 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 01802bf..88ee447 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1815,92 +1815,90 @@ static bool blk_rq_can_direct_dispatch(struct request 
*rq)
return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE;
 }
 
-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
-   bool bypass_insert, bool last)
+   bool bypass, bool last)
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   blk_status_t ret = BLK_STS_RESOURCE;
+   int srcu_idx;
bool force = false;
 
+   if (!blk_rq_can_direct_dispatch(rq)) {
+   /*
+* Insert request to hctx dispatch list and return
+* BLK_STS_OK for 'bypass == true' case, otherwise,
+* the caller will fail forever.
+*/
+   force = bypass;
+   goto out;
+   }
+
+   hctx_lock(hctx, &srcu_idx);
/*
-* RCU or SRCU read lock is needed before checking quiesced flag.
+* hctx_lock is needed before checking quiesced flag.
 *
-* When queue is stopped or quiesced, ignore 'bypass_insert' from
-* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
-* and avoid driver to try to dispatch again.
+* When queue is stopped or quiesced, ignore 'bypass', insert
+* and return BLK_STS_OK to caller, and avoid driver to try to
+* dispatch again.
 */
-   if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
+   if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
run_queue = false;
-   bypass_insert = false;
-   goto insert;
+   bypass = false;
+   goto out_unlock;
}
 
-   if (q->elevator && !bypass_insert)
-   goto insert;
-
-   if (!blk_rq_can_direct_dispatch(rq)) {
-   /*
-* For 'bypass_insert == true' case, insert request into hctx
-* dispatch list.
-*/
-   force = bypass_insert;
-   goto insert;
-   }
+   if (unlikely(q->elevator && !bypass))
+   goto out_unlock;
 
if (!blk_mq_get_dispatch_budget(hctx))
-   goto insert;
+   goto out_unlock;
 
if (!blk_mq_get_driver_tag(rq)) {
blk_mq_put_dispatch_budget(hctx);
-   goto insert;
+   goto out_unlock;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie, last);
-insert:
-   if (force) {
-   blk_mq_request_bypass_insert(rq, run_queue);
-   return BLK_STS_OK;
-   } else if (bypass_insert) {
-   return BLK_STS_RESOURCE;
+   /*
+* If the request is issued unsuccessfully with
+* BLK_STS_DEV_RESOURCE or BLK_STS_RESOURCE, insert
+* the request to hctx dispatch list due to attached
+* lldd resource.
+*/
+   force = true;
+   ret = __blk_mq_issue_directly(hctx, rq, cookie, last);
+out_unlock:
+   hctx_unlock(hctx, srcu_idx);
+out:
+   switch (ret) {
+   case BLK_STS_OK:
+   break;
+   case BLK_STS_DEV_RESOURCE:
+   case BLK_STS_RESOURCE:
+   if (force) {
+   blk_mq_request_bypass_insert(rq, run_queue);
+   ret = bypass ? BLK_STS_OK : ret;
+   } else if (!bypass) {
+   blk_mq_sched_insert_request(rq, false,
+   run_queue, false);
+   }
+   break;
+   default:
+   if (!bypass)
+   blk_mq_end_request(rq, ret);
+   break;
}
 
-   blk_mq_sched_insert_request(rq, false, run_queue, false);
-   return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-   struct request *rq, blk_qc_t *cookie)
-{
-   blk_status_t ret;
-   int srcu

[PATCH V11 0/4] blk-mq: refactor code of issue directly

2018-12-06 Thread Jianchao Wang

Hi Jens

Please consider this patchset for 4.21.

It refactors the code of issue request directly to unify the interface
and make the code clearer and more readable.

This patch set is rebased on the recent for-4.21/block and add the 1st
patch which inserts the non-read-write request to hctx dispatch
list to avoid to involve merge and io scheduler when bypass_insert
is true, otherwise, inserting is ignored, BLK_STS_RESOURCE is returned
and the caller will fail forever.

The 2nd patch refactors the code of issue request directly to unify the
helper interface which could handle all the cases.

The 3rd patch make blk_mq_sched_insert_requests issue requests directly
with 'bypass' false, then it needn't to handle the non-issued requests
any more.

The 4th patch replace and kill the blk_mq_request_issue_directly.

V11:
 - insert request to dispatch list when .queue_rq return
   BLK_STS_RESOURCE/BLK_STS_DEV_RESOURCE. (2/4)
 - stop to attempt direct-issue and insert the reset when meet
   non-ok case (3/4).

V10:
 - address Jen's comment.
 - let blk_mq_try_issue_directly return actual result for case
   'bypass == false'. (2/4)
 - use return value of blk_mq_try_issue_directly to identify
   whether the request is direct-issued successfully. (3/4)

V9:
 - rebase on recent for-4.21/block
 - add 1st patch

V8:
 - drop the original 2nd patch which try to insert requests into hctx->dispatch
   if quiesced or stopped.
 - remove two wrong 'unlikely'

V7:
 - drop the original 3rd patch which try to ensure hctx to be ran on
   its mapped cpu in issueing directly path.

V6:
 - drop original 1st patch to address Jen's comment
 - discard the enum mq_issue_decision and blk_mq_make_decision and use
   BLK_STS_* return values directly to address Jen's comment. (1/5)
 - add 'unlikely' in blk_mq_try_issue_directly (1/5)
 - refactor the 2nd and 3rd patch based on the new 1st patch.
 - reserve the unused_cookie in 4th and 5th patch

V5:
 - rebase against Jens' for-4.21/block branch
 - adjust the order of patch04 and patch05
 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert
 - comment changes

V4:
 - split the original patch 1 into two patches, 1st and 2nd patch currently
 - rename the mq_decision to mq_issue_decision
 - comment changes

V3:
 - Correct the code about the case bypass_insert is true and io scheduler
   attached. The request still need to be issued in case above. (1/4)
 - Refactor the code to make code clearer. blk_mq_make_request is introduced
   to decide insert, end or just return based on the return value of .queue_rq
   and bypass_insert (1/4) 
 - Add the 2nd patch. It introduce a new decision result which indicates to
   insert request with blk_mq_request_bypass_insert.
 - Modify the code to adapt the new patch 1.

V2:
 - Add 1st and 2nd patch to refactor the code.


Jianchao Wang (4)
blk-mq: insert to hctx dispatch list when
blk-mq: refactor the code of issue request directly
blk-mq: issue directly with bypass 'false' in
blk-mq: replace and kill blk_mq_request_issue_directly

block/blk-core.c |   4 +-
block/blk-mq-sched.c |   8 ++-
block/blk-mq.c   | 137 +--
block/blk-mq.h   |   6 ++-
4 files changed, 76 insertions(+), 79 deletions(-)

Thanks
Jianchao

[PATCH V11 1/4] blk-mq: insert to hctx dispatch list when bypass_insert is true

2018-12-06 Thread Jianchao Wang

We don't allow direct dispatch of anything but regular reads/writes
and insert all of non-read-write requests. However, this is not
correct for 'bypass_insert == true' case where inserting is ignored
and BLK_STS_RESOURCE is returned. The caller will fail forever.

Fix it with inserting the non-read-write request to hctx dispatch
list to avoid to involve merge and io scheduler when bypass_insert
is true.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9005505..01802bf 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1822,6 +1822,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   bool force = false;
 
/*
 * RCU or SRCU read lock is needed before checking quiesced flag.
@@ -1836,9 +1837,18 @@ static blk_status_t __blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
goto insert;
}
 
-   if (!blk_rq_can_direct_dispatch(rq) || (q->elevator && !bypass_insert))
+   if (q->elevator && !bypass_insert)
goto insert;
 
+   if (!blk_rq_can_direct_dispatch(rq)) {
+   /*
+* For 'bypass_insert == true' case, insert request into hctx
+* dispatch list.
+*/
+   force = bypass_insert;
+   goto insert;
+   }
+
if (!blk_mq_get_dispatch_budget(hctx))
goto insert;
 
@@ -1849,8 +1859,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 
return __blk_mq_issue_directly(hctx, rq, cookie, last);
 insert:
-   if (bypass_insert)
+   if (force) {
+   blk_mq_request_bypass_insert(rq, run_queue);
+   return BLK_STS_OK;
+   } else if (bypass_insert) {
return BLK_STS_RESOURCE;
+   }
 
blk_mq_sched_insert_request(rq, false, run_queue, false);
return BLK_STS_OK;
-- 
2.7.4

[PATCH V10 4/4] blk-mq: replace and kill blk_mq_request_issue_directly

2018-12-05 Thread Jianchao Wang

Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly
in blk_insert_cloned_request and kill it as nobody uses it any more.

Signed-off-by: Jianchao Wang 
---
 block/blk-core.c | 4 +++-
 block/blk-mq.c   | 9 +
 block/blk-mq.h   | 6 --
 3 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index ad59102..c92d866 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1297,6 +1297,8 @@ static int blk_cloned_rq_check_limits(struct 
request_queue *q,
  */
 blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request 
*rq)
 {
+   blk_qc_t unused;
+
if (blk_cloned_rq_check_limits(q, rq))
return BLK_STS_IOERR;
 
@@ -1312,7 +1314,7 @@ blk_status_t blk_insert_cloned_request(struct 
request_queue *q, struct request *
 * bypass a potential scheduler on the bottom device for
 * insert.
 */
-   return blk_mq_request_issue_directly(rq, true);
+   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, true);
 }
 EXPORT_SYMBOL_GPL(blk_insert_cloned_request);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index dd07fe1..3ae5095 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1815,7 +1815,7 @@ static bool blk_rq_can_direct_dispatch(struct request *rq)
return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE;
 }
 
-static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
bool bypass, bool last)
@@ -1886,13 +1886,6 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last)
-{
-   blk_qc_t unused;
-
-   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, last);
-}
-
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
diff --git a/block/blk-mq.h b/block/blk-mq.h
index a664ea4..b81e619 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -68,8 +68,10 @@ void blk_mq_request_bypass_insert(struct request *rq, bool 
run_queue);
 void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct list_head *list);
 
-/* Used by blk_insert_cloned_request() to issue request directly */
-blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last);
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+   struct request *rq,
+   blk_qc_t *cookie,
+   bool bypass, bool last);
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list);
 
-- 
2.7.4

[PATCH V10 1/4] blk-mq: insert to hctx dispatch list when bypass_insert is true

2018-12-05 Thread Jianchao Wang

We don't allow direct dispatch of anything but regular reads/writes
and insert all of non-read-write requests. However, this is not
correct for 'bypass_insert == true' case where inserting is ignored
and BLK_STS_RESOURCE is returned. The caller will fail forever.

Fix it with inserting the non-read-write request to hctx dispatch
list to avoid to involve merge and io scheduler when bypass_insert
is true.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9005505..01802bf 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1822,6 +1822,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   bool force = false;
 
/*
 * RCU or SRCU read lock is needed before checking quiesced flag.
@@ -1836,9 +1837,18 @@ static blk_status_t __blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
goto insert;
}
 
-   if (!blk_rq_can_direct_dispatch(rq) || (q->elevator && !bypass_insert))
+   if (q->elevator && !bypass_insert)
goto insert;
 
+   if (!blk_rq_can_direct_dispatch(rq)) {
+   /*
+* For 'bypass_insert == true' case, insert request into hctx
+* dispatch list.
+*/
+   force = bypass_insert;
+   goto insert;
+   }
+
if (!blk_mq_get_dispatch_budget(hctx))
goto insert;
 
@@ -1849,8 +1859,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 
return __blk_mq_issue_directly(hctx, rq, cookie, last);
 insert:
-   if (bypass_insert)
+   if (force) {
+   blk_mq_request_bypass_insert(rq, run_queue);
+   return BLK_STS_OK;
+   } else if (bypass_insert) {
return BLK_STS_RESOURCE;
+   }
 
blk_mq_sched_insert_request(rq, false, run_queue, false);
return BLK_STS_OK;
-- 
2.7.4

[PATCH V10 0/4] blk-mq: refactor code of issue directly

2018-12-05 Thread Jianchao Wang

Hi Jens

Please consider this patchset for 4.21.

It refactors the code of issue request directly to unify the interface
and make the code clearer and more readable.

This patch set is rebased on the recent for-4.21/block and add the 1st
patch which inserts the non-read-write request to hctx dispatch
list to avoid to involve merge and io scheduler when bypass_insert
is true, otherwise, inserting is ignored, BLK_STS_RESOURCE is returned
and the caller will fail forever.

The 2nd patch refactors the code of issue request directly to unify the
helper interface which could handle all the cases.

The 3rd patch make blk_mq_sched_insert_requests issue requests directly
with 'bypass' false, then it needn't to handle the non-issued requests
any more.

The 4th patch replace and kill the blk_mq_request_issue_directly.

V10:
 - address Jen's comment.
 - let blk_mq_try_issue_directly return actual result for case
   'bypass == false'. (2/4)
 - use return value of blk_mq_try_issue_directly to identify
   whether the request is direct-issued successfully. (3/4)

V9:
 - rebase on recent for-4.21/block
 - add 1st patch

V8:
 - drop the original 2nd patch which try to insert requests into hctx->dispatch
   if quiesced or stopped.
 - remove two wrong 'unlikely'

V7:
 - drop the original 3rd patch which try to ensure hctx to be ran on
   its mapped cpu in issueing directly path.

V6:
 - drop original 1st patch to address Jen's comment
 - discard the enum mq_issue_decision and blk_mq_make_decision and use
   BLK_STS_* return values directly to address Jen's comment. (1/5)
 - add 'unlikely' in blk_mq_try_issue_directly (1/5)
 - refactor the 2nd and 3rd patch based on the new 1st patch.
 - reserve the unused_cookie in 4th and 5th patch

V5:
 - rebase against Jens' for-4.21/block branch
 - adjust the order of patch04 and patch05
 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert
 - comment changes

V4:
 - split the original patch 1 into two patches, 1st and 2nd patch currently
 - rename the mq_decision to mq_issue_decision
 - comment changes

V3:
 - Correct the code about the case bypass_insert is true and io scheduler
   attached. The request still need to be issued in case above. (1/4)
 - Refactor the code to make code clearer. blk_mq_make_request is introduced
   to decide insert, end or just return based on the return value of .queue_rq
   and bypass_insert (1/4) 
 - Add the 2nd patch. It introduce a new decision result which indicates to
   insert request with blk_mq_request_bypass_insert.
 - Modify the code to adapt the new patch 1.

V2:
 - Add 1st and 2nd patch to refactor the code.


Jianchao Wang (4)
blk-mq: insert to hctx dispatch list when
blk-mq: refactor the code of issue request directly
blk-mq: issue directly with bypass 'false' in
blk-mq: replace and kill blk_mq_request_issue_directly

block/blk-core.c |   4 +-
block/blk-mq-sched.c |   8 ++--
block/blk-mq.c   | 127 +++
block/blk-mq.h   |   6 ++-
4 files changed, 68 insertions(+), 77 deletions(-)


Thanks
Jianchao

[PATCH V10 2/4] blk-mq: refactor the code of issue request directly

2018-12-05 Thread Jianchao Wang

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface to unify the interfaces to issue requests
directly. The merged interface takes over the requests totally,
it could insert, end or do nothing based on the return value of
.queue_rq and 'bypass' parameter. Then caller needn't any other
handling any more and then code could be cleaned up.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 112 ++---
 1 file changed, 51 insertions(+), 61 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 01802bf..a1cccdd 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1815,92 +1815,82 @@ static bool blk_rq_can_direct_dispatch(struct request 
*rq)
return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE;
 }
 
-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
-   bool bypass_insert, bool last)
+   bool bypass, bool last)
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   blk_status_t ret = BLK_STS_RESOURCE;
+   int srcu_idx;
bool force = false;
 
+   if (!blk_rq_can_direct_dispatch(rq)) {
+   /*
+* Insert request to hctx dispatch list and return
+* BLK_STS_OK for 'bypass == true' case, otherwise,
+* the caller will fail forever.
+*/
+   force = bypass;
+   goto out;
+   }
+
+   hctx_lock(hctx, &srcu_idx);
/*
-* RCU or SRCU read lock is needed before checking quiesced flag.
+* hctx_lock is needed before checking quiesced flag.
 *
-* When queue is stopped or quiesced, ignore 'bypass_insert' from
-* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
-* and avoid driver to try to dispatch again.
+* When queue is stopped or quiesced, ignore 'bypass', insert
+* and return BLK_STS_OK to caller, and avoid driver to try to
+* dispatch again.
 */
-   if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
+   if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
run_queue = false;
-   bypass_insert = false;
-   goto insert;
+   bypass = false;
+   goto out_unlock;
}
 
-   if (q->elevator && !bypass_insert)
-   goto insert;
-
-   if (!blk_rq_can_direct_dispatch(rq)) {
-   /*
-* For 'bypass_insert == true' case, insert request into hctx
-* dispatch list.
-*/
-   force = bypass_insert;
-   goto insert;
-   }
+   if (unlikely(q->elevator && !bypass))
+   goto out_unlock;
 
if (!blk_mq_get_dispatch_budget(hctx))
-   goto insert;
+   goto out_unlock;
 
if (!blk_mq_get_driver_tag(rq)) {
blk_mq_put_dispatch_budget(hctx);
-   goto insert;
+   goto out_unlock;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie, last);
-insert:
-   if (force) {
-   blk_mq_request_bypass_insert(rq, run_queue);
-   return BLK_STS_OK;
-   } else if (bypass_insert) {
-   return BLK_STS_RESOURCE;
+   ret = __blk_mq_issue_directly(hctx, rq, cookie, last);
+out_unlock:
+   hctx_unlock(hctx, srcu_idx);
+out:
+   switch (ret) {
+   case BLK_STS_OK:
+   break;
+   case BLK_STS_DEV_RESOURCE:
+   case BLK_STS_RESOURCE:
+   if (force) {
+   blk_mq_request_bypass_insert(rq, run_queue);
+   ret = BLK_STS_OK;
+   } else if (!bypass) {
+   blk_mq_sched_insert_request(rq, false, run_queue, 
false);
+   }
+   break;
+   default:
+   if (!bypass)
+   blk_mq_end_request(rq, ret);
+   break;
}
 
-   blk_mq_sched_insert_request(rq, false, run_queue, false);
-   return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-   struct request *rq, blk_qc_t *cookie)
-{
-   blk_status_t ret;
-   int srcu_idx;
-
-   might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
-
-   hctx_lock(hctx, &srcu_idx);
-
-   ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false, true);
-   if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
-   blk_mq_sched_insert_request(rq, fals

[PATCH V10 3/4] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests

2018-12-05 Thread Jianchao Wang

It is not necessary to issue request directly with bypass 'true'
in blk_mq_sched_insert_requests and handle the non-issued requests
itself. Just set bypass to 'false' and let blk_mq_try_issue_directly
handle them totally. Remove the blk_rq_can_direct_dispatch check,
because blk_mq_try_issue_directly can handle it well.

With respect to commit_rqs hook, we only need to care about the last
request's result. If it is inserted, invoke commit_rqs.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c |  8 +++-
 block/blk-mq.c   | 26 +-
 2 files changed, 12 insertions(+), 22 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index f096d898..5b4d52d 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -417,12 +417,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx 
*hctx,
 * busy in case of 'none' scheduler, and this way may save
 * us one extra enqueue & dequeue to sw queue.
 */
-   if (!hctx->dispatch_busy && !e && !run_queue_async) {
+   if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
-   if (list_empty(list))
-   return;
-   }
-   blk_mq_insert_requests(hctx, ctx, list);
+   else
+   blk_mq_insert_requests(hctx, ctx, list);
}
 
blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index a1cccdd..dd07fe1 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1896,32 +1896,24 @@ blk_status_t blk_mq_request_issue_directly(struct 
request *rq, bool last)
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
+   blk_qc_t unused;
+   blk_status_t ret = BLK_STS_OK;
+
while (!list_empty(list)) {
-   blk_status_t ret;
struct request *rq = list_first_entry(list, struct request,
queuelist);
 
-   if (!blk_rq_can_direct_dispatch(rq))
-   break;
-
list_del_init(&rq->queuelist);
-   ret = blk_mq_request_issue_directly(rq, list_empty(list));
-   if (ret != BLK_STS_OK) {
-   if (ret == BLK_STS_RESOURCE ||
-   ret == BLK_STS_DEV_RESOURCE) {
-   list_add(&rq->queuelist, list);
-   break;
-   }
-   blk_mq_end_request(rq, ret);
-   }
+   ret = blk_mq_try_issue_directly(hctx, rq, &unused, false,
+   list_empty(list));
}
 
/*
-* If we didn't flush the entire list, we could have told
-* the driver there was more coming, but that turned out to
-* be a lie.
+* We only need to care about the last request's result,
+* if it is inserted, kick the hardware with commit_rqs hook.
 */
-   if (!list_empty(list) && hctx->queue->mq_ops->commit_rqs)
+   if ((ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) &&
+   hctx->queue->mq_ops->commit_rqs)
hctx->queue->mq_ops->commit_rqs(hctx);
 }
 
-- 
2.7.4

[PATCH V9 3/4] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests

2018-12-04 Thread Jianchao Wang

It is not necessary to issue request directly with bypass 'true'
in blk_mq_sched_insert_requests and handle the non-issued requests
itself. Just set bypass to 'false' and let blk_mq_try_issue_directly
handle them totally. Remove the blk_rq_can_direct_dispatch check,
because blk_mq_try_issue_directly can handle it well.

With respect to commit_rqs hook, we only need to care about the last
request's result. If it is inserted, invoke commit_rqs. We identify
the actual result of blk_mq_try_issue_directly with outputed cookie.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c  |  8 +++-
 block/blk-mq.c| 25 -
 include/linux/blk_types.h |  1 +
 3 files changed, 12 insertions(+), 22 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index f096d898..5b4d52d 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -417,12 +417,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx 
*hctx,
 * busy in case of 'none' scheduler, and this way may save
 * us one extra enqueue & dequeue to sw queue.
 */
-   if (!hctx->dispatch_busy && !e && !run_queue_async) {
+   if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
-   if (list_empty(list))
-   return;
-   }
-   blk_mq_insert_requests(hctx, ctx, list);
+   else
+   blk_mq_insert_requests(hctx, ctx, list);
}
 
blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index fe92e52..0dfa269 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1899,32 +1899,23 @@ blk_status_t blk_mq_request_issue_directly(struct 
request *rq, bool last)
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
+   blk_qc_t cookie = BLK_QC_T_INVALID;
+
while (!list_empty(list)) {
-   blk_status_t ret;
struct request *rq = list_first_entry(list, struct request,
queuelist);
 
-   if (!blk_rq_can_direct_dispatch(rq))
-   break;
-
list_del_init(&rq->queuelist);
-   ret = blk_mq_request_issue_directly(rq, list_empty(list));
-   if (ret != BLK_STS_OK) {
-   if (ret == BLK_STS_RESOURCE ||
-   ret == BLK_STS_DEV_RESOURCE) {
-   list_add(&rq->queuelist, list);
-   break;
-   }
-   blk_mq_end_request(rq, ret);
-   }
+   blk_mq_try_issue_directly(hctx, rq, &cookie, false,
+   list_empty(list));
}
 
/*
-* If we didn't flush the entire list, we could have told
-* the driver there was more coming, but that turned out to
-* be a lie.
+* cookie is set to a valid value only when reqeust is issued 
successfully.
+* We only need to care about the last request's result, if it is 
inserted,
+* kick the hardware with commit_rqs hook.
 */
-   if (!list_empty(list) && hctx->queue->mq_ops->commit_rqs)
+   if ((cookie == BLK_QC_T_INVALID) && hctx->queue->mq_ops->commit_rqs)
hctx->queue->mq_ops->commit_rqs(hctx);
 }
 
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index c0ba1a0..a0a467a41 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -414,6 +414,7 @@ static inline int op_stat_group(unsigned int op)
 }
 
 typedef unsigned int blk_qc_t;
+#define BLK_QC_T_INVALID   -2U
 #define BLK_QC_T_NONE  -1U
 #define BLK_QC_T_SHIFT 16
 #define BLK_QC_T_INTERNAL  (1U << 31)
-- 
2.7.4

[PATCH V9 0/4] blk-mq: refactor code of issue directly

2018-12-04 Thread Jianchao Wang

Hi Jens

Please consider this patchset for 4.21.

It refactors the code of issue request directly to unify the interface
and make the code clearer and more readable.

This patch set is rebased on the recent for-4.21/block and add the 1st
patch which inserts the non-read-write request to hctx dispatch
list to avoid to involve merge and io scheduler when bypass_insert
is true, otherwise, inserting is ignored, BLK_STS_RESOURCE is returned
and the caller will fail forever.

The 2nd patch refactors the code of issue request directly to unify the
helper interface which could handle all the cases.

The 3rd patch make blk_mq_sched_insert_requests issue requests directly
with 'bypass' false, then it needn't to handle the non-issued requests
any more.

The 4th patch replace and kill the blk_mq_request_issue_directly.

V9:
 - rebase on recent for-4.21/block
 - add 1st patch

V8:
 - drop the original 2nd patch which try to insert requests into hctx->dispatch
   if quiesced or stopped.
 - remove two wrong 'unlikely'

V7:
 - drop the original 3rd patch which try to ensure hctx to be ran on
   its mapped cpu in issueing directly path.

V6:
 - drop original 1st patch to address Jen's comment
 - discard the enum mq_issue_decision and blk_mq_make_decision and use
   BLK_STS_* return values directly to address Jen's comment. (1/5)
 - add 'unlikely' in blk_mq_try_issue_directly (1/5)
 - refactor the 2nd and 3rd patch based on the new 1st patch.
 - reserve the unused_cookie in 4th and 5th patch

V5:
 - rebase against Jens' for-4.21/block branch
 - adjust the order of patch04 and patch05
 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert
 - comment changes

V4:
 - split the original patch 1 into two patches, 1st and 2nd patch currently
 - rename the mq_decision to mq_issue_decision
 - comment changes

V3:
 - Correct the code about the case bypass_insert is true and io scheduler
   attached. The request still need to be issued in case above. (1/4)
 - Refactor the code to make code clearer. blk_mq_make_request is introduced
   to decide insert, end or just return based on the return value of .queue_rq
   and bypass_insert (1/4) 
 - Add the 2nd patch. It introduce a new decision result which indicates to
   insert request with blk_mq_request_bypass_insert.
 - Modify the code to adapt the new patch 1.

V2:
 - Add 1st and 2nd patch to refactor the code.


Jianchao Wang (4)
blk-mq: insert to hctx dispatch list when
blk-mq: refactor the code of issue request directly
blk-mq: issue directly with bypass 'false' in
blk-mq: replace and kill blk_mq_request_issue_directly

block/blk-core.c  |   4 +-
block/blk-mq-sched.c  |   8 ++-
block/blk-mq.c| 129 +-
block/blk-mq.h|   6 ++-
include/linux/blk_types.h |   1 +
5 files changed, 71 insertions(+), 77 deletions(-)

Thanks
Jianchao

[PATCH V9 4/4] blk-mq: replace and kill blk_mq_request_issue_directly

2018-12-04 Thread Jianchao Wang

Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly
in blk_insert_cloned_request and kill it as nobody uses it any more.

Signed-off-by: Jianchao Wang 
---
 block/blk-core.c | 4 +++-
 block/blk-mq.c   | 9 +
 block/blk-mq.h   | 6 --
 3 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index ad59102..c92d866 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1297,6 +1297,8 @@ static int blk_cloned_rq_check_limits(struct 
request_queue *q,
  */
 blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request 
*rq)
 {
+   blk_qc_t unused;
+
if (blk_cloned_rq_check_limits(q, rq))
return BLK_STS_IOERR;
 
@@ -1312,7 +1314,7 @@ blk_status_t blk_insert_cloned_request(struct 
request_queue *q, struct request *
 * bypass a potential scheduler on the bottom device for
 * insert.
 */
-   return blk_mq_request_issue_directly(rq, true);
+   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, true);
 }
 EXPORT_SYMBOL_GPL(blk_insert_cloned_request);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 0dfa269..9d5c949 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1815,7 +1815,7 @@ static bool blk_rq_can_direct_dispatch(struct request *rq)
return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE;
 }
 
-static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
bool bypass, bool last)
@@ -1889,13 +1889,6 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last)
-{
-   blk_qc_t unused;
-
-   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, last);
-}
-
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
diff --git a/block/blk-mq.h b/block/blk-mq.h
index a664ea4..b81e619 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -68,8 +68,10 @@ void blk_mq_request_bypass_insert(struct request *rq, bool 
run_queue);
 void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct list_head *list);
 
-/* Used by blk_insert_cloned_request() to issue request directly */
-blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last);
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+   struct request *rq,
+   blk_qc_t *cookie,
+   bool bypass, bool last);
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list);
 
-- 
2.7.4

[PATCH V9 2/4] blk-mq: refactor the code of issue request directly

2018-12-04 Thread Jianchao Wang

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface to unify the interfaces to issue requests
directly. The merged interface takes over the requests totally,
it could insert, end or do nothing based on the return value of
.queue_rq and 'bypass' parameter. Then caller needn't any other
handling any more and then code could be cleaned up.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 116 +++--
 1 file changed, 54 insertions(+), 62 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 153af90..fe92e52 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1815,93 +1815,85 @@ static bool blk_rq_can_direct_dispatch(struct request 
*rq)
return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE;
 }
 
-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
-   bool bypass_insert, bool last)
+   bool bypass, bool last)
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   blk_status_t ret = BLK_STS_RESOURCE;
+   int srcu_idx;
bool force = false;
 
+   if (!blk_rq_can_direct_dispatch(rq)) {
+   /*
+* Insert request to hctx dispatch list for 'bypass == true'
+* case, otherwise, the caller will fail forever.
+*/
+   if (bypass)
+   force = true;
+   goto out;
+   }
+
+   hctx_lock(hctx, &srcu_idx);
/*
-* RCU or SRCU read lock is needed before checking quiesced flag.
+* hctx_lock is needed before checking quiesced flag.
 *
-* When queue is stopped or quiesced, ignore 'bypass_insert' from
-* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
-* and avoid driver to try to dispatch again.
+* When queue is stopped or quiesced, ignore 'bypass', insert
+* and return BLK_STS_OK to caller, and avoid driver to try to
+* dispatch again.
 */
-   if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
+   if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
run_queue = false;
-   bypass_insert = false;
-   goto insert;
+   bypass = false;
+   goto out_unlock;
}
 
-   if (q->elevator && !bypass_insert)
-   goto insert;
-
-   if (!blk_rq_can_direct_dispatch(rq)) {
-   /*
-* For 'bypass_insert == true' case, insert request into hctx
-* dispatch list.
-*/
-   if (bypass_insert)
-   force = true;
-   goto insert;
-   }
+   if (unlikely(q->elevator && !bypass))
+   goto out_unlock;
 
if (!blk_mq_get_dispatch_budget(hctx))
-   goto insert;
+   goto out_unlock;
 
if (!blk_mq_get_driver_tag(rq)) {
blk_mq_put_dispatch_budget(hctx);
-   goto insert;
+   goto out_unlock;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie, last);
-insert:
-   if (force) {
-   blk_mq_request_bypass_insert(rq, run_queue);
-   return BLK_STS_OK;
-   } else if (bypass_insert) {
-   return BLK_STS_RESOURCE;
+   ret = __blk_mq_issue_directly(hctx, rq, cookie, last);
+out_unlock:
+   hctx_unlock(hctx, srcu_idx);
+out:
+   switch (ret) {
+   case BLK_STS_OK:
+   break;
+   case BLK_STS_DEV_RESOURCE:
+   case BLK_STS_RESOURCE:
+   if (force) {
+   blk_mq_request_bypass_insert(rq, run_queue);
+   ret = BLK_STS_OK;
+   } else if (!bypass) {
+   blk_mq_sched_insert_request(rq, false, run_queue, 
false);
+   ret = BLK_STS_OK;
+   }
+   break;
+   default:
+   if (!bypass) {
+   blk_mq_end_request(rq, ret);
+   ret = BLK_STS_OK;
+   }
+   break;
}
 
-   blk_mq_sched_insert_request(rq, false, run_queue, false);
-   return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-   struct request *rq, blk_qc_t *cookie)
-{
-   blk_status_t ret;
-   int srcu_idx;
-
-   might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
-
-   hctx_lock(hctx, &srcu_idx);
-
-   ret = __blk_mq_try_issue_directly(hct

[PATCH V9 1/4] blk-mq: insert to hctx dispatch list when bypass_insert is true

2018-12-04 Thread Jianchao Wang

We don't allow direct dispatch of anything but regular reads/writes
and insert all of non-read-write requests. However, this is not
correct for 'bypass_insert == true' case where inserting is ignored
and BLK_STS_RESOURCE is returned. The caller will fail forever.

Fix it with inserting the non-read-write request to hctx dispatch
list to avoid to involve merge and io scheduler when bypass_insert
is true.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9005505..153af90 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1822,6 +1822,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   bool force = false;
 
/*
 * RCU or SRCU read lock is needed before checking quiesced flag.
@@ -1836,9 +1837,19 @@ static blk_status_t __blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
goto insert;
}
 
-   if (!blk_rq_can_direct_dispatch(rq) || (q->elevator && !bypass_insert))
+   if (q->elevator && !bypass_insert)
goto insert;
 
+   if (!blk_rq_can_direct_dispatch(rq)) {
+   /*
+* For 'bypass_insert == true' case, insert request into hctx
+* dispatch list.
+*/
+   if (bypass_insert)
+   force = true;
+   goto insert;
+   }
+
if (!blk_mq_get_dispatch_budget(hctx))
goto insert;
 
@@ -1849,8 +1860,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 
return __blk_mq_issue_directly(hctx, rq, cookie, last);
 insert:
-   if (bypass_insert)
+   if (force) {
+   blk_mq_request_bypass_insert(rq, run_queue);
+   return BLK_STS_OK;
+   } else if (bypass_insert) {
return BLK_STS_RESOURCE;
+   }
 
blk_mq_sched_insert_request(rq, false, run_queue, false);
return BLK_STS_OK;
-- 
2.7.4

[PATCH V8 0/3] blk-mq: refactor code of issue request directly

2018-11-14 Thread Jianchao Wang

Hi Jens

Please consider this patchset for 4.21.
It refactors the code of issue request directly to unify the interface
and make the code clearer and more readable.

The 1st patch refactors the code of issue request directly to unify the
helper interface which could handle all the cases.

The 2nd patch make blk_mq_sched_insert_requests issue requests directly
with 'bypass' false, then it needn't to handle the non-issued requests
any more.

The 3rd patch replace and kill the blk_mq_request_issue_directly.

V8:
 - drop the original 2nd patch which try to insert requests into hctx->dispatch
   if quiesced or stopped.
 - remove two wrong 'unlikely'

V7:
 - drop the original 3rd patch which try to ensure hctx to be ran on
   its mapped cpu in issueing directly path.

V6:
 - drop original 1st patch to address Jen's comment
 - discard the enum mq_issue_decision and blk_mq_make_decision and use
   BLK_STS_* return values directly to address Jen's comment. (1/5)
 - add 'unlikely' in blk_mq_try_issue_directly (1/5)
 - refactor the 2nd and 3rd patch based on the new 1st patch.
 - reserve the unused_cookie in 4th and 5th patch

V5:
 - rebase against Jens' for-4.21/block branch
 - adjust the order of patch04 and patch05
 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert
 - comment changes

V4:
 - split the original patch 1 into two patches, 1st and 2nd patch currently
 - rename the mq_decision to mq_issue_decision
 - comment changes

V3:
 - Correct the code about the case bypass_insert is true and io scheduler
   attached. The request still need to be issued in case above. (1/4)
 - Refactor the code to make code clearer. blk_mq_make_request is introduced
   to decide insert, end or just return based on the return value of .queue_rq
   and bypass_insert (1/4) 
 - Add the 2nd patch. It introduce a new decision result which indicates to
   insert request with blk_mq_request_bypass_insert.
 - Modify the code to adapt the new patch 1.

V2:
 - Add 1st and 2nd patch to refactor the code.

Jianchao Wang(3)
  blk-mq: refactor the code of issue request directly
  blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
  blk-mq: replace and kill blk_mq_request_issue_directly

block/blk-core.c |   4 +-
block/blk-mq-sched.c |   8 ++--
block/blk-mq.c   | 103 +--
block/blk-mq.h   |   7 ++--
4 files changed, 53 insertions(+), 69 deletions(-)

Thanks
Jianchao

[PATCH V8 2/3] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests

2018-11-14 Thread Jianchao Wang

It is not necessary to issue request directly with bypass 'true'
in blk_mq_sched_insert_requests and handle the non-issued requests
itself. Just set bypass to 'false' and let blk_mq_try_issue_directly
handle them totally.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c |  8 +++-
 block/blk-mq.c   | 13 +++--
 2 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 66fda19..9af57c8 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -410,12 +410,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx 
*hctx,
 * busy in case of 'none' scheduler, and this way may save
 * us one extra enqueue & dequeue to sw queue.
 */
-   if (!hctx->dispatch_busy && !e && !run_queue_async) {
+   if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
-   if (list_empty(list))
-   return;
-   }
-   blk_mq_insert_requests(hctx, ctx, list);
+   else
+   blk_mq_insert_requests(hctx, ctx, list);
}
 
blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 1b57449..c8566f9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1840,21 +1840,14 @@ blk_status_t blk_mq_request_issue_directly(struct 
request *rq)
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
+   blk_qc_t unused_cookie;
+
while (!list_empty(list)) {
-   blk_status_t ret;
struct request *rq = list_first_entry(list, struct request,
queuelist);
 
list_del_init(&rq->queuelist);
-   ret = blk_mq_request_issue_directly(rq);
-   if (ret != BLK_STS_OK) {
-   if (ret == BLK_STS_RESOURCE ||
-   ret == BLK_STS_DEV_RESOURCE) {
-   list_add(&rq->queuelist, list);
-   break;
-   }
-   blk_mq_end_request(rq, ret);
-   }
+   blk_mq_try_issue_directly(hctx, rq, &unused_cookie, false);
}
 }
 
-- 
2.7.4

[PATCH V8 1/3] blk-mq: refactor the code of issue request directly

2018-11-14 Thread Jianchao Wang

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface to unify the interfaces to issue requests
directly. The merged interface takes over the requests totally,
it could insert, end or do nothing based on the return value of
.queue_rq and 'bypass' parameter. Then caller needn't any other
handling any more.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 89 --
 1 file changed, 43 insertions(+), 46 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 411be60..1b57449 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1766,78 +1766,75 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
-   bool bypass_insert)
+   bool bypass)
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   blk_status_t ret = BLK_STS_RESOURCE;
+   int srcu_idx;
 
+   hctx_lock(hctx, &srcu_idx);
/*
-* RCU or SRCU read lock is needed before checking quiesced flag.
+* hctx_lock is needed before checking quiesced flag.
 *
-* When queue is stopped or quiesced, ignore 'bypass_insert' from
-* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
-* and avoid driver to try to dispatch again.
+* When queue is stopped or quiesced, ignore 'bypass', insert
+* and return BLK_STS_OK to caller, and avoid driver to try to
+* dispatch again.
 */
-   if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
+   if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
run_queue = false;
-   bypass_insert = false;
-   goto insert;
+   bypass = false;
+   goto out_unlock;
}
 
-   if (q->elevator && !bypass_insert)
-   goto insert;
+   /*
+* Bypass the potential scheduler on the bottom device.
+*/
+   if (unlikely(q->elevator && !bypass))
+   goto out_unlock;
 
if (!blk_mq_get_dispatch_budget(hctx))
-   goto insert;
+   goto out_unlock;
 
if (!blk_mq_get_driver_tag(rq)) {
blk_mq_put_dispatch_budget(hctx);
-   goto insert;
+   goto out_unlock;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie);
-insert:
-   if (bypass_insert)
-   return BLK_STS_RESOURCE;
-
-   blk_mq_sched_insert_request(rq, false, run_queue, false);
-   return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-   struct request *rq, blk_qc_t *cookie)
-{
-   blk_status_t ret;
-   int srcu_idx;
-
-   might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
+   ret = __blk_mq_issue_directly(hctx, rq, cookie);
 
-   hctx_lock(hctx, &srcu_idx);
+out_unlock:
+   hctx_unlock(hctx, srcu_idx);
 
-   ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
-   if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
-   blk_mq_sched_insert_request(rq, false, true, false);
-   else if (ret != BLK_STS_OK)
-   blk_mq_end_request(rq, ret);
+   switch (ret) {
+   case BLK_STS_OK:
+   break;
+   case BLK_STS_DEV_RESOURCE:
+   case BLK_STS_RESOURCE:
+   if (!bypass) {
+   blk_mq_sched_insert_request(rq, false, run_queue, 
false);
+   ret = BLK_STS_OK;
+   }
+   break;
+   default:
+   if (!bypass) {
+   blk_mq_end_request(rq, ret);
+   ret = BLK_STS_OK;
+   }
+   break;
+   }
 
-   hctx_unlock(hctx, srcu_idx);
+   return ret;
 }
 
 blk_status_t blk_mq_request_issue_directly(struct request *rq)
 {
-   blk_status_t ret;
-   int srcu_idx;
blk_qc_t unused_cookie;
-   struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
 
-   hctx_lock(hctx, &srcu_idx);
-   ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true);
-   hctx_unlock(hctx, srcu_idx);
-
-   return ret;
+   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true);
 }
 
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
@@ -1958,13 +1955,13 @@ static blk_qc_t blk_mq_make_request(struct 
request_queue *q, struct bio *bio)
if (same_queue_rq) {
da

[PATCH V8 3/3] blk-mq: replace and kill blk_mq_request_issue_directly

2018-11-14 Thread Jianchao Wang

Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly
in blk_insert_cloned_request and kill it as nobody uses it any more.

Signed-off-by: Jianchao Wang 
---
 block/blk-core.c | 4 +++-
 block/blk-mq.c   | 9 +
 block/blk-mq.h   | 7 ---
 3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index fdc0ad2..e4eedc7 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1421,6 +1421,8 @@ static int blk_cloned_rq_check_limits(struct 
request_queue *q,
  */
 blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request 
*rq)
 {
+   blk_qc_t unused_cookie;
+
if (blk_cloned_rq_check_limits(q, rq))
return BLK_STS_IOERR;
 
@@ -1436,7 +1438,7 @@ blk_status_t blk_insert_cloned_request(struct 
request_queue *q, struct request *
 * bypass a potential scheduler on the bottom device for
 * insert.
 */
-   return blk_mq_request_issue_directly(rq);
+   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true);
 }
 EXPORT_SYMBOL_GPL(blk_insert_cloned_request);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index c8566f9..784ca6b 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1766,7 +1766,7 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
bool bypass)
@@ -1830,13 +1830,6 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-blk_status_t blk_mq_request_issue_directly(struct request *rq)
-{
-   blk_qc_t unused_cookie;
-
-   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true);
-}
-
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
diff --git a/block/blk-mq.h b/block/blk-mq.h
index facb6e9..f18c27c 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -61,9 +61,10 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, 
struct request *rq,
 void blk_mq_request_bypass_insert(struct request *rq, bool run_queue);
 void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct list_head *list);
-
-/* Used by blk_insert_cloned_request() to issue request directly */
-blk_status_t blk_mq_request_issue_directly(struct request *rq);
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+   struct request *rq,
+   blk_qc_t *cookie,
+   bool bypass);
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list);
 
-- 
2.7.4

[PATCH V7 3/4] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests

2018-11-14 Thread Jianchao Wang

It is not necessary to issue request directly with bypass 'true'
in blk_mq_sched_insert_requests and handle the non-issued requests
itself. Just set bypass to 'false' and let blk_mq_try_issue_directly
handle them totally.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c |  8 +++-
 block/blk-mq.c   | 13 +++--
 2 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 66fda19..9af57c8 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -410,12 +410,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx 
*hctx,
 * busy in case of 'none' scheduler, and this way may save
 * us one extra enqueue & dequeue to sw queue.
 */
-   if (!hctx->dispatch_busy && !e && !run_queue_async) {
+   if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
-   if (list_empty(list))
-   return;
-   }
-   blk_mq_insert_requests(hctx, ctx, list);
+   else
+   blk_mq_insert_requests(hctx, ctx, list);
}
 
blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 11c52bb..049fd47 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1843,21 +1843,14 @@ blk_status_t blk_mq_request_issue_directly(struct 
request *rq)
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
+   blk_qc_t unused_cookie;
+
while (!list_empty(list)) {
-   blk_status_t ret;
struct request *rq = list_first_entry(list, struct request,
queuelist);
 
list_del_init(&rq->queuelist);
-   ret = blk_mq_request_issue_directly(rq);
-   if (ret != BLK_STS_OK) {
-   if (ret == BLK_STS_RESOURCE ||
-   ret == BLK_STS_DEV_RESOURCE) {
-   list_add(&rq->queuelist, list);
-   break;
-   }
-   blk_mq_end_request(rq, ret);
-   }
+   blk_mq_try_issue_directly(hctx, rq, &unused_cookie, false);
}
 }
 
-- 
2.7.4

[PATCH V7 1/4] blk-mq: refactor the code of issue request directly

2018-11-14 Thread Jianchao Wang

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface to unify the interfaces to issue requests
directly. The merged interface takes over the requests totally,
it could insert, end or do nothing based on the return value of
.queue_rq and 'bypass' parameter. Then caller needn't any other
handling any more.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 93 --
 1 file changed, 45 insertions(+), 48 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 411be60..14b4d06 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1766,78 +1766,75 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
-   bool bypass_insert)
+   bool bypass)
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   blk_status_t ret = BLK_STS_RESOURCE;
+   int srcu_idx;
 
+   hctx_lock(hctx, &srcu_idx);
/*
-* RCU or SRCU read lock is needed before checking quiesced flag.
+* hctx_lock is needed before checking quiesced flag.
 *
-* When queue is stopped or quiesced, ignore 'bypass_insert' from
-* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
-* and avoid driver to try to dispatch again.
+* When queue is stopped or quiesced, ignore 'bypass', insert
+* and return BLK_STS_OK to caller, and avoid driver to try to
+* dispatch again.
 */
-   if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
+   if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
run_queue = false;
-   bypass_insert = false;
-   goto insert;
+   bypass = false;
+   goto out_unlock;
}
 
-   if (q->elevator && !bypass_insert)
-   goto insert;
+   /*
+* Bypass the potential scheduler on the bottom device.
+*/
+   if (unlikely(q->elevator && !bypass))
+   goto out_unlock;
 
-   if (!blk_mq_get_dispatch_budget(hctx))
-   goto insert;
+   if (unlikely(!blk_mq_get_dispatch_budget(hctx)))
+   goto out_unlock;
 
-   if (!blk_mq_get_driver_tag(rq)) {
+   if (unlikely(!blk_mq_get_driver_tag(rq))) {
blk_mq_put_dispatch_budget(hctx);
-   goto insert;
+   goto out_unlock;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie);
-insert:
-   if (bypass_insert)
-   return BLK_STS_RESOURCE;
-
-   blk_mq_sched_insert_request(rq, false, run_queue, false);
-   return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-   struct request *rq, blk_qc_t *cookie)
-{
-   blk_status_t ret;
-   int srcu_idx;
-
-   might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
+   ret = __blk_mq_issue_directly(hctx, rq, cookie);
 
-   hctx_lock(hctx, &srcu_idx);
+out_unlock:
+   hctx_unlock(hctx, srcu_idx);
 
-   ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
-   if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
-   blk_mq_sched_insert_request(rq, false, true, false);
-   else if (ret != BLK_STS_OK)
-   blk_mq_end_request(rq, ret);
+   switch (ret) {
+   case BLK_STS_OK:
+   break;
+   case BLK_STS_DEV_RESOURCE:
+   case BLK_STS_RESOURCE:
+   if (!bypass) {
+   blk_mq_sched_insert_request(rq, false, run_queue, 
false);
+   ret = BLK_STS_OK;
+   }
+   break;
+   default:
+   if (!bypass) {
+   blk_mq_end_request(rq, ret);
+   ret = BLK_STS_OK;
+   }
+   break;
+   }
 
-   hctx_unlock(hctx, srcu_idx);
+   return ret;
 }
 
 blk_status_t blk_mq_request_issue_directly(struct request *rq)
 {
-   blk_status_t ret;
-   int srcu_idx;
blk_qc_t unused_cookie;
-   struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
 
-   hctx_lock(hctx, &srcu_idx);
-   ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true);
-   hctx_unlock(hctx, srcu_idx);
-
-   return ret;
+   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true);
 }
 
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
@@ -1958,13 +1955,13 @@ static blk_qc_t blk

[PATCH V7 2/4] blk-mq: fix issue directly case when q is stopped or quiesced

2018-11-14 Thread Jianchao Wang

When try to issue request directly, if the queue is stopped or
quiesced, 'bypass' will be ignored and return BLK_STS_OK to caller
to avoid it dispatch request again. Then the request will be
inserted with blk_mq_sched_insert_request. This is not correct
for dm-rq case where we should avoid to pass through the underlying
path's io scheduler.

To fix it, use blk_mq_request_bypass_insert to insert the request
to hctx->dispatch when we cannot pass through io scheduler but have
to insert.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 14b4d06..11c52bb 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1772,7 +1772,7 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
bool bypass)
 {
struct request_queue *q = rq->q;
-   bool run_queue = true;
+   bool run_queue = true, force = false;
blk_status_t ret = BLK_STS_RESOURCE;
int srcu_idx;
 
@@ -1786,7 +1786,7 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 */
if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
run_queue = false;
-   bypass = false;
+   force = true;
goto out_unlock;
}
 
@@ -1817,6 +1817,9 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
if (!bypass) {
blk_mq_sched_insert_request(rq, false, run_queue, 
false);
ret = BLK_STS_OK;
+   } else if (force) {
+   blk_mq_request_bypass_insert(rq, run_queue);
+   ret = BLK_STS_OK;
}
break;
default:
-- 
2.7.4

[PATCH V7 4/4] blk-mq: replace and kill blk_mq_request_issue_directly

2018-11-14 Thread Jianchao Wang

Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly
in blk_insert_cloned_request and kill it as nobody uses it any more.

Signed-off-by: Jianchao Wang 
---
 block/blk-core.c | 4 +++-
 block/blk-mq.c   | 9 +
 block/blk-mq.h   | 7 ---
 3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index fdc0ad2..e4eedc7 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1421,6 +1421,8 @@ static int blk_cloned_rq_check_limits(struct 
request_queue *q,
  */
 blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request 
*rq)
 {
+   blk_qc_t unused_cookie;
+
if (blk_cloned_rq_check_limits(q, rq))
return BLK_STS_IOERR;
 
@@ -1436,7 +1438,7 @@ blk_status_t blk_insert_cloned_request(struct 
request_queue *q, struct request *
 * bypass a potential scheduler on the bottom device for
 * insert.
 */
-   return blk_mq_request_issue_directly(rq);
+   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true);
 }
 EXPORT_SYMBOL_GPL(blk_insert_cloned_request);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 049fd47..3fcf2f9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1766,7 +1766,7 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
bool bypass)
@@ -1833,13 +1833,6 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-blk_status_t blk_mq_request_issue_directly(struct request *rq)
-{
-   blk_qc_t unused_cookie;
-
-   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true);
-}
-
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
diff --git a/block/blk-mq.h b/block/blk-mq.h
index facb6e9..f18c27c 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -61,9 +61,10 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, 
struct request *rq,
 void blk_mq_request_bypass_insert(struct request *rq, bool run_queue);
 void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct list_head *list);
-
-/* Used by blk_insert_cloned_request() to issue request directly */
-blk_status_t blk_mq_request_issue_directly(struct request *rq);
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+   struct request *rq,
+   blk_qc_t *cookie,
+   bool bypass);
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list);
 
-- 
2.7.4

[PATCH V7 0/4] blk-mq: refactor and fix on issue request directly

2018-11-14 Thread Jianchao Wang



Hi Jens

Please consider this patchset for 4.21.
It refactors the code of issue request directly to unify the interface
and make the code clearer and more readable, and also fixes a defects
there.

The 1st patch refactors the code of issue request directly to unify the
helper interface which could handle all the cases.

The 2nd patch fix the issue that when queue is stopped or quiesced request
may pass through bottom device's potential io scheduler.

The 3rd patch make blk_mq_sched_insert_requests issue requests directly
with 'bypass' false, then it needn't to handle the non-issued requests
any more.

The 4th patch replace and kill the blk_mq_request_issue_directly.

V7:
 - drop the original 3rd patch which try to ensure hctx to be ran on mapped
   cpu. As it add get/put_cpu and cpumask test in hot path and it is not
   necessary for drivers to do the guarantee.

V6:
 - drop original 1st patch to address Jen's comment
 - discard the enum mq_issue_decision and blk_mq_make_decision and use
   BLK_STS_* return values directly to address Jen's comment. (1/5)
 - add 'unlikely' in blk_mq_try_issue_directly (1/5)
 - refactor the 2nd and 3rd patch based on the new 1st patch.
 - reserve the unused_cookie in 4th and 5th patch

V5:
 - rebase against Jens' for-4.21/block branch
 - adjust the order of patch04 and patch05
 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert
 - comment changes

V4:
 - split the original patch 1 into two patches, 1st and 2nd patch currently
 - rename the mq_decision to mq_issue_decision
 - comment changes

V3:
 - Correct the code about the case bypass_insert is true and io scheduler
   attached. The request still need to be issued in case above. (1/4)
 - Refactor the code to make code clearer. blk_mq_make_request is introduced
   to decide insert, end or just return based on the return value of .queue_rq
   and bypass_insert (1/4) 
 - Add the 2nd patch. It introduce a new decision result which indicates to
   insert request with blk_mq_request_bypass_insert.
 - Modify the code to adapt the new patch 1.

V2:
 - Add 1st and 2nd patch to refactor the code.

Jianchao Wang(4)
  blk-mq: refactor the code of issue request directly
  blk-mq: fix issue directly case when q is stopped or quiesced
  blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
  blk-mq: replace and kill blk_mq_request_issue_directly

block/blk-core.c |   4 +-
block/blk-mq-sched.c |   8 ++--
block/blk-mq.c   | 112 ++-
block/blk-mq.h   |   7 ++--
4 files changed, 59 insertions(+), 72 deletions(-)

Thanks
Jianchao

[PATCH V6 1/5] blk-mq: refactor the code of issue request directly

2018-11-13 Thread Jianchao Wang

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface to unify the interfaces to issue requests
directly. The merged interface takes over the requests totally,
it could insert, end or do nothing based on the return value of
.queue_rq and 'bypass' parameter. Then caller needn't any other
handling any more.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 93 --
 1 file changed, 45 insertions(+), 48 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 411be60..14b4d06 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1766,78 +1766,75 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
-   bool bypass_insert)
+   bool bypass)
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   blk_status_t ret = BLK_STS_RESOURCE;
+   int srcu_idx;
 
+   hctx_lock(hctx, &srcu_idx);
/*
-* RCU or SRCU read lock is needed before checking quiesced flag.
+* hctx_lock is needed before checking quiesced flag.
 *
-* When queue is stopped or quiesced, ignore 'bypass_insert' from
-* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
-* and avoid driver to try to dispatch again.
+* When queue is stopped or quiesced, ignore 'bypass', insert
+* and return BLK_STS_OK to caller, and avoid driver to try to
+* dispatch again.
 */
-   if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
+   if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
run_queue = false;
-   bypass_insert = false;
-   goto insert;
+   bypass = false;
+   goto out_unlock;
}
 
-   if (q->elevator && !bypass_insert)
-   goto insert;
+   /*
+* Bypass the potential scheduler on the bottom device.
+*/
+   if (unlikely(q->elevator && !bypass))
+   goto out_unlock;
 
-   if (!blk_mq_get_dispatch_budget(hctx))
-   goto insert;
+   if (unlikely(!blk_mq_get_dispatch_budget(hctx)))
+   goto out_unlock;
 
-   if (!blk_mq_get_driver_tag(rq)) {
+   if (unlikely(!blk_mq_get_driver_tag(rq))) {
blk_mq_put_dispatch_budget(hctx);
-   goto insert;
+   goto out_unlock;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie);
-insert:
-   if (bypass_insert)
-   return BLK_STS_RESOURCE;
-
-   blk_mq_sched_insert_request(rq, false, run_queue, false);
-   return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-   struct request *rq, blk_qc_t *cookie)
-{
-   blk_status_t ret;
-   int srcu_idx;
-
-   might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
+   ret = __blk_mq_issue_directly(hctx, rq, cookie);
 
-   hctx_lock(hctx, &srcu_idx);
+out_unlock:
+   hctx_unlock(hctx, srcu_idx);
 
-   ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
-   if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
-   blk_mq_sched_insert_request(rq, false, true, false);
-   else if (ret != BLK_STS_OK)
-   blk_mq_end_request(rq, ret);
+   switch (ret) {
+   case BLK_STS_OK:
+   break;
+   case BLK_STS_DEV_RESOURCE:
+   case BLK_STS_RESOURCE:
+   if (!bypass) {
+   blk_mq_sched_insert_request(rq, false, run_queue, 
false);
+   ret = BLK_STS_OK;
+   }
+   break;
+   default:
+   if (!bypass) {
+   blk_mq_end_request(rq, ret);
+   ret = BLK_STS_OK;
+   }
+   break;
+   }
 
-   hctx_unlock(hctx, srcu_idx);
+   return ret;
 }
 
 blk_status_t blk_mq_request_issue_directly(struct request *rq)
 {
-   blk_status_t ret;
-   int srcu_idx;
blk_qc_t unused_cookie;
-   struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
 
-   hctx_lock(hctx, &srcu_idx);
-   ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true);
-   hctx_unlock(hctx, srcu_idx);
-
-   return ret;
+   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true);
 }
 
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
@@ -1958,13 +1955,13 @@ static blk_qc_t blk

[PATCH V6 4/5] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests

2018-11-13 Thread Jianchao Wang

It is not necessary to issue request directly with bypass 'true'
in blk_mq_sched_insert_requests and handle the non-issued requests
itself. Just set bypass to 'false' and let blk_mq_try_issue_directly
handle them totally.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c |  8 +++-
 block/blk-mq.c   | 13 +++--
 2 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 66fda19..9af57c8 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -410,12 +410,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx 
*hctx,
 * busy in case of 'none' scheduler, and this way may save
 * us one extra enqueue & dequeue to sw queue.
 */
-   if (!hctx->dispatch_busy && !e && !run_queue_async) {
+   if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
-   if (list_empty(list))
-   return;
-   }
-   blk_mq_insert_requests(hctx, ctx, list);
+   else
+   blk_mq_insert_requests(hctx, ctx, list);
}
 
blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 58f15cc..f41a815 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1855,21 +1855,14 @@ blk_status_t blk_mq_request_issue_directly(struct 
request *rq)
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
+   blk_qc_t unused_cookie;
+
while (!list_empty(list)) {
-   blk_status_t ret;
struct request *rq = list_first_entry(list, struct request,
queuelist);
 
list_del_init(&rq->queuelist);
-   ret = blk_mq_request_issue_directly(rq);
-   if (ret != BLK_STS_OK) {
-   if (ret == BLK_STS_RESOURCE ||
-   ret == BLK_STS_DEV_RESOURCE) {
-   list_add(&rq->queuelist, list);
-   break;
-   }
-   blk_mq_end_request(rq, ret);
-   }
+   blk_mq_try_issue_directly(hctx, rq, &unused_cookie, false);
}
 }
 
-- 
2.7.4

[PATCH V6 3/5] blk-mq: ensure hctx to be ran on mapped cpu when issue directly

2018-11-13 Thread Jianchao Wang

When issue request directly and the task is migrated out of the
original cpu where it allocates request, hctx could be ran on
the cpu where it is not mapped.
To fix this,
 - insert the request forcibly if BLK_MQ_F_BLOCKING is set.
 - check whether the current is mapped to the hctx, if not, insert
   forcibly.
 - invoke __blk_mq_issue_directly under preemption disabled.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 11c52bb..58f15cc 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1776,6 +1776,17 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
blk_status_t ret = BLK_STS_RESOURCE;
int srcu_idx;
 
+   if (hctx->flags & BLK_MQ_F_BLOCKING) {
+   force = true;
+   goto out;
+   }
+
+   if (unlikely(!cpumask_test_cpu(get_cpu(), hctx->cpumask))) {
+   put_cpu();
+   force = true;
+   goto out;
+   }
+
hctx_lock(hctx, &srcu_idx);
/*
 * hctx_lock is needed before checking quiesced flag.
@@ -1808,7 +1819,8 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 
 out_unlock:
hctx_unlock(hctx, srcu_idx);
-
+   put_cpu();
+out:
switch (ret) {
case BLK_STS_OK:
break;
-- 
2.7.4

[PATCH V6 2/5] blk-mq: fix issue directly case when q is stopped or quiesced

2018-11-13 Thread Jianchao Wang

When try to issue request directly, if the queue is stopped or
quiesced, 'bypass' will be ignored and return BLK_STS_OK to caller
to avoid it dispatch request again. Then the request will be
inserted with blk_mq_sched_insert_request. This is not correct
for dm-rq case where we should avoid to pass through the underlying
path's io scheduler.

To fix it, use blk_mq_request_bypass_insert to insert the request
to hctx->dispatch when we cannot pass through io scheduler but have
to insert.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 14b4d06..11c52bb 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1772,7 +1772,7 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
bool bypass)
 {
struct request_queue *q = rq->q;
-   bool run_queue = true;
+   bool run_queue = true, force = false;
blk_status_t ret = BLK_STS_RESOURCE;
int srcu_idx;
 
@@ -1786,7 +1786,7 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 */
if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
run_queue = false;
-   bypass = false;
+   force = true;
goto out_unlock;
}
 
@@ -1817,6 +1817,9 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
if (!bypass) {
blk_mq_sched_insert_request(rq, false, run_queue, 
false);
ret = BLK_STS_OK;
+   } else if (force) {
+   blk_mq_request_bypass_insert(rq, run_queue);
+   ret = BLK_STS_OK;
}
break;
default:
-- 
2.7.4

[PATCH V6 5/5] blk-mq: replace and kill blk_mq_request_issue_directly

2018-11-13 Thread Jianchao Wang

Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly
in blk_insert_cloned_request and kill it as nobody uses it any more.

Signed-off-by: Jianchao Wang 
---
 block/blk-core.c | 4 +++-
 block/blk-mq.c   | 9 +
 block/blk-mq.h   | 7 ---
 3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index fdc0ad2..e4eedc7 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1421,6 +1421,8 @@ static int blk_cloned_rq_check_limits(struct 
request_queue *q,
  */
 blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request 
*rq)
 {
+   blk_qc_t unused_cookie;
+
if (blk_cloned_rq_check_limits(q, rq))
return BLK_STS_IOERR;
 
@@ -1436,7 +1438,7 @@ blk_status_t blk_insert_cloned_request(struct 
request_queue *q, struct request *
 * bypass a potential scheduler on the bottom device for
 * insert.
 */
-   return blk_mq_request_issue_directly(rq);
+   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true);
 }
 EXPORT_SYMBOL_GPL(blk_insert_cloned_request);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f41a815..b5316c78 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1766,7 +1766,7 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
bool bypass)
@@ -1845,13 +1845,6 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-blk_status_t blk_mq_request_issue_directly(struct request *rq)
-{
-   blk_qc_t unused_cookie;
-
-   return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true);
-}
-
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
diff --git a/block/blk-mq.h b/block/blk-mq.h
index facb6e9..f18c27c 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -61,9 +61,10 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, 
struct request *rq,
 void blk_mq_request_bypass_insert(struct request *rq, bool run_queue);
 void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct list_head *list);
-
-/* Used by blk_insert_cloned_request() to issue request directly */
-blk_status_t blk_mq_request_issue_directly(struct request *rq);
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+   struct request *rq,
+   blk_qc_t *cookie,
+   bool bypass);
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list);
 
-- 
2.7.4

[PATCH V6 0/5] blk-mq: refactor and fix on issue request directly

2018-11-13 Thread Jianchao Wang

Hi Jens

Please consider this patchset for 4.21.
It refactors the code of issue request directly to unify the interface
and make the code clearer and more readable, and also fixes two defects
there.

The 1st patch refactors the code of issue request directly to unify the
helper interface which could handle all the cases.

The 2nd patch fix the issue that when queue is stopped or quiesced request
may pass through bottom device's potential io scheduler.

The 3rd patch fix the issue that hctx maybe ran on a cpu where it is not
mapped in issue directly path.

The 4th patch make blk_mq_sched_insert_requests issue requests directly
with 'bypass' false, then it needn't to handle the non-issued requests
any more.

The 5th patch replace and kill the blk_mq_request_issue_directly.

V6:
 - drop original 1st patch to address Jen's comment
 - discard the enum mq_issue_decision and blk_mq_make_decision and use
   BLK_STS_* return values directly to address Jen's comment. (1/5)
 - add 'unlikely' in blk_mq_try_issue_directly (1/5)
 - refactor the 2nd and 3rd patch based on the new 1st patch.
 - reserve the unused_cookie in 4th and 5th patch

V5:
 - rebase against Jens' for-4.21/block branch
 - adjust the order of patch04 and patch05
 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert
 - comment changes

V4:
 - split the original patch 1 into two patches, 1st and 2nd patch currently
 - rename the mq_decision to mq_issue_decision
 - comment changes

V3:
 - Correct the code about the case bypass_insert is true and io scheduler
   attached. The request still need to be issued in case above. (1/4)
 - Refactor the code to make code clearer. blk_mq_make_request is introduced
   to decide insert, end or just return based on the return value of .queue_rq
   and bypass_insert (1/4) 
 - Add the 2nd patch. It introduce a new decision result which indicates to
   insert request with blk_mq_request_bypass_insert.
 - Modify the code to adapt the new patch 1.

V2:
 - Add 1st and 2nd patch to refactor the code.

Jianchao Wang(6)
  blk-mq: refactor the code of issue request directly
  blk-mq: fix issue directly case when q is stopped or quiesced
  blk-mq: ensure hctx to be ran on mapped cpu when issue directly
  blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
  blk-mq: replace and kill blk_mq_request_issue_directly

block/blk-core.c |   4 +-
block/blk-mq-sched.c |   8 ++--
block/blk-mq.c   | 126 +--
block/blk-mq.h   |   7 +--
4 files changed, 72 insertions(+), 73 deletions(-)

Thanks
Jianchao

[PATCH V5 3/6] blk-mq: fix issue directly case when q is stopped or quiesced

2018-11-12 Thread Jianchao Wang

When try to issue request directly, if the queue is stopped or
quiesced, 'bypass' will be ignored and return BLK_STS_OK to caller
to avoid it dispatch request again. Then the request will be
inserted with blk_mq_sched_insert_request. This is not correct
for dm-rq case where we should avoid to pass through the underlying
path's io scheduler.

To fix it, add new mq_issue_decision entry MQ_ISSUE_INSERT_DISPATCH
for above case where the request need to be inserted forcibly.
And use blk_mq_request_bypass_insert to insert the request into
hctx->dispatch directly.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 48b7a7c..f54c092 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1768,12 +1768,13 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 
 enum mq_issue_decision {
MQ_ISSUE_INSERT_QUEUE,
+   MQ_ISSUE_INSERT_DISPATCH,
MQ_ISSUE_END_REQUEST,
MQ_ISSUE_DO_NOTHING,
 };
 
 static inline enum mq_issue_decision
-   blk_mq_make_dicision(blk_status_t ret, bool bypass)
+   blk_mq_make_dicision(blk_status_t ret, bool bypass, bool force)
 {
enum mq_issue_decision dec;
 
@@ -1783,7 +1784,10 @@ static inline enum mq_issue_decision
break;
case BLK_STS_DEV_RESOURCE:
case BLK_STS_RESOURCE:
-   dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_INSERT_QUEUE;
+   if (force)
+   dec = bypass ? MQ_ISSUE_INSERT_DISPATCH : 
MQ_ISSUE_INSERT_QUEUE;
+   else
+   dec = bypass ? MQ_ISSUE_DO_NOTHING : 
MQ_ISSUE_INSERT_QUEUE;
break;
default:
dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_END_REQUEST;
@@ -1799,7 +1803,7 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
bool bypass)
 {
struct request_queue *q = rq->q;
-   bool run_queue = true;
+   bool run_queue = true, force = false;
blk_status_t ret = BLK_STS_RESOURCE;
enum mq_issue_decision dec;
int srcu_idx;
@@ -1814,7 +1818,7 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 */
if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
run_queue = false;
-   bypass = false;
+   force = true;
goto out_unlock;
}
 
@@ -1837,11 +1841,14 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 out_unlock:
hctx_unlock(hctx, srcu_idx);
 
-   dec = blk_mq_make_dicision(ret, bypass);
+   dec = blk_mq_make_dicision(ret, bypass, force);
switch(dec) {
case MQ_ISSUE_INSERT_QUEUE:
blk_mq_sched_insert_request(rq, false, run_queue, false);
break;
+   case MQ_ISSUE_INSERT_DISPATCH:
+   blk_mq_request_bypass_insert(rq, run_queue);
+   break;
case MQ_ISSUE_END_REQUEST:
blk_mq_end_request(rq, ret);
break;
-- 
2.7.4

[PATCH V5 2/6] blk-mq: refactor the code of issue request directly

2018-11-12 Thread Jianchao Wang

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface to unify the interfaces to issue requests
directly. The merged interface takes over the requests totally,
it could insert, end or do nothing based on the return value of
.queue_rq and 'bypass' parameter. Then caller needn't any other
handling any more.

To make code clearer, introduce new helpers enum mq_issue_decision
and blk_mq_make_decision to decide how to handle the non-issued
requests.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 108 +
 1 file changed, 63 insertions(+), 45 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 364a53f..48b7a7c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1766,77 +1766,95 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+enum mq_issue_decision {
+   MQ_ISSUE_INSERT_QUEUE,
+   MQ_ISSUE_END_REQUEST,
+   MQ_ISSUE_DO_NOTHING,
+};
+
+static inline enum mq_issue_decision
+   blk_mq_make_dicision(blk_status_t ret, bool bypass)
+{
+   enum mq_issue_decision dec;
+
+   switch(ret) {
+   case BLK_STS_OK:
+   dec = MQ_ISSUE_DO_NOTHING;
+   break;
+   case BLK_STS_DEV_RESOURCE:
+   case BLK_STS_RESOURCE:
+   dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_INSERT_QUEUE;
+   break;
+   default:
+   dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_END_REQUEST;
+   break;
+   }
+
+   return dec;
+}
+
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
-   bool bypass_insert)
+   bool bypass)
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   blk_status_t ret = BLK_STS_RESOURCE;
+   enum mq_issue_decision dec;
+   int srcu_idx;
 
+   hctx_lock(hctx, &srcu_idx);
/*
-* RCU or SRCU read lock is needed before checking quiesced flag.
+* hctx_lock is needed before checking quiesced flag.
 *
-* When queue is stopped or quiesced, ignore 'bypass_insert' from
-* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
-* and avoid driver to try to dispatch again.
+* When queue is stopped or quiesced, ignore 'bypass', insert
+* and return BLK_STS_OK to caller, and avoid driver to try to
+* dispatch again.
 */
if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
run_queue = false;
-   bypass_insert = false;
-   goto insert;
+   bypass = false;
+   goto out_unlock;
}
 
-   if (q->elevator && !bypass_insert)
-   goto insert;
+   /*
+* Bypass the potential scheduler on the bottom device.
+*/
+   if (q->elevator && !bypass)
+   goto out_unlock;
 
if (!blk_mq_get_dispatch_budget(hctx))
-   goto insert;
+   goto out_unlock;
 
if (!blk_mq_get_driver_tag(rq)) {
blk_mq_put_dispatch_budget(hctx);
-   goto insert;
+   goto out_unlock;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie);
-insert:
-   if (bypass_insert)
-   return BLK_STS_RESOURCE;
-
-   blk_mq_sched_insert_request(rq, false, run_queue, false);
-   return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-   struct request *rq, blk_qc_t *cookie)
-{
-   blk_status_t ret;
-   int srcu_idx;
+   ret = __blk_mq_issue_directly(hctx, rq, cookie);
 
-   might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
-
-   hctx_lock(hctx, &srcu_idx);
+out_unlock:
+   hctx_unlock(hctx, srcu_idx);
 
-   ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
-   if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
-   blk_mq_sched_insert_request(rq, false, true, false);
-   else if (ret != BLK_STS_OK)
+   dec = blk_mq_make_dicision(ret, bypass);
+   switch(dec) {
+   case MQ_ISSUE_INSERT_QUEUE:
+   blk_mq_sched_insert_request(rq, false, run_queue, false);
+   break;
+   case MQ_ISSUE_END_REQUEST:
blk_mq_end_request(rq, ret);
+   break;
+   default:
+   return ret;
+   }
 
-   hctx_unlock(hctx, srcu_idx);
+   return BLK_STS_OK;
 }
 
 blk_status_t blk_mq_request_issue_directly(struct request *rq)
 {
-   blk_status_t ret;
-   int srcu_idx;
-   st

[PATCH V5 6/6] blk-mq: replace and kill blk_mq_request_issue_directly

2018-11-12 Thread Jianchao Wang

Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly
in blk_insert_cloned_request and remove blk_mq_request_issue_directly
as nobody uses it.

Signed-off-by: Jianchao Wang 
---
 block/blk-core.c | 2 +-
 block/blk-mq.c   | 7 +--
 block/blk-mq.h   | 6 --
 3 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index fdc0ad2..5cae9ef 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1436,7 +1436,7 @@ blk_status_t blk_insert_cloned_request(struct 
request_queue *q, struct request *
 * bypass a potential scheduler on the bottom device for
 * insert.
 */
-   return blk_mq_request_issue_directly(rq);
+   return blk_mq_try_issue_directly(rq->mq_hctx, rq, NULL, true);
 }
 EXPORT_SYMBOL_GPL(blk_insert_cloned_request);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f27d547..f016028 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1797,7 +1797,7 @@ static inline enum mq_issue_decision
return dec;
 }
 
-static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
bool bypass)
@@ -1871,11 +1871,6 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return BLK_STS_OK;
 }
 
-blk_status_t blk_mq_request_issue_directly(struct request *rq)
-{
-   return blk_mq_try_issue_directly(rq->mq_hctx, rq, NULL, true);
-}
-
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
diff --git a/block/blk-mq.h b/block/blk-mq.h
index facb6e9..deee3a4 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -62,8 +62,10 @@ void blk_mq_request_bypass_insert(struct request *rq, bool 
run_queue);
 void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct list_head *list);
 
-/* Used by blk_insert_cloned_request() to issue request directly */
-blk_status_t blk_mq_request_issue_directly(struct request *rq);
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+  struct request *rq,
+  blk_qc_t *cookie,
+  bool bypass);
 void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list);
 
-- 
2.7.4

[PATCH V5 1/6] blk-mq: make __blk_mq_issue_directly be able to accept NULL cookie pointer

2018-11-12 Thread Jianchao Wang

Make __blk_mq_issue_directly be able to accept a NULL cookie pointer
and remove the dummy unused_cookie in blk_mq_request_issue_directly.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 411be60..364a53f 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1736,11 +1736,9 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
.rq = rq,
.last = true,
};
-   blk_qc_t new_cookie;
+   blk_qc_t new_cookie = BLK_QC_T_NONE;
blk_status_t ret;
 
-   new_cookie = request_to_qc_t(hctx, rq);
-
/*
 * For OK queue, we are done. For error, caller may kill it.
 * Any other error (busy), just add it to our list as we
@@ -1750,7 +1748,7 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
switch (ret) {
case BLK_STS_OK:
blk_mq_update_dispatch_busy(hctx, false);
-   *cookie = new_cookie;
+   new_cookie = request_to_qc_t(hctx, rq);
break;
case BLK_STS_RESOURCE:
case BLK_STS_DEV_RESOURCE:
@@ -1759,10 +1757,12 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
break;
default:
blk_mq_update_dispatch_busy(hctx, false);
-   *cookie = BLK_QC_T_NONE;
break;
}
 
+   if (cookie)
+   *cookie = new_cookie;
+
return ret;
 }
 
@@ -1830,11 +1830,10 @@ blk_status_t blk_mq_request_issue_directly(struct 
request *rq)
 {
blk_status_t ret;
int srcu_idx;
-   blk_qc_t unused_cookie;
struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
 
hctx_lock(hctx, &srcu_idx);
-   ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true);
+   ret = __blk_mq_try_issue_directly(hctx, rq, NULL, true);
hctx_unlock(hctx, srcu_idx);
 
return ret;
-- 
2.7.4

[PATCH V5 4/6] blk-mq: ensure hctx to be ran on mapped cpu when issue directly

2018-11-12 Thread Jianchao Wang

When issue request directly and the task is migrated out of the
original cpu where it allocates request, hctx could be ran on
the cpu where it is not mapped.
To fix this,
 - insert the request forcibly if BLK_MQ_F_BLOCKING is set.
 - check whether the current is mapped to the hctx, if not, insert
   forcibly.
 - invoke __blk_mq_issue_directly under preemption disabled.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index f54c092..7915f44 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1808,6 +1808,17 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
enum mq_issue_decision dec;
int srcu_idx;
 
+   if (hctx->flags & BLK_MQ_F_BLOCKING) {
+   force = true;
+   goto out;
+   }
+
+   if (!cpumask_test_cpu(get_cpu(), hctx->cpumask)) {
+   put_cpu();
+   force = true;
+   goto out;
+   }
+
hctx_lock(hctx, &srcu_idx);
/*
 * hctx_lock is needed before checking quiesced flag.
@@ -1840,7 +1851,8 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 
 out_unlock:
hctx_unlock(hctx, srcu_idx);
-
+   put_cpu();
+out:
dec = blk_mq_make_dicision(ret, bypass, force);
switch(dec) {
case MQ_ISSUE_INSERT_QUEUE:
-- 
2.7.4

[PATCH V5 5/6] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests

2018-11-12 Thread Jianchao Wang

It is not necessary to issue request directly with bypass 'true'
in blk_mq_sched_insert_requests and insert the non-issued requests
itself. Just set bypass to 'false' and let blk_mq_try_issue_directly
handle them totally.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c |  8 +++-
 block/blk-mq.c   | 11 +--
 2 files changed, 4 insertions(+), 15 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 66fda19..9af57c8 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -410,12 +410,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx 
*hctx,
 * busy in case of 'none' scheduler, and this way may save
 * us one extra enqueue & dequeue to sw queue.
 */
-   if (!hctx->dispatch_busy && !e && !run_queue_async) {
+   if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
-   if (list_empty(list))
-   return;
-   }
-   blk_mq_insert_requests(hctx, ctx, list);
+   else
+   blk_mq_insert_requests(hctx, ctx, list);
}
 
blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 7915f44..f27d547 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1880,20 +1880,11 @@ void blk_mq_try_issue_list_directly(struct 
blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
while (!list_empty(list)) {
-   blk_status_t ret;
struct request *rq = list_first_entry(list, struct request,
queuelist);
 
list_del_init(&rq->queuelist);
-   ret = blk_mq_request_issue_directly(rq);
-   if (ret != BLK_STS_OK) {
-   if (ret == BLK_STS_RESOURCE ||
-   ret == BLK_STS_DEV_RESOURCE) {
-   list_add(&rq->queuelist, list);
-   break;
-   }
-   blk_mq_end_request(rq, ret);
-   }
+   blk_mq_try_issue_directly(hctx, rq, NULL, false);
}
 }
 
-- 
2.7.4

[PATCH V5 0/6] blk-mq: refactor and fix on issue request directly

2018-11-12 Thread Jianchao Wang

Hi Jens

Please consider this patchset for 4.21.
It refactors the code of issue request directly to unify the interface
and make the code clearer and more readable, and also fixes two defects
there.

The 1st patch make __blk_mq_issue_directly be able to accept NULL cookie
pointer then we needn't the dummy unused_cookie.

The 2nd patch refactors the code of issue request directly to unify the
helper interface which could handle all the cases.

The 3rd patch fix the issue that when queue is stopped or quiesced request
may pass through bottom device's potential io scheduler.

The 4th patch fix the issue that hctx maybe ran on a cpu where it is not
mapped in issue directly path.

The 5th patch make blk_mq_sched_insert_requests issue requests directly
with 'bypass' false, then it needn't to handle the non-issued requests
any more.

The 6th patch replace and kill the one line blk_mq_request_issue_directly.

V5:
 - rebase against Jens' for-4.21/block branch
 - adjust the order of patch04 and patch05
 - add patch06 to replace and kill the one line blk_mq_request_bypass_insert
 - comment changes

V4:
 - split the original patch 1 into two patches, 1st and 2nd patch currently
 - rename the mq_decision to mq_issue_decision
 - comment changes

V3:
 - Correct the code about the case bypass_insert is true and io scheduler
   attached. The request still need to be issued in case above. (1/4)
 - Refactor the code to make code clearer. blk_mq_make_request is introduced
   to decide insert, end or just return based on the return value of .queue_rq
   and bypass_insert (1/4) 
 - Add the 2nd patch. It introduce a new decision result which indicates to
   insert request with blk_mq_request_bypass_insert.
 - Modify the code to adapt the new patch 1.

V2:
 - Add 1st and 2nd patch to refactor the code.

Jianchao Wang(6)
  blk-mq: make __blk_mq_issue_directly be able to accept NULL cookie pointer
  blk-mq: refactor the code of issue request directly
  blk-mq: fix issue directly case when q is stopped or quiesced
  blk-mq: ensure hctx to be ran on mapped cpu when issue directly
  blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
  blk-mq: replace and kill blk_mq_request_issue_directly

block/blk-core.c |   2 +-
block/blk-mq-sched.c |   8 +--
block/blk-mq.c   | 156 +--
block/blk-mq.h   |   6 +-
4 files changed, 97 insertions(+), 75 deletions(-)

Thanks
Jianchao

[PATCH V4 4/5] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests

2018-11-02 Thread Jianchao Wang

It is not necessary to issue request directly with bypass 'true'
in blk_mq_sched_insert_requests and insert the non-issued requests
itself. Just set bypass to 'false' and let blk_mq_try_issue_directly
handle them totally.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c |  8 +++-
 block/blk-mq.c   | 11 +--
 2 files changed, 4 insertions(+), 15 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 29bfe80..23cd97e 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -411,12 +411,10 @@ void blk_mq_sched_insert_requests(struct request_queue *q,
 * busy in case of 'none' scheduler, and this way may save
 * us one extra enqueue & dequeue to sw queue.
 */
-   if (!hctx->dispatch_busy && !e && !run_queue_async) {
+   if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
-   if (list_empty(list))
-   return;
-   }
-   blk_mq_insert_requests(hctx, ctx, list);
+   else
+   blk_mq_insert_requests(hctx, ctx, list);
}
 
blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index a0b9b6c..bf8b144 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1832,20 +1832,11 @@ void blk_mq_try_issue_list_directly(struct 
blk_mq_hw_ctx *hctx,
struct list_head *list)
 {
while (!list_empty(list)) {
-   blk_status_t ret;
struct request *rq = list_first_entry(list, struct request,
queuelist);
 
list_del_init(&rq->queuelist);
-   ret = blk_mq_request_issue_directly(rq);
-   if (ret != BLK_STS_OK) {
-   if (ret == BLK_STS_RESOURCE ||
-   ret == BLK_STS_DEV_RESOURCE) {
-   list_add(&rq->queuelist, list);
-   break;
-   }
-   blk_mq_end_request(rq, ret);
-   }
+   blk_mq_try_issue_directly(hctx, rq, NULL, false);
}
 }
 
-- 
2.7.4

[PATCH V4 2/5] blk-mq: refactor the code of issue request directly

2018-11-02 Thread Jianchao Wang

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface which is able to handle the return value from
.queue_rq callback. To make the code clearer, introduce new helpers
enum mq_issue_decision and blk_mq_make_decision to decide how to
handle the non-issued requests.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 104 +
 1 file changed, 61 insertions(+), 43 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index af5b591..962fdfc 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1729,78 +1729,96 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
return ret;
 }
 
-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+enum mq_issue_decision {
+   MQ_ISSUE_INSERT_QUEUE,
+   MQ_ISSUE_END_REQUEST,
+   MQ_ISSUE_DO_NOTHING,
+};
+
+static inline enum mq_issue_decision
+   blk_mq_make_dicision(blk_status_t ret, bool bypass)
+{
+   enum mq_issue_decision dec;
+
+   switch(ret) {
+   case BLK_STS_OK:
+   dec = MQ_ISSUE_DO_NOTHING;
+   break;
+   case BLK_STS_DEV_RESOURCE:
+   case BLK_STS_RESOURCE:
+   dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_INSERT_QUEUE;
+   break;
+   default:
+   dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_END_REQUEST;
+   break;
+   }
+
+   return dec;
+}
+
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
-   bool bypass_insert)
+   bool bypass)
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   blk_status_t ret = BLK_STS_RESOURCE;
+   enum mq_issue_decision dec;
+   int srcu_idx;
+
+   hctx_lock(hctx, &srcu_idx);
 
/*
-* RCU or SRCU read lock is needed before checking quiesced flag.
+* hctx_lock is needed before checking quiesced flag.
 *
-* When queue is stopped or quiesced, ignore 'bypass_insert' from
-* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
-* and avoid driver to try to dispatch again.
+* When queue is stopped or quiesced, ignore 'bypass', insert
+* and return BLK_STS_OK to caller, and avoid driver to try to
+* dispatch again.
 */
if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
run_queue = false;
-   bypass_insert = false;
-   goto insert;
+   bypass = false;
+   goto out_unlock;
}
 
-   if (q->elevator && !bypass_insert)
-   goto insert;
+   if (q->elevator && !bypass)
+   goto out_unlock;
 
if (!blk_mq_get_dispatch_budget(hctx))
-   goto insert;
+   goto out_unlock;
 
if (!blk_mq_get_driver_tag(rq)) {
blk_mq_put_dispatch_budget(hctx);
-   goto insert;
+   goto out_unlock;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie);
-insert:
-   if (bypass_insert)
-   return BLK_STS_RESOURCE;
+   ret = __blk_mq_issue_directly(hctx, rq, cookie);
 
-   blk_mq_sched_insert_request(rq, false, run_queue, false);
-   return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-   struct request *rq, blk_qc_t *cookie)
-{
-   blk_status_t ret;
-   int srcu_idx;
-
-   might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
-
-   hctx_lock(hctx, &srcu_idx);
+out_unlock:
+   hctx_unlock(hctx, srcu_idx);
 
-   ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
-   if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
-   blk_mq_sched_insert_request(rq, false, true, false);
-   else if (ret != BLK_STS_OK)
+   dec = blk_mq_make_dicision(ret, bypass);
+   switch(dec) {
+   case MQ_ISSUE_INSERT_QUEUE:
+   blk_mq_sched_insert_request(rq, false, run_queue, false);
+   break;
+   case MQ_ISSUE_END_REQUEST:
blk_mq_end_request(rq, ret);
+   break;
+   default:
+   return ret;
+   }
 
-   hctx_unlock(hctx, srcu_idx);
+   return BLK_STS_OK;
 }
 
 blk_status_t blk_mq_request_issue_directly(struct request *rq)
 {
-   blk_status_t ret;
-   int srcu_idx;
struct blk_mq_ctx *ctx = rq->mq_ctx;
struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu);
 
-   hctx_lock(hctx, &srcu_idx);
-   ret = __blk_mq_try_issue_directly(hctx, rq, NULL, true);
-   hctx_unlock(hctx, srcu_idx);
-
-   return ret;
+

[PATCH V4 1/5] blk-mq: make __blk_mq_issue_directly be able to accept NULL cookie pointer

2018-11-02 Thread Jianchao Wang

Make __blk_mq_issue_directly be able to accept a NULL cookie pointer
and remove the dummy unused_cookie in blk_mq_request_issue_directly.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index dcf10e3..af5b591 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1700,8 +1700,6 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
blk_qc_t new_cookie;
blk_status_t ret;
 
-   new_cookie = request_to_qc_t(hctx, rq);
-
/*
 * For OK queue, we are done. For error, caller may kill it.
 * Any other error (busy), just add it to our list as we
@@ -1711,19 +1709,23 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
switch (ret) {
case BLK_STS_OK:
blk_mq_update_dispatch_busy(hctx, false);
-   *cookie = new_cookie;
+   new_cookie = request_to_qc_t(hctx, rq);
break;
case BLK_STS_RESOURCE:
case BLK_STS_DEV_RESOURCE:
blk_mq_update_dispatch_busy(hctx, true);
__blk_mq_requeue_request(rq);
+   new_cookie = BLK_QC_T_NONE;
break;
default:
blk_mq_update_dispatch_busy(hctx, false);
-   *cookie = BLK_QC_T_NONE;
+   new_cookie = BLK_QC_T_NONE;
break;
}
 
+   if (cookie)
+   *cookie = new_cookie;
+
return ret;
 }
 
@@ -1791,12 +1793,11 @@ blk_status_t blk_mq_request_issue_directly(struct 
request *rq)
 {
blk_status_t ret;
int srcu_idx;
-   blk_qc_t unused_cookie;
struct blk_mq_ctx *ctx = rq->mq_ctx;
struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu);
 
hctx_lock(hctx, &srcu_idx);
-   ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true);
+   ret = __blk_mq_try_issue_directly(hctx, rq, NULL, true);
hctx_unlock(hctx, srcu_idx);
 
return ret;
-- 
2.7.4

[PATCH V4 3/5] blk-mq: fix issue directly case when q is stopped or quiesced

2018-11-02 Thread Jianchao Wang

When try to issue request directly, if the queue is stopped or
quiesced, 'bypass' will be ignored and return BLK_STS_OK to caller
to avoid it dispatch request again. Then the request will be
inserted with blk_mq_sched_insert_request. This is not correct
for dm-rq case where we should avoid to pass through the underlying
path's io scheduler.

To fix it, add new mq_issue_decision entry MQ_ISSUE_INSERT_DISPATCH
for above case where the request need to be inserted forcibly.
And use blk_mq_request_bypass_insert to insert the request into
hctx->dispatch directly.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 962fdfc..a0b9b6c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1731,12 +1731,13 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 
 enum mq_issue_decision {
MQ_ISSUE_INSERT_QUEUE,
+   MQ_ISSUE_INSERT_DISPATCH,
MQ_ISSUE_END_REQUEST,
MQ_ISSUE_DO_NOTHING,
 };
 
 static inline enum mq_issue_decision
-   blk_mq_make_dicision(blk_status_t ret, bool bypass)
+   blk_mq_make_dicision(blk_status_t ret, bool bypass, bool force)
 {
enum mq_issue_decision dec;
 
@@ -1746,7 +1747,10 @@ static inline enum mq_issue_decision
break;
case BLK_STS_DEV_RESOURCE:
case BLK_STS_RESOURCE:
-   dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_INSERT_QUEUE;
+   if (force)
+   dec = bypass ? MQ_ISSUE_INSERT_DISPATCH : 
MQ_ISSUE_INSERT_QUEUE;
+   else
+   dec = bypass ? MQ_ISSUE_DO_NOTHING : 
MQ_ISSUE_INSERT_QUEUE;
break;
default:
dec = bypass ? MQ_ISSUE_DO_NOTHING : MQ_ISSUE_END_REQUEST;
@@ -1762,7 +1766,7 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
bool bypass)
 {
struct request_queue *q = rq->q;
-   bool run_queue = true;
+   bool run_queue = true, force = false;
blk_status_t ret = BLK_STS_RESOURCE;
enum mq_issue_decision dec;
int srcu_idx;
@@ -1778,7 +1782,7 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 */
if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
run_queue = false;
-   bypass = false;
+   force = true;
goto out_unlock;
}
 
@@ -1798,11 +1802,14 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 out_unlock:
hctx_unlock(hctx, srcu_idx);
 
-   dec = blk_mq_make_dicision(ret, bypass);
+   dec = blk_mq_make_dicision(ret, bypass, force);
switch(dec) {
case MQ_ISSUE_INSERT_QUEUE:
blk_mq_sched_insert_request(rq, false, run_queue, false);
break;
+   case MQ_ISSUE_INSERT_DISPATCH:
+   blk_mq_request_bypass_insert(rq, run_queue);
+   break;
case MQ_ISSUE_END_REQUEST:
blk_mq_end_request(rq, ret);
break;
-- 
2.7.4

[PATCH V4 0/5] blk-mq: refactor and fix on issue request directly

2018-11-02 Thread Jianchao Wang

Hi Jens

These patch set refactors the code of issueing request driectly and
fix some defects.

The 1st patch make __blk_mq_issue_directly be able to accept NULL cookie
pointer.

The 2nd patch refactors the code of issue request directly.
It merges the blk_mq_try_issue_directly and __blk_mq_try_issue_directly
and make it handle the return value of .queue_rq itself.

The 3rd patch let the requests be inserted into hctx->dispatch when
the queue is stopped or quiesced if bypass is true.

The 4th patch make blk_mq_sched_insert_requests issue requests directly
with 'bypass' false, then it needn't to handle the non-issued requests
any more.

The 5th patch ensures the hctx to be ran on its mapped cpu in issue directly
path.

V4:
 - split the original patch 1 into two patches, 1st and 2nd patch currently
 - rename the mq_decision to mq_issue_decision
 - comment changes

V3:
 - Correct the code about the case bypass_insert is true and io scheduler
   attached. The request still need to be issued in case above. (1/4)
 - Refactor the code to make code clearer. blk_mq_make_request is introduced
   to decide insert, end or just return based on the return value of .queue_rq
   and bypass_insert (1/4) 
 - Add the 2nd patch. It introduce a new decision result which indicates to
   insert request with blk_mq_request_bypass_insert.
 - Modify the code to adapt the new patch 1.

V2:
 - Add 1st and 2nd patch

Jianchao Wang (5)
blk-mq: make __blk_mq_issue_directly be able to accept
blk-mq: refactor the code of issue request directly
blk-mq: fix issue directly case when q is stopped or
blk-mq: issue directly with bypass 'false' in
blk-mq: ensure hctx to be ran on mapped cpu when issue

 block/blk-mq-sched.c |   8 ++-
 block/blk-mq.c   | 149 ++-
 2 files changed, 92 insertions(+), 65 deletions(-)

Thanks
Jianchao

[PATCH V4 5/5] blk-mq: ensure hctx to be ran on mapped cpu when issue directly

2018-11-02 Thread Jianchao Wang

When issue request directly and the task is migrated out of the
original cpu where it allocates request, hctx could be ran on
the cpu where it is not mapped. To fix this, insert the request
if BLK_MQ_F_BLOCKING is set, check whether the current is mapped
to the hctx and invoke __blk_mq_issue_directly under preemption
disabled.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index bf8b144..4450eb6 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1771,6 +1771,17 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
enum mq_issue_decision dec;
int srcu_idx;
 
+   if (hctx->flags & BLK_MQ_F_BLOCKING) {
+   force = true;
+   goto out;
+   }
+
+   if (!cpumask_test_cpu(get_cpu(), hctx->cpumask)) {
+   put_cpu();
+   force = true;
+   goto out;
+   }
+
hctx_lock(hctx, &srcu_idx);
 
/*
@@ -1801,7 +1812,8 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 
 out_unlock:
hctx_unlock(hctx, srcu_idx);
-
+   put_cpu();
+out:
dec = blk_mq_make_dicision(ret, bypass, force);
switch(dec) {
case MQ_ISSUE_INSERT_QUEUE:
-- 
2.7.4

[PATCH RESENT V3 0/4] blk-mq: refactor and fix on issue request directly

2018-10-29 Thread Jianchao Wang

Hi Jens

The 1st patch refactors the code of issue request directly.
It merges the blk_mq_try_issue_directly and __blk_mq_try_issue_directly
and make it handle the return value of .queue_rq itself.

The 2nd patch let the requests be inserted into hctx->dispatch when
the queue is stopped or quiesced if bypass_insert is true.

The 3rd patch make blk_mq_sched_insert_requests issue requests directly
with 'bypass_insert' false, then it needn't to handle the non-issued requests
any more.

The 4th patch ensures the hctx to be ran on its mapped cpu in issue directly
path.

V3:
 - Correct the code about the case bypass_insert is true and io scheduler
   attached. The request still need to be issued in case above. (1/4)
 - Refactor the code to make code clearer. blk_mq_make_request is introduced
   to decide insert, end or just return based on the return value of .queue_rq
   and bypass_insert (1/4) 
 - Add the 2nd patch. It introduce a new decision result which indicates to
   insert request with blk_mq_request_bypass_insert.
 - Modify the code to adapt the new patch 1.

V2:
 - Add 1st and 2nd patch

Jianchao Wang(4)
blk-mq: refactor the code of issue request directly
blk-mq: insert request without involving any io
blk-mq: issue directly with bypass_insert 'false' in
blk-mq: ensure hctx to be ran on mapped cpu when issue

 block/blk-mq-sched.c |   8 ++-
 block/blk-mq.c   | 139 ++-
 2 files changed, 86 insertions(+), 61 deletions(-)

Thanks
Jianchao

[PATCH V3 1/4] blk-mq: refactor the code of issue request directly

2018-10-29 Thread Jianchao Wang

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface which is able to handle the return value from
.queue_rq callback. To make the code clearer, introduce new helpers
blk_mq_make_decision and enum mq_decision.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 104 +
 1 file changed, 60 insertions(+), 44 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index dcf10e3..0f6328b 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1711,7 +1711,6 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
switch (ret) {
case BLK_STS_OK:
blk_mq_update_dispatch_busy(hctx, false);
-   *cookie = new_cookie;
break;
case BLK_STS_RESOURCE:
case BLK_STS_DEV_RESOURCE:
@@ -1720,86 +1719,103 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
break;
default:
blk_mq_update_dispatch_busy(hctx, false);
-   *cookie = BLK_QC_T_NONE;
+   new_cookie = BLK_QC_T_NONE;
break;
}
 
+   if (cookie)
+   *cookie = new_cookie;
return ret;
 }
 
-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+enum mq_decision {
+   MQ_INSERT_QUEUE,
+   MQ_END_REQUEST,
+   MQ_DO_NOTHING,
+};
+
+static inline enum mq_decision
+   blk_mq_make_dicision(blk_status_t ret, bool bypass_insert)
+{
+   enum mq_decision dec;
+
+   switch(ret) {
+   case BLK_STS_OK:
+   dec = MQ_DO_NOTHING;
+   break;
+   case BLK_STS_DEV_RESOURCE:
+   case BLK_STS_RESOURCE:
+   dec = bypass_insert ? MQ_DO_NOTHING : MQ_INSERT_QUEUE;
+   break;
+   default:
+   dec = bypass_insert ? MQ_DO_NOTHING : MQ_END_REQUEST;
+   break;
+   }
+
+   return dec;
+}
+
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
bool bypass_insert)
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   enum mq_decision dec;
+   blk_status_t ret = BLK_STS_RESOURCE;
+   int srcu_idx;
 
+   hctx_lock(hctx, &srcu_idx);
/*
-* RCU or SRCU read lock is needed before checking quiesced flag.
+* hctx_lock is needed before checking quiesced flag.
 *
-* When queue is stopped or quiesced, ignore 'bypass_insert' from
-* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
-* and avoid driver to try to dispatch again.
+* When queue is stopped or quiesced, ignore 'bypass_insert',
+* and return BLK_STS_OK to caller, and avoid driver to try to
+* dispatch again.
 */
if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
run_queue = false;
bypass_insert = false;
-   goto insert;
+   goto out_unlock;
}
 
if (q->elevator && !bypass_insert)
-   goto insert;
+   goto out_unlock;
 
if (!blk_mq_get_dispatch_budget(hctx))
-   goto insert;
+   goto out_unlock;
 
if (!blk_mq_get_driver_tag(rq)) {
blk_mq_put_dispatch_budget(hctx);
-   goto insert;
+   goto out_unlock;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie);
-insert:
-   if (bypass_insert)
-   return BLK_STS_RESOURCE;
-
-   blk_mq_sched_insert_request(rq, false, run_queue, false);
-   return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-   struct request *rq, blk_qc_t *cookie)
-{
-   blk_status_t ret;
-   int srcu_idx;
-
-   might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
-
-   hctx_lock(hctx, &srcu_idx);
+   ret = __blk_mq_issue_directly(hctx, rq, cookie);
+out_unlock:
+   hctx_unlock(hctx, srcu_idx);
 
-   ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
-   if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
-   blk_mq_sched_insert_request(rq, false, true, false);
-   else if (ret != BLK_STS_OK)
+   dec = blk_mq_make_dicision(ret, bypass_insert);
+   switch(dec) {
+   case MQ_INSERT_QUEUE:
+   blk_mq_sched_insert_request(rq, false, run_queue, false);
+   break;
+   case MQ_END_REQUEST:
blk_mq_end_request(rq, ret);
+   break;
+   default:
+   return ret;
+   }
 
-   hctx_unlock(hctx, srcu_idx);
+   return BLK_STS_OK;
 }
 
 blk_status_t blk_mq_request_issue_directly(struct request *rq)

[PATCH V3 4/4] blk-mq: ensure hctx to be ran on mapped cpu when issue directly

2018-10-29 Thread Jianchao Wang

When issue request directly and the task is migrated out of the
original cpu where it allocates request, hctx could be ran on
the cpu where it is not mapped. To fix this, insert the request
if BLK_MQ_F_BLOCKING is set, check whether the current is mapped
to the hctx and invoke __blk_mq_issue_directly under preemption
disabled.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9c6c858..ced3346 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1770,6 +1770,17 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
blk_status_t ret = BLK_STS_RESOURCE;
int srcu_idx;
 
+   if (hctx->flags & BLK_MQ_F_BLOCKING) {
+   force = true;
+   goto out;
+   }
+
+   if (!cpumask_test_cpu(get_cpu(), hctx->cpumask)) {
+   put_cpu();
+   force = true;
+   goto out;
+   }
+
hctx_lock(hctx, &srcu_idx);
/*
 * hctx_lock is needed before checking quiesced flag.
@@ -1798,7 +1809,8 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
ret = __blk_mq_issue_directly(hctx, rq, cookie);
 out_unlock:
hctx_unlock(hctx, srcu_idx);
-
+   put_cpu();
+out:
dec = blk_mq_make_dicision(ret, bypass_insert, force);
switch(dec) {
case MQ_INSERT_DISPATCH:
-- 
2.7.4

[PATCH V3 3/4] blk-mq: issue directly with bypass_insert 'false' in blk_mq_sched_insert_requests

2018-10-29 Thread Jianchao Wang

It is not necessary to issue request directly with bypass_insert
'true' in blk_mq_sched_insert_requests and insert the non-issued
requests itself. Just set bypass_insert to 'false' and let
blk_mq_try_issue_directly handle them.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c |  8 +++-
 block/blk-mq.c   | 10 +-
 2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 29bfe80..23cd97e 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -411,12 +411,10 @@ void blk_mq_sched_insert_requests(struct request_queue *q,
 * busy in case of 'none' scheduler, and this way may save
 * us one extra enqueue & dequeue to sw queue.
 */
-   if (!hctx->dispatch_busy && !e && !run_queue_async) {
+   if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
-   if (list_empty(list))
-   return;
-   }
-   blk_mq_insert_requests(hctx, ctx, list);
+   else
+   blk_mq_insert_requests(hctx, ctx, list);
}
 
blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index e0aa068..9c6c858 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1834,15 +1834,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx 
*hctx,
queuelist);
 
list_del_init(&rq->queuelist);
-   ret = blk_mq_request_issue_directly(rq);
-   if (ret != BLK_STS_OK) {
-   if (ret == BLK_STS_RESOURCE ||
-   ret == BLK_STS_DEV_RESOURCE) {
-   list_add(&rq->queuelist, list);
-   break;
-   }
-   blk_mq_end_request(rq, ret);
-   }
+   ret = blk_mq_try_issue_directly(hctx, rq, NULL, false);
}
 }
 
-- 
2.7.4

[PATCH V3 2/4] blk-mq: insert request without involving any io scheduler

2018-10-29 Thread Jianchao Wang

When try to issue request directly, if the queue is stopped or
quiesced, 'bypass_insert' will be ignored and return BLK_STS_OK to
caller to avoid it dispatch request again. Then the request will
be inserted with blk_mq_sched_insert_request. This is not correct
for dm-rq case where we should avoid to pass through the underlying
paths' io scheduler.

To fix it, use blk_mq_request_bypass_insert to insert the request
into hctx->dispatch directly.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 0f6328b..e0aa068 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1729,13 +1729,14 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 }
 
 enum mq_decision {
+   MQ_INSERT_DISPATCH,
MQ_INSERT_QUEUE,
MQ_END_REQUEST,
MQ_DO_NOTHING,
 };
 
 static inline enum mq_decision
-   blk_mq_make_dicision(blk_status_t ret, bool bypass_insert)
+   blk_mq_make_dicision(blk_status_t ret, bool bypass_insert, bool force)
 {
enum mq_decision dec;
 
@@ -1745,7 +1746,10 @@ static inline enum mq_decision
break;
case BLK_STS_DEV_RESOURCE:
case BLK_STS_RESOURCE:
-   dec = bypass_insert ? MQ_DO_NOTHING : MQ_INSERT_QUEUE;
+   if (force)
+   dec = bypass_insert ? MQ_INSERT_DISPATCH : 
MQ_INSERT_QUEUE;
+   else
+   dec = bypass_insert ? MQ_DO_NOTHING : MQ_INSERT_QUEUE;
break;
default:
dec = bypass_insert ? MQ_DO_NOTHING : MQ_END_REQUEST;
@@ -1761,7 +1765,7 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
bool bypass_insert)
 {
struct request_queue *q = rq->q;
-   bool run_queue = true;
+   bool run_queue = true, force = false;
enum mq_decision dec;
blk_status_t ret = BLK_STS_RESOURCE;
int srcu_idx;
@@ -1776,7 +1780,7 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 */
if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
run_queue = false;
-   bypass_insert = false;
+   force = true;
goto out_unlock;
}
 
@@ -1795,8 +1799,11 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 out_unlock:
hctx_unlock(hctx, srcu_idx);
 
-   dec = blk_mq_make_dicision(ret, bypass_insert);
+   dec = blk_mq_make_dicision(ret, bypass_insert, force);
switch(dec) {
+   case MQ_INSERT_DISPATCH:
+   blk_mq_request_bypass_insert(rq, run_queue);
+   break;
case MQ_INSERT_QUEUE:
blk_mq_sched_insert_request(rq, false, run_queue, false);
break;
-- 
2.7.4

[PATCH V3 3/4] blk-mq: issue directly with bypass_insert 'false' in blk_mq_sched_insert_requests

2018-10-29 Thread Jianchao Wang

It is not necessary to issue request directly with bypass_insert
'true' in blk_mq_sched_insert_requests and insert the non-issued
requests itself. Just set bypass_insert to 'false' and let
blk_mq_try_issue_directly handle them.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c |  8 +++-
 block/blk-mq.c   | 10 +-
 2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 29bfe80..23cd97e 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -411,12 +411,10 @@ void blk_mq_sched_insert_requests(struct request_queue *q,
 * busy in case of 'none' scheduler, and this way may save
 * us one extra enqueue & dequeue to sw queue.
 */
-   if (!hctx->dispatch_busy && !e && !run_queue_async) {
+   if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
-   if (list_empty(list))
-   return;
-   }
-   blk_mq_insert_requests(hctx, ctx, list);
+   else
+   blk_mq_insert_requests(hctx, ctx, list);
}
 
blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index e0aa068..9c6c858 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1834,15 +1834,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx 
*hctx,
queuelist);
 
list_del_init(&rq->queuelist);
-   ret = blk_mq_request_issue_directly(rq);
-   if (ret != BLK_STS_OK) {
-   if (ret == BLK_STS_RESOURCE ||
-   ret == BLK_STS_DEV_RESOURCE) {
-   list_add(&rq->queuelist, list);
-   break;
-   }
-   blk_mq_end_request(rq, ret);
-   }
+   ret = blk_mq_try_issue_directly(hctx, rq, NULL, false);
}
 }
 
-- 
2.7.4

[PATCH V3 2/4] blk-mq: insert request without involving any io scheduler

2018-10-29 Thread Jianchao Wang

When try to issue request directly, if the queue is stopped or
quiesced, 'bypass_insert' will be ignored and return BLK_STS_OK to
caller to avoid it dispatch request again. Then the request will
be inserted with blk_mq_sched_insert_request. This is not correct
for dm-rq case where we should avoid to pass through the underlying
paths' io scheduler.

To fix it, use blk_mq_request_bypass_insert to insert the request
into hctx->dispatch directly.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 0f6328b..e0aa068 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1729,13 +1729,14 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 }
 
 enum mq_decision {
+   MQ_INSERT_DISPATCH,
MQ_INSERT_QUEUE,
MQ_END_REQUEST,
MQ_DO_NOTHING,
 };
 
 static inline enum mq_decision
-   blk_mq_make_dicision(blk_status_t ret, bool bypass_insert)
+   blk_mq_make_dicision(blk_status_t ret, bool bypass_insert, bool force)
 {
enum mq_decision dec;
 
@@ -1745,7 +1746,10 @@ static inline enum mq_decision
break;
case BLK_STS_DEV_RESOURCE:
case BLK_STS_RESOURCE:
-   dec = bypass_insert ? MQ_DO_NOTHING : MQ_INSERT_QUEUE;
+   if (force)
+   dec = bypass_insert ? MQ_INSERT_DISPATCH : 
MQ_INSERT_QUEUE;
+   else
+   dec = bypass_insert ? MQ_DO_NOTHING : MQ_INSERT_QUEUE;
break;
default:
dec = bypass_insert ? MQ_DO_NOTHING : MQ_END_REQUEST;
@@ -1761,7 +1765,7 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
bool bypass_insert)
 {
struct request_queue *q = rq->q;
-   bool run_queue = true;
+   bool run_queue = true, force = false;
enum mq_decision dec;
blk_status_t ret = BLK_STS_RESOURCE;
int srcu_idx;
@@ -1776,7 +1780,7 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 */
if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
run_queue = false;
-   bypass_insert = false;
+   force = true;
goto out_unlock;
}
 
@@ -1795,8 +1799,11 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 out_unlock:
hctx_unlock(hctx, srcu_idx);
 
-   dec = blk_mq_make_dicision(ret, bypass_insert);
+   dec = blk_mq_make_dicision(ret, bypass_insert, force);
switch(dec) {
+   case MQ_INSERT_DISPATCH:
+   blk_mq_request_bypass_insert(rq, run_queue);
+   break;
case MQ_INSERT_QUEUE:
blk_mq_sched_insert_request(rq, false, run_queue, false);
break;
-- 
2.7.4

[PATCH V3 4/4] blk-mq: ensure hctx to be ran on mapped cpu when issue directly

2018-10-29 Thread Jianchao Wang

When issue request directly and the task is migrated out of the
original cpu where it allocates request, hctx could be ran on
the cpu where it is not mapped. To fix this, insert the request
if BLK_MQ_F_BLOCKING is set, check whether the current is mapped
to the hctx and invoke __blk_mq_issue_directly under preemption
disabled.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9c6c858..ced3346 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1770,6 +1770,17 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
blk_status_t ret = BLK_STS_RESOURCE;
int srcu_idx;
 
+   if (hctx->flags & BLK_MQ_F_BLOCKING) {
+   force = true;
+   goto out;
+   }
+
+   if (!cpumask_test_cpu(get_cpu(), hctx->cpumask)) {
+   put_cpu();
+   force = true;
+   goto out;
+   }
+
hctx_lock(hctx, &srcu_idx);
/*
 * hctx_lock is needed before checking quiesced flag.
@@ -1798,7 +1809,8 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
ret = __blk_mq_issue_directly(hctx, rq, cookie);
 out_unlock:
hctx_unlock(hctx, srcu_idx);
-
+   put_cpu();
+out:
dec = blk_mq_make_dicision(ret, bypass_insert, force);
switch(dec) {
case MQ_INSERT_DISPATCH:
-- 
2.7.4

[PATCH V3 1/4] blk-mq: refactor the code of issue request directly

2018-10-29 Thread Jianchao Wang

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface which is able to handle the return value from
.queue_rq callback. To make the code clearer, introduce new helpers
blk_mq_make_decision and enum mq_decision.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 104 ++---
 block/blk-mq.c.rej |  60 +++
 2 files changed, 120 insertions(+), 44 deletions(-)
 create mode 100644 block/blk-mq.c.rej

diff --git a/block/blk-mq.c b/block/blk-mq.c
index dcf10e3..0f6328b 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1711,7 +1711,6 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
switch (ret) {
case BLK_STS_OK:
blk_mq_update_dispatch_busy(hctx, false);
-   *cookie = new_cookie;
break;
case BLK_STS_RESOURCE:
case BLK_STS_DEV_RESOURCE:
@@ -1720,86 +1719,103 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
break;
default:
blk_mq_update_dispatch_busy(hctx, false);
-   *cookie = BLK_QC_T_NONE;
+   new_cookie = BLK_QC_T_NONE;
break;
}
 
+   if (cookie)
+   *cookie = new_cookie;
return ret;
 }
 
-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+enum mq_decision {
+   MQ_INSERT_QUEUE,
+   MQ_END_REQUEST,
+   MQ_DO_NOTHING,
+};
+
+static inline enum mq_decision
+   blk_mq_make_dicision(blk_status_t ret, bool bypass_insert)
+{
+   enum mq_decision dec;
+
+   switch(ret) {
+   case BLK_STS_OK:
+   dec = MQ_DO_NOTHING;
+   break;
+   case BLK_STS_DEV_RESOURCE:
+   case BLK_STS_RESOURCE:
+   dec = bypass_insert ? MQ_DO_NOTHING : MQ_INSERT_QUEUE;
+   break;
+   default:
+   dec = bypass_insert ? MQ_DO_NOTHING : MQ_END_REQUEST;
+   break;
+   }
+
+   return dec;
+}
+
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
bool bypass_insert)
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   enum mq_decision dec;
+   blk_status_t ret = BLK_STS_RESOURCE;
+   int srcu_idx;
 
+   hctx_lock(hctx, &srcu_idx);
/*
-* RCU or SRCU read lock is needed before checking quiesced flag.
+* hctx_lock is needed before checking quiesced flag.
 *
-* When queue is stopped or quiesced, ignore 'bypass_insert' from
-* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
-* and avoid driver to try to dispatch again.
+* When queue is stopped or quiesced, ignore 'bypass_insert',
+* and return BLK_STS_OK to caller, and avoid driver to try to
+* dispatch again.
 */
if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
run_queue = false;
bypass_insert = false;
-   goto insert;
+   goto out_unlock;
}
 
if (q->elevator && !bypass_insert)
-   goto insert;
+   goto out_unlock;
 
if (!blk_mq_get_dispatch_budget(hctx))
-   goto insert;
+   goto out_unlock;
 
if (!blk_mq_get_driver_tag(rq)) {
blk_mq_put_dispatch_budget(hctx);
-   goto insert;
+   goto out_unlock;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie);
-insert:
-   if (bypass_insert)
-   return BLK_STS_RESOURCE;
-
-   blk_mq_sched_insert_request(rq, false, run_queue, false);
-   return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-   struct request *rq, blk_qc_t *cookie)
-{
-   blk_status_t ret;
-   int srcu_idx;
-
-   might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
-
-   hctx_lock(hctx, &srcu_idx);
+   ret = __blk_mq_issue_directly(hctx, rq, cookie);
+out_unlock:
+   hctx_unlock(hctx, srcu_idx);
 
-   ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
-   if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
-   blk_mq_sched_insert_request(rq, false, true, false);
-   else if (ret != BLK_STS_OK)
+   dec = blk_mq_make_dicision(ret, bypass_insert);
+   switch(dec) {
+   case MQ_INSERT_QUEUE:
+   blk_mq_sched_insert_request(rq, false, run_queue, false);
+   break;
+   case MQ_END_REQUEST:
blk_mq_end_request(rq, ret);
+   break;
+   default:
+   return ret;
+   }
 
-   hctx_unlock(hctx, srcu_id

[PATCH V3 0/4] blk-mq: refactor and fix on issue request directly

2018-10-29 Thread Jianchao Wang

Hi Jens

The 1st patch refactors the code of issue request directly.
It merges the blk_mq_try_issue_directly and __blk_mq_try_issue_directly
and make it handle the return value of .queue_rq itself.

The 2nd patch let the requests be inserted into hctx->dispatch when
the queue is stopped or quiesced if bypass_insert is true.

The 3rd patch make blk_mq_sched_insert_requests issue requests directly
with 'bypass_insert' false, then it needn't to handle the non-issued requests
any more.

The 4th patch ensures the hctx to be ran on its mapped cpu in issue directly
path.

V3:
 - Correct the code about the case bypass_insert is true and io scheduler
   attached. The request still need to be issued in case above. (1/4)
 - Refactor the code to make code clearer. (1/4) 
 - Add the 2nd patch.
 - Modify the code to adapt the new patch 1.

V2:
 - Add 1st and 2nd patch

Jianchao Wang(4)
blk-mq: refactor the code of issue request directly
blk-mq: insert request without involving any io
blk-mq: issue directly with bypass_insert 'false' in
blk-mq: ensure hctx to be ran on mapped cpu when issue

 block/blk-mq-sched.c |   8 ++-
 block/blk-mq.c   | 139 ++-
 block/blk-mq.c.rej   |  60 ++
 3 files changed, 146 insertions(+), 61 deletions(-)

[PATCH V5] block: fix the DISCARD request merge

2018-10-27 Thread Jianchao Wang

There are two cases when handle DISCARD merge.
If max_discard_segments == 1, the bios/requests need to be contiguous
to merge. If max_discard_segments > 1, it takes every bio as a range
and different range needn't to be contiguous.

But now, attempt_merge screws this up. It always consider contiguity
for DISCARD for the case max_discard_segments > 1 and cannot merge
contiguous DISCARD for the case max_discard_segments == 1, because
rq_attempt_discard_merge always returns false in this case.
This patch fixes both of the two cases above.

Signed-off-by: Jianchao Wang 
---

V5:
 - get rid of the redundant 'else' in blk_discard_mergable

V4:
 - introduce blk_try_req_merge as suggestion of Christoph.

V3:
 - Introduce blk_discard_mergable into attempt_merge and
   blk_try_merge.
 - Some comment changes.

V2:
 - Add max_discard_segments > 1 checking in attempt_merge.
 - Change patch title and comment.
 - Add more comment in attempt_merge

 block/blk-merge.c | 46 --
 1 file changed, 36 insertions(+), 10 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 42a4674..6b5ad27 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -714,6 +714,31 @@ static void blk_account_io_merge(struct request *req)
part_stat_unlock();
}
 }
+/*
+ * Two cases of handling DISCARD merge:
+ * If max_discard_segments > 1, the driver takes every bio
+ * as a range and send them to controller together. The ranges
+ * needn't to be contiguous.
+ * Otherwise, the bios/requests will be handled as same as
+ * others which should be contiguous.
+ */
+static inline bool blk_discard_mergable(struct request *req)
+{
+   if (req_op(req) == REQ_OP_DISCARD &&
+   queue_max_discard_segments(req->q) > 1)
+   return true;
+   return false;
+}
+
+enum elv_merge blk_try_req_merge(struct request *req, struct request *next)
+{
+   if (blk_discard_mergable(req))
+   return ELEVATOR_DISCARD_MERGE;
+   else if (blk_rq_pos(req) + blk_rq_sectors(req) == blk_rq_pos(next))
+   return ELEVATOR_BACK_MERGE;
+
+   return ELEVATOR_NO_MERGE;
+}
 
 /*
  * For non-mq, this has to be called with the request spinlock acquired.
@@ -731,12 +756,6 @@ static struct request *attempt_merge(struct request_queue 
*q,
if (req_op(req) != req_op(next))
return NULL;
 
-   /*
-* not contiguous
-*/
-   if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next))
-   return NULL;
-
if (rq_data_dir(req) != rq_data_dir(next)
|| req->rq_disk != next->rq_disk
|| req_no_special_merge(next))
@@ -760,11 +779,19 @@ static struct request *attempt_merge(struct request_queue 
*q,
 * counts here. Handle DISCARDs separately, as they
 * have separate settings.
 */
-   if (req_op(req) == REQ_OP_DISCARD) {
+
+   switch (blk_try_req_merge(req, next)) {
+   case ELEVATOR_DISCARD_MERGE:
if (!req_attempt_discard_merge(q, req, next))
return NULL;
-   } else if (!ll_merge_requests_fn(q, req, next))
+   break;
+   case ELEVATOR_BACK_MERGE:
+   if (!ll_merge_requests_fn(q, req, next))
+   return NULL;
+   break;
+   default:
return NULL;
+   }
 
/*
 * If failfast settings disagree or any of the two is already
@@ -888,8 +915,7 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
 
 enum elv_merge blk_try_merge(struct request *rq, struct bio *bio)
 {
-   if (req_op(rq) == REQ_OP_DISCARD &&
-   queue_max_discard_segments(rq->q) > 1)
+   if (blk_discard_mergable(rq))
return ELEVATOR_DISCARD_MERGE;
else if (blk_rq_pos(rq) + blk_rq_sectors(rq) == bio->bi_iter.bi_sector)
return ELEVATOR_BACK_MERGE;
-- 
2.7.4

[PATCH V4] block: fix the DISCARD request merge

2018-10-26 Thread Jianchao Wang

There are two cases when handle DISCARD merge.
If max_discard_segments == 1, the bios/requests need to be contiguous
to merge. If max_discard_segments > 1, it takes every bio as a range
and different range needn't to be contiguous.

But now, attempt_merge screws this up. It always consider contiguity
for DISCARD for the case max_discard_segments > 1 and cannot merge
contiguous DISCARD for the case max_discard_segments == 1, because
rq_attempt_discard_merge always returns false in this case.
This patch fixes both of the two cases above.

Signed-off-by: Jianchao Wang 
---

V4:
 - introduce blk_try_req_merge as suggestion of Christoph.

V3:
 - Introduce blk_discard_mergable into attempt_merge and
   blk_try_merge.
 - Some comment changes.

V2:
 - Add max_discard_segments > 1 checking in attempt_merge.
 - Change patch title and comment.
 - Add more comment in attempt_merge

 block/blk-merge.c | 47 +--
 1 file changed, 37 insertions(+), 10 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 42a4674..cf817c7a 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -714,6 +714,32 @@ static void blk_account_io_merge(struct request *req)
part_stat_unlock();
}
 }
+/*
+ * Two cases of handling DISCARD merge:
+ * If max_discard_segments > 1, the driver takes every bio
+ * as a range and send them to controller together. The ranges
+ * needn't to be contiguous.
+ * Otherwise, the bios/requests will be handled as same as
+ * others which should be contiguous.
+ */
+static inline bool blk_discard_mergable(struct request *req)
+{
+   if (req_op(req) == REQ_OP_DISCARD &&
+   queue_max_discard_segments(req->q) > 1)
+   return true;
+   else
+   return false;
+}
+
+enum elv_merge blk_try_req_merge(struct request *req, struct request *next)
+{
+   if (blk_discard_mergable(req))
+   return ELEVATOR_DISCARD_MERGE;
+   else if (blk_rq_pos(req) + blk_rq_sectors(req) == blk_rq_pos(next))
+   return ELEVATOR_BACK_MERGE;
+
+   return ELEVATOR_NO_MERGE;
+}
 
 /*
  * For non-mq, this has to be called with the request spinlock acquired.
@@ -731,12 +757,6 @@ static struct request *attempt_merge(struct request_queue 
*q,
if (req_op(req) != req_op(next))
return NULL;
 
-   /*
-* not contiguous
-*/
-   if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next))
-   return NULL;
-
if (rq_data_dir(req) != rq_data_dir(next)
|| req->rq_disk != next->rq_disk
|| req_no_special_merge(next))
@@ -760,11 +780,19 @@ static struct request *attempt_merge(struct request_queue 
*q,
 * counts here. Handle DISCARDs separately, as they
 * have separate settings.
 */
-   if (req_op(req) == REQ_OP_DISCARD) {
+
+   switch (blk_try_req_merge(req, next)) {
+   case ELEVATOR_DISCARD_MERGE:
if (!req_attempt_discard_merge(q, req, next))
return NULL;
-   } else if (!ll_merge_requests_fn(q, req, next))
+   break;
+   case ELEVATOR_BACK_MERGE:
+   if (!ll_merge_requests_fn(q, req, next))
+   return NULL;
+   break;
+   default:
return NULL;
+   }
 
/*
 * If failfast settings disagree or any of the two is already
@@ -888,8 +916,7 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
 
 enum elv_merge blk_try_merge(struct request *rq, struct bio *bio)
 {
-   if (req_op(rq) == REQ_OP_DISCARD &&
-   queue_max_discard_segments(rq->q) > 1)
+   if (blk_discard_mergable(rq))
return ELEVATOR_DISCARD_MERGE;
else if (blk_rq_pos(rq) + blk_rq_sectors(rq) == bio->bi_iter.bi_sector)
return ELEVATOR_BACK_MERGE;
-- 
2.7.4

[PATCH V2 1/3] blk-mq: refactor the code of issue request directly

2018-10-26 Thread Jianchao Wang

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface which is able to handle the return value from
.queue_rq callback. Due to we can only issue directly w/o io
scheduler, so remove the blk_mq_get_driver_tag.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 109 ++---
 1 file changed, 50 insertions(+), 59 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index dcf10e3..a81d2ca 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1700,8 +1700,6 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
blk_qc_t new_cookie;
blk_status_t ret;
 
-   new_cookie = request_to_qc_t(hctx, rq);
-
/*
 * For OK queue, we are done. For error, caller may kill it.
 * Any other error (busy), just add it to our list as we
@@ -1711,7 +1709,7 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
switch (ret) {
case BLK_STS_OK:
blk_mq_update_dispatch_busy(hctx, false);
-   *cookie = new_cookie;
+   new_cookie = request_to_qc_t(hctx, rq);
break;
case BLK_STS_RESOURCE:
case BLK_STS_DEV_RESOURCE:
@@ -1720,86 +1718,79 @@ static blk_status_t __blk_mq_issue_directly(struct 
blk_mq_hw_ctx *hctx,
break;
default:
blk_mq_update_dispatch_busy(hctx, false);
-   *cookie = BLK_QC_T_NONE;
+   new_cookie = BLK_QC_T_NONE;
break;
}
 
+   if (cookie)
+   *cookie = new_cookie;
return ret;
 }
 
-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+/*
+ * When the bypass is true, the caller is responsible for handling the
+ * request if it is not issued. The only exception is that io scheduler
+ * is set.
+ */
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
-   bool bypass_insert)
+   bool bypass)
 {
struct request_queue *q = rq->q;
-   bool run_queue = true;
+   blk_status_t ret = BLK_STS_OK;
+   bool insert = true;
+   int srcu_idx;
+
+   if (q->elevator)
+   goto out;
 
+   hctx_lock(hctx, &srcu_idx);
/*
-* RCU or SRCU read lock is needed before checking quiesced flag.
+* hctx_lock is needed before checking quiesced flag.
 *
-* When queue is stopped or quiesced, ignore 'bypass_insert' from
-* blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
-* and avoid driver to try to dispatch again.
+* When queue is stopped or quiesced, ignore 'bypass', insert and return
+* BLK_STS_OK to caller, and avoid driver to try to dispatch again.
 */
-   if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
-   run_queue = false;
-   bypass_insert = false;
-   goto insert;
-   }
-
-   if (q->elevator && !bypass_insert)
-   goto insert;
+   if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))
+   goto out_unlock;
 
-   if (!blk_mq_get_dispatch_budget(hctx))
-   goto insert;
-
-   if (!blk_mq_get_driver_tag(rq)) {
-   blk_mq_put_dispatch_budget(hctx);
-   goto insert;
+   if (!blk_mq_get_dispatch_budget(hctx)) {
+   insert = !bypass;
+   ret = bypass ? BLK_STS_RESOURCE : BLK_STS_OK;
+   goto out_unlock;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie);
-insert:
-   if (bypass_insert)
-   return BLK_STS_RESOURCE;
-
-   blk_mq_sched_insert_request(rq, false, run_queue, false);
-   return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-   struct request *rq, blk_qc_t *cookie)
-{
-   blk_status_t ret;
-   int srcu_idx;
-
-   might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
-
-   hctx_lock(hctx, &srcu_idx);
-
-   ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
-   if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
-   blk_mq_sched_insert_request(rq, false, true, false);
-   else if (ret != BLK_STS_OK)
-   blk_mq_end_request(rq, ret);
+   ret = __blk_mq_issue_directly(hctx, rq, cookie);
+   switch(ret) {
+   case BLK_STS_OK:
+   insert = false;
+   break;
+   case BLK_STS_DEV_RESOURCE:
+   case BLK_STS_RESOURCE:
+   insert = !bypass;
+   break;
+   default:
+   if (!bypass)
+   blk_mq_end_request(rq, ret);
+

[PATCH V2 2/3] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests

2018-10-26 Thread Jianchao Wang

It is not necessary to issue request directly with bypass 'true' in
blk_mq_sched_insert_requests and insert the non-issued requests
itself. Just set bypass to 'false' and let blk_mq_try_issue_directly
handle them.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c |  8 +++-
 block/blk-mq.c   | 10 +-
 2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 29bfe80..23cd97e 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -411,12 +411,10 @@ void blk_mq_sched_insert_requests(struct request_queue *q,
 * busy in case of 'none' scheduler, and this way may save
 * us one extra enqueue & dequeue to sw queue.
 */
-   if (!hctx->dispatch_busy && !e && !run_queue_async) {
+   if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
-   if (list_empty(list))
-   return;
-   }
-   blk_mq_insert_requests(hctx, ctx, list);
+   else
+   blk_mq_insert_requests(hctx, ctx, list);
}
 
blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index a81d2ca..71b829c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1802,15 +1802,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx 
*hctx,
queuelist);
 
list_del_init(&rq->queuelist);
-   ret = blk_mq_request_issue_directly(rq);
-   if (ret != BLK_STS_OK) {
-   if (ret == BLK_STS_RESOURCE ||
-   ret == BLK_STS_DEV_RESOURCE) {
-   list_add(&rq->queuelist, list);
-   break;
-   }
-   blk_mq_end_request(rq, ret);
-   }
+   ret = blk_mq_try_issue_directly(hctx, rq, NULL, false);
}
 }
 
-- 
2.7.4

[PATCH V2 3/3] blk-mq: ensure hctx to be ran on mapped cpu when issue directly

2018-10-26 Thread Jianchao Wang

When issue request directly and the task is migrated out of the
original cpu where it allocates request, hctx could be ran on
the cpu where it is not mapped. To fix this, insert the request
if BLK_MQ_F_BLOCKING is set, check whether the current is mapped
to the hctx and invoke __blk_mq_issue_directly under preemption
disabled.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 71b829c..4f1dedb 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1745,6 +1745,14 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
if (q->elevator)
goto out;
 
+   if (hctx->flags & BLK_MQ_F_BLOCKING)
+   goto out;
+
+   if (!cpumask_test_cpu(get_cpu(), hctx->cpumask)) {
+   put_cpu();
+   goto out;
+   }
+
hctx_lock(hctx, &srcu_idx);
/*
 * hctx_lock is needed before checking quiesced flag.
@@ -1779,6 +1787,7 @@ static blk_status_t blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 
 out_unlock:
hctx_unlock(hctx, srcu_idx);
+   put_cpu();
 out:
if (insert)
blk_mq_sched_insert_request(rq, false, true, false);
-- 
2.7.4

[PATCH V2 0/3] blk-mq: refactor and fix on issue request directly

2018-10-26 Thread Jianchao Wang

Hi Jens

The 1st patch refactors the code of issue request directly.
It merges the blk_mq_try_issue_directly and __blk_mq_try_issue_directly
and make blk_mq_try_issue_directly handle return value of .queue_rq itself.

The 2nd patch make blk_mq_sched_insert_requests issue request directly with
bypass "false" instead of the "true", then needn't to handle the non-issued
requests any more.

The 3rd patch ensures the hctx to be ran on its mapped cpu in issue directly
path.

V2:
 - Add 1st and 2nd patch.

Jianchao Wang(3)
blk-mq: refactor the code of issue request directly
blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
blk-mq: ensure hctx to be ran on mapped cpu when issue

 block/blk-mq-sched.c |   8 ++--
 block/blk-mq.c   | 128 ---
 2 files changed, 63 insertions(+), 73 deletions(-)

[PATCH] blk-mq: ensure hctx to be ran on mapped cpu when issue directly

2018-10-24 Thread Jianchao Wang

When issue request directly and the task is migrated out of the
original cpu where it allocates request, hctx could be ran on
the cpu where it is not mapped. To fix this, insert the request
if BLK_MQ_F_BLOCKING is set, check whether the current is mapped
to the hctx and invoke __blk_mq_issue_directly under preemption
disabled.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index e3c39ea..0cdc306 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1717,6 +1717,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
 {
struct request_queue *q = rq->q;
bool run_queue = true;
+   blk_status_t ret;
+
+   if (hctx->flags & BLK_MQ_F_BLOCKING) {
+   bypass_insert = false;
+   goto insert;
+   }
 
/*
 * RCU or SRCU read lock is needed before checking quiesced flag.
@@ -1734,6 +1740,11 @@ static blk_status_t __blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
if (q->elevator && !bypass_insert)
goto insert;
 
+   if (!cpumask_test_cpu(get_cpu(), hctx->cpumask)) {
+   bypass_insert = false;
+   goto insert;
+   }
+
if (!blk_mq_get_dispatch_budget(hctx))
goto insert;
 
@@ -1742,8 +1753,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
goto insert;
}
 
-   return __blk_mq_issue_directly(hctx, rq, cookie);
+   ret = __blk_mq_issue_directly(hctx, rq, cookie);
+   put_cpu();
+   return ret;
+
 insert:
+   put_cpu();
if (bypass_insert)
return BLK_STS_RESOURCE;
 
-- 
2.7.4

[PATCH V3] block: fix the DISCARD request merge

2018-10-23 Thread Jianchao Wang

There are two cases when handle DISCARD merge.
If max_discard_segments == 1, the bios/requests need to be contiguous
to merge. If max_discard_segments > 1, it takes every bio as a range
and different range needn't to be contiguous.

But now, attempt_merge screws this up. It always consider contiguity
for DISCARD for the case max_discard_segments > 1 and cannot merge
contiguous DISCARD for the case max_discard_segments == 1, because
rq_attempt_discard_merge always returns false in this case.
This patch fixes both of the two cases above.

Signed-off-by: Jianchao Wang 
---

V3:
 - Introduce blk_discard_mergable into attempt_merge and
   blk_try_merge.
 - Some comment changes.

V2:
 - Add max_discard_segments > 1 checking in attempt_merge.
 - Change patch title and comment.
 - Add more comment in attempt_merge

 block/blk-merge.c | 34 --
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 42a4674..b258de0 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -714,6 +714,22 @@ static void blk_account_io_merge(struct request *req)
part_stat_unlock();
}
 }
+/*
+ * Two cases of handling DISCARD merge:
+ * If max_discard_segments > 1, the driver takes every bio
+ * as a range and send them to controller together. The ranges
+ * needn't to be contiguous.
+ * Otherwise, the bios/requests will be handled as same as
+ * others which should be contiguous.
+ */
+static inline bool blk_discard_mergable(struct request *req)
+{
+   if (req_op(req) == REQ_OP_DISCARD &&
+   queue_max_discard_segments(req->q) > 1)
+   return true;
+   else
+   return false;
+}
 
 /*
  * For non-mq, this has to be called with the request spinlock acquired.
@@ -731,12 +747,6 @@ static struct request *attempt_merge(struct request_queue 
*q,
if (req_op(req) != req_op(next))
return NULL;
 
-   /*
-* not contiguous
-*/
-   if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next))
-   return NULL;
-
if (rq_data_dir(req) != rq_data_dir(next)
|| req->rq_disk != next->rq_disk
|| req_no_special_merge(next))
@@ -760,11 +770,16 @@ static struct request *attempt_merge(struct request_queue 
*q,
 * counts here. Handle DISCARDs separately, as they
 * have separate settings.
 */
-   if (req_op(req) == REQ_OP_DISCARD) {
+
+   if (blk_discard_mergable(req)) {
if (!req_attempt_discard_merge(q, req, next))
return NULL;
-   } else if (!ll_merge_requests_fn(q, req, next))
+   } else if (blk_rq_pos(req) + blk_rq_sectors(req) == blk_rq_pos(next)) {
+   if (!ll_merge_requests_fn(q, req, next))
+   return NULL;
+   } else {
return NULL;
+   }
 
/*
 * If failfast settings disagree or any of the two is already
@@ -888,8 +903,7 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
 
 enum elv_merge blk_try_merge(struct request *rq, struct bio *bio)
 {
-   if (req_op(rq) == REQ_OP_DISCARD &&
-   queue_max_discard_segments(rq->q) > 1)
+   if (blk_discard_mergable(rq))
return ELEVATOR_DISCARD_MERGE;
else if (blk_rq_pos(rq) + blk_rq_sectors(rq) == bio->bi_iter.bi_sector)
return ELEVATOR_BACK_MERGE;
-- 
2.7.4

[PATCH V2] block: fix the DISCARD request merge

2018-10-20 Thread Jianchao Wang

There are two cases when handle DISCARD merge
 - max_discard_segments == 1
   bios need to be contiguous
 - max_discard_segments > 1
   Only nvme right now. It takes every bio as a range and different
   range needn't to be contiguous.

But now, attempt_merge screws this up. It always consider contiguity
for DISCARD for the case max_discard_segments > 1 and cannot merge
contiguous DISCARD for the case max_discard_segments == 1, because
rq_attempt_discard_merge always returns false in this case.
This patch fixes both of the two cases above.

Signed-off-by: Jianchao Wang 
---

V2:
  - Add max_discard_segments > 1 checking in attempt_merge
  - Change patch title and comment
  - Add more comment in attempt_merge

 block/blk-merge.c | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 42a4674..8f22374 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -734,8 +734,15 @@ static struct request *attempt_merge(struct request_queue 
*q,
/*
 * not contiguous
 */
-   if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next))
-   return NULL;
+   if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next)) {
+   /*
+* When max_discard_segments is bigger than 1 (only nvme right
+* now), needn't consider the contiguity.
+*/
+   if (!(req_op(req) == REQ_OP_DISCARD &&
+ queue_max_discard_segments(q) > 1))
+   return NULL;
+   }
 
if (rq_data_dir(req) != rq_data_dir(next)
|| req->rq_disk != next->rq_disk
@@ -757,10 +764,17 @@ static struct request *attempt_merge(struct request_queue 
*q,
 * If we are allowed to merge, then append bio list
 * from next to rq and release next. merge_requests_fn
 * will have updated segment counts, update sector
-* counts here. Handle DISCARDs separately, as they
-* have separate settings.
+* counts here.
+* Two cases of Handling DISCARD:
+*  - max_discard_segments == 1
+*The bios need to be contiguous.
+*  - max_discard_segments > 1
+*Only nvme right now. It takes every bio as a
+*range and send them to controller together. The ranges
+*needn't to be contiguous.
 */
-   if (req_op(req) == REQ_OP_DISCARD) {
+   if (req_op(req) == REQ_OP_DISCARD &&
+   queue_max_discard_segments(q) > 1) {
if (!req_attempt_discard_merge(q, req, next))
return NULL;
} else if (!ll_merge_requests_fn(q, req, next))
-- 
2.7.4

[PATCH] block: don't check position contiguity for DISCARD in attempt_merge

2018-10-19 Thread Jianchao Wang

Discard command supports multiple ranges of blocks, so needn't
checking position contiguity when merging. Let's do the same thing
in attempt_merge as the blk_try_merge.

Signed-off-by: Jianchao Wang 
---
 block/blk-merge.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 42a4674..c94749b 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -732,9 +732,10 @@ static struct request *attempt_merge(struct request_queue 
*q,
return NULL;
 
/*
-* not contiguous
+* not contiguous, except for DISCARD
 */
-   if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next))
+   if ((req_op(req) != REQ_OP_DISCARD) &&
+   (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next)))
return NULL;
 
if (rq_data_dir(req) != rq_data_dir(next)
-- 
2.7.4

[PATCH V2 3/3] nvme-pci: unquiesce queues after update hw queues

2018-10-17 Thread Jianchao Wang

updating hw queues uses bio retrieve to drain the queues. unquiescing
queues before that is not needed and will cause requests to be issued
to dead hw queue. So move unquiescing queues after updating hw queues,
as well as the wait freeze.

Signed-off-by: Jianchao Wang 
---
 drivers/nvme/host/pci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index d668682..dbf6904 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2327,11 +2327,11 @@ static void nvme_reset_work(struct work_struct *work)
nvme_remove_namespaces(&dev->ctrl);
new_state = NVME_CTRL_ADMIN_ONLY;
} else {
-   nvme_start_queues(&dev->ctrl);
-   nvme_wait_freeze(&dev->ctrl);
/* hit this only when allocate tagset fails */
if (nvme_dev_add(dev))
new_state = NVME_CTRL_ADMIN_ONLY;
+   nvme_start_queues(&dev->ctrl);
+   nvme_wait_freeze(&dev->ctrl);
nvme_unfreeze(&dev->ctrl);
}
 
-- 
2.7.4

[PATCH V2 1/3] blk-mq: introduce bio retrieve mechanism

2018-10-17 Thread Jianchao Wang

Currently request requeue mechanism cannot work well with updating
nr_hw_queues. Because the requests are highly bound with specific
hw queue, requests on the dying hw queue have to be failed. And
this could be fatal for filesystem.

In addition, the request_queue need to be frozen and drained before
updating nr_hw_queues, if IO timeout, we have to depend on the LLDD
to do recovery. But the recovery path maybe sleeping to wait the
the request_queue to be drained. IO hang comes up.

To avoid the two case above, we introduce bio retrieve mechanism.
The bio retrieving will do following things:
 - flush requests on hctx->dispatch, sw queue or io scheduler queue
 - take the bios down from the requests and end the requests
 - requeue this bios and submit them through generic_make_request
   again later.

Then we could avoid to fail requests on dying hw queue and depend
on storage device to drain request_queue.

Signed-off-by: Jianchao Wang 
---
 block/blk-core.c   |  2 ++
 block/blk-mq-sched.c   | 88 ++
 block/blk-mq.c | 42 
 include/linux/blk-mq.h |  4 +++
 include/linux/blkdev.h |  2 ++
 5 files changed, 138 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index cdfabc5..f3c6fa8 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -807,6 +807,8 @@ void blk_cleanup_queue(struct request_queue *q)
 
/* @q won't process any more request, flush async actions */
del_timer_sync(&q->backing_dev_info->laptop_mode_wb_timer);
+   if (q->mq_ops)
+   cancel_delayed_work_sync(&q->bio_requeue_work);
blk_sync_queue(q);
 
/*
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 29bfe80..9d0b2a2 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -422,6 +422,94 @@ void blk_mq_sched_insert_requests(struct request_queue *q,
blk_mq_run_hw_queue(hctx, run_queue_async);
 }
 
+static void blk_mq_sched_retrieve_one_req(struct request *rq,
+   struct bio_list *list)
+{
+   struct bio *bio;
+
+   blk_steal_bios(list, rq);
+   blk_mq_end_request(rq, BLK_STS_OK);
+
+   bio_list_for_each(bio, list) {
+   /*
+* bio with BIO_QUEUE_ENTERED will enter queue with
+* blk_queue_enter_live.
+*/
+   bio_clear_flag(bio, BIO_QUEUE_ENTERED);
+   }
+}
+
+static void __blk_mq_sched_retrieve_bios(struct blk_mq_hw_ctx *hctx)
+{
+   struct request_queue *q = hctx->queue;
+   struct bio_list bio_list;
+   LIST_HEAD(rq_list);
+   struct request *rq;
+
+   bio_list_init(&bio_list);
+
+   if (!list_empty_careful(&hctx->dispatch)) {
+   spin_lock(&hctx->lock);
+   if (!list_empty(&hctx->dispatch))
+   list_splice_tail_init(&hctx->dispatch, &rq_list);
+   spin_unlock(&hctx->lock);
+   }
+
+   if (!q->elevator)
+   blk_mq_flush_busy_ctxs(hctx, &rq_list);
+
+   while (!list_empty(&rq_list)) {
+   rq = list_first_entry(&rq_list, struct request, queuelist);
+   list_del_init(&rq->queuelist);
+   blk_mq_sched_retrieve_one_req(rq, &bio_list);
+   }
+
+   if (q->elevator) {
+   struct elevator_queue *e = hctx->queue->elevator;
+
+   while (e->type->ops.mq.has_work &&
+   e->type->ops.mq.has_work(hctx)) {
+   rq = e->type->ops.mq.dispatch_request(hctx);
+   if (!rq)
+   continue;
+
+   blk_mq_sched_retrieve_one_req(rq, &bio_list);
+   }
+   }
+   /* For the request with RQF_FLUSH_SEQ, blk_mq_end_request cannot end 
them
+* but just push the flush sm. So there could still be rqs in flush 
queue,
+* the caller will check q_usage_counter and come back again.
+*/
+   blk_mq_requeue_bios(q, &bio_list, false);
+}
+
+/*
+ * When blk_mq_sched_retrieve_bios returns:
+ *  - All the rqs are ended, q_usage_counter is zero
+ *  - All the bios are queued to q->requeue_bios
+ */
+void blk_mq_sched_retrieve_bios(struct request_queue *q)
+{
+   struct blk_mq_hw_ctx *hctx;
+   int i;
+
+   BUG_ON(!atomic_read(&q->mq_freeze_depth) ||
+   !blk_queue_quiesced(q));
+
+   /*
+* Kick the requeue_work to flush the reqs in requeue_list
+*/
+   blk_mq_kick_requeue_list(q);
+
+   while (!percpu_ref_is_zero(&q->q_usage_counter)) {
+   queue_for_each_hw_ctx(q, hctx, i)
+   __blk_mq_sched_retrieve_bios(hctx);
+   }
+
+   blk_mq_requeue_bios(q, NULL, true);
+}
+EXPORT_SYMBOL_GPL(blk_mq_sched_retrieve_bios);
+
 static void blk_mq_sched_fre

[PATCH V2 2/3] blk-mq: retrieve bios before update nr_hw_queues

2018-10-17 Thread Jianchao Wang

retrieve bios of all requests on queue to drain requests, then
needn't depend on storage device to drain the queue any more.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index f75598b..6d89d3e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3085,7 +3085,10 @@ static void __blk_mq_update_nr_hw_queues(struct 
blk_mq_tag_set *set,
return;
 
list_for_each_entry(q, &set->tag_list, tag_set_list)
-   blk_mq_freeze_queue(q);
+   blk_freeze_queue_start(q);
+
+   list_for_each_entry(q, &set->tag_list, tag_set_list)
+   blk_mq_sched_retrieve_bios(q);
/*
 * Sync with blk_mq_queue_tag_busy_iter.
 */
-- 
2.7.4

[PATCH V2 0/3] blk-mq: introduce bio retrieve mechanism

2018-10-17 Thread Jianchao Wang

Hi Jenss

This patchset introduces a new bio retrieve mechanism. It will
flush all the requests, take the bios down from them and end them,
then submit the bios again later. Then we could
avoid:
 - fail requests on the dying hw queues, this could be fatal for
   filesystem.
 - depend on storage device to drain the queue before update hw queues.
   this could introduce hung into nvme_reset_work.

The 1st patch introduce the bio retrieve mechanism
The 2nd patch apply the bio retrieve mechanism in updating hw queues
The 3rd patch unquiesces the queues after updating hw queues in nvme_reset_work

V2:
 - clear BIO_QUEUE_ENTERED of requeued bios (1/3)
 - sync bio_requeue_work in blk_cleanup_queue (1/3)
 - discard the unnecessary synchronize_sched (2/3)
 - discard the 4th patch which is wrong
 - some misc comment changes

Jianchao Wang(3)
blk-mq: introduce bio retrieve mechanism
blk-mq: retrieve bios before update nr_hw_queues
nvme-pci: unquiesce queues after update hw queues


 block/blk-core.c|  2 ++
 block/blk-mq-sched.c| 88 

 block/blk-mq.c  | 47 ++-
 drivers/nvme/host/pci.c |  4 ++--
 include/linux/blk-mq.h  |  4 
 include/linux/blkdev.h  |  2 ++
 6 files changed, 144 insertions(+), 3 deletions(-)

Thanks
Jianchao

[PATCH V2 0/2] block: kyber: make kyber more friendly with merging

2018-05-22 Thread Jianchao Wang

Hi Jens

This is the second version patchset to make the kyber io scheduler more firendly
with merging.

Most of time, kyber io scheduler will not leave any requests in ctx rq_list,
this is because even if tokens of one domain is used up, kyber will try to
dispatch requests from other domain and flush the rq_list there. So there is not
any change to do merge.

To improve this, setup kyber_ctx_queue (kcq) which is similar with blk_mq_ctx 
but has
rq_list for every domain, and build mapping between kcq and khd as ctx and hctx.
Then we could merge, insert and dispatch on rq_list of different domains 
separately.
At the same time, only flush the rq_list of kcq after get domain token
successfully, then the requests could be left in the rq_list of that domain and 
maybe merged with following io. In my local test on NVMe card, the sequential io
performance could be improved a lot, especially on high workload. More details
please refer to the second patch.

Changes V1 to V2:
 - Add patch from Jens which abstract out the blk-mq-sched rq list iteration bio
 merge helper interface and rebase the patch based on it.
 - merge bio_sched_domain and rq_sched_domain to avoid the duplicated code.
 - allocate kcqs per khd

Jens Axboe (1)
0001-blk-mq-abstract-out-blk-mq-sched-rq-list-iteration-b.patch

Jianchao Wang (1)
0002-block-kyber-make-kyber-more-friendly-with-merging.patch

 block/blk-mq-sched.c   |  34 ++---
 block/kyber-iosched.c  | 197 -
 include/linux/blk-mq.h |   3 +-
 3 files changed, 188 insertions(+), 46 deletions(-)


Thanks
Jianchao

[PATCH V2 2/2] block: kyber: make kyber more friendly with merging

2018-05-22 Thread Jianchao Wang

Currently, kyber is very unfriendly with merging. kyber depends
on ctx rq_list to do merging, however, most of time, it will not
leave any requests in ctx rq_list. This is because even if tokens
of one domain is used up, kyber will try to dispatch requests
from other domain and flush the rq_list there.

To improve this, we setup kyber_ctx_queue (kcq) which is similar
with ctx, but it has rq_lists for different domain and build same
mapping between kcq and khd as the ctx & hctx. Then we could merge,
insert and dispatch for different domains separately. At the same
time, only flush the rq_list of kcq when get domain token successfully.
Then if one domain token is used up, the requests could be left in
the rq_list of that domain and maybe merged with following io.

Following is my test result on machine with 8 cores and NVMe card
INTEL SSDPEKKR128G7

fio size=256m ioengine=libaio iodepth=64 direct=1 numjobs=8
seq/random
+--+---+
|patch?| bw(MB/s) |   iops| slat(usec) |clat(usec)   |  merge  |
+--+
| w/o  |  606/612 | 151k/153k |  6.89/7.03 | 3349.21/3305.40 |   0/0   |
+--+
| w/   | 1083/616 | 277k/154k |  4.93/6.95 | 1830.62/3279.95 | 223k/3k |
+--+
When set numjobs to 16, the bw and iops could reach 1662MB/s and 425k
on my platform.

Signed-off-by: Jianchao Wang 
Tested-by: Holger Hoffstätte 
---
 block/kyber-iosched.c | 197 +-
 1 file changed, 162 insertions(+), 35 deletions(-)

diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c
index 0d6d25e3..7ca1364 100644
--- a/block/kyber-iosched.c
+++ b/block/kyber-iosched.c
@@ -72,6 +72,15 @@ static const unsigned int kyber_batch_size[] = {
[KYBER_OTHER] = 8,
 };
 
+struct kyber_ctx_queue {
+   /*
+* Copied from blk_mq_ctx->index_hw
+*/
+   unsigned int index;
+   spinlock_t lock;
+   struct list_head rq_list[KYBER_NUM_DOMAINS];
+} cacheline_aligned_in_smp;
+
 struct kyber_queue_data {
struct request_queue *q;
 
@@ -99,6 +108,8 @@ struct kyber_hctx_data {
struct list_head rqs[KYBER_NUM_DOMAINS];
unsigned int cur_domain;
unsigned int batching;
+   struct kyber_ctx_queue *kcqs;
+   struct sbitmap kcq_map[KYBER_NUM_DOMAINS];
wait_queue_entry_t domain_wait[KYBER_NUM_DOMAINS];
struct sbq_wait_state *domain_ws[KYBER_NUM_DOMAINS];
atomic_t wait_index[KYBER_NUM_DOMAINS];
@@ -107,10 +118,8 @@ struct kyber_hctx_data {
 static int kyber_domain_wake(wait_queue_entry_t *wait, unsigned mode, int 
flags,
 void *key);
 
-static int rq_sched_domain(const struct request *rq)
+static int kyber_sched_domain(unsigned int op)
 {
-   unsigned int op = rq->cmd_flags;
-
if ((op & REQ_OP_MASK) == REQ_OP_READ)
return KYBER_READ;
else if ((op & REQ_OP_MASK) == REQ_OP_WRITE && op_is_sync(op))
@@ -284,6 +293,11 @@ static unsigned int kyber_sched_tags_shift(struct 
kyber_queue_data *kqd)
return kqd->q->queue_hw_ctx[0]->sched_tags->bitmap_tags.sb.shift;
 }
 
+static int kyber_bucket_fn(const struct request *rq)
+{
+   return kyber_sched_domain(rq->cmd_flags);
+}
+
 static struct kyber_queue_data *kyber_queue_data_alloc(struct request_queue *q)
 {
struct kyber_queue_data *kqd;
@@ -297,11 +311,10 @@ static struct kyber_queue_data 
*kyber_queue_data_alloc(struct request_queue *q)
goto err;
kqd->q = q;
 
-   kqd->cb = blk_stat_alloc_callback(kyber_stat_timer_fn, rq_sched_domain,
+   kqd->cb = blk_stat_alloc_callback(kyber_stat_timer_fn, kyber_bucket_fn,
  KYBER_NUM_DOMAINS, kqd);
if (!kqd->cb)
goto err_kqd;
-
/*
 * The maximum number of tokens for any scheduling domain is at least
 * the queue depth of a single hardware queue. If the hardware doesn't
@@ -376,15 +389,45 @@ static void kyber_exit_sched(struct elevator_queue *e)
kfree(kqd);
 }
 
+static void kyber_ctx_queue_init(struct kyber_ctx_queue *kcq)
+{
+   unsigned int i;
+
+   spin_lock_init(&kcq->lock);
+   for (i = 0; i < KYBER_NUM_DOMAINS; i++)
+   INIT_LIST_HEAD(&kcq->rq_list[i]);
+}
+
 static int kyber_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx)
 {
struct kyber_hctx_data *khd;
-   int i;
+   int i, sd;
 
khd = kmalloc_node(sizeof(*khd), GFP_KERNEL, hctx->numa_node);
if (!khd)
return -ENOMEM;
 
+   khd->kcqs = kmalloc_array_node(hctx->nr_ctx,
+   sizeof(struct kyber_ctx_queue),
+

[PATCH V2 1/2] blk-mq: abstract out blk-mq-sched rq list iteration bio merge helper

2018-05-22 Thread Jianchao Wang

From: Jens Axboe 

No functional changes in this patch, just a prep patch for utilizing
this in an IO scheduler.

Signed-off-by: Jens Axboe 
---
 block/blk-mq-sched.c   | 34 --
 include/linux/blk-mq.h |  3 ++-
 2 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 25c14c5..b0f2c2a 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -268,19 +268,16 @@ bool blk_mq_sched_try_merge(struct request_queue *q, 
struct bio *bio,
 EXPORT_SYMBOL_GPL(blk_mq_sched_try_merge);
 
 /*
- * Reverse check our software queue for entries that we could potentially
- * merge with. Currently includes a hand-wavy stop count of 8, to not spend
- * too much time checking for merges.
+ * Iterate list of requests and see if we can merge this bio with any
+ * of them.
  */
-static bool blk_mq_attempt_merge(struct request_queue *q,
-struct blk_mq_ctx *ctx, struct bio *bio)
+bool blk_mq_bio_list_merge(struct request_queue *q, struct list_head *list,
+  struct bio *bio)
 {
struct request *rq;
int checked = 8;
 
-   lockdep_assert_held(&ctx->lock);
-
-   list_for_each_entry_reverse(rq, &ctx->rq_list, queuelist) {
+   list_for_each_entry_reverse(rq, list, queuelist) {
bool merged = false;
 
if (!checked--)
@@ -305,13 +302,30 @@ static bool blk_mq_attempt_merge(struct request_queue *q,
continue;
}
 
-   if (merged)
-   ctx->rq_merged++;
return merged;
}
 
return false;
 }
+EXPORT_SYMBOL_GPL(blk_mq_bio_list_merge);
+
+/*
+ * Reverse check our software queue for entries that we could potentially
+ * merge with. Currently includes a hand-wavy stop count of 8, to not spend
+ * too much time checking for merges.
+ */
+static bool blk_mq_attempt_merge(struct request_queue *q,
+struct blk_mq_ctx *ctx, struct bio *bio)
+{
+   lockdep_assert_held(&ctx->lock);
+
+   if (blk_mq_bio_list_merge(q, &ctx->rq_list, bio)) {
+   ctx->rq_merged++;
+   return true;
+   }
+
+   return false;
+}
 
 bool __blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio)
 {
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index ebc34a5..fb35517 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -259,7 +259,8 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool 
at_head,
 void blk_mq_kick_requeue_list(struct request_queue *q);
 void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long 
msecs);
 void blk_mq_complete_request(struct request *rq);
-
+bool blk_mq_bio_list_merge(struct request_queue *q, struct list_head *list,
+  struct bio *bio);
 bool blk_mq_queue_stopped(struct request_queue *q);
 void blk_mq_stop_hw_queue(struct blk_mq_hw_ctx *hctx);
 void blk_mq_start_hw_queue(struct blk_mq_hw_ctx *hctx);
-- 
2.7.4

[PATCH] block: kyber: make kyber more friendly with merging

2018-05-22 Thread Jianchao Wang

Currently, kyber is very unfriendly with merging. kyber depends
on ctx rq_list to do merging, however, most of time, it will not
leave any requests in ctx rq_list. This is because even if tokens
of one domain is used up, kyber will try to dispatch requests
from other domain and flush the rq_list there.

To improve this, we setup kyber_ctx_queue (kcq) which is similar
with ctx, but it has rq_lists for different domain and build same
mapping between kcq and khd as the ctx & hctx. Then we could merge,
insert and dispatch for different domains separately. If one domain
token is used up, the requests could be left in the rq_list of
that domain and maybe merged with following io.

Following is my test result on machine with 8 cores and NVMe card
INTEL SSDPEKKR128G7

fio size=256m ioengine=libaio iodepth=64 direct=1 numjobs=8
seq/random
+--+---+
|patch?| bw(MB/s) |   iops| slat(usec) |clat(usec)   |  merge  |
+--+
| w/o  |  606/612 | 151k/153k |  6.89/7.03 | 3349.21/3305.40 |   0/0   |
+--+
| w/   | 1083/616 | 277k/154k |  4.93/6.95 | 1830.62/3279.95 | 223k/3k |
+--+
When set numjobs to 16, the bw and iops could reach 1662MB/s and 425k
on my platform.

Signed-off-by: Jianchao Wang 
---
 block/kyber-iosched.c | 240 --
 1 file changed, 212 insertions(+), 28 deletions(-)

diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c
index 0d6d25e3..04da05b 100644
--- a/block/kyber-iosched.c
+++ b/block/kyber-iosched.c
@@ -72,6 +72,15 @@ static const unsigned int kyber_batch_size[] = {
[KYBER_OTHER] = 8,
 };
 
+struct kyber_ctx_queue {
+   /*
+* Copied from blk_mq_ctx->index_hw
+*/
+   unsigned int index;
+   spinlock_t lock;
+   struct list_head rq_list[KYBER_NUM_DOMAINS];
+} cacheline_aligned_in_smp;
+
 struct kyber_queue_data {
struct request_queue *q;
 
@@ -84,6 +93,7 @@ struct kyber_queue_data {
 */
struct sbitmap_queue domain_tokens[KYBER_NUM_DOMAINS];
 
+   struct kyber_ctx_queue *ctx_queue;
/*
 * Async request percentage, converted to per-word depth for
 * sbitmap_get_shallow().
@@ -99,6 +109,8 @@ struct kyber_hctx_data {
struct list_head rqs[KYBER_NUM_DOMAINS];
unsigned int cur_domain;
unsigned int batching;
+   struct kyber_ctx_queue **kcqs;
+   struct sbitmap kcq_map[KYBER_NUM_DOMAINS];
wait_queue_entry_t domain_wait[KYBER_NUM_DOMAINS];
struct sbq_wait_state *domain_ws[KYBER_NUM_DOMAINS];
atomic_t wait_index[KYBER_NUM_DOMAINS];
@@ -284,6 +296,19 @@ static unsigned int kyber_sched_tags_shift(struct 
kyber_queue_data *kqd)
return kqd->q->queue_hw_ctx[0]->sched_tags->bitmap_tags.sb.shift;
 }
 
+static void kyber_ctx_queue_init(struct kyber_queue_data *kqd)
+{
+   unsigned int i, j;
+
+   for (i = 0; i < nr_cpu_ids; i++) {
+   struct kyber_ctx_queue *kcq = &kqd->ctx_queue[i];
+
+   spin_lock_init(&kcq->lock);
+   for (j = 0; j < KYBER_NUM_DOMAINS; j++)
+   INIT_LIST_HEAD(&kcq->rq_list[j]);
+   }
+}
+
 static struct kyber_queue_data *kyber_queue_data_alloc(struct request_queue *q)
 {
struct kyber_queue_data *kqd;
@@ -302,6 +327,13 @@ static struct kyber_queue_data 
*kyber_queue_data_alloc(struct request_queue *q)
if (!kqd->cb)
goto err_kqd;
 
+   kqd->ctx_queue = kmalloc_array_node(nr_cpu_ids,
+   sizeof(struct kyber_ctx_queue), GFP_KERNEL, -1);
+   if (!kqd->ctx_queue)
+   goto err_cb;
+
+   kyber_ctx_queue_init(kqd);
+
/*
 * The maximum number of tokens for any scheduling domain is at least
 * the queue depth of a single hardware queue. If the hardware doesn't
@@ -318,7 +350,7 @@ static struct kyber_queue_data 
*kyber_queue_data_alloc(struct request_queue *q)
if (ret) {
while (--i >= 0)
sbitmap_queue_free(&kqd->domain_tokens[i]);
-   goto err_cb;
+   goto err_kcq;
}
sbitmap_queue_resize(&kqd->domain_tokens[i], kyber_depth[i]);
}
@@ -331,6 +363,8 @@ static struct kyber_queue_data 
*kyber_queue_data_alloc(struct request_queue *q)
 
return kqd;
 
+err_kcq:
+   kfree(kqd->ctx_queue);
 err_cb:
blk_stat_free_callback(kqd->cb);
 err_kqd:
@@ -372,6 +406,7 @@ static void kyber_exit_sched(struct elevator_queue *e)
 
for (i = 0; i < KYBER_NUM_DOMAINS; i++)
sbitmap_queue_free(&kqd->domain_tokens[i

[PATCH] blk-mq: add plug trace event for multiple hw queues case

2018-05-22 Thread Jianchao Wang

There is no plug trace event for multiple hw queues. This is
confusing when check block trace event log and find unplug one
there. Add plug trace event when request is added to a empty plug
list.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9ce9cac..ddf726a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1935,6 +1935,9 @@ static blk_qc_t blk_mq_make_request(struct request_queue 
*q, struct bio *bio)
same_queue_rq = NULL;
if (same_queue_rq)
list_del_init(&same_queue_rq->queuelist);
+   else
+   trace_block_plug(q);
+
list_add_tail(&rq->queuelist, &plug->mq_list);
 
blk_mq_put_ctx(data.ctx);
-- 
2.7.4

[PATCH] blk-mq: start request gstate with gen 1

2018-04-16 Thread Jianchao Wang

rq->gstate and rq->aborted_gstate both are zero before rqs are
allocated. If we have a small timeout, when the timer fires,
there could be rqs that are never allocated, and also there could
be rq that has been allocated but not initialized and started. At
the moment, the rq->gstate and rq->aborted_gstate both are 0, thus
the blk_mq_terminate_expired will identify the rq is timed out and
invoke .timeout early.

For scsi, this will cause scsi_times_out to be invoked before the
scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at
the moment, then we will get crash.

Cc: Bart Van Assche 
Cc: Tejun Heo 
Cc: Ming Lei 
Cc: Martin Steigerwald 
Cc: sta...@vger.kernel.org
Signed-off-by: Jianchao Wang 
---
 block/blk-core.c | 4 
 block/blk-mq.c   | 7 +++
 2 files changed, 11 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index abcb868..ce62681 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -201,6 +201,10 @@ void blk_rq_init(struct request_queue *q, struct request 
*rq)
rq->part = NULL;
seqcount_init(&rq->gstate_seq);
u64_stats_init(&rq->aborted_gstate_sync);
+   /*
+* See comment of blk_mq_init_request
+*/
+   WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
 }
 EXPORT_SYMBOL(blk_rq_init);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f5c7dbc..d62030a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2069,6 +2069,13 @@ static int blk_mq_init_request(struct blk_mq_tag_set 
*set, struct request *rq,
 
seqcount_init(&rq->gstate_seq);
u64_stats_init(&rq->aborted_gstate_sync);
+   /*
+* start gstate with gen 1 instead of 0, otherwise it will be equal
+* to aborted_gstate, and be identified timed out by
+* blk_mq_terminate_expired.
+*/
+   WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
+
return 0;
 }
 
-- 
2.7.4

[PATCH] blk-mq: start request gstate with gen 1

2018-04-16 Thread Jianchao Wang

rq->gstate and rq->aborted_gstate both are zero before rqs are
allocated. If we have a small timeout, when the timer fires,
there could be rqs that are never allocated, and also there could
be rq that has been allocated but not initialized and started. At
the moment, the rq->gstate and rq->aborted_gstate both are 0, thus
the blk_mq_terminate_expired will identify the rq is timed out and
invoke .timeout early.

For scsi, this will cause scsi_times_out to be invoked before the
scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at
the moment, then we will get crash.

Cc: Bart Van Assche 
Cc: Tejun Heo 
Cc: Ming Lei 
Cc: Martin Steigerwald 
Cc: sta...@vger.kernel.org
Signed-off-by: Jianchao Wang 
---
 block/blk-core.c | 4 
 block/blk-mq.c   | 7 +++
 2 files changed, 11 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index abcb868..ce62681 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -201,6 +201,10 @@ void blk_rq_init(struct request_queue *q, struct request 
*rq)
rq->part = NULL;
seqcount_init(&rq->gstate_seq);
u64_stats_init(&rq->aborted_gstate_sync);
+   /*
+* See comment of blk_mq_init_request
+*/
+   WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
 }
 EXPORT_SYMBOL(blk_rq_init);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f5c7dbc..d62030a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2069,6 +2069,13 @@ static int blk_mq_init_request(struct blk_mq_tag_set 
*set, struct request *rq,
 
seqcount_init(&rq->gstate_seq);
u64_stats_init(&rq->aborted_gstate_sync);
+   /*
+* start gstate with gen 1 instead of 0, otherwise it will be equal
+* to aborted_gstate, and be identified timed out by
+* blk_mq_terminate_expired.
+*/
+   WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
+
return 0;
 }
 
-- 
2.7.4

[PATCH] blk-mq: mark hctx RESTART when get budget fails

2018-04-16 Thread Jianchao Wang

When get budget fails, blk_mq_sched_dispatch_requests does not do
anything to ensure the hctx to be restarted. We can survive from
this, because only the scsi implements .get_budget and it always
runs the hctx queues when request is completed.

Signed-off-by: Jianchao Wang 
---
 block/blk-mq-sched.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 25c14c5..49e7cd9 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -102,8 +102,10 @@ static void blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx 
*hctx)
!e->type->ops.mq.has_work(hctx))
break;
 
-   if (!blk_mq_get_dispatch_budget(hctx))
+   if (!blk_mq_get_dispatch_budget(hctx)) {
+   blk_mq_sched_mark_restart_hctx(hctx);
break;
+   }
 
rq = e->type->ops.mq.dispatch_request(hctx);
if (!rq) {
@@ -148,8 +150,10 @@ static void blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx 
*hctx)
if (!sbitmap_any_bit_set(&hctx->ctx_map))
break;
 
-   if (!blk_mq_get_dispatch_budget(hctx))
+   if (!blk_mq_get_dispatch_budget(hctx)) {
+   blk_mq_sched_mark_restart_hctx(hctx);
break;
+   }
 
rq = blk_mq_dequeue_from_ctx(hctx, ctx);
if (!rq) {
-- 
2.7.4

1 2 >

1 - 100 of 109 matches

Mail list logo