Re: [PATCH] dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Ming Lei
On Thu, Jan 18, 2018 at 05:49:10PM -0700, Jens Axboe wrote:
> On 1/18/18 5:35 PM, Ming Lei wrote:
> > On Thu, Jan 18, 2018 at 05:24:29PM -0700, Jens Axboe wrote:
> >> On 1/18/18 5:18 PM, Ming Lei wrote:
> >>> On Fri, Jan 19, 2018 at 12:14:24AM +, Bart Van Assche wrote:
>  On Fri, 2018-01-19 at 08:11 +0800, Ming Lei wrote:
> > On Thu, Jan 18, 2018 at 08:37:07AM -0800, Bart Van Assche wrote:
> >> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> >> index f16096af879a..c59c59cfd2a5 100644
> >> --- a/drivers/md/dm-rq.c
> >> +++ b/drivers/md/dm-rq.c
> >> @@ -761,6 +761,7 @@ static blk_status_t dm_mq_queue_rq(struct 
> >> blk_mq_hw_ctx *hctx,
> >>/* Undo dm_start_request() before requeuing */
> >>rq_end_stats(md, rq);
> >>rq_completed(md, rq_data_dir(rq), false);
> >> +  blk_mq_delay_run_hw_queue(hctx, 100/*ms*/);
> >>return BLK_STS_RESOURCE;
> >>}
> >>  
> >
> > Nak.
> 
>  This patch fixes a regression that was introduced by you. You should know
>  that regressions are not acceptable. If you don't agree with this patch,
>  please fix the root cause.
> >>>
> >>> Yesterday I sent a patch, did you test that?
> >>
> >> That patch isn't going to be of much help. It might prevent you from
> >> completely stalling, but it might take you tens of seconds to get there.
> > 
> > Yeah, that is true, and why I marked it as RFC, but the case is so rare to
> > happen.
> 
> You don't know that. If the resource is very limited, or someone else
> is gobbling up all of it, then it may trigger quite often. Granted,
> in that case, you need some way of signaling the freeing of those
> resources to make it efficient.
> 
> >> On top of that, it's a rolling timer that gets added when IO is queued,
> >> and re-added if IO is pending when it fires. If the latter case is not
> >> true, then it won't get armed again. So if IO fails to queue without
> >> any other IO pending, you're pretty much in the same situation as if
> > 
> > No IO pending detection may take a bit time, we can do it after
> > BLK_STS_RESOURCE is returned and SCHED_RESTART is set. I have done
> > this way in the following patch, what do you think of it?
> > 
> > https://github.com/ming1/linux/commit/f866ce0b97b0ae22d033881da7eb07706fd458b4
> 
> I think it's overly complicated. As I wrote in a lengthier email earlier
> today, just ensure that we have a unique return code from the driver for
> when it aborts queuing an IO due to a resource shortage that isn't tied
> to it's own consumption of it. When we get that return, do a delayed
> queue run after X amount of time to ensure we always try again. If you
> want to get fancy (later on), you could track the frequency of such
> returns and complain if it's too high.

Suppose the new introduced return code is BLK_STS_NO_DEV_RESOURCE.

This way may degrade performance, for example, DM-MPATH returns
BLK_STS_NO_DEV_RESOURCE when blk_get_request() returns NULL, blk-mq
handles it by blk_mq_delay_run_hw_queue(10ms). Then blk_mq_sched_restart()
is just triggered when one DM-MPATH request is completed, that means one
request of underlying queue is completed too, but blk_mq_sched_restart()
still can't make progress during the 10ms.

That means we should only run blk_mq_delay_run_hw_queue(10ms/100ms) when
the queue is idle.

I will post out the patch in github for discussion.

Thanks,
Ming


Re: [PATCH] dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Jens Axboe
On 1/18/18 5:35 PM, Ming Lei wrote:
> On Thu, Jan 18, 2018 at 05:24:29PM -0700, Jens Axboe wrote:
>> On 1/18/18 5:18 PM, Ming Lei wrote:
>>> On Fri, Jan 19, 2018 at 12:14:24AM +, Bart Van Assche wrote:
 On Fri, 2018-01-19 at 08:11 +0800, Ming Lei wrote:
> On Thu, Jan 18, 2018 at 08:37:07AM -0800, Bart Van Assche wrote:
>> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
>> index f16096af879a..c59c59cfd2a5 100644
>> --- a/drivers/md/dm-rq.c
>> +++ b/drivers/md/dm-rq.c
>> @@ -761,6 +761,7 @@ static blk_status_t dm_mq_queue_rq(struct 
>> blk_mq_hw_ctx *hctx,
>>  /* Undo dm_start_request() before requeuing */
>>  rq_end_stats(md, rq);
>>  rq_completed(md, rq_data_dir(rq), false);
>> +blk_mq_delay_run_hw_queue(hctx, 100/*ms*/);
>>  return BLK_STS_RESOURCE;
>>  }
>>  
>
> Nak.

 This patch fixes a regression that was introduced by you. You should know
 that regressions are not acceptable. If you don't agree with this patch,
 please fix the root cause.
>>>
>>> Yesterday I sent a patch, did you test that?
>>
>> That patch isn't going to be of much help. It might prevent you from
>> completely stalling, but it might take you tens of seconds to get there.
> 
> Yeah, that is true, and why I marked it as RFC, but the case is so rare to
> happen.

You don't know that. If the resource is very limited, or someone else
is gobbling up all of it, then it may trigger quite often. Granted,
in that case, you need some way of signaling the freeing of those
resources to make it efficient.

>> On top of that, it's a rolling timer that gets added when IO is queued,
>> and re-added if IO is pending when it fires. If the latter case is not
>> true, then it won't get armed again. So if IO fails to queue without
>> any other IO pending, you're pretty much in the same situation as if
> 
> No IO pending detection may take a bit time, we can do it after
> BLK_STS_RESOURCE is returned and SCHED_RESTART is set. I have done
> this way in the following patch, what do you think of it?
> 
> https://github.com/ming1/linux/commit/f866ce0b97b0ae22d033881da7eb07706fd458b4

I think it's overly complicated. As I wrote in a lengthier email earlier
today, just ensure that we have a unique return code from the driver for
when it aborts queuing an IO due to a resource shortage that isn't tied
to it's own consumption of it. When we get that return, do a delayed
queue run after X amount of time to ensure we always try again. If you
want to get fancy (later on), you could track the frequency of such
returns and complain if it's too high.

There's no point in adding a lot of code to check whether we need to run
the queue or not. Just always do it. We won't be doing it more than 100
times per second for the worst case of the condition taking a while to
clear, if we stick to the 10ms re-run time.

-- 
Jens Axboe



Re: dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Bart Van Assche
On Fri, 2018-01-19 at 08:26 +0800, Ming Lei wrote:
> No, this case won't return BLK_STS_RESOURCE.

It's possible that my attempt to reverse engineer the latest blk-mq changes
was wrong. But the queue stall is real and needs to be fixed.

Bart.

Re: dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Ming Lei
On Thu, Jan 18, 2018 at 05:13:53PM +, Bart Van Assche wrote:
> On Thu, 2018-01-18 at 11:50 -0500, Mike Snitzer wrote:
> > The issue you say it was originally intended to fix _should_ be
> > addressed with this change:
> > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.16=4dd6edd23e7ea971efddc303f9e67eb79e95808e
> 
> Hello Mike,
> 
> Sorry but I'm not convinced that that patch is sufficient. That patch helps
> if .end_io() is called with status BLK_STS_RESOURCE and also if
> blk_insert_cloned_request() returns the .queue_rq() return value. It does not
> help if .queue_rq() returns BLK_STS_RESOURCE and that return value gets
> ignored.

The return value from .queue_rq() is handled by blk-mq, why do you think
it can be ignored? Please see blk_mq_dispatch_rq_list().

> I think that can happen as follows:
> - Request cloning in multipath_clone_and_map() succeeds and that function
>   returns DM_MAPIO_REMAPPED.
> - dm_dispatch_clone_request() calls blk_insert_cloned_request().
> - blk_insert_cloned_request() calls blk_mq_request_direct_issue(), which
>   results in a call of __blk_mq_try_issue_directly().
> - __blk_mq_try_issue_directly() calls blk_mq_sched_insert_request(). In this

This only happens iff queue is stopped or quiesced, then we return
BLK_STS_OK to blk-mq via .queue_rq(), please see __blk_mq_try_issue_directly(),
how does this cause IO hang? 

>   case the BLK_STS_RESOURCE returned by the .queue_rq() implementation of the
>   underlying path will be ignored.

No, this case won't return BLK_STS_RESOURCE.

-- 
Ming


Re: [PATCH] dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Jens Axboe
On 1/18/18 5:18 PM, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 12:14:24AM +, Bart Van Assche wrote:
>> On Fri, 2018-01-19 at 08:11 +0800, Ming Lei wrote:
>>> On Thu, Jan 18, 2018 at 08:37:07AM -0800, Bart Van Assche wrote:
 diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
 index f16096af879a..c59c59cfd2a5 100644
 --- a/drivers/md/dm-rq.c
 +++ b/drivers/md/dm-rq.c
 @@ -761,6 +761,7 @@ static blk_status_t dm_mq_queue_rq(struct 
 blk_mq_hw_ctx *hctx,
/* Undo dm_start_request() before requeuing */
rq_end_stats(md, rq);
rq_completed(md, rq_data_dir(rq), false);
 +  blk_mq_delay_run_hw_queue(hctx, 100/*ms*/);
return BLK_STS_RESOURCE;
}
  
>>>
>>> Nak.
>>
>> This patch fixes a regression that was introduced by you. You should know
>> that regressions are not acceptable. If you don't agree with this patch,
>> please fix the root cause.
> 
> Yesterday I sent a patch, did you test that?

That patch isn't going to be of much help. It might prevent you from
completely stalling, but it might take you tens of seconds to get there.

On top of that, it's a rolling timer that gets added when IO is queued,
and re-added if IO is pending when it fires. If the latter case is not
true, then it won't get armed again. So if IO fails to queue without
any other IO pending, you're pretty much in the same situation as if
you marked the queue as restart. Nobody is going to be noticing
either of those conditions.

-- 
Jens Axboe



Re: [PATCH] dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Bart Van Assche
On Fri, 2018-01-19 at 08:18 +0800, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 12:14:24AM +, Bart Van Assche wrote:
> > On Fri, 2018-01-19 at 08:11 +0800, Ming Lei wrote:
> > > On Thu, Jan 18, 2018 at 08:37:07AM -0800, Bart Van Assche wrote:
> > > > diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> > > > index f16096af879a..c59c59cfd2a5 100644
> > > > --- a/drivers/md/dm-rq.c
> > > > +++ b/drivers/md/dm-rq.c
> > > > @@ -761,6 +761,7 @@ static blk_status_t dm_mq_queue_rq(struct 
> > > > blk_mq_hw_ctx *hctx,
> > > > /* Undo dm_start_request() before requeuing */
> > > > rq_end_stats(md, rq);
> > > > rq_completed(md, rq_data_dir(rq), false);
> > > > +   blk_mq_delay_run_hw_queue(hctx, 100/*ms*/);
> > > > return BLK_STS_RESOURCE;
> > > > }
> > > >  
> > > 
> > > Nak.
> > 
> > This patch fixes a regression that was introduced by you. You should know
> > that regressions are not acceptable. If you don't agree with this patch,
> > please fix the root cause.
> 
> Yesterday I sent a patch, did you test that?

Yes, I did. It caused queue stalls that were so bad that sending "kick" to the
debugfs "state" attribute was not sufficient to resolve the queue stall.

Bart.

Re: [PATCH] dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Ming Lei
On Fri, Jan 19, 2018 at 12:14:24AM +, Bart Van Assche wrote:
> On Fri, 2018-01-19 at 08:11 +0800, Ming Lei wrote:
> > On Thu, Jan 18, 2018 at 08:37:07AM -0800, Bart Van Assche wrote:
> > > diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> > > index f16096af879a..c59c59cfd2a5 100644
> > > --- a/drivers/md/dm-rq.c
> > > +++ b/drivers/md/dm-rq.c
> > > @@ -761,6 +761,7 @@ static blk_status_t dm_mq_queue_rq(struct 
> > > blk_mq_hw_ctx *hctx,
> > >   /* Undo dm_start_request() before requeuing */
> > >   rq_end_stats(md, rq);
> > >   rq_completed(md, rq_data_dir(rq), false);
> > > + blk_mq_delay_run_hw_queue(hctx, 100/*ms*/);
> > >   return BLK_STS_RESOURCE;
> > >   }
> > >  
> > 
> > Nak.
> 
> This patch fixes a regression that was introduced by you. You should know
> that regressions are not acceptable. If you don't agree with this patch,
> please fix the root cause.

Yesterday I sent a patch, did you test that?

-- 
Ming


Re: [PATCH] dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Bart Van Assche
On Fri, 2018-01-19 at 08:11 +0800, Ming Lei wrote:
> On Thu, Jan 18, 2018 at 08:37:07AM -0800, Bart Van Assche wrote:
> > diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> > index f16096af879a..c59c59cfd2a5 100644
> > --- a/drivers/md/dm-rq.c
> > +++ b/drivers/md/dm-rq.c
> > @@ -761,6 +761,7 @@ static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx 
> > *hctx,
> > /* Undo dm_start_request() before requeuing */
> > rq_end_stats(md, rq);
> > rq_completed(md, rq_data_dir(rq), false);
> > +   blk_mq_delay_run_hw_queue(hctx, 100/*ms*/);
> > return BLK_STS_RESOURCE;
> > }
> >  
> 
> Nak.

This patch fixes a regression that was introduced by you. You should know
that regressions are not acceptable. If you don't agree with this patch,
please fix the root cause.

Bart.

Re: [PATCH] dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Ming Lei
On Thu, Jan 18, 2018 at 08:37:07AM -0800, Bart Van Assche wrote:
> If the .queue_rq() implementation of a block driver returns
> BLK_STS_RESOURCE then that block driver is responsible for
> rerunning the queue once the condition that caused it to return
> BLK_STS_RESOURCE has been cleared. The dm-mpath driver tells the
> dm core to requeue a request if e.g. not enough memory is
> available for cloning a request or if the underlying path is
> busy. Since the dm-mpath driver does not receive any kind of
> notification if the condition that caused it to return "requeue"
> is cleared, the only solution to avoid that dm-mpath request
> processing stalls is to call blk_mq_delay_run_hw_queue(). Hence
> this patch.
> 
> Fixes: ec3eaf9a6731 ("dm mpath: don't call blk_mq_delay_run_hw_queue() in 
> case of BLK_STS_RESOURCE")
> Signed-off-by: Bart Van Assche 
> Cc: Ming Lei 
> ---
>  drivers/md/dm-rq.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> index f16096af879a..c59c59cfd2a5 100644
> --- a/drivers/md/dm-rq.c
> +++ b/drivers/md/dm-rq.c
> @@ -761,6 +761,7 @@ static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx 
> *hctx,
>   /* Undo dm_start_request() before requeuing */
>   rq_end_stats(md, rq);
>   rq_completed(md, rq_data_dir(rq), false);
> + blk_mq_delay_run_hw_queue(hctx, 100/*ms*/);
>   return BLK_STS_RESOURCE;
>   }
>  

Nak.

It still takes a bit time to add this request to hctx->dispatch_list
from here, so suppose the time is longer than 100ms because of interrupt
, preemption or whatever, this request can't be observed in the scheduled
run queue(__blk_mq_run_hw_queue).

Not mention it is just a ugly workaround, which degrades performance
a lot.

-- 
Ming


Re: dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Bart Van Assche
On Thu, 2018-01-18 at 11:50 -0500, Mike Snitzer wrote:
> The issue you say it was originally intended to fix _should_ be
> addressed with this change:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.16=4dd6edd23e7ea971efddc303f9e67eb79e95808e

Hello Mike,

Sorry but I'm not convinced that that patch is sufficient. That patch helps
if .end_io() is called with status BLK_STS_RESOURCE and also if
blk_insert_cloned_request() returns the .queue_rq() return value. It does not
help if .queue_rq() returns BLK_STS_RESOURCE and that return value gets
ignored. I think that can happen as follows:
- Request cloning in multipath_clone_and_map() succeeds and that function
  returns DM_MAPIO_REMAPPED.
- dm_dispatch_clone_request() calls blk_insert_cloned_request().
- blk_insert_cloned_request() calls blk_mq_request_direct_issue(), which
  results in a call of __blk_mq_try_issue_directly().
- __blk_mq_try_issue_directly() calls blk_mq_sched_insert_request(). In this
  case the BLK_STS_RESOURCE returned by the .queue_rq() implementation of the
  underlying path will be ignored.

Please note that I have not tried to find all paths that ignore the
.queue_rq() return value.

Thanks,

Bart.

Re: dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Mike Snitzer
On Thu, Jan 18 2018 at 11:37am -0500,
Bart Van Assche  wrote:

> If the .queue_rq() implementation of a block driver returns
> BLK_STS_RESOURCE then that block driver is responsible for
> rerunning the queue once the condition that caused it to return
> BLK_STS_RESOURCE has been cleared. The dm-mpath driver tells the
> dm core to requeue a request if e.g. not enough memory is
> available for cloning a request or if the underlying path is
> busy. Since the dm-mpath driver does not receive any kind of
> notification if the condition that caused it to return "requeue"
> is cleared, the only solution to avoid that dm-mpath request
> processing stalls is to call blk_mq_delay_run_hw_queue(). Hence
> this patch.
> 
> Fixes: ec3eaf9a6731 ("dm mpath: don't call blk_mq_delay_run_hw_queue() in 
> case of BLK_STS_RESOURCE")
> Signed-off-by: Bart Van Assche 
> Cc: Ming Lei 
> ---
>  drivers/md/dm-rq.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> index f16096af879a..c59c59cfd2a5 100644
> --- a/drivers/md/dm-rq.c
> +++ b/drivers/md/dm-rq.c
> @@ -761,6 +761,7 @@ static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx 
> *hctx,
>   /* Undo dm_start_request() before requeuing */
>   rq_end_stats(md, rq);
>   rq_completed(md, rq_data_dir(rq), false);
> + blk_mq_delay_run_hw_queue(hctx, 100/*ms*/);
>   return BLK_STS_RESOURCE;
>   }
>  
> -- 
> 2.15.1
> 

Sorry but we need to understand why you still need this.

The issue you say it was originally intended to fix _should_ be
addressed with this change:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.16=4dd6edd23e7ea971efddc303f9e67eb79e95808e

So if you still feel you need to blindly kick the queue then it is very
likely a bug in blk-mq (either its RESTART hueristics or whatever
internal implementation is lacking)

Did you try Ming's RFC patch to "fixup RESTART" before resorting to the
above again?, see: https://patchwork.kernel.org/patch/10172315/

Thanks,
Mike


[PATCH] dm rq: Avoid that request processing stalls sporadically

2018-01-18 Thread Bart Van Assche
If the .queue_rq() implementation of a block driver returns
BLK_STS_RESOURCE then that block driver is responsible for
rerunning the queue once the condition that caused it to return
BLK_STS_RESOURCE has been cleared. The dm-mpath driver tells the
dm core to requeue a request if e.g. not enough memory is
available for cloning a request or if the underlying path is
busy. Since the dm-mpath driver does not receive any kind of
notification if the condition that caused it to return "requeue"
is cleared, the only solution to avoid that dm-mpath request
processing stalls is to call blk_mq_delay_run_hw_queue(). Hence
this patch.

Fixes: ec3eaf9a6731 ("dm mpath: don't call blk_mq_delay_run_hw_queue() in case 
of BLK_STS_RESOURCE")
Signed-off-by: Bart Van Assche 
Cc: Ming Lei 
---
 drivers/md/dm-rq.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index f16096af879a..c59c59cfd2a5 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -761,6 +761,7 @@ static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx 
*hctx,
/* Undo dm_start_request() before requeuing */
rq_end_stats(md, rq);
rq_completed(md, rq_data_dir(rq), false);
+   blk_mq_delay_run_hw_queue(hctx, 100/*ms*/);
return BLK_STS_RESOURCE;
}
 
-- 
2.15.1



Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-16 Thread Ming Lei
On Fri, Apr 14, 2017 at 05:12:50PM +, Bart Van Assche wrote:
> On Fri, 2017-04-14 at 09:13 +0800, Ming Lei wrote:
> > On Thu, Apr 13, 2017 at 09:59:57AM -0700, Bart Van Assche wrote:
> > > On 04/12/17 19:20, Ming Lei wrote:
> > > > On Wed, Apr 12, 2017 at 06:38:07PM +, Bart Van Assche wrote:
> > > > > If the blk-mq core would always rerun a hardware queue if a block 
> > > > > driver
> > > > > returns BLK_MQ_RQ_QUEUE_BUSY then that would cause 100% of a single 
> > > > > CPU core
> > > > 
> > > > It won't casue 100% CPU utilization since we restart queue in completion
> > > > path and at that time at least one tag is available, then progress can 
> > > > be
> > > > made.
> > > 
> > > Hello Ming,
> > > 
> > > Sorry but you are wrong. If .queue_rq() returns BLK_MQ_RQ_QUEUE_BUSY
> > > then it's likely that calling .queue_rq() again after only a few
> > > microseconds will cause it to return BLK_MQ_RQ_QUEUE_BUSY again. If you
> > > don't believe me, change "if (!blk_mq_sched_needs_restart(hctx) &&
> > > !test_bit(BLK_MQ_S_TAG_WAITING, >state)) blk_mq_run_hw_queue(hctx,
> > > true);" into "blk_mq_run_hw_queue(hctx, true);", trigger a busy
> > 
> > Yes, that can be true, but I mean it is still OK to run the queue again
> > with
> > 
> > if (!blk_mq_sched_needs_restart(hctx) &&
> > !test_bit(BLK_MQ_S_TAG_WAITING, >state))
> > blk_mq_run_hw_queue(hctx, true);
> > 
> > and restarting queue in __blk_mq_finish_request() when
> > BLK_MQ_RQ_QUEUE_BUSY is returned from .queue_rq(). And both are in current
> > blk-mq implementation.
> > 
> > Then why do we need blk_mq_delay_run_hw_queue(hctx, 100/*ms*/) in dm?
> 
> Because if dm_mq_queue_rq() returns BLK_MQ_RQ_QUEUE_BUSY that there is no
> guarantee that __blk_mq_finish_request() will be called later on for the
> same queue. dm_mq_queue_rq() can e.g. return BLK_MQ_RQ_QUEUE_BUSY while no
> dm requests are in progress because the SCSI error handler is active for
> all underlying paths. See also scsi_lld_busy() and scsi_host_in_recovery().

OK, thanks Bart for the explanation.

Looks a very interesting BLK_MQ_RQ_QUEUE_BUSY case which isn't casued by
too many pending I/O, and will study more about this case.


Thanks,
Ming


Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-14 Thread Bart Van Assche
On Fri, 2017-04-14 at 09:13 +0800, Ming Lei wrote:
> On Thu, Apr 13, 2017 at 09:59:57AM -0700, Bart Van Assche wrote:
> > On 04/12/17 19:20, Ming Lei wrote:
> > > On Wed, Apr 12, 2017 at 06:38:07PM +, Bart Van Assche wrote:
> > > > If the blk-mq core would always rerun a hardware queue if a block driver
> > > > returns BLK_MQ_RQ_QUEUE_BUSY then that would cause 100% of a single CPU 
> > > > core
> > > 
> > > It won't casue 100% CPU utilization since we restart queue in completion
> > > path and at that time at least one tag is available, then progress can be
> > > made.
> > 
> > Hello Ming,
> > 
> > Sorry but you are wrong. If .queue_rq() returns BLK_MQ_RQ_QUEUE_BUSY
> > then it's likely that calling .queue_rq() again after only a few
> > microseconds will cause it to return BLK_MQ_RQ_QUEUE_BUSY again. If you
> > don't believe me, change "if (!blk_mq_sched_needs_restart(hctx) &&
> > !test_bit(BLK_MQ_S_TAG_WAITING, >state)) blk_mq_run_hw_queue(hctx,
> > true);" into "blk_mq_run_hw_queue(hctx, true);", trigger a busy
> 
> Yes, that can be true, but I mean it is still OK to run the queue again
> with
> 
>   if (!blk_mq_sched_needs_restart(hctx) &&
>   !test_bit(BLK_MQ_S_TAG_WAITING, >state))
>   blk_mq_run_hw_queue(hctx, true);
> 
> and restarting queue in __blk_mq_finish_request() when
> BLK_MQ_RQ_QUEUE_BUSY is returned from .queue_rq(). And both are in current
> blk-mq implementation.
> 
> Then why do we need blk_mq_delay_run_hw_queue(hctx, 100/*ms*/) in dm?

Because if dm_mq_queue_rq() returns BLK_MQ_RQ_QUEUE_BUSY that there is no
guarantee that __blk_mq_finish_request() will be called later on for the
same queue. dm_mq_queue_rq() can e.g. return BLK_MQ_RQ_QUEUE_BUSY while no
dm requests are in progress because the SCSI error handler is active for
all underlying paths. See also scsi_lld_busy() and scsi_host_in_recovery().

Bart.

Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-13 Thread Ming Lei
On Thu, Apr 13, 2017 at 09:59:57AM -0700, Bart Van Assche wrote:
> On 04/12/17 19:20, Ming Lei wrote:
> > On Wed, Apr 12, 2017 at 06:38:07PM +, Bart Van Assche wrote:
> >> If the blk-mq core would always rerun a hardware queue if a block driver
> >> returns BLK_MQ_RQ_QUEUE_BUSY then that would cause 100% of a single CPU 
> >> core
> > 
> > It won't casue 100% CPU utilization since we restart queue in completion
> > path and at that time at least one tag is available, then progress can be
> > made.
> 
> Hello Ming,
> 
> Sorry but you are wrong. If .queue_rq() returns BLK_MQ_RQ_QUEUE_BUSY
> then it's likely that calling .queue_rq() again after only a few
> microseconds will cause it to return BLK_MQ_RQ_QUEUE_BUSY again. If you
> don't believe me, change "if (!blk_mq_sched_needs_restart(hctx) &&
> !test_bit(BLK_MQ_S_TAG_WAITING, >state)) blk_mq_run_hw_queue(hctx,
> true);" into "blk_mq_run_hw_queue(hctx, true);", trigger a busy

Yes, that can be true, but I mean it is still OK to run the queue again
with

if (!blk_mq_sched_needs_restart(hctx) &&
!test_bit(BLK_MQ_S_TAG_WAITING, >state))
blk_mq_run_hw_queue(hctx, true);

and restarting queue in __blk_mq_finish_request() when
BLK_MQ_RQ_QUEUE_BUSY is returned from .queue_rq(). And both are in current
blk-mq implementation.

Then why do we need blk_mq_delay_run_hw_queue(hctx, 100/*ms*/) in dm?

Thanks,
Ming


Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-13 Thread Bart Van Assche
On 04/12/17 19:20, Ming Lei wrote:
> On Wed, Apr 12, 2017 at 06:38:07PM +, Bart Van Assche wrote:
>> If the blk-mq core would always rerun a hardware queue if a block driver
>> returns BLK_MQ_RQ_QUEUE_BUSY then that would cause 100% of a single CPU core
> 
> It won't casue 100% CPU utilization since we restart queue in completion
> path and at that time at least one tag is available, then progress can be
> made.

Hello Ming,

Sorry but you are wrong. If .queue_rq() returns BLK_MQ_RQ_QUEUE_BUSY
then it's likely that calling .queue_rq() again after only a few
microseconds will cause it to return BLK_MQ_RQ_QUEUE_BUSY again. If you
don't believe me, change "if (!blk_mq_sched_needs_restart(hctx) &&
!test_bit(BLK_MQ_S_TAG_WAITING, >state)) blk_mq_run_hw_queue(hctx,
true);" into "blk_mq_run_hw_queue(hctx, true);", trigger a busy
condition for either a SCSI LLD or a dm-rq driver, run top and you will
see that the CPU usage of a kworker thread jumps up to 100%.

Bart.


Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-12 Thread Ming Lei
On Wed, Apr 12, 2017 at 06:38:07PM +, Bart Van Assche wrote:
> On Wed, 2017-04-12 at 11:42 +0800, Ming Lei wrote:
> > On Tue, Apr 11, 2017 at 06:18:36PM +, Bart Van Assche wrote:
> > > On Tue, 2017-04-11 at 14:03 -0400, Mike Snitzer wrote:
> > > > Rather than working so hard to use DM code against me, your argument
> > > > should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well
> > > > established pattern"
> > > > 
> > > > I see drivers/nvme/host/fc.c:nvme_fc_start_fcp_op() does.  But that is
> > > > only one other driver out of ~20 BLK_MQ_RQ_QUEUE_BUSY returns
> > > > tree-wide.
> > > > 
> > > > Could be there are some others, but hardly a well-established pattern.
> > > 
> > > Hello Mike,
> > > 
> > > Several blk-mq drivers that can return BLK_MQ_RQ_QUEUE_BUSY from their
> > > .queue_rq() implementation stop the request queue (blk_mq_stop_hw_queue())
> > > before returning "busy" and restart the queue after the busy condition has
> > > been cleared (blk_mq_start_stopped_hw_queues()). Examples are virtio_blk 
> > > and
> > > xen-blkfront. However, this approach is not appropriate for the dm-mq core
> > > nor for the scsi core since both drivers already use the "stopped" state 
> > > for
> > > another purpose than tracking whether or not a hardware queue is busy. 
> > > Hence
> > > the blk_mq_delay_run_hw_queue() and blk_mq_run_hw_queue() calls in these 
> > > last
> > > two drivers to rerun a hardware queue after the busy state has been 
> > > cleared.
> > 
> > But looks this patch just reruns the hw queue after 100ms, which isn't
> > that after the busy state has been cleared, right?
> 
> Hello Ming,
> 
> That patch can be considered as a first step that can be refined further, 
> namely
> by modifying the dm-rq code further such that dm-rq queues are only rerun 
> after
> the busy condition has been cleared. The patch at the start of this thread is
> easier to review and easier to test than any patch that would only rerun dm-rq
> queues after the busy condition has been cleared.

OK, got it, it should have been better to add comments about this change
since reruning the queue after 100ms is actually a workaround, instead
of final solution.

> 
> > Actually if BLK_MQ_RQ_QUEUE_BUSY is returned from .queue_rq(), blk-mq
> > will buffer this request into hctx->dispatch and run the hw queue again,
> > so looks blk_mq_delay_run_hw_queue() in this situation shouldn't have been
> > needed at my 1st impression.
> 
> If the blk-mq core would always rerun a hardware queue if a block driver
> returns BLK_MQ_RQ_QUEUE_BUSY then that would cause 100% of a single CPU core

It won't casue 100% CPU utilization since we restart queue in completion
path and at that time at least one tag is available, then progress can be
made.

> to be busy with polling a hardware queue until the "busy" condition has been
> cleared. One can see easily that that's not what the blk-mq core does. From
> blk_mq_sched_dispatch_requests():
> 
>   if (!list_empty(_list)) {
>   blk_mq_sched_mark_restart_hctx(hctx);
>   did_work = blk_mq_dispatch_rq_list(q, _list);
>   }
> 
> From the end of blk_mq_dispatch_rq_list():
> 
>   if (!list_empty(list)) {
>   [ ... ]
>   if (!blk_mq_sched_needs_restart(hctx) &&
>   !test_bit(BLK_MQ_S_TAG_WAITING, >state))
>   blk_mq_run_hw_queue(hctx, true);
>   }

That is exactly what I meant, blk-mq already provides this mechanism
to rerun the queue automatically in case of BLK_MQ_RQ_QUEUE_BUSY. If the
mechanism doesn't work well, we need to fix that, then why bother
drivers to workaround it?

> 
> In other words, the BLK_MQ_S_SCHED_RESTART flag is set before the dispatch 
> list
> is examined and only if that flag gets cleared while blk_mq_dispatch_rq_list()
> is in progress by a concurrent blk_mq_sched_restart_hctx() call then the
> dispatch list will be rerun after a block driver returned 
> BLK_MQ_RQ_QUEUE_BUSY.

Yes, the queue is rerun either in completion path when
BLK_MQ_S_SCHED_RESTART is set, or just .queue_rq() returning _BUSY
and the flag is cleared at the same time from completion path.

So in theroy we can make sure the queue will be run again if _BUSY
happened, then what is the root cause why we have to add
blk_mq_delay_run_hw_queue(hctx, 100) in dm's .queue_rq()?

Thanks,
Ming


Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-12 Thread Bart Van Assche
On Wed, 2017-04-12 at 11:42 +0800, Ming Lei wrote:
> On Tue, Apr 11, 2017 at 06:18:36PM +, Bart Van Assche wrote:
> > On Tue, 2017-04-11 at 14:03 -0400, Mike Snitzer wrote:
> > > Rather than working so hard to use DM code against me, your argument
> > > should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well
> > > established pattern"
> > > 
> > > I see drivers/nvme/host/fc.c:nvme_fc_start_fcp_op() does.  But that is
> > > only one other driver out of ~20 BLK_MQ_RQ_QUEUE_BUSY returns
> > > tree-wide.
> > > 
> > > Could be there are some others, but hardly a well-established pattern.
> > 
> > Hello Mike,
> > 
> > Several blk-mq drivers that can return BLK_MQ_RQ_QUEUE_BUSY from their
> > .queue_rq() implementation stop the request queue (blk_mq_stop_hw_queue())
> > before returning "busy" and restart the queue after the busy condition has
> > been cleared (blk_mq_start_stopped_hw_queues()). Examples are virtio_blk and
> > xen-blkfront. However, this approach is not appropriate for the dm-mq core
> > nor for the scsi core since both drivers already use the "stopped" state for
> > another purpose than tracking whether or not a hardware queue is busy. Hence
> > the blk_mq_delay_run_hw_queue() and blk_mq_run_hw_queue() calls in these 
> > last
> > two drivers to rerun a hardware queue after the busy state has been cleared.
> 
> But looks this patch just reruns the hw queue after 100ms, which isn't
> that after the busy state has been cleared, right?

Hello Ming,

That patch can be considered as a first step that can be refined further, namely
by modifying the dm-rq code further such that dm-rq queues are only rerun after
the busy condition has been cleared. The patch at the start of this thread is
easier to review and easier to test than any patch that would only rerun dm-rq
queues after the busy condition has been cleared.

> Actually if BLK_MQ_RQ_QUEUE_BUSY is returned from .queue_rq(), blk-mq
> will buffer this request into hctx->dispatch and run the hw queue again,
> so looks blk_mq_delay_run_hw_queue() in this situation shouldn't have been
> needed at my 1st impression.

If the blk-mq core would always rerun a hardware queue if a block driver
returns BLK_MQ_RQ_QUEUE_BUSY then that would cause 100% of a single CPU core
to be busy with polling a hardware queue until the "busy" condition has been
cleared. One can see easily that that's not what the blk-mq core does. From
blk_mq_sched_dispatch_requests():

if (!list_empty(_list)) {
blk_mq_sched_mark_restart_hctx(hctx);
did_work = blk_mq_dispatch_rq_list(q, _list);
}

>From the end of blk_mq_dispatch_rq_list():

if (!list_empty(list)) {
[ ... ]
if (!blk_mq_sched_needs_restart(hctx) &&
!test_bit(BLK_MQ_S_TAG_WAITING, >state))
blk_mq_run_hw_queue(hctx, true);
}

In other words, the BLK_MQ_S_SCHED_RESTART flag is set before the dispatch list
is examined and only if that flag gets cleared while blk_mq_dispatch_rq_list()
is in progress by a concurrent blk_mq_sched_restart_hctx() call then the
dispatch list will be rerun after a block driver returned BLK_MQ_RQ_QUEUE_BUSY.

Bart.

Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-11 Thread Ming Lei
On Tue, Apr 11, 2017 at 06:18:36PM +, Bart Van Assche wrote:
> On Tue, 2017-04-11 at 14:03 -0400, Mike Snitzer wrote:
> > Rather than working so hard to use DM code against me, your argument
> > should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well
> > established pattern"
> > 
> > I see drivers/nvme/host/fc.c:nvme_fc_start_fcp_op() does.  But that is
> > only one other driver out of ~20 BLK_MQ_RQ_QUEUE_BUSY returns
> > tree-wide.
> > 
> > Could be there are some others, but hardly a well-established pattern.
> 
> Hello Mike,
> 
> Several blk-mq drivers that can return BLK_MQ_RQ_QUEUE_BUSY from their
> .queue_rq() implementation stop the request queue (blk_mq_stop_hw_queue())
> before returning "busy" and restart the queue after the busy condition has
> been cleared (blk_mq_start_stopped_hw_queues()). Examples are virtio_blk and
> xen-blkfront. However, this approach is not appropriate for the dm-mq core
> nor for the scsi core since both drivers already use the "stopped" state for
> another purpose than tracking whether or not a hardware queue is busy. Hence
> the blk_mq_delay_run_hw_queue() and blk_mq_run_hw_queue() calls in these last
> two drivers to rerun a hardware queue after the busy state has been cleared.

But looks this patch just reruns the hw queue after 100ms, which isn't
that after the busy state has been cleared, right?

Actually if BLK_MQ_RQ_QUEUE_BUSY is returned from .queue_rq(), blk-mq
will buffer this request into hctx->dispatch and run the hw queue again,
so looks blk_mq_delay_run_hw_queue() in this situation shouldn't have been
needed at my 1st impression. Or maybe Bart has more stories about this usage,
better to comments it?

Thanks,
Ming


Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-11 Thread Bart Van Assche
On Tue, 2017-04-11 at 14:03 -0400, Mike Snitzer wrote:
> Rather than working so hard to use DM code against me, your argument
> should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well
> established pattern"
> 
> I see drivers/nvme/host/fc.c:nvme_fc_start_fcp_op() does.  But that is
> only one other driver out of ~20 BLK_MQ_RQ_QUEUE_BUSY returns
> tree-wide.
> 
> Could be there are some others, but hardly a well-established pattern.

Hello Mike,

Several blk-mq drivers that can return BLK_MQ_RQ_QUEUE_BUSY from their
.queue_rq() implementation stop the request queue (blk_mq_stop_hw_queue())
before returning "busy" and restart the queue after the busy condition has
been cleared (blk_mq_start_stopped_hw_queues()). Examples are virtio_blk and
xen-blkfront. However, this approach is not appropriate for the dm-mq core
nor for the scsi core since both drivers already use the "stopped" state for
another purpose than tracking whether or not a hardware queue is busy. Hence
the blk_mq_delay_run_hw_queue() and blk_mq_run_hw_queue() calls in these last
two drivers to rerun a hardware queue after the busy state has been cleared.

Bart.

Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-11 Thread Mike Snitzer
On Tue, Apr 11 2017 at  1:51pm -0400,
Bart Van Assche  wrote:

> On Tue, 2017-04-11 at 13:47 -0400, Mike Snitzer wrote:
> > Other drivers will very likely be caught about by
> > this blk-mq quirk in the future.
> 
> Hello Mike,
> 
> Are you aware that the requirement that blk-mq drivers rerun the queue after
> having returned BLK_MQ_RQ_QUEUE_BUSY is a requirement that is shared with
> traditional block drivers? From dm_old_request_fn():
> 
>   if (... || (ti->type->busy && ti->type->busy(ti))) {
>   blk_delay_queue(q, 10);
>   return;
>   }

No, and pointing to DM code that does something with the old .request_fn
case to justify why blk-mq requires the same is pretty specious.

Rather than working so hard to use DM code against me, your argument
should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well
established pattern"

I see drivers/nvme/host/fc.c:nvme_fc_start_fcp_op() does.  But that is
only one other driver out of ~20 BLK_MQ_RQ_QUEUE_BUSY returns
tree-wide.

Could be there are some others, but hardly a well-established pattern.


Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-11 Thread Bart Van Assche
On Tue, 2017-04-11 at 13:47 -0400, Mike Snitzer wrote:
> Other drivers will very likely be caught about by
> this blk-mq quirk in the future.

Hello Mike,

Are you aware that the requirement that blk-mq drivers rerun the queue after
having returned BLK_MQ_RQ_QUEUE_BUSY is a requirement that is shared with
traditional block drivers? From dm_old_request_fn():

if (... || (ti->type->busy && ti->type->busy(ti))) {
blk_delay_queue(q, 10);
return;
}

Bart.

Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-11 Thread Mike Snitzer
On Tue, Apr 11 2017 at 12:26pm -0400,
Bart Van Assche  wrote:

> On Tue, 2017-04-11 at 12:09 -0400, Mike Snitzer wrote:
> > This has no place in dm-mq (or any blk-mq
> > driver).  If it is needed it should be elevated to blk-mq core to
> > trigger blk_mq_delay_run_hw_queue() when BLK_MQ_RQ_QUEUE_BUSY is
> > returned from blk_mq_ops' .queue_rq.
> 
> Hello Mike,
> 
> If the blk-mq core would have to figure out whether or not a queue is no
> longer busy without any cooperation from the blk-mq driver all the blk-mq
> core could do is to attempt to rerun that queue from time to time. But at
> which intervals should the blk-mq core check whether or not a queue is still
> busy? Would it be possible to choose intervals at which to check the queue
> state that work well for all block drivers? Consider e.g. at the dm-mpath
> driver. multipath_busy() returns true as long as path initialization is in
> progress. Path initialization can take a long time. The (indirect) call to
> blk_mq_run_queue() from pg_init_done() resumes request processing immediately
> after path initialization has finished. Sorry but I don't think it is possible
> to invent an algorithm for the blk-mq core that guarantees not only that a
> queue is rerun as soon as it is no longer busy but also that avoids that
> plenty of CPU cycles are wasted by the blk-mq core for checking whether a
> queue is no longer busy.

Sorry but that isn't a very strong argument for what you've done.

I mean I do appreciate your point that the 2 BLK_MQ_RQ_QUEUE_BUSY
returns in dm_mq_queue_rq() are not equal but that could easily be
conveyed using a new return value.

Anyway, point is, no blk-mq driver should need to have concern about
whether their request will get resubmitted (and the associated hw queue
re-ran) if they return BLK_MQ_RQ_QUEUE_BUSY.

Your change is a means to an end but it just solves the problem in a
very hackish way.  Other drivers will very likely be caught about by
this blk-mq quirk in the future.


Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-11 Thread Bart Van Assche
On Tue, 2017-04-11 at 12:09 -0400, Mike Snitzer wrote:
> This has no place in dm-mq (or any blk-mq
> driver).  If it is needed it should be elevated to blk-mq core to
> trigger blk_mq_delay_run_hw_queue() when BLK_MQ_RQ_QUEUE_BUSY is
> returned from blk_mq_ops' .queue_rq.

Hello Mike,

If the blk-mq core would have to figure out whether or not a queue is no
longer busy without any cooperation from the blk-mq driver all the blk-mq
core could do is to attempt to rerun that queue from time to time. But at
which intervals should the blk-mq core check whether or not a queue is still
busy? Would it be possible to choose intervals at which to check the queue
state that work well for all block drivers? Consider e.g. at the dm-mpath
driver. multipath_busy() returns true as long as path initialization is in
progress. Path initialization can take a long time. The (indirect) call to
blk_mq_run_queue() from pg_init_done() resumes request processing immediately
after path initialization has finished. Sorry but I don't think it is possible
to invent an algorithm for the blk-mq core that guarantees not only that a
queue is rerun as soon as it is no longer busy but also that avoids that
plenty of CPU cycles are wasted by the blk-mq core for checking whether a
queue is no longer busy.

Bart.

Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-11 Thread Mike Snitzer
On Fri, Apr 07 2017 at  2:16pm -0400,
Bart Van Assche  wrote:

> While running the srp-test software I noticed that request
> processing stalls sporadically at the beginning of a test, namely
> when mkfs is run against a dm-mpath device. Every time when that
> happened the following command was sufficient to resume request
> processing:
> 
> echo run >/sys/kernel/debug/block/dm-0/state
> 
> This patch avoids that such request processing stalls occur. The
> test I ran is as follows:
> 
> while srp-test/run_tests -d -r 30 -t 02-mq; do :; done
> 
> Signed-off-by: Bart Van Assche 
> Cc: Mike Snitzer 
> Cc: dm-de...@redhat.com
> ---
>  drivers/md/dm-rq.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> index 6886bf160fb2..d19af1d21f4c 100644
> --- a/drivers/md/dm-rq.c
> +++ b/drivers/md/dm-rq.c
> @@ -755,6 +755,7 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
>   /* Undo dm_start_request() before requeuing */
>   rq_end_stats(md, rq);
>   rq_completed(md, rq_data_dir(rq), false);
> + blk_mq_delay_run_hw_queue(hctx, 100/*ms*/);
>   return BLK_MQ_RQ_QUEUE_BUSY;
>   }
>  
> -- 
> 2.12.0
> 

I really appreciate your hard work Bart but this looks like a cheap
hack.

I'm clearly too late to stop this from going in (given Jens got it
merged for -rc6) but: this has no place in dm-mq (or any blk-mq
driver).  If it is needed it should be elevated to blk-mq core to
trigger blk_mq_delay_run_hw_queue() when BLK_MQ_RQ_QUEUE_BUSY is
returned from blk_mq_ops' .queue_rq.

If this dm-mq specific commit is justified the case certainly is spelled
out in the commit header.


[PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-07 Thread Bart Van Assche
While running the srp-test software I noticed that request
processing stalls sporadically at the beginning of a test, namely
when mkfs is run against a dm-mpath device. Every time when that
happened the following command was sufficient to resume request
processing:

echo run >/sys/kernel/debug/block/dm-0/state

This patch avoids that such request processing stalls occur. The
test I ran is as follows:

while srp-test/run_tests -d -r 30 -t 02-mq; do :; done

Signed-off-by: Bart Van Assche 
Cc: Mike Snitzer 
Cc: dm-de...@redhat.com
---
 drivers/md/dm-rq.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 6886bf160fb2..d19af1d21f4c 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -755,6 +755,7 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
/* Undo dm_start_request() before requeuing */
rq_end_stats(md, rq);
rq_completed(md, rq_data_dir(rq), false);
+   blk_mq_delay_run_hw_queue(hctx, 100/*ms*/);
return BLK_MQ_RQ_QUEUE_BUSY;
}
 
-- 
2.12.0