Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Ming Lei
On Tue, Sep 19, 2017 at 04:49:15PM +, Bart Van Assche wrote:
> On Wed, 2017-09-20 at 00:04 +0800, Ming Lei wrote:
> > Run queue at end_io is definitely wrong, because blk-mq has SCHED_RESTART
> > to do that already.
> 
> Sorry but I disagree. If SCHED_RESTART is set that causes the blk-mq core to
> reexamine the software queues and the hctx dispatch list but not the requeue
> list. If a block driver returns BLK_STS_RESOURCE then requests end up on the
> requeue list. Hence the following code in scsi_end_request():
> 
>   if (scsi_target(sdev)->single_lun || 
> !list_empty(>host->starved_list))
>   kblockd_schedule_work(>requeue_work);
>   else
>   blk_mq_run_hw_queues(q, true);

Let's focus on dm-rq.

I have explained before, dm-rq is different with SCSI's, we don't need
to requeue request any more in dm-rq if blk_get_request() returns NULL
in multipath_clone_and_map(), since SCHED_RESTART can cover that
definitely.

Not mention dm_mq_delay_requeue_request() will run the queue explicitly
if it has to be done. That needn't SCHED_RESTART to cover, right?

-- 
Ming


Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Ming Lei
Hi Mike,

On Tue, Sep 19, 2017 at 07:50:06PM -0400, Mike Snitzer wrote:
> On Tue, Sep 19 2017 at  7:25pm -0400,
> Bart Van Assche  wrote:
> 
> > On Wed, 2017-09-20 at 06:44 +0800, Ming Lei wrote:
> > > For this issue, it isn't same between SCSI and dm-rq.
> > > 
> > > We don't need to run queue in .end_io of dm, and the theory is
> > > simple, otherwise it isn't performance issue, and should be I/O hang.
> > > 
> > > 1) every dm-rq's request is 1:1 mapped to SCSI's request
> > > 
> > > 2) if there is any mapped SCSI request not finished, either
> > > in-flight or in requeue list or whatever, there will be one
> > > corresponding dm-rq's request in-flight
> > > 
> > > 3) once the mapped SCSI request is completed, dm-rq's completion
> > > path will be triggered and dm-rq's queue will be rerun because of
> > > SCHED_RESTART in dm-rq
> > > 
> > > So the hw queue of dm-rq has been run in dm-rq's completion path
> > > already, right? Why do we need to do it again in the hot path?
> > 
> > The measurement data in the description of patch 5/5 shows a significant
> > performance regression for an important workload, namely random I/O.
> > Additionally, the performance improvement for sequential I/O was achieved
> > for an unrealistically low queue depth.
> 
> So you've ignored Ming's question entirely and instead decided to focus
> on previous questions you raised to Ming that he ignored.  This is
> getting tedious.

Sorry for not making it clear, I mentioned I will post a new version
to address the random I/O regression.

> 
> Especially given that Ming said the first patch that all this fighting
> has been over isn't even required to attain the improvements.
> 
> Ming, please retest both your baseline and patched setup with a
> queue_depth of >= 32.  Also, please do 3 - 5 runs to get a avg and std
> dev across the runs.

Taking a bigger queue_depth won't be helpful on this issue,
and it can make the situation worse, because .cmd_per_lun won't
be changed, and queue often becomes busy when number of in-flight
requests is bigger than .cmd_per_lun.

I will post one new version, which will use another simple way to
figure out if underlying queue is busy, so that random I/O perf
won't be affected, but this new version need to depend on the
following patchset:

https://marc.info/?t=15043655572=1=2

So it may take a while since that patchset is still under review.

I will post them all together in 'blk-mq-sched: improve SCSI-MQ 
performance(V5)'.

The approach taken in patch 5 depends on q->queue_depth, but some SCSI
host's .cmd_per_lun is different with q->queue_depth, so causes
the random I/O regression.

-- 
Ming


Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Mike Snitzer
On Tue, Sep 19 2017 at  7:25pm -0400,
Bart Van Assche  wrote:

> On Wed, 2017-09-20 at 06:44 +0800, Ming Lei wrote:
> > For this issue, it isn't same between SCSI and dm-rq.
> > 
> > We don't need to run queue in .end_io of dm, and the theory is
> > simple, otherwise it isn't performance issue, and should be I/O hang.
> > 
> > 1) every dm-rq's request is 1:1 mapped to SCSI's request
> > 
> > 2) if there is any mapped SCSI request not finished, either
> > in-flight or in requeue list or whatever, there will be one
> > corresponding dm-rq's request in-flight
> > 
> > 3) once the mapped SCSI request is completed, dm-rq's completion
> > path will be triggered and dm-rq's queue will be rerun because of
> > SCHED_RESTART in dm-rq
> > 
> > So the hw queue of dm-rq has been run in dm-rq's completion path
> > already, right? Why do we need to do it again in the hot path?
> 
> The measurement data in the description of patch 5/5 shows a significant
> performance regression for an important workload, namely random I/O.
> Additionally, the performance improvement for sequential I/O was achieved
> for an unrealistically low queue depth.

So you've ignored Ming's question entirely and instead decided to focus
on previous questions you raised to Ming that he ignored.  This is
getting tedious.

Especially given that Ming said the first patch that all this fighting
has been over isn't even required to attain the improvements.

Ming, please retest both your baseline and patched setup with a
queue_depth of >= 32.  Also, please do 3 - 5 runs to get a avg and std
dev across the runs.

> Sorry but given these measurement results I don't see why I should
> spend more time on this patch series. 

Bart, I've historically genuinely always appreciated your review and
insight.  But if your future "review" on this patchset would take the
form shown in this thread then please don't spend more time on it.

Mike


Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Bart Van Assche
On Wed, 2017-09-20 at 06:44 +0800, Ming Lei wrote:
> For this issue, it isn't same between SCSI and dm-rq.
> 
> We don't need to run queue in .end_io of dm, and the theory is
> simple, otherwise it isn't performance issue, and should be I/O hang.
> 
> 1) every dm-rq's request is 1:1 mapped to SCSI's request
> 
> 2) if there is any mapped SCSI request not finished, either
> in-flight or in requeue list or whatever, there will be one
> corresponding dm-rq's request in-flight
> 
> 3) once the mapped SCSI request is completed, dm-rq's completion
> path will be triggered and dm-rq's queue will be rerun because of
> SCHED_RESTART in dm-rq
> 
> So the hw queue of dm-rq has been run in dm-rq's completion path
> already, right? Why do we need to do it again in the hot path?

The measurement data in the description of patch 5/5 shows a significant
performance regression for an important workload, namely random I/O.
Additionally, the performance improvement for sequential I/O was achieved
for an unrealistically low queue depth. Sorry but given these measurement
results I don't see why I should spend more time on this patch series.

Bart.

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Ming Lei
On Tue, Sep 19, 2017 at 06:42:30PM +, Bart Van Assche wrote:
> On Wed, 2017-09-20 at 00:55 +0800, Ming Lei wrote:
> > On Wed, Sep 20, 2017 at 12:49 AM, Bart Van Assche
> >  wrote:
> > > On Wed, 2017-09-20 at 00:04 +0800, Ming Lei wrote:
> > > > Run queue at end_io is definitely wrong, because blk-mq has 
> > > > SCHED_RESTART
> > > > to do that already.
> > > 
> > > Sorry but I disagree. If SCHED_RESTART is set that causes the blk-mq core 
> > > to
> > > reexamine the software queues and the hctx dispatch list but not the 
> > > requeue
> > > list. If a block driver returns BLK_STS_RESOURCE then requests end up on 
> > > the
> > > requeue list. Hence the following code in scsi_end_request():
> > 
> > That doesn't need SCHED_RESTART, because it is requeue's
> > responsibility to do that,
> > see blk_mq_requeue_work(), which will run hw queue at the end of this func.
> 
> That's not what I was trying to explain. What I was trying to explain is that
> every block driver that can cause a request to end up on the requeue list is
> responsible for kicking the requeue list at a later time. Hence the
> kblockd_schedule_work(>requeue_work) call in the SCSI core and the
> blk_mq_kick_requeue_list() and blk_mq_delay_kick_requeue_list() calls in the
> dm code. What I would like to see is measurement results for dm-mpath without
> this patch series and a call to kick the requeue list added to the dm-mpath
> end_io code.

For this issue, it isn't same between SCSI and dm-rq.

We don't need to run queue in .end_io of dm, and the theory is
simple, otherwise it isn't performance issue, and should be I/O hang.

1) every dm-rq's request is 1:1 mapped to SCSI's request

2) if there is any mapped SCSI request not finished, either
in-flight or in requeue list or whatever, there will be one
corresponding dm-rq's request in-flight

3) once the mapped SCSI request is completed, dm-rq's completion
path will be triggered and dm-rq's queue will be rerun because of
SCHED_RESTART in dm-rq

So the hw queue of dm-rq has been run in dm-rq's completion path
already, right? Why do we need to do it again in the hot path?

-- 
Ming


Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Bart Van Assche
On Wed, 2017-09-20 at 00:55 +0800, Ming Lei wrote:
> On Wed, Sep 20, 2017 at 12:49 AM, Bart Van Assche
>  wrote:
> > On Wed, 2017-09-20 at 00:04 +0800, Ming Lei wrote:
> > > Run queue at end_io is definitely wrong, because blk-mq has SCHED_RESTART
> > > to do that already.
> > 
> > Sorry but I disagree. If SCHED_RESTART is set that causes the blk-mq core to
> > reexamine the software queues and the hctx dispatch list but not the requeue
> > list. If a block driver returns BLK_STS_RESOURCE then requests end up on the
> > requeue list. Hence the following code in scsi_end_request():
> 
> That doesn't need SCHED_RESTART, because it is requeue's
> responsibility to do that,
> see blk_mq_requeue_work(), which will run hw queue at the end of this func.

That's not what I was trying to explain. What I was trying to explain is that
every block driver that can cause a request to end up on the requeue list is
responsible for kicking the requeue list at a later time. Hence the
kblockd_schedule_work(>requeue_work) call in the SCSI core and the
blk_mq_kick_requeue_list() and blk_mq_delay_kick_requeue_list() calls in the
dm code. What I would like to see is measurement results for dm-mpath without
this patch series and a call to kick the requeue list added to the dm-mpath
end_io code.

Bart.

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Ming Lei
On Wed, Sep 20, 2017 at 12:49 AM, Bart Van Assche
 wrote:
> On Wed, 2017-09-20 at 00:04 +0800, Ming Lei wrote:
>> Run queue at end_io is definitely wrong, because blk-mq has SCHED_RESTART
>> to do that already.
>
> Sorry but I disagree. If SCHED_RESTART is set that causes the blk-mq core to
> reexamine the software queues and the hctx dispatch list but not the requeue
> list. If a block driver returns BLK_STS_RESOURCE then requests end up on the
> requeue list. Hence the following code in scsi_end_request():

That doesn't need SCHED_RESTART, because it is requeue's
responsibility to do that,
see blk_mq_requeue_work(), which will run hw queue at the end of this func.


-- 
Ming Lei


Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Bart Van Assche
On Wed, 2017-09-20 at 00:04 +0800, Ming Lei wrote:
> Run queue at end_io is definitely wrong, because blk-mq has SCHED_RESTART
> to do that already.

Sorry but I disagree. If SCHED_RESTART is set that causes the blk-mq core to
reexamine the software queues and the hctx dispatch list but not the requeue
list. If a block driver returns BLK_STS_RESOURCE then requests end up on the
requeue list. Hence the following code in scsi_end_request():

if (scsi_target(sdev)->single_lun || 
!list_empty(>host->starved_list))
kblockd_schedule_work(>requeue_work);
else
blk_mq_run_hw_queues(q, true);

Bart.

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Ming Lei
On Tue, Sep 19, 2017 at 11:48:23AM -0400, Mike Snitzer wrote:
> On Tue, Sep 19 2017 at  1:43am -0400,
> Ming Lei  wrote:
> 
> > On Mon, Sep 18, 2017 at 03:18:16PM +, Bart Van Assche wrote:
> > > On Sun, 2017-09-17 at 20:40 +0800, Ming Lei wrote:
> > > > "if no request has completed before the delay has expired" can't be a
> > > > reason to rerun the queue, because the queue can still be busy.
> > > 
> > > That statement of you shows that there are important aspects of the SCSI
> > > core and dm-mpath driver that you don't understand.
> > 
> > Then can you tell me why blk-mq's SCHED_RESTART can't cover
> > the rerun when there are in-flight requests? What is the case
> > in which dm-rq can return BUSY and there aren't any in-flight
> > requests meantime?
> > 
> > Also you are the author of adding 'blk_mq_delay_run_hw_queue(
> > hctx, 100/*ms*/)' in dm-rq, you never explain in commit
> > 6077c2d706097c0(dm rq: Avoid that request processing stalls
> > sporadically) what the root cause is for your request stall
> > and why this patch fixes your issue. Even you don't explain
> > why is the delay 100ms?
> > 
> > So it is a workaound, isn't it?
> > 
> > My concern is that it isn't good to add blk_mq_delay_run_hw_queue(hctx, 
> > 100/*ms*/)
> > in the hot path since it should been covered by SCHED_RESTART
> > if there are in-flight requests.
> 
> This thread proves that it is definitely brittle to be relying on fixed
> delays like this:
> https://patchwork.kernel.org/patch/9703249/

I can't agree more, because no one mentioned the root cause, maybe the
request stall has been fixed recently. Keeping the workaound in hotpath
is a bit annoying.

-- 
Ming


Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Ming Lei
On Tue, Sep 19, 2017 at 11:56:03AM -0400, Mike Snitzer wrote:
> On Tue, Sep 19 2017 at 11:36am -0400,
> Bart Van Assche  wrote:
> 
> > On Tue, 2017-09-19 at 13:43 +0800, Ming Lei wrote:
> > > On Mon, Sep 18, 2017 at 03:18:16PM +, Bart Van Assche wrote:
> > > > If you are still looking at removing the blk_mq_delay_run_hw_queue() 
> > > > calls
> > > > then I think you are looking in the wrong direction. What kind of 
> > > > problem
> > > > are you trying to solve? Is it perhaps that there can be a delay between
> > > 
> > > Actually the improvement on dm-rq IO schedule(the patch 2 ~ 5) doesn't
> > > need this patch.
> > 
> > The approach of this patch series looks wrong to me and patch 5/5 is an ugly
> > hack in my opinion. That's why I asked you to drop the entire patch series 
> > and
> > to test whether inserting a queue run call into the dm-mpath end_io callback
> > yields a similar performance improvement to the patches you posted. Please 
> > do
> > not expect me to spend more time on this patch series if you do not come up
> > with measurement results for the proposed alternative.
> 
> Bart, asserting that Ming's work is a hack doesn't help your apparent
> goal of deligitimizing Ming's work.
> 
> Nor does it take away from the fact that your indecision on appropriate
> timeouts (let alone ability to defend and/or explain them) validates
> Ming calling them into question (which you are now dodging regularly).
> 
> But please don't take this feedback and shut-down.  Instead please work
> together more constructively.  This doesn't need to be adversarial!  I
> am at a loss for why there is such animosity from your end Bart.
> 
> Please dial it back.  It is just a distraction that fosters needless
> in-fighting.
> 
> Believe it or not: Ming is just trying to improve the code because he
> has a testbed that is showing fairly abysmal performance with dm-mq
> multipath (on lpfc with scsi-mq).
> 
> Ming, if you can: please see if what Bart has proposed (instead: run
> queue at end_io) helps.  Though I don't yet see why that should be
> needed.

Run queue at end_io is definitely wrong, because blk-mq has SCHED_RESTART
to do that already.

The dm-mpath performance issue is nothing to do with that, actually
the issue is very similar with my improvement on SCSI-MQ, because
now dm_dispatch_clone_request() doesn't know if the underlying
queue is busy or not, and dm-rq/mpath is just trying to dispatch
more request to underlying queue even though it is busy, then IO
merge can't be done easily on dm-rq/mpath.

I am thinking another way to improve this issue, since some
SCSI device's queue_depth is different with .cmd_per_lun,
will post patchset soon.


-- 
Ming


Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Mike Snitzer
On Tue, Sep 19 2017 at 11:52am -0400,
Bart Van Assche  wrote:

> On Tue, 2017-09-19 at 11:48 -0400, Mike Snitzer wrote:
> > This thread proves that it is definitely brittle to be relying on fixed
> > delays like this:
> > https://patchwork.kernel.org/patch/9703249/
> 
> Hello Mike,
> 
> Sorry but I think that's a misinterpretation of my patch. I came up with that
> patch before I had found and fixed the root cause of the high system load.
> These fixes are upstream (kernel v4.13) which means that that patch is now
> obsolete.

OK, fair point.

Though fixed magic values for delay aren't going to stand the test of
time.


Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Mike Snitzer
On Tue, Sep 19 2017 at 11:36am -0400,
Bart Van Assche  wrote:

> On Tue, 2017-09-19 at 13:43 +0800, Ming Lei wrote:
> > On Mon, Sep 18, 2017 at 03:18:16PM +, Bart Van Assche wrote:
> > > If you are still looking at removing the blk_mq_delay_run_hw_queue() calls
> > > then I think you are looking in the wrong direction. What kind of problem
> > > are you trying to solve? Is it perhaps that there can be a delay between
> > 
> > Actually the improvement on dm-rq IO schedule(the patch 2 ~ 5) doesn't
> > need this patch.
> 
> The approach of this patch series looks wrong to me and patch 5/5 is an ugly
> hack in my opinion. That's why I asked you to drop the entire patch series and
> to test whether inserting a queue run call into the dm-mpath end_io callback
> yields a similar performance improvement to the patches you posted. Please do
> not expect me to spend more time on this patch series if you do not come up
> with measurement results for the proposed alternative.

Bart, asserting that Ming's work is a hack doesn't help your apparent
goal of deligitimizing Ming's work.

Nor does it take away from the fact that your indecision on appropriate
timeouts (let alone ability to defend and/or explain them) validates
Ming calling them into question (which you are now dodging regularly).

But please don't take this feedback and shut-down.  Instead please work
together more constructively.  This doesn't need to be adversarial!  I
am at a loss for why there is such animosity from your end Bart.

Please dial it back.  It is just a distraction that fosters needless
in-fighting.

Believe it or not: Ming is just trying to improve the code because he
has a testbed that is showing fairly abysmal performance with dm-mq
multipath (on lpfc with scsi-mq).

Ming, if you can: please see if what Bart has proposed (instead: run
queue at end_io) helps.  Though I don't yet see why that should be
needed.

Mike


Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Bart Van Assche
On Tue, 2017-09-19 at 11:48 -0400, Mike Snitzer wrote:
> This thread proves that it is definitely brittle to be relying on fixed
> delays like this:
> https://patchwork.kernel.org/patch/9703249/

Hello Mike,

Sorry but I think that's a misinterpretation of my patch. I came up with that
patch before I had found and fixed the root cause of the high system load.
These fixes are upstream (kernel v4.13) which means that that patch is now
obsolete.

Bart.

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Mike Snitzer
On Tue, Sep 19 2017 at  1:43am -0400,
Ming Lei  wrote:

> On Mon, Sep 18, 2017 at 03:18:16PM +, Bart Van Assche wrote:
> > On Sun, 2017-09-17 at 20:40 +0800, Ming Lei wrote:
> > > "if no request has completed before the delay has expired" can't be a
> > > reason to rerun the queue, because the queue can still be busy.
> > 
> > That statement of you shows that there are important aspects of the SCSI
> > core and dm-mpath driver that you don't understand.
> 
> Then can you tell me why blk-mq's SCHED_RESTART can't cover
> the rerun when there are in-flight requests? What is the case
> in which dm-rq can return BUSY and there aren't any in-flight
> requests meantime?
> 
> Also you are the author of adding 'blk_mq_delay_run_hw_queue(
> hctx, 100/*ms*/)' in dm-rq, you never explain in commit
> 6077c2d706097c0(dm rq: Avoid that request processing stalls
> sporadically) what the root cause is for your request stall
> and why this patch fixes your issue. Even you don't explain
> why is the delay 100ms?
> 
> So it is a workaound, isn't it?
> 
> My concern is that it isn't good to add blk_mq_delay_run_hw_queue(hctx, 
> 100/*ms*/)
> in the hot path since it should been covered by SCHED_RESTART
> if there are in-flight requests.

This thread proves that it is definitely brittle to be relying on fixed
delays like this:
https://patchwork.kernel.org/patch/9703249/

Mike


Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Bart Van Assche
On Tue, 2017-09-19 at 13:43 +0800, Ming Lei wrote:
> On Mon, Sep 18, 2017 at 03:18:16PM +, Bart Van Assche wrote:
> > If you are still looking at removing the blk_mq_delay_run_hw_queue() calls
> > then I think you are looking in the wrong direction. What kind of problem
> > are you trying to solve? Is it perhaps that there can be a delay between
> 
> Actually the improvement on dm-rq IO schedule(the patch 2 ~ 5) doesn't
> need this patch.

The approach of this patch series looks wrong to me and patch 5/5 is an ugly
hack in my opinion. That's why I asked you to drop the entire patch series and
to test whether inserting a queue run call into the dm-mpath end_io callback
yields a similar performance improvement to the patches you posted. Please do
not expect me to spend more time on this patch series if you do not come up
with measurement results for the proposed alternative.

Bart.

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-18 Thread Ming Lei
On Mon, Sep 18, 2017 at 03:18:16PM +, Bart Van Assche wrote:
> On Sun, 2017-09-17 at 20:40 +0800, Ming Lei wrote:
> > "if no request has completed before the delay has expired" can't be a
> > reason to rerun the queue, because the queue can still be busy.
> 
> That statement of you shows that there are important aspects of the SCSI
> core and dm-mpath driver that you don't understand.

Then can you tell me why blk-mq's SCHED_RESTART can't cover
the rerun when there are in-flight requests? What is the case
in which dm-rq can return BUSY and there aren't any in-flight
requests meantime?

Also you are the author of adding 'blk_mq_delay_run_hw_queue(
hctx, 100/*ms*/)' in dm-rq, you never explain in commit
6077c2d706097c0(dm rq: Avoid that request processing stalls
sporadically) what the root cause is for your request stall
and why this patch fixes your issue. Even you don't explain
why is the delay 100ms?

So it is a workaound, isn't it?

My concern is that it isn't good to add blk_mq_delay_run_hw_queue(hctx, 
100/*ms*/)
in the hot path since it should been covered by SCHED_RESTART
if there are in-flight requests.

> 
> > I suggest to understand the root cause, instead of keeping this
> > ugly random delay because run hw queue after 100ms may be useless
> > in 99.99% times.
> 
> If you are still looking at removing the blk_mq_delay_run_hw_queue() calls
> then I think you are looking in the wrong direction. What kind of problem
> are you trying to solve? Is it perhaps that there can be a delay between

Actually the improvement on dm-rq IO schedule(the patch 2 ~ 5) doesn't
need this patch.


-- 
Ming


Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-18 Thread Bart Van Assche
On Sun, 2017-09-17 at 20:40 +0800, Ming Lei wrote:
> "if no request has completed before the delay has expired" can't be a
> reason to rerun the queue, because the queue can still be busy.

That statement of you shows that there are important aspects of the SCSI
core and dm-mpath driver that you don't understand.

> I suggest to understand the root cause, instead of keeping this
> ugly random delay because run hw queue after 100ms may be useless
> in 99.99% times.

If you are still looking at removing the blk_mq_delay_run_hw_queue() calls
then I think you are looking in the wrong direction. What kind of problem
are you trying to solve? Is it perhaps that there can be a delay between
dm-mpath request completion and the queueing of a new request? If so,
adding a queue run call into the dm-mpath end_io callback is probably
sufficient and probably can replace this entire patch series.

Bart.

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-15 Thread Bart Van Assche
On Sat, 2017-09-16 at 00:44 +0800, Ming Lei wrote:
> If .queue_rq() returns BLK_STS_RESOURCE, blk-mq will rerun
> the queue in the three situations:
> 
> 1) if BLK_MQ_S_SCHED_RESTART is set
> - queue is rerun after one rq is completed, see blk_mq_sched_restart()
> which is run from blk_mq_free_request()
> 
> 2) BLK_MQ_S_TAG_WAITING is set
> - queue is rerun after one tag is freed
> 
> 3) otherwise
> - queue is run immediately in blk_mq_dispatch_rq_list()
> 
> So calling blk_mq_delay_run_hw_queue() inside .queue_rq() doesn't make
> sense because no matter it is called or not, the queue still will be
> rerun soon in above three situations, and the busy req can be dispatched
> again.

NAK

Block drivers call blk_mq_delay_run_hw_queue() if it is known that the queue
has to be rerun later even if no request has completed before the delay has
expired. This patch breaks at least the SCSI core and the dm-mpath drivers.

Bart.