On Tue, 2017-12-05 at 06:42 +0800, Ming Lei wrote:
> On Mon, Dec 04, 2017 at 09:30:32AM -0800, Bart Van Assche wrote:
> > * A systematic lockup for SCSI queues with queue depth 1. The
> >   following test reproduces that bug systematically:
> >   - Change the SRP initiator such that SCSI target queue depth is
> >     limited to 1.
> >   - Run the following command:
> >       srp-test/run_tests -f xfs -d -e none -r 60 -t 01
> >   See also "[PATCH 4/7] blk-mq: Avoid that request processing
> >   stalls when sharing tags"
> >   (https://marc.info/?l=linux-block&m=151208695316857). Note:
> >   reverting commit 0df21c86bdbf also fixes a sporadic SCSI request
> >   queue lockup while inserting a blk_mq_sched_mark_restart_hctx()
> >   before all blk_mq_dispatch_rq_list() calls only fixes the
> >   systematic lockup for queue depth 1.
> 
> You are the only reproducer [ ... ]

That's not correct. I'm pretty sure if you try to reproduce this that
you will see the same hang I ran into. Does this mean that you have not
yet tried to reproduce the hang I reported?

> You said that your patch fixes 'commit b347689ffbca ("blk-mq-sched:
> improve dispatching from sw queue")', but you don't mention any issue
> about that commit.

That's not correct either. From the commit message "A systematic lockup
for SCSI queues with queue depth 1."

> > I think the above means that it is too risky to try to fix all bugs
> > introduced by commit 0df21c86bdbf before kernel v4.15 is released.
> > Hence revert that commit.
> 
> What is the risk?

That more bugs were introduced by commit 0df21c86bdbf than the ones that
have been discovered so far.

Bart.

Reply via email to