Hello, Bart.

On Wed, Feb 07, 2018 at 05:27:10PM +0000, Bart Van Assche wrote:
> Even with the above change I think that there is still a race between the
> code that handles timer resets and the completion handler. Anyway, the test

Can you elaborate the scenario a bit further?  If you're referring to
lost completions, we've always had that and while we can try to close
that hole too, I don't think it's a critical issue.

> with which I triggered these races is as follows:
> - Start from what will become kernel v4.16-rc1 and apply the patch that adds
>   SRP over RoCE support to the ib_srpt driver. See also the "[PATCH v2 00/14]
>   IB/srpt: Add RDMA/CM support" patch series
>   (https://www.spinics.net/lists/linux-rdma/msg59589.html).
> - Apply my patch series that fixes a race between the SCSI error handler and
>   SCSI transport recovery.
> - Apply my patch series that improves the stability of the SCSI target core
>   (LIO).
> - Build and install that kernel.
> - Clone the following repository: https://github.com/bvanassche/srp-test.
> - Run the following test:
>   while true; do srp-test/run_tests -c -t 02-mq; done
> - While the test is running, check whether or not something weird happens.
>   Sometimes I see that scsi_times_out() crashes. Sometimes I see while running
>   this test that a soft lockup is reported inside blk_mq_do_dispatch_ctx().
> If you want I can share the tree on github that I use myself for my tests.

Heh, that's quite a bit.  I'll take up on that git tree later but for
now I'd really appreciate if you can test the patch.

Thank you very much.


Reply via email to