Re: [PATCH v3 0/6] Make SCSI device suspend and resume work reliably

2017-09-26 Thread Ming Lei
On Mon, Sep 25, 2017 at 04:17:27PM +, Bart Van Assche wrote:
> On Mon, 2017-09-25 at 10:36 +0800, Ming Lei wrote:
> > On Sat, Sep 23, 2017 at 6:13 AM, Bart Van Assche  
> > wrote:
> > > It is known that during the resume following a hibernate sometimes the
> > > system hangs instead of coming up properly. This patch series fixes this
> > > problem. This patch series is an alternative for Ming Lei's "[PATCH V5
> > > 0/10] block/scsi: safe SCSI quiescing" patch series. The advantages of
> > > this patch series are:
> > 
> > No, your patch doesn't fix scsi quiesce on block legacy, so not an 
> > alternative
> > of my patchset at all.
> 
> This patch series definitely is an alternative for blk-mq/scsi-mq. And as you
> know my approach can be extended easily to the legacy SCSI core by adding
> blk_queue_enter() / blk_queue_exit() calls where necessary in the legacy block
> layer. I have not done this because the bug report was against scsi-mq and not
> against the legacy SCSI core. Additionally, since the legacy block layer and
> SCSI core are on their way out I did not want to spend time on modifying 
> these.

Let me show you the legacy report and verification:

https://www.spinics.net/lists/linux-block/msg17237.html

If you have transport_spi device at hand, the issue can be reproduced
in several minutes by the following way:

- set nr_request of this disk as 4
- while true; do
trigger revalidate once in 5 seconds
meantime run heavy/background concurrent I/O

-- 
Ming


Re: [PATCH v3 0/6] Make SCSI device suspend and resume work reliably

2017-09-25 Thread h...@lst.de
On Mon, Sep 25, 2017 at 04:17:27PM +, Bart Van Assche wrote:
> This patch series definitely is an alternative for blk-mq/scsi-mq. And as you
> know my approach can be extended easily to the legacy SCSI core by adding
> blk_queue_enter() / blk_queue_exit() calls where necessary in the legacy block
> layer. I have not done this because the bug report was against scsi-mq and not
> against the legacy SCSI core. Additionally, since the legacy block layer and
> SCSI core are on their way out I did not want to spend time on modifying 
> these.

For now we'll have to fix both paths to have equal functionality.  Even
if scsi might be on it's way out there still are many drivers that
use the legacy request path, and with mmc we'll probably grow another
dual block layer driver soon, although hopefully the transition
period will be much shorter.


Re: [PATCH v3 0/6] Make SCSI device suspend and resume work reliably

2017-09-25 Thread Bart Van Assche
On Mon, 2017-09-25 at 10:36 +0800, Ming Lei wrote:
> On Sat, Sep 23, 2017 at 6:13 AM, Bart Van Assche  
> wrote:
> > It is known that during the resume following a hibernate sometimes the
> > system hangs instead of coming up properly. This patch series fixes this
> > problem. This patch series is an alternative for Ming Lei's "[PATCH V5
> > 0/10] block/scsi: safe SCSI quiescing" patch series. The advantages of
> > this patch series are:
> 
> No, your patch doesn't fix scsi quiesce on block legacy, so not an alternative
> of my patchset at all.

This patch series definitely is an alternative for blk-mq/scsi-mq. And as you
know my approach can be extended easily to the legacy SCSI core by adding
blk_queue_enter() / blk_queue_exit() calls where necessary in the legacy block
layer. I have not done this because the bug report was against scsi-mq and not
against the legacy SCSI core. Additionally, since the legacy block layer and
SCSI core are on their way out I did not want to spend time on modifying these.

Bart.

Re: [PATCH v3 0/6] Make SCSI device suspend and resume work reliably

2017-09-24 Thread Ming Lei
On Sat, Sep 23, 2017 at 6:13 AM, Bart Van Assche  wrote:
> Hello Jens,
>
> It is known that during the resume following a hibernate sometimes the
> system hangs instead of coming up properly. This patch series fixes this
> problem. This patch series is an alternative for Ming Lei's "[PATCH V5
> 0/10] block/scsi: safe SCSI quiescing" patch series. The advantages of
> this patch series are:

No, your patch doesn't fix scsi quiesce on block legacy, so not an alternative
of my patchset at all.

> - Easier to review because no new race conditions are introduced between
>   queue freezing and blk_cleanup_queue(). As the discussion that followed
>   Ming's patch series shows the correctness of the new code is hard to
>   verify.

I don't agree that my code is hard to verify. I have replied all your comments,
and the only thing you pay special attention to is that the race between preempt
quiesce and blk_cleanup_queue():

- that is simply not a race
- we have depended on drivers(legacy or blk-mq) to handle request correctly
   after queue is set as dying for long long time


-- 
Ming Lei


[PATCH v3 0/6] Make SCSI device suspend and resume work reliably

2017-09-22 Thread Bart Van Assche
Hello Jens,

It is known that during the resume following a hibernate sometimes the
system hangs instead of coming up properly. This patch series fixes this
problem. This patch series is an alternative for Ming Lei's "[PATCH V5
0/10] block/scsi: safe SCSI quiescing" patch series. The advantages of
this patch series are:
- Easier to review because no new race conditions are introduced between
  queue freezing and blk_cleanup_queue(). As the discussion that followed
  Ming's patch series shows the correctness of the new code is hard to
  verify.
- No new freeze modes and hence no new freeze mode variables.

These patches have been tested on top of a merge of the block layer
for-next branch and Linus' master tree. Linus' master tree includes
patch "KVM: x86: Fix the NULL pointer parameter in check_cr_write()"
but the block layer for-next branch not yet.

Please consider these changes for kernel v4.15.

Thanks,

Bart.

Changes between v2 and v3:
- Made md kernel threads freezable.
- Changed the approach for quiescing SCSI devices again.
- Addressed Ming's review comments.

Changes compared to v1 of this patch series:
- Changed the approach and rewrote the patch series.

Bart Van Assche (6):
  md: Make md resync and reshape threads freezable
  block: Convert RQF_PREEMPT into REQ_PREEMPT
  block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag
  scsi: Set QUEUE_FLAG_PREEMPT_ONLY while quiesced
  block: Make SCSI device suspend and resume work reliably
  scsi-mq: Reduce suspend latency

 block/blk-core.c  | 46 +++---
 block/blk-mq-debugfs.c|  2 +-
 block/blk-mq.c|  4 ++--
 block/blk-timeout.c   |  2 +-
 drivers/ide/ide-atapi.c   |  3 +--
 drivers/ide/ide-io.c  |  2 +-
 drivers/ide/ide-pm.c  |  4 ++--
 drivers/md/md.c   | 21 +
 drivers/scsi/scsi_lib.c   | 36 ++--
 fs/block_dev.c|  4 ++--
 include/linux/blk_types.h |  6 ++
 include/linux/blkdev.h| 10 ++
 12 files changed, 100 insertions(+), 40 deletions(-)

-- 
2.14.1