Re: [PATCH V6 0/6] block/scsi: safe SCSI quiescing

2017-09-29 Thread Martin Steigerwald
Ming Lei - 27.09.17, 16:27:
> On Wed, Sep 27, 2017 at 09:57:37AM +0200, Martin Steigerwald wrote:
> > Hi Ming.
> > 
> > Ming Lei - 27.09.17, 13:48:
> > > Hi,
> > > 
> > > The current SCSI quiesce isn't safe and easy to trigger I/O deadlock.
> > > 
> > > Once SCSI device is put into QUIESCE, no new request except for
> > > RQF_PREEMPT can be dispatched to SCSI successfully, and
> > > scsi_device_quiesce() just simply waits for completion of I/Os
> > > dispatched to SCSI stack. It isn't enough at all.
> > > 
> > > Because new request still can be comming, but all the allocated
> > > requests can't be dispatched successfully, so request pool can be
> > > consumed up easily.
> > > 
> > > Then request with RQF_PREEMPT can't be allocated and wait forever,
> > > meantime scsi_device_resume() waits for completion of RQF_PREEMPT,
> > > then system hangs forever, such as during system suspend or
> > > sending SCSI domain alidation.
> > > 
> > > Both IO hang inside system suspend[1] or SCSI domain validation
> > > were reported before.
> > > 
> > > This patch introduces preempt only mode, and solves the issue
> > > by allowing RQF_PREEMP only during SCSI quiesce.
> > > 
> > > Both SCSI and SCSI_MQ have this IO deadlock issue, this patch fixes
> > > them all.
> > > 
> > > V6:
> > >   - borrow Bart's idea of preempt only, with clean
> > >   
> > > implementation(patch 5/patch 6)
> > >   
> > >   - needn't any external driver's dependency, such as MD's
> > >   change
> > 
> > Do you want me to test with v6 of the patch set? If so, it would be nice
> > if
> > you´d make a v6 branch in your git repo.
> 
> Hi Martin,
> 
> I appreciate much if you may run V6 and provide your test result,
> follows the branch:
> 
> https://github.com/ming1/linux/tree/blk_safe_scsi_quiesce_V6
> 
> https://github.com/ming1/linux.git #blk_safe_scsi_quiesce_V6
> 
> > After an uptime of almost 6 days I am pretty confident that the V5 one
> > fixes the issue for me. So
> > 
> > Tested-by: Martin Steigerwald 
> > 
> > for V5.
> 
> Thanks for your test!

Two days and almost 6 hours, no hang yet. I bet the whole thing works. 
(3e45474d7df3bfdabe4801b5638d197df9810a79)

Tested-By: Martin Steigerwald 

(It could still hang after three days, but usually I got the first hang within 
the first two days.)

Thanks,
-- 
Martin


Re: [PATCH V6 0/6] block/scsi: safe SCSI quiescing

2017-09-28 Thread Oleksandr Natalenko
Hey.

I can confirm that v6 of your patchset still works well for me. Tested on 
v4.13 kernel.

Thanks.

On středa 27. září 2017 10:52:41 CEST Ming Lei wrote:
> On Wed, Sep 27, 2017 at 04:27:51PM +0800, Ming Lei wrote:
> > On Wed, Sep 27, 2017 at 09:57:37AM +0200, Martin Steigerwald wrote:
> > > Hi Ming.
> > > 
> > > Ming Lei - 27.09.17, 13:48:
> > > > Hi,
> > > > 
> > > > The current SCSI quiesce isn't safe and easy to trigger I/O deadlock.
> > > > 
> > > > Once SCSI device is put into QUIESCE, no new request except for
> > > > RQF_PREEMPT can be dispatched to SCSI successfully, and
> > > > scsi_device_quiesce() just simply waits for completion of I/Os
> > > > dispatched to SCSI stack. It isn't enough at all.
> > > > 
> > > > Because new request still can be comming, but all the allocated
> > > > requests can't be dispatched successfully, so request pool can be
> > > > consumed up easily.
> > > > 
> > > > Then request with RQF_PREEMPT can't be allocated and wait forever,
> > > > meantime scsi_device_resume() waits for completion of RQF_PREEMPT,
> > > > then system hangs forever, such as during system suspend or
> > > > sending SCSI domain alidation.
> > > > 
> > > > Both IO hang inside system suspend[1] or SCSI domain validation
> > > > were reported before.
> > > > 
> > > > This patch introduces preempt only mode, and solves the issue
> > > > by allowing RQF_PREEMP only during SCSI quiesce.
> > > > 
> > > > Both SCSI and SCSI_MQ have this IO deadlock issue, this patch fixes
> > > > them all.
> > > > 
> > > > V6:
> > > > - borrow Bart's idea of preempt only, with clean
> > > > 
> > > >   implementation(patch 5/patch 6)
> > > > 
> > > > - needn't any external driver's dependency, such as MD's
> > > > change
> > > 
> > > Do you want me to test with v6 of the patch set? If so, it would be nice
> > > if
> > > you´d make a v6 branch in your git repo.
> > 
> > Hi Martin,
> > 
> > I appreciate much if you may run V6 and provide your test result,
> > follows the branch:
> > 
> > https://github.com/ming1/linux/tree/blk_safe_scsi_quiesce_V6
> > 
> > https://github.com/ming1/linux.git #blk_safe_scsi_quiesce_V6
> 
> Also follows the branch against V4.13:
> 
> https://github.com/ming1/linux/tree/v4.13-safe-scsi-quiesce_V6_for_test
> 
> https://github.com/ming1/linux.git #v4.13-safe-scsi-quiesce_V6_for_test



Re: [PATCH V6 0/6] block/scsi: safe SCSI quiescing

2017-09-27 Thread Ming Lei
On Wed, Sep 27, 2017 at 04:27:51PM +0800, Ming Lei wrote:
> On Wed, Sep 27, 2017 at 09:57:37AM +0200, Martin Steigerwald wrote:
> > Hi Ming.
> > 
> > Ming Lei - 27.09.17, 13:48:
> > > Hi,
> > > 
> > > The current SCSI quiesce isn't safe and easy to trigger I/O deadlock.
> > > 
> > > Once SCSI device is put into QUIESCE, no new request except for
> > > RQF_PREEMPT can be dispatched to SCSI successfully, and
> > > scsi_device_quiesce() just simply waits for completion of I/Os
> > > dispatched to SCSI stack. It isn't enough at all.
> > > 
> > > Because new request still can be comming, but all the allocated
> > > requests can't be dispatched successfully, so request pool can be
> > > consumed up easily.
> > > 
> > > Then request with RQF_PREEMPT can't be allocated and wait forever,
> > > meantime scsi_device_resume() waits for completion of RQF_PREEMPT,
> > > then system hangs forever, such as during system suspend or
> > > sending SCSI domain alidation.
> > > 
> > > Both IO hang inside system suspend[1] or SCSI domain validation
> > > were reported before.
> > > 
> > > This patch introduces preempt only mode, and solves the issue
> > > by allowing RQF_PREEMP only during SCSI quiesce.
> > > 
> > > Both SCSI and SCSI_MQ have this IO deadlock issue, this patch fixes
> > > them all.
> > > 
> > > V6:
> > >   - borrow Bart's idea of preempt only, with clean
> > > implementation(patch 5/patch 6)
> > >   - needn't any external driver's dependency, such as MD's
> > >   change
> > 
> > Do you want me to test with v6 of the patch set? If so, it would be nice if 
> > you´d make a v6 branch in your git repo.
> 
> Hi Martin,
> 
> I appreciate much if you may run V6 and provide your test result,
> follows the branch:
> 
> https://github.com/ming1/linux/tree/blk_safe_scsi_quiesce_V6
> 
> https://github.com/ming1/linux.git #blk_safe_scsi_quiesce_V6
> 

Also follows the branch against V4.13:

https://github.com/ming1/linux/tree/v4.13-safe-scsi-quiesce_V6_for_test

https://github.com/ming1/linux.git #v4.13-safe-scsi-quiesce_V6_for_test

-- 
Ming


Re: [PATCH V6 0/6] block/scsi: safe SCSI quiescing

2017-09-27 Thread Ming Lei
On Wed, Sep 27, 2017 at 09:57:37AM +0200, Martin Steigerwald wrote:
> Hi Ming.
> 
> Ming Lei - 27.09.17, 13:48:
> > Hi,
> > 
> > The current SCSI quiesce isn't safe and easy to trigger I/O deadlock.
> > 
> > Once SCSI device is put into QUIESCE, no new request except for
> > RQF_PREEMPT can be dispatched to SCSI successfully, and
> > scsi_device_quiesce() just simply waits for completion of I/Os
> > dispatched to SCSI stack. It isn't enough at all.
> > 
> > Because new request still can be comming, but all the allocated
> > requests can't be dispatched successfully, so request pool can be
> > consumed up easily.
> > 
> > Then request with RQF_PREEMPT can't be allocated and wait forever,
> > meantime scsi_device_resume() waits for completion of RQF_PREEMPT,
> > then system hangs forever, such as during system suspend or
> > sending SCSI domain alidation.
> > 
> > Both IO hang inside system suspend[1] or SCSI domain validation
> > were reported before.
> > 
> > This patch introduces preempt only mode, and solves the issue
> > by allowing RQF_PREEMP only during SCSI quiesce.
> > 
> > Both SCSI and SCSI_MQ have this IO deadlock issue, this patch fixes
> > them all.
> > 
> > V6:
> > - borrow Bart's idea of preempt only, with clean
> >   implementation(patch 5/patch 6)
> > - needn't any external driver's dependency, such as MD's
> > change
> 
> Do you want me to test with v6 of the patch set? If so, it would be nice if 
> you´d make a v6 branch in your git repo.

Hi Martin,

I appreciate much if you may run V6 and provide your test result,
follows the branch:

https://github.com/ming1/linux/tree/blk_safe_scsi_quiesce_V6

https://github.com/ming1/linux.git #blk_safe_scsi_quiesce_V6


> 
> After an uptime of almost 6 days I am pretty confident that the V5 one fixes 
> the 
> issue for me. So
> 
> Tested-by: Martin Steigerwald 
> 
> for V5.

Thanks for your test!


-- 
Ming


[PATCH V6 0/6] block/scsi: safe SCSI quiescing

2017-09-26 Thread Ming Lei
Hi,

The current SCSI quiesce isn't safe and easy to trigger I/O deadlock.

Once SCSI device is put into QUIESCE, no new request except for
RQF_PREEMPT can be dispatched to SCSI successfully, and
scsi_device_quiesce() just simply waits for completion of I/Os
dispatched to SCSI stack. It isn't enough at all.

Because new request still can be comming, but all the allocated
requests can't be dispatched successfully, so request pool can be
consumed up easily.

Then request with RQF_PREEMPT can't be allocated and wait forever,
meantime scsi_device_resume() waits for completion of RQF_PREEMPT,
then system hangs forever, such as during system suspend or
sending SCSI domain alidation.

Both IO hang inside system suspend[1] or SCSI domain validation
were reported before.

This patch introduces preempt only mode, and solves the issue
by allowing RQF_PREEMP only during SCSI quiesce.

Both SCSI and SCSI_MQ have this IO deadlock issue, this patch fixes
them all.

V6:
- borrow Bart's idea of preempt only, with clean
  implementation(patch 5/patch 6)
- needn't any external driver's dependency, such as MD's
change

V5:
- fix one tiny race by introducing blk_queue_enter_preempt_freeze()
given this change is small enough compared with V4, I added
tested-by directly

V4:
- reorganize patch order to make it more reasonable
- support nested preempt freeze, as required by SCSI transport spi
- check preempt freezing in slow path of of blk_queue_enter()
- add "SCSI: transport_spi: resume a quiesced device"
- wake up freeze queue in setting dying for both blk-mq and legacy
- rename blk_mq_[freeze|unfreeze]_queue() in one patch
- rename .mq_freeze_wq and .mq_freeze_depth
- improve comment

V3:
- introduce q->preempt_unfreezing to fix one bug of preempt freeze
- call blk_queue_enter_live() only when queue is preempt frozen
- cleanup a bit on the implementation of preempt freeze
- only patch 6 and 7 are changed

V2:
- drop the 1st patch in V1 because percpu_ref_is_dying() is
enough as pointed by Tejun
- introduce preempt version of blk_[freeze|unfreeze]_queue
- sync between preempt freeze and normal freeze
- fix warning from percpu-refcount as reported by Oleksandr


[1] https://marc.info/?t=150340250100013=3=2


Thanks,
Ming

Ming Lei (6):
  blk-mq: only run hw queues for blk-mq
  block: tracking request allocation with q_usage_counter
  block: pass flags to blk_queue_enter()
  block: prepare for passing RQF_PREEMPT to request allocation
  block: support PREEMPT_ONLY
  SCSI: set block queue at preempt only when SCSI device is put into
quiesce

 block/blk-core.c| 62 ++---
 block/blk-mq.c  | 14 ---
 block/blk-timeout.c |  2 +-
 drivers/scsi/scsi_lib.c | 25 +---
 fs/block_dev.c  |  4 ++--
 include/linux/blk-mq.h  |  7 +++---
 include/linux/blkdev.h  | 27 ++---
 7 files changed, 106 insertions(+), 35 deletions(-)

-- 
2.9.5