On Wed, 2017-09-27 at 13:48 +0800, Ming Lei wrote:
> @@ -2928,12 +2929,28 @@ scsi_device_quiesce(struct scsi_device *sdev)
> {
> int err;
>
> + /*
> + * Simply quiesing SCSI device isn't safe, it is easy
> + * to use up requests because all these allocated requests
> + * can't be dispatched when device is put in QIUESCE.
> + * Then no request can be allocated and we may hang
> + * somewhere, such as system suspend/resume.
> + *
> + * So we set block queue in preempt only first, no new
> + * normal request can enter queue any more, and all pending
> + * requests are drained once blk_set_preempt_only()
> + * returns. Only RQF_PREEMPT is allowed in preempt only mode.
> + */
> + blk_set_preempt_only(sdev->request_queue, true);
> +
> mutex_lock(&sdev->state_mutex);
> err = scsi_device_set_state(sdev, SDEV_QUIESCE);
> mutex_unlock(&sdev->state_mutex);
>
> - if (err)
> + if (err) {
> + blk_set_preempt_only(sdev->request_queue, false);
> return err;
> + }
>
> scsi_run_queue(sdev->request_queue);
> while (atomic_read(&sdev->device_busy)) {
> @@ -2964,6 +2981,8 @@ void scsi_device_resume(struct scsi_device *sdev)
> scsi_device_set_state(sdev, SDEV_RUNNING) == 0)
> scsi_run_queue(sdev->request_queue);
> mutex_unlock(&sdev->state_mutex);
> +
> + blk_set_preempt_only(sdev->request_queue, false);
You should have realized yourself that this code is racy. If a request is
allocated just before scsi_device_quiesce() is called and dispatched just
after the device state has been changed into SDEV_QUIESCE then the loop that
waits for all commands to complete will wait forever due to the SCSI prep
function returning BLKPREP_DEFER.
Bart.