On Fri, May 18, 2018 at 10:38:20AM -0600, Keith Busch wrote:
> IO may be retryable, so don't wait for them in the reset path. These
> commands may trigger a reset if that IO expires without a completion,
> placing it on the requeue list. Waiting for these would then deadlock
> the reset handler.
> 
> To fix the theoretical deadlock, this patch unblocks IO submission from
> the reset_work as before, but moves the waiting to the IO safe scan_work
> so that the reset_work may proceed to completion. Since the unfreezing
> happens in the controller LIVE state, the nvme device has to track if
> the queues were frozen now to prevent incorrect freeze depths.
> 
> This patch is also renaming the function 'nvme_dev_add' to a
> more appropriate name that describes what it's actually doing:
> nvme_alloc_io_tags.
> 
> Signed-off-by: Keith Busch <keith.bu...@intel.com>
> ---
>  drivers/nvme/host/core.c |  3 +++
>  drivers/nvme/host/nvme.h |  1 +
>  drivers/nvme/host/pci.c  | 46 +++++++++++++++++++++++++++++++++-------------
>  3 files changed, 37 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 1de68b56b318..34d7731f1419 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -214,6 +214,7 @@ static inline bool nvme_req_needs_retry(struct request 
> *req)
>       if (blk_noretry_request(req))
>               return false;
>       if (nvme_req(req)->status & NVME_SC_DNR)
> +
>               return false;
>       if (nvme_req(req)->retries >= nvme_max_retries)
>               return false;
> @@ -3177,6 +3178,8 @@ static void nvme_scan_work(struct work_struct *work)
>       struct nvme_id_ctrl *id;
>       unsigned nn;
>  
> +     if (ctrl->ops->update_hw_ctx)
> +             ctrl->ops->update_hw_ctx(ctrl);
>       if (ctrl->state != NVME_CTRL_LIVE)
>               return;
>  
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index c15c2ee7f61a..230c5424b197 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -320,6 +320,7 @@ struct nvme_ctrl_ops {
>       int (*get_address)(struct nvme_ctrl *ctrl, char *buf, int size);
>       int (*reinit_request)(void *data, struct request *rq);
>       void (*stop_ctrl)(struct nvme_ctrl *ctrl);
> +     void (*update_hw_ctx)(struct nvme_ctrl *ctrl);
>  };
>  
>  #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 2bd9d84f58d0..6a7cbc631d92 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -99,6 +99,7 @@ struct nvme_dev {
>       u32 cmbloc;
>       struct nvme_ctrl ctrl;
>       struct completion ioq_wait;
> +     bool queues_froze;
>  
>       /* shadow doorbell buffer support: */
>       u32 *dbbuf_dbs;
> @@ -2065,10 +2066,33 @@ static void nvme_disable_io_queues(struct nvme_dev 
> *dev)
>       }
>  }
>  
> +static void nvme_pci_update_hw_ctx(struct nvme_ctrl *ctrl)
> +{
> +     struct nvme_dev *dev = to_nvme_dev(ctrl);
> +     bool unfreeze;
> +
> +     mutex_lock(&dev->shutdown_lock);
> +     unfreeze = dev->queues_froze;
> +     mutex_unlock(&dev->shutdown_lock);

What if nvme_dev_disable() just sets the .queues_froze flag and
userspace sends a RESCAN command at the same time?

> +
> +     if (unfreeze)
> +             nvme_wait_freeze(&dev->ctrl);
> +

timeout may comes just before&during blk_mq_update_nr_hw_queues() or
the above nvme_wait_freeze(), then both two may hang forever.

> +     blk_mq_update_nr_hw_queues(ctrl->tagset, dev->online_queues - 1);
> +     nvme_free_queues(dev, dev->online_queues);
> +
> +     if (unfreeze)
> +             nvme_unfreeze(&dev->ctrl);
> +
> +     mutex_lock(&dev->shutdown_lock);
> +     dev->queues_froze = false;
> +     mutex_unlock(&dev->shutdown_lock);

If the running scan work is triggered via user-space, the above code
may clear the .queues_froze flag wrong.

Thanks,
Ming

Reply via email to