Re: [PATCH V3 7/8] nvme: pci: recover controller reliably

jianchao.wang Thu, 03 May 2018 02:15:50 -0700

Hi ming

On 05/03/2018 11:17 AM, Ming Lei wrote:
>  static int io_queue_depth_set(const char *val, const struct kernel_param *kp)
> @@ -1199,7 +1204,7 @@ static enum blk_eh_timer_return nvme_timeout(struct 
> request *req, bool reserved)
>       if (nvme_should_reset(dev, csts)) {
>               nvme_warn_reset(dev, csts);
>               nvme_dev_disable(dev, false, true);
> -             nvme_reset_ctrl(&dev->ctrl);
> +             nvme_eh_reset(dev);
>               return BLK_EH_HANDLED;
>       }
>  
> @@ -1242,7 +1247,7 @@ static enum blk_eh_timer_return nvme_timeout(struct 
> request *req, bool reserved)
>                        "I/O %d QID %d timeout, reset controller\n",
>                        req->tag, nvmeq->qid);
>               nvme_dev_disable(dev, false, true);
> -             nvme_reset_ctrl(&dev->ctrl);
> +             nvme_eh_reset(dev);


w/o the 8th patch, invoke nvme_eh_reset in nvme_timeout is dangerous.
nvme_pre_reset_dev will send a lot of admin io when initialize the controller.
if this admin ios timeout, the nvme_timeout cannot handle this because the 
timeout work is sleeping
to wait admin ios.

In addition, even if we take the nvme_wait_freeze out of nvme_eh_reset and put 
it into another context,
but the ctrl state is still CONNECTING, the nvme_eh_reset cannot move forward.

Actually, I used to report this issue to Keith. I met io hung when the 
controller die in
nvme_reset_work -> nvme_wait_freeze. As you know, the nvme_reset_work cannot be 
scheduled because it is waiting.
Here is Keith's commit for this:
http://lists.infradead.org/pipermail/linux-nvme/2018-February/015603.html

Thanks
Jianchao

Re: [PATCH V3 7/8] nvme: pci: recover controller reliably

Reply via email to