@@ -791,7 +791,8 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work) * queues are not a live anymore, so restart the queues to fail fast * new IO */ - blk_mq_start_stopped_hw_queues(ctrl->ctrl.admin_q, true); + blk_mq_unquiesce_queue(ctrl->ctrl.admin_q); + blk_mq_kick_requeue_list(ctrl->ctrl.admin_q);Now the queue won't be stopped via blk_mq_quiesce_queue(), so why do you add blk_mq_kick_requeue_list() here?
I think you're right. We now quiesce the queue and fast fail inflight io, in nvme_complete_rq we call blk_mq_requeue_request with !blk_mq_queue_stopped(req->q) which is now true. So the requeue_work is triggered and requeue the request, and when we unquiesce we simply run the hw queues again. If we were to call it with !blk_queue_quiesced(req->q) I think it would be needed though...
