Re: [PATCH 12/14] lpfc: Fix create_association oops on unloading LPFC driver

James Smart Mon, 09 Apr 2018 14:14:29 -0700

On 4/9/2018 1:03 AM, Hannes Reinecke wrote:

On Sat,  7 Apr 2018 11:30:24 -0700
James Smart <[email protected]> wrote:

Driver unload isn't waiting for all outstanding nvme associations
to terminate before clearing structures. In particular, it did not
set dev_loss_tmo to 0 such that all associations are immediately
terminated. Thus the transport would enter reconnect timeouts and
reattempt reconnect to an nvme controller. The call makes a call
into the driver to create hw queues for the controller which causes
a NULL pointer reference.

Correct by changing the teardown process to change all dev_loss_tmo
timeouts to 0 so that they are immediate. Now the teardown process
initiates, the remote ports unregistered and delete callback made,
and as the assocations are immediate upon remoteport unregister, the
transport will not longer invoke the callbacks for a new controller.

Signed-off-by: Dick Kennedy <[email protected]>
Signed-off-by: James Smart <[email protected]>
---
  drivers/scsi/lpfc/lpfc_hbadisc.c | 20 ++++++++++++++++++++
  1 file changed, 20 insertions(+)


Hmm. This seems to be a very circumspect way of deleting all
outstanding I/O...
Is there any guarantee that nvme_fc_set_remoteport_devloss() will
return only after all callbacks are invoked?

well roundabout - I agree. No, the set_remoteport_devloss won't makethe guarantee, but the unregister_remoteport and the wait for theremoteport_delete call will.

And as I look deeper at this failure scenario, I'm starting to believethat the actual problem was the missed unregister_remoteport that wasone of the other problems corrected in the patch set - by patch 12 or 14.

I'm going to repost, pulling this patch from the set. We'll retest andif still needed, we'll fix it in the next patch set.


-- james

Re: [PATCH 12/14] lpfc: Fix create_association oops on unloading LPFC driver

Reply via email to