Re: [PATCH 09/14] lpfc: Fix driver not recovering NVME rports during target link faults

2018-04-09 Thread Hannes Reinecke
On Sat,  7 Apr 2018 11:30:21 -0700
James Smart  wrote:

> During target-side port faults, the driver would not recover all
> target port logins. This resulted in a loss of nvme device discovery.
> 
> The driver is coded to wait for all GID_FT requests to complete
> before restarting discovery. A fault is seen where the outstanding
> GIT_FT counts are not properly decremented, thus discovery would
> never start. Another fault was found in the clearing of the gidft_inp
> counter that would be skipped in this condition. And a third fault
> found with lpfc_nvme_register_port that would remove a reverence
> on the ndlp which then allows a node swap on a port address change
> to prematurely remove the reference and release the ndlp.
> 
> The following changes are made:
> Correct the decrementing of the outstanding GID_FT counters.
> In RSCN handling, no longer zero the counter before calling to
>   issue another GID_FT.
> No longer remove the reference on the dlp when the ndlp->nrport
>   value is not yet null.
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
> ---
>  drivers/scsi/lpfc/lpfc_ct.c   |  5 +
>  drivers/scsi/lpfc/lpfc_els.c  |  1 -
>  drivers/scsi/lpfc/lpfc_nvme.c | 12 ++--
>  3 files changed, 15 insertions(+), 3 deletions(-)
> 

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes


[PATCH 09/14] lpfc: Fix driver not recovering NVME rports during target link faults

2018-04-07 Thread James Smart
During target-side port faults, the driver would not recover all
target port logins. This resulted in a loss of nvme device discovery.

The driver is coded to wait for all GID_FT requests to complete
before restarting discovery. A fault is seen where the outstanding
GIT_FT counts are not properly decremented, thus discovery would
never start. Another fault was found in the clearing of the gidft_inp
counter that would be skipped in this condition. And a third fault
found with lpfc_nvme_register_port that would remove a reverence
on the ndlp which then allows a node swap on a port address change
to prematurely remove the reference and release the ndlp.

The following changes are made:
Correct the decrementing of the outstanding GID_FT counters.
In RSCN handling, no longer zero the counter before calling to
  issue another GID_FT.
No longer remove the reference on the dlp when the ndlp->nrport
  value is not yet null.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
---
 drivers/scsi/lpfc/lpfc_ct.c   |  5 +
 drivers/scsi/lpfc/lpfc_els.c  |  1 -
 drivers/scsi/lpfc/lpfc_nvme.c | 12 ++--
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_ct.c b/drivers/scsi/lpfc/lpfc_ct.c
index 0617c8ea88c6..1e7889e45160 100644
--- a/drivers/scsi/lpfc/lpfc_ct.c
+++ b/drivers/scsi/lpfc/lpfc_ct.c
@@ -691,6 +691,11 @@ lpfc_cmpl_ct_cmd_gid_ft(struct lpfc_hba *phba, struct 
lpfc_iocbq *cmdiocb,
vport->fc_flag &= ~FC_RSCN_DEFERRED;
spin_unlock_irq(shost->host_lock);
 
+   /* This is a GID_FT completing so the gidft_inp counter was
+* incremented before the GID_FT was issued to the wire.
+*/
+   vport->gidft_inp--;
+
/*
 * Skip processing the NS response
 * Re-issue the NS cmd
diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 74895e62aaea..6d84a10fef07 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -6268,7 +6268,6 @@ lpfc_els_handle_rscn(struct lpfc_vport *vport)
 * flush the RSCN.  Otherwise, the outstanding requests
 * need to complete.
 */
-   vport->gidft_inp = 0;
if (lpfc_issue_gidft(vport) > 0)
return 1;
} else {
diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
index 1cb2c634e9f7..22962b08c275 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.c
+++ b/drivers/scsi/lpfc/lpfc_nvme.c
@@ -2721,8 +2721,16 @@ lpfc_nvme_register_port(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp)
spin_unlock_irq(&vport->phba->hbalock);
rport->ndlp = NULL;
rport->remoteport = NULL;
-   if (prev_ndlp)
-   lpfc_nlp_put(ndlp);
+
+   /* Reference only removed if previous NDLP is no longer
+* active. It might be just a swap and removing the
+* reference would cause a premature cleanup.
+*/
+   if (prev_ndlp && prev_ndlp != ndlp) {
+   if ((!NLP_CHK_NODE_ACT(prev_ndlp)) ||
+   (!prev_ndlp->nrport))
+   lpfc_nlp_put(prev_ndlp);
+   }
}
 
/* Clean bind the rport to the ndlp. */
-- 
2.13.1