Bug: libfc/fnic connected to JBODs. Rapidly pull/push in the JBODs to generate multiple RSCNs rapidly. There is a race in the remote port/disc state machine that causes multiple entries for the same remote port in /sys/class/fc_remote_ports.
Rogue ports are now tracked in a separate rogue list until they go to READY state and get moved to the real rports list. An incoming RSCN that causes rediscovery or causes a re-plogi to a remote port does not search the rogues list, and thus fails to log it off before starting a new plogi. This causes a race condition where 2 rogue ports are created to the same remote port. (The same problem existed when rogues were not tracked in any list) This patch adds a fix by searching the rogue list in addition to the regular rport list on incoming RSCNs. fc_disc_lookup_rport() is also called for incoming rport requests. I wanted to avoid modifying that function to also search in the rogue list, since that would result into a case where an incoming Plogi would find the rogue session, and significant change is required to support that. So, instead, I added a new function fc_disc_lookup_all_ports() which searches in both lists. It is called on incoming RSCNs. On incoming rport requests, like Plogi etc, existing fc_disc_lookup_rport() function is used, thus avoiding changes to that code path. Also, before re-discovery is initiated, the rogue list is searched and all rogue ports are first logged off. Test: Tested the fix by running the same test case as above. This time, remote ports were logged off and re-logged on correctly, and resulted in single entry per remote port. Please verify bug and fix for libfc/fcoe driver. Signed-off-by: Abhijeet Joglekar <[email protected]> --- drivers/scsi/libfc/fc_disc.c | 44 ++++++++++++++++++++++++++++++++++++++++-- 1 files changed, 42 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/libfc/fc_disc.c b/drivers/scsi/libfc/fc_disc.c index 6fabf66..f089db6 100644 --- a/drivers/scsi/libfc/fc_disc.c +++ b/drivers/scsi/libfc/fc_disc.c @@ -56,6 +56,39 @@ static void fc_disc_single(struct fc_disc *, struct fc_disc_port *); static void fc_disc_restart(struct fc_disc *); /** + * fc_disc_lookup_all_rports() - lookup a remote port by port_id in both + * the rports and the rogue_rports list + * @lport: Fibre Channel host port instance + * @port_id: remote port port_id to match + */ +struct fc_rport *fc_disc_lookup_all_rports(const struct fc_lport *lport, + u32 port_id) +{ + const struct fc_disc *disc = &lport->disc; + struct fc_rport *rport, *found = NULL; + struct fc_rport_libfc_priv *rdata; + + list_for_each_entry(rdata, &disc->rports, peers) { + rport = PRIV_TO_RPORT(rdata); + if (rport->port_id == port_id) { + found = rport; + goto out; + } + } + + list_for_each_entry(rdata, &disc->rogue_rports, peers) { + rport = PRIV_TO_RPORT(rdata); + if (rport->port_id == port_id) { + found = rport; + goto out; + } + } + +out: + return found; +} + +/** * fc_disc_lookup_rport() - lookup a remote port by port_id * @lport: Fibre Channel host port instance * @port_id: remote port port_id to match @@ -249,10 +282,12 @@ static void fc_disc_recv_rscn_req(struct fc_seq *sp, struct fc_frame *fp, redisc, lport->state, disc->pending); list_for_each_entry_safe(dp, next, &disc_ports, peers) { list_del(&dp->peers); - rport = lport->tt.rport_lookup(lport, dp->ids.port_id); + rport = fc_disc_lookup_all_rports(lport, + dp->ids.port_id); if (rport) { rdata = rport->dd_data; - list_del(&rdata->peers); + if (rdata->trans_state != FC_PORTSTATE_ROGUE) + list_del(&rdata->peers); lport->tt.rport_logoff(rport); } fc_disc_single(disc, dp); @@ -320,6 +355,11 @@ static void fc_disc_restart(struct fc_disc *disc) lport->tt.rport_logoff(rport); } + list_for_each_entry_safe(rdata, next, &disc->rogue_rports, peers) { + rport = PRIV_TO_RPORT(rdata); + lport->tt.rport_logoff(rport); + } + disc->requested = 1; if (!disc->pending) fc_disc_gpn_ft_req(disc); _______________________________________________ devel mailing list [email protected] http://www.open-fcoe.org/mailman/listinfo/devel
