fc remote port timeout with qla2xxx driver

2005-08-31 Thread Rudolph Pereira
Hello,

I've been trying to do some basic testing with dm-multipath and in the
course of that have hit a strange situation with the following setup:

- HBA:
scsi0 : qla2xxx
qla2300 :07:01.0: 
 QLogic Fibre Channel HBA Driver: 8.01.00b5-k-debug
  QLogic QLA2342 - 
  ISP2312: PCI-X (133 MHz) @ :07:01.0 hdma-, host#=0, fw=3.03.15 IPX

going to:
Host: scsi0 Channel: 00 Id: 00 Lun: 01
  Vendor: IFT  Model: A16F-R2221   Rev: 342E
  Type:   Direct-AccessANSI SCSI revision: 03

on kernel 2.6.13.

The strange situation is that when disconnecting the fibre on the
disk/enclosure end and reconnecting it, the FC port is not reenabled.

I've enabled DEBUG_QLA2100 and QL_DEBUG_LEVEL_14 and am seeing the
following debug messages:

...
Aug 31 15:54:33 baku kernel: scsi(0): fcport-0 - port retry count: 0 remaining
= note, this is where I reconnect the fibre
Aug 31 15:55:10 baku kernel: scsi(0): RSCN database changed -- 000b 1132.
Aug 31 15:55:10 baku kernel: scsi(0): qla2x00_loop_resync()
Aug 31 15:55:10 baku kernel: scsi(0): F/W Ready - OK 
Aug 31 15:55:10 baku kernel: scsi(0): fw_state=3 curr time=83702.
Aug 31 15:55:10 baku kernel: scsi(0): Configure loop -- dpc flags =0xa0
Aug 31 15:55:10 baku kernel: scsi(0): RSCN queue entry[1] = [00/0b1132].
Aug 31 15:55:10 baku kernel: scsi(0): Handle RSCN -- process RSCN for port id 
[0b1132].
Aug 31 15:55:10 baku kernel: scsi(0): Handle RSCN -- attempting login to 
[82/0b1132].
Aug 31 15:55:10 baku kernel: scsi(0): Sending Login IOCB (a0002000) to 
[82/0b1132].
Aug 31 15:55:11 baku kernel: scsi(0): Process IODesc -- processing a0002000.
Aug 31 15:55:11 baku kernel: scsi(0): Login IOCB -- port id [0b1132] already 
assigned to loop id [81].
Aug 31 15:55:11 baku kernel: scsi(0): Login IOCB -- retrying login to 
[81/0b1132] (2).
Aug 31 15:55:11 baku kernel: scsi(0): Sending Login IOCB (a0003000) to 
[81/0b1132].
Aug 31 15:55:11 baku kernel: scsi(0): Process IODesc -- processing a0003000.
Aug 31 15:55:11 baku kernel: scsi(0): Login IOCB -- status=30 mb1=0 
pn=21d02367d125.
Aug 31 15:55:11 baku kernel: scsi(0): Login IOCB -- found RSCN fcport in 
fcports list [f7c84600].
Aug 31 15:55:11 baku kernel: scsi(0): Login IOCB -- marking existing fcport 
[81/0b1132] online.
Aug 31 15:55:11 baku kernel: scsi(0): Login IOCB -- Freeing RSCN fcport 
f5c12d80 [81/0b1132].
Aug 31 15:55:11 baku kernel: scsi(0): LOOP READY
Aug 31 15:55:11 baku kernel: scsi(0): qla2x00_loop_resync - end
Aug 31 15:55:16 baku kernel: scsi(0): Port Update -- creating RSCN fcport 
f5c12d80 for 81/7/6000.
Aug 31 15:55:16 baku kernel: scsi(0): Handle RSCN -- process RSCN for fcport 
[ff].
Aug 31 15:55:16 baku kernel: scsi(0): Handle RSCN -- attempting login to 
[81/ff].
Aug 31 15:55:16 baku kernel: scsi(0): Sending Login IOCB (a0004000) to 
[81/ff].
Aug 31 15:55:16 baku kernel: scsi(0): Port login retry: 21d02367d125, id = 
0x0081 retry cnt=10
Aug 31 15:55:16 baku kernel: scsi(0): Process IODesc -- processing a0004000.
Aug 31 15:55:16 baku kernel: scsi(0): fcport-0 - port retry count: 29 remaining
Aug 31 15:55:16 baku kernel: scsi(0): qla2x00_port_login()
Aug 31 15:55:16 baku kernel: scsi(0): Trying Fabric Login w/loop id 0x0081 for 
port 0b1132.
Aug 31 15:55:16 baku kernel: scsi(0): Login IOCB -- loop id [81] used by port 
id [0b1132].
Aug 31 15:55:16 baku kernel: scsi(0): Login IOCB -- retrying login to 
[81/0b1132] (2).
Aug 31 15:55:16 baku kernel: scsi(0): Sending Login IOCB (a0005000) to 
[81/0b1132].
Aug 31 15:55:16 baku kernel: scsi(0): Process IODesc -- processing a0005000.
Aug 31 15:55:16 baku kernel: scsi(0): Login IOCB -- status=0 mb1=0 
pn=21d02367d125.
Aug 31 15:55:16 baku kernel: scsi(0): Login IOCB -- found RSCN fcport in 
fcports list [f7c84600].
Aug 31 15:55:16 baku kernel: scsi(0): Login IOCB -- marking existing fcport 
[81/0b1132] online.
Aug 31 15:55:16 baku kernel: scsi(0): Login IOCB -- Freeing RSCN fcport 
f5c12d80 [81/0b1132].
Aug 31 15:55:16 baku kernel: scsi(0): port login OK: logged in ID 0x81
Aug 31 15:55:16 baku kernel: scsi(0): qla2x00_port_login - end
Aug 31 15:55:50 baku kernel:  rport-0:0-0: blocked FC remote port time out: 
removing target

at this point, the path is no longer unavailable, whereas it should be
(everything's physically connected). The most worrying indication is the
final blocked FC remote port time out which seems like the port is not
being unblocked when it should.

Has anyone seen this issue, and is it known, and if so, are there any
fixes for it?

Any information about this would be appreciated.
Thanks


signature.asc
Description: Digital signature


Re: fc remote port timeout with qla2xxx driver

2005-08-31 Thread Andrew Vasquez
On Wed, 31 Aug 2005, Rudolph Pereira wrote:

 Aug 31 15:55:16 baku kernel: scsi(0): Sending Login IOCB (a0005000) to 
 [81/0b1132].
 Aug 31 15:55:16 baku kernel: scsi(0): Process IODesc -- processing a0005000.
 Aug 31 15:55:16 baku kernel: scsi(0): Login IOCB -- status=0 mb1=0 
 pn=21d02367d125.
 Aug 31 15:55:16 baku kernel: scsi(0): Login IOCB -- found RSCN fcport in 
 fcports list [f7c84600].
 Aug 31 15:55:16 baku kernel: scsi(0): Login IOCB -- marking existing fcport 
 [81/0b1132] online.
 Aug 31 15:55:16 baku kernel: scsi(0): Login IOCB -- Freeing RSCN fcport 
 f5c12d80 [81/0b1132].
 Aug 31 15:55:16 baku kernel: scsi(0): port login OK: logged in ID 0x81
 Aug 31 15:55:16 baku kernel: scsi(0): qla2x00_port_login - end
 Aug 31 15:55:50 baku kernel:  rport-0:0-0: blocked FC remote port time out: 
 removing target
 
 at this point, the path is no longer unavailable, whereas it should be
 (everything's physically connected). The most worrying indication is the
 final blocked FC remote port time out which seems like the port is not
 being unblocked when it should.
 
 Has anyone seen this issue, and is it known, and if so, are there any
 fixes for it?

Hmm, could you try the attached small patch?  This should close that
whole where the fc_remote_port state is restored to a correct state.

---

diff --git a/drivers/scsi/qla2xxx/qla_rscn.c b/drivers/scsi/qla2xxx/qla_rscn.c
--- a/drivers/scsi/qla2xxx/qla_rscn.c
+++ b/drivers/scsi/qla2xxx/qla_rscn.c
@@ -330,6 +330,8 @@ qla2x00_update_login_fcport(scsi_qla_hos
fcport-flags = ~FCF_FAILOVER_NEEDED;
fcport-iodesc_idx_sent = IODESC_INVALID_INDEX;
atomic_set(fcport-state, FCS_ONLINE);
+   if (fcport-rport)
+   fc_remote_port_unblock(fcport-rport);
 }
 
 
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fc remote port timeout with qla2xxx driver

2005-08-31 Thread Rudolph Pereira
On Wed, Aug 31, 2005 at 03:44:09PM -0700, Andrew Vasquez wrote:
 Hmm, could you try the attached small patch?  This should close that
 whole where the fc_remote_port state is restored to a correct state.
This seems to fix the problem. The debug now shows:
...

Sep  1 10:05:15 baku kernel: scsi(0): LOOP READY
Sep  1 10:05:15 baku kernel: scsi(0): qla2x00_loop_resync - end
Sep  1 10:05:36 baku kernel: scsi(0): Port Update -- creating RSCN fcport 
f7c2a080 for 81/7/6000.
Sep  1 10:05:36 baku kernel: scsi(0): Handle RSCN -- process RSCN for fcport 
[ff].
Sep  1 10:05:36 baku kernel: scsi(0): Handle RSCN -- attempting login to 
[81/ff].
Sep  1 10:05:36 baku kernel: scsi(0): Sending Login IOCB (a0004000) to 
[81/ff].
Sep  1 10:05:36 baku kernel: scsi(0): Port login retry: 21d02367d125, id = 
0x0081 retry cnt=10
Sep  1 10:05:36 baku kernel: scsi(0): Process IODesc -- processing a0004000.
Sep  1 10:05:36 baku kernel: scsi(0): Login IOCB -- loop id [81] used by port 
id [0b1132].
Sep  1 10:05:36 baku kernel: scsi(0): Login IOCB -- retrying login to 
[81/0b1132] (2).
Sep  1 10:05:36 baku kernel: scsi(0): Sending Login IOCB (a0005000) to 
[81/0b1132].
Sep  1 10:05:36 baku kernel: scsi(0): Process IODesc -- processing a0005000.
Sep  1 10:05:36 baku kernel: scsi(0): Login IOCB -- status=0 mb1=0 
pn=21d02367d125.
Sep  1 10:05:36 baku kernel: scsi(0): fcport-0 - port retry count: 29 remaining
Sep  1 10:05:36 baku kernel: scsi(0): qla2x00_port_login()
Sep  1 10:05:36 baku kernel: scsi(0): Trying Fabric Login w/loop id 0x0081 for 
port 0b1132.
Sep  1 10:05:36 baku kernel: scsi(0): Login IOCB -- found RSCN fcport in 
fcports list [f7db8100].
Sep  1 10:05:36 baku kernel: scsi(0): Login IOCB -- marking existing fcport 
[81/0b1132] online.
Sep  1 10:05:36 baku kernel: scsi(0): Login IOCB -- Freeing RSCN fcport 
f7c2a080 [81/0b1132].
Sep  1 10:05:36 baku kernel: scsi(0): port login OK: logged in ID 0x81
Sep  1 10:05:36 baku kernel: scsi(0): qla2x00_port_login - end

one thing that I forgot to mention is that I'm prodding the scsi layer to get
rescan for devices by doing:

echo 1  
'/sys/class/fc_remote_ports/rport-0:0-0/device/target0:0:0/0:0:0:1/rescan'
I did this above at 10:05:36, as shown in the log, which led to the
port_login. This explains the delay between loop_resync and relogin.

Apologies for the basic question, but is this what one is supposed to
do? (I believe the dm-multipath stuff does this when it tries to update
devices)

If so, it seems like there might be a reference counting issue hanging
around, as I am able to do a rescan _after_ the FC port is blocked (as
indicated in the debug output), whereas I'd expect the fc_remote_port
sysfs stuff to have disappeared. Related to that, when the port is
disconnected, /sys/class/fc_remote_ports/rport-0:0-0/ still exists - I
presume this is part of the same issue.

In any case, thanks for the patch, as it seems to fix the real issue for me.


signature.asc
Description: Digital signature