>From the trace buffer it seems that the local port (emlxs in this case) is 
>causing a logout. I know emlxs driver has its own tracing which can be enabled 
>to get more data. Probably someone from Emulex can help here.

Sumit

-----Original Message-----
From: Matty [mailto:[email protected]]
Sent: Mon 5/4/2009 6:05 PM
To: Sumit Gupta
Cc: [email protected]
Subject: Re: [storage-discuss] Does COMSTAR support ACTIVE/ACTIVE paths?
 
On Mon, May 4, 2009 at 1:00 PM, Sumit Gupta <[email protected]> wrote:
>
> All the paths on COMSTAR are always active-active. I am not sure why the
> paths are showing up as faulty. Try resetting the ports (stmfadm
> offline-target wwn....  followed by stmfadm online-target www.....) or
> rebooting the initiator to see if that makes any difference. You can get the
> stmf trace buffer as:
>
> echo '*stmf_trace_buf/s' |mdb -k > stmftrace.txt
>
> This might reveal something.

Hey Sumit,

After rebooting the host, I ran the following dd command on the Linux client:

$ dd if=/dev/mapper/mpath2 of=/dev/null bs=1048576

After this ran for a minute or two, one of the paths failed:

$ multipath -ll
mpath2 (3600144f02d658400000049fba1bf0001) dm-0 SUN,COMSTAR
[size=100G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
 \_ 0:0:0:0 sda 8:0   [active][ready]
 \_ 1:0:0:0 sdb 8:16  [failed][ready]
mpath4 (3600144f02d658400000049ffc0b80001) dm-1 SUN,COMSTAR
[size=100G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
 \_ 0:0:0:1 sdd 8:48  [active][ready]
 \_ 1:0:0:1 sdf 8:80  [failed][ready]
mpath3 (3600144f02d658400000049ffc0bc0002) dm-2 SUN,COMSTAR
[size=100G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
 \_ 0:0:0:2 sde 8:64  [active][ready]
 \_ 1:0:0:2 sdg 8:96  [failed][ready]

And various errors were sent to the messages file:

May  5 04:55:16 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:55:16 disarm kernel: end_request: I/O error, dev sdg, sector 11630336
May  5 04:55:16 disarm kernel: device-mapper: multipath: Failing path 8:96.
May  5 04:56:02 disarm kernel: sd 1:0:0:1: SCSI error: return code = 0x00020000
May  5 04:56:02 disarm kernel: end_request: I/O error, dev sdf, sector 11620608
May  5 04:56:02 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:02 disarm kernel: end_request: I/O error, dev sdg, sector 11624192
May  5 04:56:02 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:02 disarm kernel: end_request: I/O error, dev sdg, sector 11625984
May  5 04:56:02 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:02 disarm kernel: end_request: I/O error, dev sdg, sector 11625728
May  5 04:56:02 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:02 disarm kernel: end_request: I/O error, dev sdg, sector 11626240
May  5 04:56:02 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:02 disarm kernel: end_request: I/O error, dev sdg, sector 11623808
May  5 04:56:02 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:02 disarm kernel: end_request: I/O error, dev sdg, sector 11625344
May  5 04:56:02 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:02 disarm kernel: end_request: I/O error, dev sdg, sector 11626496
May  5 04:56:02 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:02 disarm kernel: end_request: I/O error, dev sdg, sector 11626624
May  5 04:56:02 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:02 disarm kernel: end_request: I/O error, dev sdg, sector 11626752
May  5 04:56:03 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:03 disarm kernel: end_request: I/O error, dev sdg, sector 11627520
May  5 04:56:03 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:03 disarm kernel: end_request: I/O error, dev sdg, sector 11627264
May  5 04:56:03 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:03 disarm kernel: end_request: I/O error, dev sdg, sector 11627008
May  5 04:56:03 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:03 disarm kernel: end_request: I/O error, dev sdg, sector 11627648
May  5 04:56:03 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:03 disarm kernel: end_request: I/O error, dev sdg, sector 11628416
May  5 04:56:03 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:03 disarm kernel: end_request: I/O error, dev sdg, sector 11628544
May  5 04:56:47 disarm multipathd: sdf: readsector0 checker reports
path is down
May  5 04:56:47 disarm multipathd: 8:96: mark as failed
May  5 04:56:47 disarm multipathd: mpath3: remaining active paths: 1
May  5 04:56:47 disarm multipathd: dm-2: add map (uevent)
May  5 04:56:47 disarm multipathd: dm-2: devmap already registered
May  5 04:56:54 disarm multipathd: sdg: readsector0 checker reports path is up
May  5 04:56:54 disarm multipathd: 8:96: reinstated
May  5 04:56:54 disarm multipathd: mpath3: remaining active paths: 2
May  5 04:56:54 disarm multipathd: dm-2: add map (uevent)
May  5 04:56:54 disarm multipathd: dm-2: devmap already registered
May  5 04:56:56 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:56:56 disarm kernel: end_request: I/O error, dev sdg, sector 11658240
May  5 04:56:56 disarm kernel: device-mapper: multipath: Failing path 8:96.
May  5 04:57:39 disarm kernel: sd 1:0:0:0: SCSI error: return code = 0x00020000
May  5 04:57:39 disarm kernel: end_request: I/O error, dev sdb, sector 11633792
May  5 04:57:39 disarm kernel: device-mapper: multipath: Failing path 8:16.
May  5 04:57:40 disarm kernel: sd 1:0:0:0: SCSI error: return code = 0x00020000
May  5 04:57:40 disarm kernel: end_request: I/O error, dev sdb, sector 11640064
May  5 04:57:49 disarm multipathd: sdf: readsector0 checker reports
path is down
May  5 04:57:49 disarm multipathd: 8:96: mark as failed
May  5 04:57:49 disarm multipathd: mpath3: remaining active paths: 1
May  5 04:57:49 disarm multipathd: dm-2: add map (uevent)
May  5 04:57:49 disarm multipathd: dm-2: devmap already registered
May  5 04:57:49 disarm multipathd: dm-0: add map (uevent)
May  5 04:57:49 disarm multipathd: dm-0: devmap already registered
May  5 04:57:49 disarm multipathd: 8:16: mark as failed
May  5 04:57:49 disarm multipathd: mpath2: remaining active paths: 1
May  5 04:57:53 disarm multipathd: sdg: readsector0 checker reports path is up
May  5 04:57:53 disarm multipathd: 8:96: reinstated
May  5 04:57:53 disarm multipathd: mpath3: remaining active paths: 2
May  5 04:57:53 disarm multipathd: dm-2: add map (uevent)
May  5 04:57:53 disarm multipathd: dm-2: devmap already registered
May  5 04:57:55 disarm kernel: sd 1:0:0:2: SCSI error: return code = 0x00020000
May  5 04:57:55 disarm kernel: end_request: I/O error, dev sdg, sector 11721728
May  5 04:57:55 disarm kernel: device-mapper: multipath: Failing path 8:96.

If I switch the multi-pathing policy to failover (ACTIVE/PASSIVE),
everything appears to work correctly. It's only when I use multibus
(ACTIVE/ACTIVE) that things go south. The stmftrace data is attached,
and I'm curious to get your thoughts on what might be happening here.

Thanks,
- Ryan
--
http://prefetch.net

_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to