Disk Disappears From System When open-iscsi Re-connects To COMSTAR HA Target

2010-05-04 Thread Preston Connors
Good day,

I am utilizing a Ubuntu 9.10 amd64 open-iscsi 2.0-870 initiator
connecting to a highly available OpenSolaris snv_134 64bit COMSTAR
target. The HA target consists of one active and one passive identical
OpenSolaris servers utilizing the same disk. When the active target
fails the passive target comes on-line using the same I.P. address,
target information, and serves up the same disk.

I have verified that the OpenSolaris servers are failing over
properly. The whole fail over process takes between 30-45 seconds.

The problem lies after the active target goes off-line and the passive
target becomes the active target. During fail over I can see the
initiator iSCSI disk going into block state and the initiator session
going into FAIL state. When the target fail over is complete I can see
the initiator iSCSI disk is in state running and the initiator session
state is LOGGED_IN.  Even though the output from iscsiadm states this
disk is running and the iSCSI session is established I can no longer
access this iSCSI disk via the file system and the disk no longer
shows up in fdisk nor am I able to mount it. If I log out of the
target and log back in the disk becomes available and usable.
Unfortunately I do not want to log out and log back in to the target
because there will be active KVM VMs whose file system is based off of
this iSCSI connection and the KVM VMs will cease to work if a logout/
login happens.

Another oddity I find is that when I fail back to the original state
(the original passive node is no longer active and the off-line active
node is now back on-line and active) the disk in question becomes
available and usable by the system.

Here is output from some relevant processes while the initiator is
successfully connected to the HA target.

Any help, insight, or ideas would be greatly appreciated! I can
provide any other output and/or settings needed to provide more in-
depth information.

Thank you in advance,
Preston Connors

FDISK OUPUT BEFORE FAIL OVER:

r...@host:~# fdisk -l /dev/sdd

Disk /dev/sdd: 5368 MB, 5368709120 bytes
166 heads, 62 sectors/track, 1018 cylinders
Units = cylinders of 10292 * 512 = 5269504 bytes
Disk identifier: 0x508cb22f

   Device Boot  Start End  Blocks   Id  System
/dev/sdd1   11018 5238597   83  Linux


ISCSIADM OUTPUT BEFORE FAIL OVER:

r...@host:~# iscsiadm -m session -P 3
iSCSI Transport Class version 2.0-870
iscsiadm version 2.0-870

Target: iqn.1986-03.com.sun:mirror:iscsi-failover-test
Current Portal: 192.168.1.1:3260,2
Persistent Portal: 192.168.1.1:3260,2
**
Interface:
**
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1993-08.org.debian:01:bb82d0f5e87f
Iface IPaddress: 192.168.1.2
Iface HWaddress: default
Iface Netdev: default
SID: 3
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE

Negotiated iSCSI params:

HeaderDigest: None
DataDigest: None
MaxRecvDataSegmentLength: 131072
MaxXmitDataSegmentLength: 32768
FirstBurstLength: 65536
MaxBurstLength: 524288
ImmediateData: Yes
InitialR2T: Yes
MaxOutstandingR2T: 1

Attached SCSI devices:

Host Number: 6  State: running
scsi6 Channel 00 Id 0 Lun: 0
Attached scsi disk sdd  State: running

ISCSIADM OUTPUT DURING FAIL OVER:

r...@host:~# iscsiadm -m session -P 3
iSCSI Transport Class version 2.0-870
iscsiadm version 2.0-870

Target: iqn.1986-03.com.sun:mirror:iscsi-failover-test
Current Portal: 192.168.1.1:3260,2
Persistent Portal: 192.168.1.1:3260,2
**
Interface:
**
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1993-08.org.debian:01:bb82d0f5e87f
Iface IPaddress: 192.168.1.2
Iface HWaddress: default
Iface Netdev: default
SID: 3
iSCSI Connection State: TRANSPORT WAIT
iSCSI Session State: FAILED
Internal iscsid Session State: REPOEN

Negotiated iSCSI params:

HeaderDigest: None
DataDigest: None
MaxRecvDataSegmentLength: 131072
MaxXmitDataSegmentLength: 32768

Re: Disk Disappears From System When open-iscsi Re-connects To COMSTAR HA Target

2010-05-21 Thread Preston Connors
Runnig udevadm trigger (udevadm version 147) does not repopulate the /
dev disk entries. During the fail over IO is being sent to the disk
and there are IO errors relating to this disk during this time. I will
report back when this development environment is accessible again with
more specific details and error messages and try dd if=/dev/sdd of=/
dev/null on the disk while fail over is occurring as well.

On May 5, 11:51 am, Mike Christie micha...@cs.wisc.edu wrote:
 On 04/30/2010 02:07 PM, Preston Connors wrote:

  FDISK OUTPUT AFTER FAIL OVER:
  r...@kvm-host-3:~# fdisk -l /dev/sdd
  no output

 It is weird that the /dev/sdd link is now gone, but iscsiadm can see the
 disk below. iscsiadm looks in /sys/block. It does not look at the /dev dir.

 The iscsi layer does not really do anything wrt /dev population except
 transport requests. I wonder what is removing the /dev links during this
 time. It is not the iscsi layer or tools (we do not touch that stuff).
 Are you using udev? Is it possible to rerun udev to create the /dev
 links? Do you see anything in /dev/disk?

 While the failover is occurring are you running IO through the FS? Do
 you see IO errors in /var/log/messages during this time?

 After the failover, if you try to send IO to the FS what IO errors do
 you see in /var/log/messages?

 If you just do IO directly to the disk, before and after the failover
 (so just leave a dd if=/dev/sdd of=/dev/null running during the test) do
 you see any IO errors in /var/log/messages?



  ISCSIADM OUTPUT AFTER FAIL OVER:
  r...@host:~# iscsiadm -m session -P 3
  iSCSI Transport Class version 2.0-870
  iscsiadm version 2.0-870

  Target: iqn.1986-03.com.sun:mirror:iscsi-failover-test
     Current Portal: 192.168.1.1:3260,2
     Persistent Portal: 192.168.1.1:3260,2
             **
             Interface:
             **
             Iface Name: default
             Iface Transport: tcp
             Iface Initiatorname: iqn.1993-08.org.debian:01:bb82d0f5e87f
             Iface IPaddress: 192.168.1.2
             Iface HWaddress: default
             Iface Netdev: default
             SID: 3
             iSCSI Connection State: LOGGED IN
             iSCSI Session State: LOGGED_IN
             Internal iscsid Session State: NO CHANGE
             
             Negotiated iSCSI params:
             
             HeaderDigest: None
             DataDigest: None
             MaxRecvDataSegmentLength: 131072
             MaxXmitDataSegmentLength: 32768
             FirstBurstLength: 65536
             MaxBurstLength: 524288
             ImmediateData: Yes
             InitialR2T: Yes
             MaxOutstandingR2T: 1
             
             Attached SCSI devices:
             
             Host Number: 6  State: running
             scsi6 Channel 00 Id 0 Lun: 0
                     Attached scsi disk sdd          State: running

 --
 You received this message because you are subscribed to the Google Groups 
 open-iscsi group.
 To post to this group, send email to open-is...@googlegroups.com.
 To unsubscribe from this group, send email to 
 open-iscsi+unsubscr...@googlegroups.com.
 For more options, visit this group 
 athttp://groups.google.com/group/open-iscsi?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.