Long post, but please bear with me :)

This post is related to my previous post at:

My situation:

I have linux initiators running open-iscsi 2.0-869 with dm-multipath,
queue_if_no_path enabled. The target is an OpenSolaris box sharing
zvols from a mirrored zpool, which means the target LUNs are virtual
devices with storage backed by the ZFS zpool.

The problem:

When one of the disks in the solaris zpool dies, ZFS halts reads/
writes to the zpool for a minute or two while it waits for the disk
controller/driver to determine if the device should be offlined. The
side effect of this is that because the iSCSI targets are virtual
devices with their data store being the ZFS zpool, the iSCSI read/
writes are also halted as long as ZFS is waiting for a device to fail.
The iSCSI targets don't disappear, they are just unable to complete
read/write ops - they still respond fine to logins and target
discovery. Once ZFS continues operation, the iSCSI devices also resume
normal operation. Since I am using multipath on the linux initiators,
the linux boxes can wait patiently for read/writes to resume, but it
seems that the scsi system does not retry TUR messages which can cause
the device to never be put back into operation on the initiator node.

The process:

- Linux initiator logs in to a iscsi target and maps it to /dev/sdc
- multipath maps /dev/sdc to /dev/mapper/xyz with queue_if_no_path
- Apps start reading/writing to /dev/mapper/xyz
- A disk in the Solaris server fails
- ZFS halts reads/writes to the zpool, also halts read/write of iSCSI
- Linux reads/writes to /dev/mapper/xyz halt.
- Linux scsi layer waits for /sys/block/sdc/device/timeout seconds
before it runs eh code path
- scsi eh tries to abort any outstanding tasks issued to iscsi device.
This fails.
- scsi eh tries a lu reset. This fails.
- open-iscsi logs out and back into the iSCSI target. This works fine.
- scsi eh sends a TUR to the iscsi device. This fails because ZFS is
still waiting for the one device to timeout
- Solaris finally marks the device as fulated. ZFS resumes normal read/
writes from the zpool.
- Linux apps are still waiting for /dev/mapper/xyz to come back
online, but since the scsi layer only sends one TUR and never retries
if it fails, the device never comes back automatically

My questions:

- I want the linux initiators to queue or pause read/write requests up
to 24 hours and periodically (every 15-30 seconds) attempt to reset
and online the iscsi device. What is the best way to do this?
- I can extend the timeout period by setting /sys/block/sdc/device/
timeout to a larger value, but is this wise? What are the dangers of
setting this to a large value?
- I can online the device with 'echo running > /sys/block/sdc/device/
state'. This may be fine to do manually once I know ZFS has resumed
read/writes, but what if ZFS is still halted? What if I am just
blindly setting this to 'running' on each iscsi device every 15
seconds via script (I can't imagine this would be optimal)?

I want the linux box to act like it would if this were a problem with
NFS connections being disrupted. Just wait forever (up to 24 hours)
until the connection is recovered and then resume operation like
nothing happened.

Thanks in advance,

You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
For more options, visit this group at http://groups.google.com/group/open-iscsi

Reply via email to