I have a number of systems with an iscsi root filesystem. These systems
connect to an redundant pair of iscsi servers, using tgtd. I use heartbeat
to fail over the iscsi target. I'm using open-iscsi 869. It tried both the
iscsi transport 869, and the default centos 724. The iscsid used was
always 869.

I've set the replacement timeout high, so the iscsi root system should be
able to recover from the short outage if the iscsi target fails over to
another server:

node.session.timeo.replacement_timeout = 86400

Unfortunately, this doesn't always work. Sometimes the OS will report
filesystem errors and mount the fs read-only. A short time later the iscsi
targets will be reconnected, but the filesystem is already read-only by
then.

The logs show (default iscsi transport 724 was used for this test):

Apr 21 11:35:25 front003 kernel: end_request: I/O error, dev sda, sector
1336006
Apr 21 11:35:25 front003 kernel: end_request: I/O error, dev sda, sector
1336006
Apr 21 11:35:25 front003 kernel: Buffer I/O error on device sda1, logical
block 166993
Apr 21 11:35:25 front003 kernel: Buffer I/O error on device sda1, logical
block 166993
<more disk errors>
Apr 21 11:35:26 front003 kernel: ext3_abort called.
Apr 21 11:35:26 front003 kernel: ext3_abort called.
Apr 21 11:35:26 front003 kernel: EXT3-fs error (device sda1):
ext3_journal_start_sb: Detected aborted journal
Apr 21 11:35:26 front003 kernel: EXT3-fs error (device sda1):
ext3_journal_start_sb: Detected aborted journal
Apr 21 11:35:26 front003 kernel: Remounting filesystem read-only
Apr 21 11:35:26 front003 kernel: Remounting filesystem read-only
Apr 21 11:35:36 front003 kernel: connection1:0: iscsi: detected conn error
(1011)
Apr 21 11:35:36 front003 kernel: connection1:0: iscsi: detected conn error
(1011)
Apr 21 11:35:36 front003 iscsid: Kernel reported iSCSI connection 1:0
error (1011) state (3)
Apr 21 11:35:40 front003 kernel: connection5:0: iscsi: detected conn error
(1011)
Apr 21 11:35:40 front003 kernel: connection5:0: iscsi: detected conn error
(1011)
Apr 21 11:35:40 front003 kernel: connection8:0: iscsi: detected conn error
(1011)
Apr 21 11:35:40 front003 kernel: connection8:0: iscsi: detected conn error
(1011)
Apr 21 11:35:40 front003 iscsid: received iferror -38
Apr 21 11:35:40 front003 iscsid: received iferror -38
Apr 21 11:35:40 front003 iscsid: received iferror -38
Apr 21 11:35:40 front003 iscsid: received iferror -38
Apr 21 11:35:40 front003 iscsid: received iferror -38
Apr 21 11:35:40 front003 iscsid: connection1:0 is operational after
recovery (1 attempts)

Is there any way to prevent this, so a iscsi root system can recover
gracefully from a short outage?

Niels


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to