Re: iSCSI and filesystem error recovery

2008-06-23 Thread Mike Christie
galitz wrote:
 
 
 I am evaluating iSCSI in our production environment and have a
 question.
 
 When I induce a failure by powering down the iSCSI target while there
 is active traffic and then restore the iSCSI target 5+ minutes later,
 the filesystem remains in read-only mode.  Fair enough, I see by
 reading the docs that anytime a filesystem error is generated the
 filesystem is made read-only.
 
 I can clear this by ummounting and then remounting the filesystem.  Is
 there a more elegant or a recommended way of restoring the filesystem
 to a read-write state once the iSCSI target has returned to service?
 
 Ideally we'd like this to be a transparent process.  Perhaps dm-
 multipath is what I need?
 

I think dm-multpath is best. But if you setup dm-multipath to eventually 
return IO errors to the layer above it, then you will have the same problem.

At the iscsi layer you can set node.session.timeo.replacement_timeout to 
a higher value and that is how long we will hold onto IO before failing 
it (default is 2 minutes). There is a bug in this code where we can only 
hold on to it for so long. I just did this patch against 269.2 which 
allows you to set the node.session.timeo.replacement_timeout to 0 which 
will hold onto the IO until we reconnect.

At the dm multipath layer you can set the no_path_retry to do the same 
thing. If you set this to queue it will hold onto IO forever or until 
the user intervenes.

But like I said if you get FS errors then you have to unmount and 
remount. If your questions was about that then there is nothing I can do.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---

Only in open-iscsi-2.0-869.2.tmo/kernel: Module.markers
Only in open-iscsi-2.0-869.2.tmo/kernel: modules.order
diff -aurp open-iscsi-2.0-869.2/kernel/scsi_transport_iscsi.c 
open-iscsi-2.0-869.2.tmo/kernel/scsi_transport_iscsi.c
--- open-iscsi-2.0-869.2/kernel/scsi_transport_iscsi.c  2008-05-08 
19:53:48.0 -0500
+++ open-iscsi-2.0-869.2.tmo/kernel/scsi_transport_iscsi.c  2008-06-23 
16:03:20.0 -0500
@@ -431,8 +431,10 @@ static void __iscsi_block_session(struct
session-state = ISCSI_SESSION_FAILED;
spin_unlock_irqrestore(session-lock, flags);
scsi_target_block(session-dev);
-   queue_delayed_work(iscsi_eh_timer_workq, session-recovery_work,
-  session-recovery_tmo * HZ);
+   if (session-recovery_tmo  0)
+   queue_delayed_work(iscsi_eh_timer_workq,
+  session-recovery_work,
+  session-recovery_tmo * HZ);
 }
 
 void iscsi_block_session(struct iscsi_cls_session *session)
@@ -1089,8 +1091,7 @@ iscsi_set_param(struct iscsi_transport *
switch (ev-u.set_param.param) {
case ISCSI_PARAM_SESS_RECOVERY_TMO:
sscanf(data, %d, value);
-   if (value != 0)
-   session-recovery_tmo = value;
+   session-recovery_tmo = value;
break;
default:
err = transport-set_param(conn, ev-u.set_param.param,
diff -aurp open-iscsi-2.0-869.2/usr/initiator.c 
open-iscsi-2.0-869.2.tmo/usr/initiator.c
--- open-iscsi-2.0-869.2/usr/initiator.c2008-05-08 19:53:48.0 
-0500
+++ open-iscsi-2.0-869.2.tmo/usr/initiator.c2008-06-23 16:05:43.0 
-0500
@@ -523,13 +523,6 @@ __session_create(node_rec_t *rec, struct
else
session-initiator_alias = dconfig-initiator_alias;
 
-   /* session's eh parameters */
-   session-replacement_timeout = rec-session.timeo.replacement_timeout;
-   if (session-replacement_timeout == 0) {
-   log_error(Cannot set replacement_timeout to zero. Setting 
- 120 seconds\n);
-   session-replacement_timeout = DEF_REPLACEMENT_TIMEO;
-   }
session-fast_abort = rec-session.iscsi.FastAbort;
session-abort_timeout = rec-session.err_timeo.abort_timeout;
session-lu_reset_timeout = rec-session.err_timeo.lu_reset_timeout;


RE: iSCSI and filesystem error recovery

2008-06-23 Thread Geoff Galitz


Thanks for the info.

One more question, I just looked through the iscsi.conf file and the README
but I do not see a setting for explicitly setting dm-multipath.  How do I
enable it?

-geoff


Geoff Galitz
Blankenheim NRW, Deutschland
http://www.galitz.org


-Original Message-
From: open-iscsi@googlegroups.com [mailto:[EMAIL PROTECTED] On
Behalf Of Mike Christie
Sent: Montag, 23. Juni 2008 23:44
To: open-iscsi@googlegroups.com
Subject: Re: iSCSI and filesystem error recovery

galitz wrote:
 
 
 I am evaluating iSCSI in our production environment and have a
 question.
 
 When I induce a failure by powering down the iSCSI target while there
 is active traffic and then restore the iSCSI target 5+ minutes later,
 the filesystem remains in read-only mode.  Fair enough, I see by
 reading the docs that anytime a filesystem error is generated the
 filesystem is made read-only.
 
 I can clear this by ummounting and then remounting the filesystem.  Is
 there a more elegant or a recommended way of restoring the filesystem
 to a read-write state once the iSCSI target has returned to service?
 
 Ideally we'd like this to be a transparent process.  Perhaps dm-
 multipath is what I need?
 

I think dm-multpath is best. But if you setup dm-multipath to eventually 
return IO errors to the layer above it, then you will have the same problem.

At the iscsi layer you can set node.session.timeo.replacement_timeout to 
a higher value and that is how long we will hold onto IO before failing 
it (default is 2 minutes). There is a bug in this code where we can only 
hold on to it for so long. I just did this patch against 269.2 which 
allows you to set the node.session.timeo.replacement_timeout to 0 which 
will hold onto the IO until we reconnect.

At the dm multipath layer you can set the no_path_retry to do the same 
thing. If you set this to queue it will hold onto IO forever or until 
the user intervenes.

But like I said if you get FS errors then you have to unmount and 
remount. If your questions was about that then there is nothing I can do.




--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---