Hi Mike:

I am working on a somewhat-old SUSE bug that you may remember,
since you commented on it, though it's been 18 months. For reference:

https://bugzilla.novell.com/show_bug.cgi?id=645616

This bug has to do with a user having a problem when running Veritas
on top of an open-iscsi volume. In SLES 10 SP2 they say that when one
of their two paths to the target failed, iSCSI took 120 seconds 
(replacement_timeout) 
before failing, giving the back-end controllers time to failover, but in SP4, 
they 
found that when one of their back-end controllers went out that iSCSI failed 
over immediately.

It looks like SLES 10 SP4 uses open-iscsi-2.0.865 with a few patches.

I am using iscsitarget, and I am stopping my target to simulate this condition, 
where the replacement_timeout should be triggered. This this the correct method?

I tried applying a patch from git hash 1f1641b2c92df43895367296785fe8e4e9f96273 
"iscsid: fix relogin retry handling", but to no avail, as that does not help.

When I trigger this condition (using "-d 8"), it seems to go into a 
retry-forever 
loop, in state " iscsid: login failed STATE_XPT_WAIT/R_STAGE_SESSION_REOPEN 
257".
The only thing that changes is the retry count, which seems to increase without
bounds. This explains why the patch I tried does not help, since that patch
does not modify handling of this state.

I would try a newer open-iscsi, but I read on the mailing list about possible
retry problems with 2.0.870, so I thought I'd ask your opinion before I try 
that.

I'd be glad to supply the log file, but it's awfully large ... Any ideas you 
have
would be most appreciated.
-- 
Lee Duncan

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to