Mike Christie wrote: > Hannes Reinecke wrote: >> On Tue, Sep 23, 2008 at 12:13:19PM -0500, Mike Christie wrote: >>> Hannes Reinecke wrote: >>>> Hi Doron, >>>> Doron Shoham wrote: >>>>> Doron Shoham wrote: >>>>>> Hi, >>>>>> >>>>>> Why does the init script on suse re-discovers all iscsi targets which >>>>>> were set >>>>>> to automatic login? >>>>>> To avoid deadlocks on the root fs there is patch which limits the number >>>>>> of retries on first login. >>>>>> When doing so, it sets back all the default parameters (overriding any >>>>>> user definitions). >>>>>> I think it should be like in redhat - just login to all the targets >>>>>> which are automatic. >>>>>> >>>> That's what we tried initially. However, certain switches take quite a bit >>>> of time for the Spanning-Tree >>>> Protocol to work out the route, during which time any connect() attempt >>>> returns with -EHOSTUNREACH. >>>> If we do an automatic login, the login request is sent from the kernel >>>> directly. And any connect() >>>> failure from the kernel is taken as a terminal error, hence the login >>>> fails. >>> Are we talking about the same thing that keeps coming up :) >>> >> I know. Main reason here is that I didn't have time to investigate > > It is ok. I like repeating what I said in this mail more than fixing > aic7xxx bugs, so as long as you fix that driver you can do anything here :) > > >> this further, so I'll have to fall back to answer the same results >> I had the last time ... >> >>> I swear someone from Voltaire asked this before. You gave the same reply. >>> And then I said you can increase node.session.initial_login_retry_max >>> so we retry the login for all cases (almost all not CHAP or target not >>> there errors). If we get -EHOSTUNREACH we will retry up to >>> node.session.initial_login_retry_max times (there is a 1 second delay >>> between retries so it is a delay of node.session.initial_login_retry_max >>> seconds). I then said that for -EHOSTUNREACH I can add a check so that we >>> always test for this and always retry so the user does not have to set >>> node.session.initial_login_retry_max but I was not sure if there was a case >>> where we would not want to retry. >>> >> Problem is that there are valid cases for which we should _not_ retry an >> -EHOSTUNREACH failure case. So I wouldn't retry for EHOSTUNREACH always. >> But increasing the initial_login_retry_max value would really help here. >> Hmm. Will have to check, but this seems like a viable route. >> >> Sorry for not being responsive, but I've been kept really busy recently. >> > > No problem. > > I have been having our users try initial_login_retry_max = 60 and they > have reported success. For iscsistart which red hat and fedora uses for > the root session in the initramfs I just set it to 120. > > For the default let me up the default to something longer than 4.
Actually this was bad. If we have to wait for the login_timeout to fire then initial_login_retry_max = 4 was a nice round number and the max time we had to wait was 1 minute. If I just increase it (tried 45 stupidly first), it increases the possible max default wait to 11 minutes :( So what I did was make initial_login_retry_max just be the max number of initial iscsi login timeouts we can withstand and then let other initial login failures retry for up to initial_login_retry_max * login_timeout. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~----------~----~----~----~------~----~------~--~---