v42bis wrote: > > > On Aug 20, 1:39 am, Mike Christie <[EMAIL PROTECTED]> wrote: >> v42bis wrote: >>> Thank for the reply, Mike. >> No problem. >> >>> The iscsi connections failed about 1m13s after my iscsi target went >>> down (timestamps that follow are synced from same ntp master, however >>> clock skew may account for a few seconds difference [1m45sec seems >>> very conspicuous - a multiplier of default 15sec timers?]). The target >>> went down at Aug 19 13:33:33. >> Actually this looks like a different problem. What version of open-iscsi >> are you using? Do a "iscsiadm -P 3". The top part should dump the >> iscsiadm version. > > `iscsiadm -P 3` just spits out the usage/help information - no > version. I know it is version open-iscsi-2.0-865.15, though.
Ah older versions had private info argument for debugging. It later become stable as -P. Try "iscsiadm -m --info" > >>> Aug 19 13:36:42 ak1-vz2 kernel: iscsi: scsi conn_destroy(): host_busy >>> 0 host_failed 0 >> This means that userspace decided to kill the iscsi session/connection >> which means that we ignore the recovery/replacement timeout and just >> kill everything which forces IO errors. We only did this for fatal >> errors, but we should not do that anymore. > > What userspace process would have done that? The iscsi userspace daemon that handles iscsi errors and does the login/relogin and session/connection management, iscsid. > >>> The above did not affect normal operation of my open-iscsi initiators. >> That is weirder. In this setup do you have multiple >> sessions/connections? When you checked the machine were all the >> session/connections running? There should have been two sessions that >> were destroyed. > > Only one session per connection. One connection to each iscsi target. > > All of the filesystems and iscsi connections seemed fine, as far as I > could tell. > >> In older open-iscsi userspace tools there were certain errors the target >> could send us and iscsid would consider it a fatal error and it would >> kill the sessions like above. For example if a target was shutting down >> it could tell us that it was not coming back, so we would kill the >> session. There was also a case where iscsid got confused and thought it >> was a fatal error and would kill the session. We now just retry forever >> or until the user kills the session manually to avoid problems like this. > > To confirm: open-iscsi version 2.0-869.2 and above will never kill > iscsi sessions unless the user explicitly tells iscsid to logout/kill Right. > the session? I want to make sure my open-iscsi initiators never return > errors until replacement_timeout is reached. I'd rather have any > processes accessing filesystems on iscsi hang forever than have the > connections lost and journals aborted. > > Looking at the code, there is no problem with setting such a high > replacement_timeout? With the kernel time code or iscsi code that handles the timer? As a quick test try setting the timer to 10 days and set the nop times to 5 seconds. Unplug the cable and in about 10 seconds you will see the ping timeout message. Then shortly after (within minutes instead of days) that you should see the recovery/replacment timed out message. > >> Please tell me you were using a older version than open-iscsi-2.0-869.2 >> :) If you were using open-iscsi-2.0-869.2 then we have a different >> problem :( > > I am definitely running 2.0-865.15. I will upgrade to 2.0-869.2. > > It would be *very* convenient if the Changelog would include changes > in every version and not just the current release. :) > Will start that on the next release. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~----------~----~----~----~------~----~------~--~---