On Feb 5, 3:00 pm, Mike Christie <[EMAIL PROTECTED]> wrote: > agspoon wrote: > > We currently do iSCSI boot for a number of clients, and we have > > developed a kind of cargo-cult set of rules for how it has to be > > done. I would like to validate some of these assumptions with you > > knowledgeable folks. > > > We use iscsistart within the initrd ramdisk to login to the target and > > create a session to the root file system. At this point, we are under > > the impression that we must soon thereafter start the user space > > daemon (iscsid) in order handle some of the iSCSI protocol messages. > > Is this still true? > > Yes.
Ok, that's what we thought. > > Looking at the recent source for iscsistart, we see that it now sets > > the "noop_out_interval" and "noop_out_timeout" to 0 (as recommended in > > the README file). It does not increase the "replacement_timeout" > > value to a large time as recommended in the same section. Should we > > still be doing this after we enter user space? > > The reason that nops are turned off by iscsistart is because iscsid is > not up to handle them. It is not because of the reason mentioned in the > README. For the noop and replacement_timeout settings the user should > set them, because the iscsi tools do not know if you are using multipath > or not. > Ok, this makes sense. > > > Speaking of user space. Since our root file system is fresh at each > > boot (read-only root w/ unionfs), the DB files that would normally be > > created during discovery do not exist. We have found that if we just > > blindly start the daemon (iscsid) without doing a discovery first, the > > session will die after a period of time. So what we do, is run a > > discovery against the current session to populate the DB, fix up the > > "timeout" values, and then start the daemon. Is this the right thing > > to do? Anybody know why things lock-up if we don't do this? > > It should not lock up. What do you mean by the session dieing after a > period of time? If we just start the daemon (after setting /etc/iscsi/ initiatorname.iscsi), but with an empty /var/lib/iscsi DB. The session appears to hang after a period (1-5 minutes), and any access to uncached blocks in the file system will block forever. > When iscsid starts up, it will log out the current session and restart > it with the values iscsid wants to use (from the db or hard coded > defaults if the db is not found). Nothing should freeze or die unless we > cannot relogin, but whether or not you have a db record for the session > should not affect that. > > What version of open-iscsi are you using? > % iscsiadm -V iscsiadm version 2.0-695 (w/ 2.6.18 kernel) We are now moving to version 2.0-865 (w/ 2.6.21), and wanted to validate our assumptions. Based on what you described, I suspect that in our current case the re-login is not occurring when the daemon is restarted. I'll do some more experimenting with the new version to see if we have the same hanging behavior when not doing a discovery. > > > I notice a recent change in git that prevents discovery for existing > > That is only when using the iscsi_discovery script. > > > sessions. Would this change prevent us from reaching a stable root > > file system login? > > It will if you are using the discovery script. If you just run iscsiadm > it will not affect you. Or if you just create the node record by hand by > doing > iscsiadm -m node -T target -p ip:port,tpgt -o new > you would be fine. > Cool, we'll check this out. One other question we have is how session parameters are changed. If we issue the following command while the daemon is running (I think it has to be running), iscsiadm -m node --op=update myip:port,tpgt -n 'node.session.timeo.replacement_timeout' -v 86400 Does this change take effect on the current session, or do we have to restart the daemon? > > > > One more observation. We are using a SANRAD target, and it uses > > "65535" for the Target Portal Group Tag". The open-iscsi source uses > > "-1" (the 16bit signed conversion) to stand for > > "PORTAL_GROUP_TAG_UNKNOWN", and causes us no end of trouble. It > > requires us to patch iscsistart, and screws up command line parameters > > and the like. It appears that there is a problem here with sign > > conversion. > > Are you guys passing in the tpgt to iscsisart and still seeing -1 being > used? You can pass the tpgt in: > > iscsistart ...... -g 65535 ..... We'll give this a shot, and let you know if we are still having problems doing it this way. Thanks, Craig --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to email@example.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~----------~----~----~----~------~----~------~--~---