Mike Christie schrieb: >> The machine has its root filesystem accessible via iSCSI (via fast LAN, >> to a different target) which can somehow contribute to the problem? It >> runs a 2.6.22 kernel. >> Some bad interaction if the initiator is connected to two targets with >> different IPs, and connection to one target is very slow? >> > > There should not be. Each session/connection to the target is going to > get its own threads for sending IO. The receiving is done in the network > softirq and cannot sleep or dominate the use. > > Did you set the queue limit lower too? If so did you do it globally (set > it in iscsid.conf and discovery the targets) or did you run it for a > specific sesssion (run iscsiadm -m node -T target -p ip:port -o update > -n ......)? Maybe if you did it globally the lower queue depth is > slowing the IO execution and affecting the apps. This is probably not > the case though. I only know things like a big database not like its IO > slowed down and I do not think other apps would notice the slow down as > long as IO completes. > > Or were there any iscsi or IO messages in the logs?
I set the limit per host. In fact, now I only changed the timeout in /sys (the one settable via the udev rule you gave) to 720, with: node.session.timeo.replacement_timeout = 1000000 and other settings being default. However, since there are no disconnections involved, it shouldn't make any difference, am I correct? So with timeout set to 720 in /sys/... and replacement timeout set very high, I still see this problem. One thing I'm not sure I mentioned clearly: rootfs is on iscsi as well, and is mounted in initrd via iscsistart. Does it make a difference here? Is this connection still "managed" by iscsid, even if it starts later in the boot process? Some more details: - rootfs is connected to the target using 100 Mbit LAN - another target/IP is connected via openvpn with "--shaper 50000" option to limit the bandwidth to about 50 kB/s - I do lots of writes and reads to the ext3 filesystem from the target with limited bandwidth - in the meantime, I ping the target using its VPN IP address and a real IP address - after 20-30 minutes I can see ping saying "sendmsg: No buffer space available" - the one that pings the VPN IP; it can't be interrupted with ctrl+c (in in D+ state, as ps indicates) - another ping is still pinging - all I/O activity to the target behind VPN is stopped/frozen - I see occasional I/O activity to the rootfs target (connected via LAN) - the last syslog entry is iscsid: Nop-out timedout after 15 seconds on connection 2:0 state (3). Dropping session. - session is not re-established, as VPN tunnel does not work for some reason - any command from rootfs which is cached will start, for example "find /sys" or ps x will work (I started find /sys and ps earlier) - any command from rootfs which is *not* cached will not start, i.e. "md5sum -h" will freeze - kjournald is in D state (probably trying to access inaccessible device), which is interesting and may be the reason why everything seems to freeze? -- Tomasz Chmielewski http://wpkg.org --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~----------~----~----~----~------~----~------~--~---