On Mon, 2 Nov 2009, Santi Saez wrote: > Randomly we get Open-iSCSI "conn errors" when connecting to an > Infortrend A16E-G2130-4 storage array. We had discussed about this > earlier in the list, see:
> Nov 2 18:34:02 vz-17 kernel: ping timeout of 5 secs expired, last rx > 408250499, last ping 408249467, now 408254467 > Nov 2 18:34:02 vz-17 kernel: connection1:0: iscsi: detected conn error > (1011) > Nov 2 18:34:03 vz-17 iscsid: Kernel reported iSCSI connection 1:0 error > (1011) state (3) > Nov 2 18:34:07 vz-17 iscsid: connection1:0 is operational after recovery (1 > attempts) > Nov 2 18:34:52 vz-17 kernel: ping timeout of 5 secs expired, last rx That looks vaguely familiar, although I think mine was nop-out timeout (might be reported in another log file). Does it mostly happen when you do long sequential reads from the Infortrend unit? In my case it turned out to be a very low level of packet drops being caused by a cisco 2960G when 'mls qos' was enabled (which due to an IOS bug, didn't increment the drop counter). I'm not sure if the loss when 'mls qos' is enabled is by design as part of WRED, or a function of the port buffers being divided up into things smaller than optimal. Having TCP window scaling enabled made the problem an order of magnitude worse, try disabling it and seeing if you have the same problem still? (suggest something like dd if=/dev/sdc of=/dev/null bs=1048576 count=10 to see if that triggers it, assuming it was the same problem I was suffering). Every other iSCSI target I've tried recovered pretty gracefully from this, but not the Infortrend, I suspect their TCP retransmit algorithm needs a lot of love. I suspect it's pathologically broken when window scaling is enabled. Sadly when I opened a ticket with Infortrend, enclosing tcpdumps and analysis, they were no more useful than to let me know they don't support debian (despite having instructions for debian iscsi on their website), and they don't support Western Digital drives in redundant controller configurations (nice of them to have this somewhere public). Hope that might be somewhat useful to you.. Please to let me / the list know how you get on. There was sadly little information on this when I was tearing my hair out about it. Best wishes James -- James Rice Jump Networks Ltd --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~----------~----~----~----~------~----~------~--~---