On 05/21/2010 03:43 PM, Taylor wrote:
We have a SLES 11 server connected to an Equallogic 10 Gig disk array.

Initially everything seemed to work just fine.  When doing some IO
testing against the mounted volumes, in which we cause very high IO
loads, i.e. 95 to 100% as reported by iostat, we started seeing the
following messages in /var/log/messages:

May 21 14:26:20 hostA kernel:  connection27:0: ping timeout of 5 secs
expired, last rx 4426024362, last ping 4426025612, now 4426026862
May 21 14:26:20 hostA kernel:  connection27:0: detected conn error
(1011)
May 21 14:26:21 hostA iscsid: Kernel reported iSCSI connection 27:0
error (1011) state (3)
May 21 14:27:03 hostA kernel:  connection30:0: detected conn error
(1011)
May 21 14:27:26 hostA iscsid: Target requests logout within 3 seconds
for connection
May 21 14:27:26 hostA iscsid: Target dropping connection 0, reconnect
min 2 max 0
May 21 14:27:26 hostA iscsid: Kernel reported iSCSI connection 30:0
error (1011) state (4)
May 21 14:27:38 hostA kernel:  connection25:0: detected conn error
(1011)
May 21 14:27:39 hostA iscsid: Kernel reported iSCSI connection 21:0
error (1011) state (3)
May 21 14:27:39 hostA iscsid: Kernel reported iSCSI connection 25:0
error (1011) state (3)
May 21 14:28:16 hostA iscsid: connection27:0 is operational after
recovery (3 attempts)
May 21 14:28:20 hostA iscsid: connection21:0 is operational after
recovery (2 attempts)


Is there more to the log? I want to see if the target requests a logout first or if we get a ping timeout first.


So far we've turned on flow control on the network switches, tried
adjusting multipath.conf, turned off offload parameters on the iscsi
NICs, and adjusting iscsid.conf timeouts.

For the ping timeouts you can set node.conn[0].timeo.noop_out_interval and node.conn[0].timeo.noop_out_timeout to 0. If you are using dm-multipath though you might want them on, but maybe a little longer.

Also on the target side you can turn off their load balancing which would remove the target logout request related disruptions, but that of course messes with load balancing. If you are using dm-multipath you probably do not need the target load balancing on though (not 100% sure what equalogic reccomends, but it seems like each sides algorithms could end up working against each other).



We are running a 2.6.27 kernel with open-iscsi 2.0.870-26.5.


Is that a SLES 2.6.27 kernel or kernel.org? If a SLES kernel then make sure you have the newest once, because SUSE has added fixes for when we thought a nop/ping timedout but really it was stuck behind a large transfer and that transfer was executing ok. If you are using a 2.6.27 kernel.org kernel I would upgrade to the newest upstream one.

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to