Hi, here is the output of /var/log/messages : Nov 21 10:32:21 bdd4a kernel: session1: iscsi_free_task freeing task itt 0xb state 1 sc 0000000000000000 Nov 21 10:32:21 bdd4a kernel: connection1:0: iscsi_check_transport_timeouts Setting next tmo 4790025113 Nov 21 10:32:25 bdd4a kernel: connection2:0: iscsi_check_transport_timeouts Sending nopout as ping Nov 21 10:32:25 bdd4a kernel: connection2:0: iscsi_check_transport_timeouts Setting next tmo 4790029108 Nov 21 10:32:25 bdd4a kernel: session2: iscsi_prep_mgmt_task mgmtpdu [op 0x0 hdr->itt 0x3000001d datalen 0] Nov 21 10:32:26 bdd4a kernel: connection1:0: iscsi_check_transport_timeouts Sending nopout as ping Nov 21 10:32:26 bdd4a kernel: connection1:0: iscsi_check_transport_timeouts Setting next tmo 4790030113 Nov 21 10:32:26 bdd4a kernel: session1: iscsi_prep_mgmt_task mgmtpdu [op 0x0 hdr->itt 0x10000009 datalen 0] Nov 21 10:32:26 bdd4a kernel: session1: __iscsi2_complete_pdu [op 0x20 cid 0 itt 0x9 len 0] Nov 21 10:32:26 bdd4a kernel: session1: iscsi_complete_task complete task itt 0x9 state 3 sc 0000000000000000 Nov 21 10:32:26 bdd4a kernel: session1: iscsi_free_task freeing task itt 0x9 state 1 sc 0000000000000000 Nov 21 10:32:26 bdd4a kernel: connection1:0: iscsi_check_transport_timeouts Setting next tmo 4790030113 Nov 21 10:32:30 bdd4a kernel: connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4790019108, last ping 4790024108, now 4790029108 Nov 21 10:32:30 bdd4a kernel: connection2:0: detected conn error (1011) Nov 21 10:32:30 bdd4a kernel: session2: iscsi2_queuecommand iscsi: cmd 0x28 is not queued (7) Nov 21 10:32:30 bdd4a last message repeated 2287 times Nov 21 10:32:30 bdd4a kernel: session2: iscsi2_queueued (7) Nov 21 10:32:30 bdd4a kernel: session2: iscsi2_queued (7) Nov 21 10:32:30 bdd4a kernel: session2: iscsi2_queuecomued (7) Nov 21 10:32:30 bdd4a kernel: session2: iscsi2_queueued (7) Nov 21 10:32:30 bdd4a kernel: session2: iscsi2_qued (7) Nov 21 10:32:30 bdd4a kernel: session2: iscsi2_qued (7) Nov 21 10:32:30 bdd4a kernel: session2: iscsi2_queuued (7) Nov 21 10:32:30 bdd4a kernel: session2: iscsi2_queued (7) Nov 21 10:32:30 bdd4a kernel: session2: iscsi2_queuued (7) ............. And so on ............. Nov 21 10:32:31 bdd4a kernel: session2: iscsi2_queuecommued (7) Nov 21 10:32:31 bdd4a kernel: session2: iscsi2_queued (7) Nov 21 10:32:31 bdd4a kernel: session2: iscsi2_ued (7) Nov 21 10:32:31 bdd4a kernel: session2: iscsi2_queuecued (7) Nov 21 10:32:31 bdd4a kernel: session2: iscsiued (7) Nov 21 10:32:31 bdd4a kernel: session2: iscsi2_quued (7) Nov 21 10:32:31 bdd4a kernel: session2: iscsi2_queuued (7) Nov 21 10:32:31 bdd4a kernel: session2: iscsi2_queuecon2: iscsi2_queuecommand iscsi: cmd 0x28 is not queued (7) Nov 21 10:32:31 bdd4a kernel: session2: iscsi2_queuecommand iscsi: cmd 0x28 is not queued (7) Nov 21 10:32:31 bdd4a last message repeated 7702 times Nov 21 10:32:31 bdd4a kernel: session2: iscsi_start_session_recovery blocking session Nov 21 10:32:31 bdd4a kernel: session2: iscsi2_queuecommand iscsi: cmd 0x28 is not queued (7) Nov 21 10:32:31 bdd4a kernel: session2: fail_mgmt_tasks failing mgmt itt 0x1d state 3 Nov 21 10:32:31 bdd4a kernel: session2: iscsi_complete_task complete task itt 0x1d state 3 sc 0000000000000000 Nov 21 10:32:31 bdd4a kernel: session2: iscsi_free_task freeing task itt 0x1d state 5 sc 0000000000000000 Nov 21 10:32:31 bdd4a kernel: session2: iscsi2_queuecommand iscsi: cmd 0x28 is not queued (786432) Nov 21 10:32:31 bdd4a iscsid: Kernel reported iSCSI connection 2:0 error (1011) state (3) Nov 21 10:32:31 bdd4a kernel: connection1:0: iscsi_check_transport_timeouts Sending nopout as ping Nov 21 10:32:31 bdd4a kernel: connection1:0: iscsi_check_transport_timeouts Setting next tmo 4790035113 Nov 21 10:32:31 bdd4a kernel: session1: iscsi_prep_mgmt_task mgmtpdu [op 0x0 hdr->itt 0x1000000c datalen 0] Nov 21 10:32:31 bdd4a kernel: session1: __iscsi2_complete_pdu [op 0x20 cid 0 itt 0xc len 0] Nov 21 10:32:31 bdd4a kernel: session1: iscsi_complete_task complete task itt 0xc state 3 sc 0000000000000000 Nov 21 10:32:31 bdd4a kernel: session1: iscsi_free_task freeing task itt 0xc state 1 sc 0000000000000000 Nov 21 10:32:31 bdd4a kernel: connection1:0: iscsi_check_transport_timeouts Setting next tmo 4790035113 Nov 21 10:32:32 bdd4a kernel: session1: iscsi_prep_scsi_cmd_pdu iscsi prep [read cid 0 sc ffff8105860860c0 cdb 0x28 itt 0xd len 4096 bidi_len 0 cmdsn 50514 win 256] Nov 21 10:32:32 bdd4a kernel: session1: __iscsi2_complete_pdu [op 0x25 cid 0 itt 0xd len 0] Nov 21 10:32:32 bdd4a kernel: session1: iscsi_data_in_rsp data in with status done [sc ffff8105860860c0 res 0 itt 0xd] Nov 21 10:32:32 bdd4a kernel: session1: iscsi_complete_task complete task itt 0xd state 3 sc ffff8105860860c0 Nov 21 10:32:32 bdd4a kernel: session1: iscsi_free_task freeing task itt 0xd state 1 sc ffff8105860860c0 Nov 21 10:32:32 bdd4a kernel: session1: iscsi_prep_scsi_cmd_pdu iscsi prep [read cid 0 sc ffff81057d7113c0 cdb 0xc0 itt 0xe len 256 bidi_len 0 cmdsn 50515 win 256] Nov 21 10:32:32 bdd4a kernel: session1: __iscsi2_complete_pdu [op 0x21 cid 0 itt 0xe len 0] Nov 21 10:32:32 bdd4a kernel: session1: iscsi_scsi_cmd_rsp cmd rsp done [sc ffff81057d7113c0 res 0 itt 0xe] Nov 21 10:32:32 bdd4a kernel: session1: iscsi_complete_task complete task itt 0xe state 3 sc ffff81057d7113c0 Nov 21 10:32:32 bdd4a kernel: session1: iscsi_free_task freeing task itt 0xe state 1 sc ffff81057d7113c0 Nov 21 10:32:36 bdd4a kernel: session2: session recovery timed out after 5 secs Nov 21 10:32:36 bdd4a kernel: session2: iscsi2_queuecommand iscsi: cmd 0x28 is not queued (983040) Nov 21 10:32:36 bdd4a kernel: sd 3:0:0:0: SCSI error: return code = 0x000f0000 Nov 21 10:32:36 bdd4a kernel: end_request: I/O error, dev sdb, sector 0 Nov 21 10:32:36 bdd4a kernel: connection1:0: iscsi_check_transport_timeouts Setting next tmo 4790036132 Nov 21 10:32:36 bdd4a kernel: device-mapper: multipath: Failing path 8:16. Nov 21 10:32:36 bdd4a multipathd: sdb: directio checker reports path is down Nov 21 10:32:36 bdd4a multipathd: checker failed path 8:16 in map quorum Nov 21 10:32:36 bdd4a multipathd: quorum: remaining active paths: 1 Nov 21 10:32:36 bdd4a multipathd: dm-8: add map (uevent) Nov 21 10:32:36 bdd4a multipathd: dm-8: devmap already registered
On Nov 18, 9:06 pm, Mike Christie <micha...@cs.wisc.edu> wrote: > On 11/16/2011 09:58 AM, Amaury FRANCOIS wrote: > > > Hi, > > > I've configured iscsi on RHEL 5.7 (last kernel) with a netapp disk > > array and I'm seeing connection problems every 20 minutes (the > > connection is dropped and comes back a few minutes or seconds later). > > > Here are the messages in /var/log/messages : > > > Nov 6 04:29:01 bdd4a kernel: connection1:0: ping timeout of 5 secs > > expired, recv timeout 5, last rx 4356175664, last ping 4356180664, now > > 4356185664 > > This means that we did not see any data/pdus from the target for 5 > seconds (node.conn[0].timeo.noop_out_interval), so we sent a iscsi ping > to make sure the target was still up. The target did not respond to the > iscsi ping in 5 seconds (node.conn[0].timeo.noop_out_timeout), so we > assumed there was a connection problem. The initiator will then drop the > tcp/ip connection and try to reconnect and relogin and restart IO. > > > Nov 6 04:29:01 bdd4a kernel: connection1:0: detected conn error > > (1011) > > Nov 6 04:29:02 bdd4a iscsid: Kernel reported iSCSI connection 1:0 > > error (1011) state (3) > > Nov 6 04:29:07 bdd4a kernel: session1: session recovery timed out > > after 5 secs > > We tried to relogin/reconnect for 5 seconds > (node.session.timeo.replacement_timeout) but could not, so the > initiator was instructed to fail IO. > > It then looks like we are able to log back in around 30 seconds later. > When this happens it normally is a temporary issue with the network. For > some reason we just cannot reconnect to the target for a long'ish (30 > secs) time. > > > > > I've also traced the network with tcpdump and analyzed it with > > wireshark : > > > ARP request about the server from Netapp > > ARP response from the server -> the problem always seem to happen > > after an ARP request from the netapp array > > RST sent from the server on the iscsi connection -> !!!!! > > > All NOP in, NOP out seemed to be OK just before, any idea why this > > problem happen ? > > So in the network trace you can see the NOP get sent and its response > before the tcp disconnection? If so then that might be a bug in the > iscsi layer not processing it right. From the log above it looks like we > did not get a response. > > Can you run the initiator in debug mode? It will lead to lots and lots > of log output. > > echo 1 > /sys/module/libiscsi2/parameters/debug_libiscsi > > And could you also send your net trace if you have it. -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.