Mike Christie wrote:
> Erez Zilber wrote:
>> Mike,
>>
>> I'm seeing some strange behavior that I don't completely understand:
>> during heavy IO, I disconnect the cable on the target machine and
>> after a few seconds, I'm trying to logout from that target on the
>> initiator side. Here's what I see:
>>
>> t = 0: iscsi_eh_cmd_timed_out gets called for all commands that timed
>> out. It resets the timer because we have a ping task &
>> time_before_eq(conn->last_recv + (conn->recv_timeout * HZ), jiffies)
>> is true.
>>
>> t = /sys/block/sdX/device/timeout: iscsi_eh_cmd_timed_out gets called
>> again. The session state is ISCSI_STATE_IN_RECOVERY. This will be
>> called again and again every /sys/block/sdX/device/timeout seconds.
>> The function returns and resets the timer.
>>
>> t = 11: I'm using iscsiadm to logout from the bad session
>>
>> t = 60 sec: __iscsi_block_session is called. I guess that the call
>> stack is something like: mgmt_ipc_session_logout ->
>> session_logout_task -> session_conn_shutdown -> iscsi_sw_tcp_conn_stop
>> -> iscsi_conn_stop -> iscsi_start_session_recovery ->
>> iscsi_block_session -> __iscsi_block_session. I'm not sure if this was
>> initiated by the logout that iscsiadm initiated 49 seconds ago.
>>
>> Now, I see that the commands that were in flight time out:
>>
>> Jul 30 18:26:46 be8 kernel: sd 2:0:0:0: timing out command, waited 42s
>> Jul 30 18:26:46 be8 kernel: sd 2:0:0:0: SCSI error: return code = 0x060e0000
>> Jul 30 18:26:46 be8 kernel: end_request: I/O error, dev sdc, sector 100663048
>> Jul 30 18:26:46 be8 kernel: sd 2:0:0:0: timing out command, waited 42s
>> Jul 30 18:26:46 be8 kernel: sd 2:0:0:0: SCSI error: return code = 0x06000000
>> Jul 30 18:26:46 be8 kernel: end_request: I/O error, dev sdc, sector 100662792
>> Jul 30 18:26:46 be8 kernel: device-mapper: multipath: Failing path 8:32.
>> Jul 30 18:26:46 be8 kernel: sd 2:0:0:0: timing out command, waited 42s
>> Jul 30 18:26:46 be8 kernel: sd 2:0:0:0: SCSI error: return code = 0x060e0000
>> Jul 30 18:26:46 be8 kernel: end_request: I/O error, dev sdc, sector 41943296
>>
>> t = 61: ctldev_handle calls iscsi_sched_conn_context with
>> EV_CONN_ERROR. I'm not sure where this came from.
>>
>> What I don't understand is why __iscsi_block_session was called only
>> after 60 seconds. Is it configurable? Anyway, here's the node info:
>>
> 
> block session is only called from iscsi_conn_stop.
> 
> iscsi_conn_stop is called when there is a connection error like 1011. 
> The kernel will throw the error, then send a msg to userspace. Userspace 
> will then call ep_disconnect to have the kernel clean up the kernel 
> structs for the ep.  Then when that has completed, iscsid will call 
> iscsi_conn_stop.
> 
> What could be delaying the block session is that the 
> iscsi_start_session_recovery -> iscsi_suspend_tx will wait for the xmit 
> thread to flush. Before we had a timeout of 15 seconds so we could be 
> waiting in there that long. I just modified that to 3 seconds in that 
> sndtmo patch (that is also in the linux-2.6-iscsi git tree), but what we 
> really want is a way to just signal that thread to wake right away.
> 

Oh yeah, what version of open-iscsi were you using? Could you try the 
871 release with that patch 
http://groups.google.com/group/open-iscsi/browse_thread/thread/0940a3217ea0c4ff

In the older releases, we could end up waiting for 3 sndtmo periods so 
that would add 45 secs on there.


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to