Steven Hayter wrote: > On 28/07/2009 06:14 pm, Mike Christie wrote: >> On 07/28/2009 06:53 AM, Hannes Reinecke wrote: >>> Hi all, >>> >>> when my device-reset testcase I've come across this: >>> >>> Jul 28 12:46:08 tyne kernel: session1: iscsi_eh_device_reset LU Reset [sc >>> ffff8800731e9480 lun 6] >>> Jul 28 12:46:08 tyne kernel: session1: iscsi_exec_task_mgmt_fn tmf set >>> timeout >>> Jul 28 12:46:08 tyne kernel: session1: mgmtpdu [op 0x2 hdr->itt 0x69 >>> datalen 0] >>> Jul 28 12:46:08 tyne kernel: connection1:0: mgmtpdu [itt 0x69 task >>> ffff88007b022800] xmit >>> Jul 28 12:46:08 tyne kernel: connection1:0: tmf rsp [itt 0x69] response 0 >>> state 1 >>> Jul 28 12:46:08 tyne kernel: session1: iscsi_suspend_tx suspend Tx >>> Jul 28 12:46:08 tyne kernel: session1: fail_scsi_tasks failing sc >>> ffff88006fd20380 itt 0x54 state 3 >>> Jul 28 12:46:08 tyne kernel: session1: fail_scsi_task fail task [sc >>> ffff88006fd20380 lun 6 itt x54] state 3 >>> Jul 28 12:46:08 tyne kernel: session1: fail_scsi_tasks failing sc >>> ffff88007119b880 itt 0x5d state 3 >>> Jul 28 12:46:08 tyne kernel: session1: fail_scsi_task fail task [sc >>> ffff88007119b880 lun 6 itt x5d] state 3 >>> Jul 28 12:46:08 tyne kernel: session1: fail_scsi_tasks failing sc >>> ffff88007116ec80 itt 0x60 state 3 >>> Jul 28 12:46:08 tyne kernel: session1: fail_scsi_task fail task [sc >>> ffff88007116ec80 lun 6 itt x60] state 3 >>> Jul 28 12:46:08 tyne kernel: session1: fail_scsi_tasks failing sc >>> ffff880079dd8180 itt 0x61 state 3 >>> Jul 28 12:46:08 tyne kernel: session1: fail_scsi_task fail task [sc >>> ffff880079dd8180 lun 6 itt x61] state 3 >>> Jul 28 12:46:08 tyne kernel: connection1:0: invalid itt 0x5d in R2T hdr >>> Jul 28 12:46:08 tyne kernel: session1: iscsi_start_tx resume Tx >>> Jul 28 12:46:08 tyne kernel: session1: iscsi_eh_device_reset dev reset >>> result = SUCCESS >>> Jul 28 12:46:08 tyne kernel: connection1:0: invalid itt 0x60 in R2T hdr >>> Jul 28 12:46:08 tyne kernel: connection1:0: invalid itt 0x61 in R2T hdr >>> >>> As you can see, we're receiving R2Ts for tasks we've just aborted :-( >>> >>> Looking closely, I don't _actually_ think the we've received them >>> out-of-order (which would be >>> a violation of the RFC). The problem seems to be our skb handling (again): >>> >>> We're reading an skb, and call the handler function once the PDU is ready. >>> However, we're _not_ >>> checking if there is more data to be read from the socket. >>> So it looks to me as if we're first reading the TMF response, aborting all >>> tasks, and then >>> continue reading PDUs for tasks which we just aborted. >> We will definately do this. You mean the target sends a tmf response >> that indicates it cleaned up some tasks, then it sends pdus for the >> tasks that should have been affected by the TMF, right? If so I do not >> think targets are allowed to do this. In 3.5.1.4 we have: >> >> After the Task Management response indicates Task Management function >> completion, the initiator will not receive any additional responses >> from the affected tasks. >> >> "additional responses" means scsi response pdus and data-in with status, >> right? Does it also mean R2Ts? I thought it did, so we will just drop >> the session when getting all those pdus we thought the target should not >> be sending. >> >> If "additional responses" does not mean R2Ts, then what are we supposed >> to do? Handle them? Silently drop them? I could not find anything in the >> RFC. >> >> The nasty problem with the code and this scenario is that we preallcoate >> the tasks and itts. Once iscsi_eh_device_reset returns SUCCESS and >> cleans up the tasks, the scsi layer can start sending us commands. We >> could then allocate a task/itt that was used before and should have been >> cleaned up. The target could then send us pdus for the cleaned up >> task/itt while we are using the task/itt for a new command. Then Kablewly. > > It does look confusing, I think RFC 5048, Section 4.1.2. "Clarified > Multi-Task Abort Semantics", gives guidelines as to what should happen. > > Every way read it, the target shouldn't be sending R2Ts for tasks which > are part of the affected task set. (those equal or exceeding the CmdSN > of the reset TMF). But I've been wrong in the past. > Nevermind, found the reason. Totally different story, but we've been the culprit nevertheless.
iscsi_xmit_task() runs in a loop, disregarding any TMF state. So we will happily continue sending R2T transfers even though the LU Reset has already finished. Patch to follow. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~----------~----~----~----~------~----~------~--~---