Mike Christie wrote: > Mike Christie wrote: >> Mike Christie wrote: >>> Mike Christie wrote: >>>> Hannes Reinecke wrote: >>>>> Mike Christie wrote: >>>>>>> The second patch is the more important one, as it >>>>>>> fixes an error during LUN Reset handling in the >>>>>>> initiator. When sending a LUN Reset during an >>>>>>> ongoing R2T transfer, we're suspending Tx and >>>>>>> aborting all _SCSI_ tasks. However, once we're >>>>>>> done there we're resuming Tx and the R2T transfer >>>>>>> will happily continue. So we should rather be >>>>>> This should not be happening. When iscsi_suspend_tx returns the tx >>>>>> thread has stopped so we know there are no users accessing the task >>>>>> (well, there could be if a target is sending a tmf response then a r2t, >>>>>> but if the target is following the rfc there should not be). >>>>>> >>>>>> So when fail_scsi_tasks calls >>>>>> >>>>>> fail_scsi_task ->iscsi_complete_task (this will cleanup conn->task if >>>>>> this is the same task) -> __iscsi_put_task >>>>>> >>>>>> this should be the last put on the task and that should release it >>>>>> calling iscsi_free_task which should call cleanup_task to kill any >>>>>> pending r2t handling and it would remove it from the requeue list. >>>>>> >>>>>> If we are sending a data-out for a task that has had fail_scsi_task >>>>>> ->iscsi_complete_task -> __iscsi_put_task called for it then we are in >>>>>> bigger trouble because the last put should have been called on it and we >>>>>> are accessing a bad task. >>>>>> >>>>> This is the log I'm getting: >>>>> >>>>> >>>>> Jul 29 10:34:48 tyne kernel: session1: iscsi_eh_device_reset LU Reset >>>>> [sc ffff88007b94d080 lun 6] >>>>> Jul 29 10:34:48 tyne kernel: session1: iscsi_exec_task_mgmt_fn tmf set >>>>> timeout >>>>> Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x3a lun 6 abort >>>>> transfer >>>>> Jul 29 10:34:48 tyne kernel: session1: mgmtpdu [op 0x2 hdr->itt 0x5d >>>>> datalen 0] >>>>> Jul 29 10:34:48 tyne kernel: connection1:0: mgmtpdu [itt 0x5d task >>>>> ffff88007a01fc00] xmit >>>>> Jul 29 10:34:48 tyne kernel: connection1:0: tmf rsp [itt 0x5d] response >>>>> 0 state 1 >>>>> Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x72 lun 6 abort >>>>> transfer >>>>> Jul 29 10:34:48 tyne kernel: session1: iscsi_suspend_tx suspend Tx >>>>> Jul 29 10:34:48 tyne kernel: session1: iscsi_complete_task task itt 0x72 >>>>> sc ffff88007b5bc580 still active >>>>> Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x57 lun 6 abort >>>>> transfer >>>>> Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x59 lun 6 abort >>>>> transfer >>>>> Jul 29 10:34:48 tyne kernel: session1: Tx suspended! >>>>> >>>>> So we're indeed would have continued the R2T task (itt 0x57 and itt 0x59) >>>>> even though we've >>>>> already received a valid TMF response. >>>>> So I'm afraid it's us ... >>>> Ah, I misunderstood you. I do not think it has to do with the cleanup >>>> still leaving r2ts. I am not sure where you are putting printks, but I >>>> think it is this: >>>> >>>> while (!list_empty(&conn->requeue)) { >>>> if (conn->session->fast_abort && conn->tmf_state != >>>> TMF_INITIAL) >>>> break; >>>> >>>> Once the tmf completes, we will start sending data again. >>>> >>> Ooops. I am too sleepy. Ignore that. I am wrong there. >>> >> I guess if fast_abort is 0 though, we will hit this problem. And we will >> send data-outs when getting tmf responses as well as when we are sending >> the tmf. > > > > I think the problem is wording like in 10.5.1: > > For ABORT TASK SET and CLEAR TASK SET, the issuing initiator MUST > continue to respond to all valid target transfer tags (received via > R2T, Text Response, NOP-In, or SCSI Data-In PDUs) related to the > affected task set, even after issuing the task management request. > > I think in some other doc (probably the one Mathew and Ulrich mentioned) > there is wording about doing similar for abort and lu resets. > > The things is that I think half of targets want us to respond to r2ts > and half do not. This is where the fast_abort comes from. If set then we > reply to r2ts and if not set we do not. I think once we get a successful
Fudge. I am really going to be now. I mean if it is set we do not reply to r2ts. If not set then we reply. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~----------~----~----~----~------~----~------~--~---