Mike Christie wrote: > Mike Christie wrote: >> Hannes Reinecke wrote: >>> Mike Christie wrote: >>>>> The second patch is the more important one, as it >>>>> fixes an error during LUN Reset handling in the >>>>> initiator. When sending a LUN Reset during an >>>>> ongoing R2T transfer, we're suspending Tx and >>>>> aborting all _SCSI_ tasks. However, once we're >>>>> done there we're resuming Tx and the R2T transfer >>>>> will happily continue. So we should rather be >>>> This should not be happening. When iscsi_suspend_tx returns the tx >>>> thread has stopped so we know there are no users accessing the task >>>> (well, there could be if a target is sending a tmf response then a r2t, >>>> but if the target is following the rfc there should not be). >>>> >>>> So when fail_scsi_tasks calls >>>> >>>> fail_scsi_task ->iscsi_complete_task (this will cleanup conn->task if >>>> this is the same task) -> __iscsi_put_task >>>> >>>> this should be the last put on the task and that should release it >>>> calling iscsi_free_task which should call cleanup_task to kill any >>>> pending r2t handling and it would remove it from the requeue list. >>>> >>>> If we are sending a data-out for a task that has had fail_scsi_task >>>> ->iscsi_complete_task -> __iscsi_put_task called for it then we are in >>>> bigger trouble because the last put should have been called on it and we >>>> are accessing a bad task. >>>> >>> This is the log I'm getting: >>> >>> >>> Jul 29 10:34:48 tyne kernel: session1: iscsi_eh_device_reset LU Reset [sc >>> ffff88007b94d080 lun 6] >>> Jul 29 10:34:48 tyne kernel: session1: iscsi_exec_task_mgmt_fn tmf set >>> timeout >>> Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x3a lun 6 abort >>> transfer >>> Jul 29 10:34:48 tyne kernel: session1: mgmtpdu [op 0x2 hdr->itt 0x5d >>> datalen 0] >>> Jul 29 10:34:48 tyne kernel: connection1:0: mgmtpdu [itt 0x5d task >>> ffff88007a01fc00] xmit >>> Jul 29 10:34:48 tyne kernel: connection1:0: tmf rsp [itt 0x5d] response 0 >>> state 1 >>> Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x72 lun 6 abort >>> transfer >>> Jul 29 10:34:48 tyne kernel: session1: iscsi_suspend_tx suspend Tx >>> Jul 29 10:34:48 tyne kernel: session1: iscsi_complete_task task itt 0x72 >>> sc ffff88007b5bc580 still active >>> Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x57 lun 6 abort >>> transfer >>> Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x59 lun 6 abort >>> transfer >>> Jul 29 10:34:48 tyne kernel: session1: Tx suspended! >>> >>> So we're indeed would have continued the R2T task (itt 0x57 and itt 0x59) >>> even though we've >>> already received a valid TMF response. >>> So I'm afraid it's us ... >> Ah, I misunderstood you. I do not think it has to do with the cleanup >> still leaving r2ts. I am not sure where you are putting printks, but I >> think it is this: >> >> while (!list_empty(&conn->requeue)) { >> if (conn->session->fast_abort && conn->tmf_state != >> TMF_INITIAL) >> break; >> >> Once the tmf completes, we will start sending data again. >> > > Ooops. I am too sleepy. Ignore that. I am wrong there. >
I guess if fast_abort is 0 though, we will hit this problem. And we will send data-outs when getting tmf responses as well as when we are sending the tmf. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~----------~----~----~----~------~----~------~--~---