Mike Christie wrote:
> On 08/03/2009 04:13 AM, Hannes Reinecke wrote:
>> Mike Christie wrote:
>>> Mike Christie wrote:
>>>> On 07/31/2009 09:53 AM, Hannes Reinecke wrote:
>>>>> Hmm. I must admit I'm slightly at a loss here.
>>>>> I do have this function:
>>>> Did you try my second patch queue-some-io-during-tmfs2.patch
>>>> http://groups.google.com/group/open-iscsi/attach/82cbb543dd913960/queue-some-io-during-tmfs2.patch?part=2
>>>>
>>>>
>>> Here is a update patch for the complicated route. It should also work
>>> for bnx2i.
>>>
>> Hmm. I guess that's not quite correct:
>>
>> +    switch (ISCSI_TM_FUNC_VALUE(tmf)) {
>> +    case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
>> +        /*
>> +         * all scsi cmd pdus and, if fast_abort is set, data-outs
>> +         * in response to a r2t will fail to be sent here
>> +         *
>> +         * If we are sending a initial r2t with a scsi cmd pdu
>> +         * we do not hit this test, but the tmf will be sent
>> +         * after the scsi cmd pdu and data-out for the initial r2t.
>> +         */
>> +        hdr_lun = scsilun_to_int((struct scsi_lun *)tmf->lun);
>> +        if (hdr_lun == task->sc->device->lun) {
>> +            ISCSI_DBG_SESSION(conn->session,
>> +                      "Preventing task %x/%x from being "
>> +                      "sent due to lu reset in progress on "
>> +                      "lun %u\n", task->itt, task->hdr_itt,
>> +                      hdr_lun);
>> +            return -EACCES;
>> +        }
>> +        break;
>> +
>> The comment states it should abort everything but scsi data-out PDUs,
>> but in fact it doesn't do any checking on them. Plus it relies on the
>> queue handling, which is not totally transparent to the unsuspecting
>> reader (ie me).
>> So I'd prefer to move the check for fast_abort in there, then we
>> can make it explicit like this:
>>
>>     case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
>>         /*
>>          * all scsi cmd pdus and, if fast_abort is set, data-outs
>>          * in response to a r2t will fail to be sent here
>>          *
>>          * If we are sending a initial r2t with a scsi cmd pdu
>>          * we do not hit this test, but the tmf will be sent
>>          * after the scsi cmd pdu and data-out for the initial r2t.
>>          */
>>         hdr_lun = scsilun_to_int((struct scsi_lun *)tmf->lun);
>>         if (hdr_lun != task->sc->device->lun)
>>             return 0;
>>         /*
>>          * Fail all SCSI cmd PDUs
>>          */
>>         if (task->hdr->opcode != ISCSI_OP_SCSI_DATA_OUT) {
> 
> 
> You can just check for opcode == ISCSI_OP_SCSI_DATA_OUT like in the
> abort path.
> 
Yep, correct. That's an error in my patch.

>>             iscsi_conn_printk(KERN_INFO, conn,
>>                       "task [op %x(%x) itt 0x%x/0x%x lun %u] "
>>                       "reset.", task->hdr->opcode, opcode,
>>                       task->itt, task->hdr_itt,
>>                       hdr_lun);
>>             return -EACCES;
>>         }
>>         /*
>>          * And also all data-out PDUs in response to R2T
>>          * if fast_abort is set.
>>          */
>>         if (conn->session->fast_abort) {
> 
> 
> This is fine with me.
> 
Cool.

>>             iscsi_conn_printk(KERN_INFO, conn,
>>                       "task [op %x/%x itt "
>>                       "0x%x/0x%x lun %u] "
>>                       "fast abort.\n",
>>                       task->hdr->opcode, opcode,
>>                       task->itt, task->hdr_itt, hdr_lun);
>>             return -EACCES;
>>         }
>>         break;
>>
>> (obviously, we'd need to shuffle the header initialisation in _prep()
>> slightly
>> so that hdr->opcode is already set at this point.
>>
> It is not just the iscsi scsi cmd prep. It is also the data out prep.
> 
> Just check the opcode that is passed in for now. It is really messy to
> check the hdr one.
> 
Yes, I noticed :-)

> Also what is up with the simple patch? Did that work? I am not sure
> there is any point in doing the above patch, if we do not also fix the
> queueing. When the above patch hits EACCESS then because that task is at
> the head we will just retry the same one and no new IO to any other lun
> or task is sent. This is the same as the simple patch.
> 
Actually I only tested with this patch, and it looked to me as if the
spurious PDUs I've seen are now gone.

Unfortunately, the MSA still drops the connection every now and again.
And I wasn't able to take a full tcpdump of the connection as the dump
somehow did _not_ pick up the TMF PDUs I have sent. Normal traffic is
captured just fine, but the TMFs are dropped. And I'm reasonably sure
I've sent them; the logs indicate as much. Weird.

I'll give a last try by checking TTT values; the RFC indicates that
we have to honour each outstanding TTT for the target. So my current
thought is that maybe we haven't finished every outstanding TTT with
a final PDU, so that the target still waits for some data pdus to come.

Anyway, but on the good side: The 'Bad ITT' errors are gone, and we
don't transmit any spurious PDUs after reset. So that's definitely
a win here.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                   zSeries & Storage
h...@suse.de                          +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to