>-----Original Message-----
>From: Mike Christie [mailto:[EMAIL PROTECTED]
>Sent: Thursday, July 31, 2008 10:37 AM
>To: Dev, Vasu
>Cc: [email protected]
>Subject: Re: [Open-FCoE] [PATCH 1/1] libfc: remove shared tag map and fix
>timer usage in fc_fcp.c (take 2)
>
>Dev, Vasu wrote:
>>> -----Original Message-----
>>> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
>On
>>> Behalf Of [EMAIL PROTECTED]
>>> Sent: Wednesday, July 30, 2008 1:44 AM
>>> To: [email protected]
>>> Subject: [Open-FCoE] [PATCH 1/1] libfc: remove shared tag map and fix
>timer
>>> usage in fc_fcp.c (take 2)
>>>
>>> From: Mike Christie <[EMAIL PROTECTED]>
>> So one possible solution can be adding exch lock/unlock functions
>> similar to  recently added fc pkt lock/unlock functions by you and
>> then call exch resp handler with exch lock held. This would also
>> require ensuring other EM calls (e.g. fc_exch_done()) are not in
>> call path to any circular exch locking.
>
>Yeah, I was thinking that might get difficult if it is a spin lock with
>the code how it is now. Another problem besides the circular locking is
>that he exch resp handler may sleep. For example when we fc_io_compl
>calls del_timer_sync. The sleeping problem could be fixed though.
>

Yeah this will also cause sleep with lock held issues in resp function
and this sleeping problem would need fix similar to handling host_lock
in queuecommand path.

>I was tinkering with something like del_timer_sync where the fc_fcp.c
>resp function would hold the pkt lock, set some state bit indicating it
>is cleaning up the exchange, drop the pkt lock then call a fc_exch.c
>function to clean up the exachnge. The fc_exch.c function would then not
>return until the resp function was done running. For the fc_fcp resp
>functions it would get run, grab the pkt lock then see the new state bit
>indicating that it should just exit and not process the fsp. All this
>seems maybe more complicated than what you propose though :(
>
>

This sounds better approach and will also eliminate need to call
resp function with exch lock held. I'll try to fix this race along
this approach. However if I can fix fc_exch.c to better co-ordinate
between exch_done, exch timeout, exch abort and resp calling then
that would eliminate upper layer from any tracking as
suggested above by a state bit in fsp pkt.

>>
>>> This will be fixed in a different patch.
>>>
>>
>> I can work on above suggested solution unless you are in the middle
>> of this different patch.
>
>Go ahead. I have just been timkering with different approaches and
>learning the fc_exch.c code.
>

OK.
>>
>> As you mentioned that there were issues like this prior to shared
>> tag map fix in libfc, therefore not sure on reverting shared tag map
>> fix from libfc with added over head of maintaining link list of
>
>I think we agree we need a way to loop over running commands for error
>handling operations.

True.

> Are you saying you want to keep the shared tag map
>way of looking up commands?

Thought so, I wanted libfc fast path free from keeping track of
running/outstanding commands for only error handling and instead use
any scsi-ml maintained command tracking. Since now you have better
solutions for this in pipeline as discussed in other mails (using
scsi_error.c list or exchange manager reset function), therefore
shared tag map in libfc not required any more and this patch is
good to go.

<snip>
>>> +       spin_lock(&fsp->scsi_pkt_lock);
>>> +       if (fsp->state & FC_SRB_COMPL)
>>> +               goto unlock;
>>> +
>>
>> The fsp packet might be freed by this time though FC_SCSI_REC_TOV for
>
>How will it get freed? I thought this fsp is only freed when we do this
>in fc_device_reset:
>
>         sp->state = FC_SRB_FREE;
>         fc_fcp_pkt_free(sp);
>
>
>If fc_lun_reset calls fc_lun_reset_send and the frame allocation always
>fails so that timer is getting rearmed over and over, then the call to
>
>wait_for_completion_timeout(&fsp->tm_done, FC_SCSI_TM_TOV)
>would timeout and return but we do this:
>
>         spin_lock(&fsp->scsi_pkt_lock);
>         fsp->state |= FC_SRB_COMPL;
>         spin_unlock(&fsp->scsi_pkt_lock);
>
>         del_timer_sync(&fsp->timer);
>
>the del_timer_sync call will wait for fc_lun_reset_send to complete
>before it returns or kill the pending timer then return.
>

Thanks for explanation, yeap this is not an issue.

With this patch, the entire fc_lun_reset_send() has pkt lock
held while sending frame down via fc_exch.c and network layer.
The fc_exch.c can sleep in exch allocation, I need to fix that
and that is a separate issue but in general what do you think on
calling into fc_exch.c with or without any lock held ?
_______________________________________________
devel mailing list
[email protected]
http://www.open-fcoe.org/mailman/listinfo/devel

Reply via email to