On 09/01/2010 05:45 PM, Chris Leech wrote:
> On Wed, Sep 01, 2010 at 02:06:26PM -0700, Vasu Dev wrote:
>>>> It looks safe to me to call scsi_done() w/o host_lock held,
>>>
>>> Hmmmm, this indeed this appears to be safe now..  For some reason I had
>>> it in my head (and in TCM_Loop virtual SCSI LLD code as well) that
>>> host_lock needed to be held while calling struct scsi_cmnd->scsi_done().
>>>
>>> I assume this is some old age relic from the BLK days in the SCSI
>>> completion path, and the subsequent conversion.  I still see a couple of
>>> ancient drivers in drivers/scsi/ that are still doing this, but I
>>> believe I stand corrected in that (all..?) of the modern in-use
>>> drivers/scsi code is indeed *not* holding host_lock while calling struct
>>> scsi_cmnd->scsi_done()..
>>>
>>
>> fcoe/libfc moved to scsi_done w/o holding scsi host_lock a while ago
>> around dec, 09 and it was done after discussion with Mathew and Chris
>> Leech from fcoe side at that time, they may have more to comment on
>> this.
>
> There's not a whole lot to comment on.  Matthew Wilcox was helping me
> look for opportunities to reduce our host_lock use, and said he didn't
> think it was needed around scsi_done anymore.  It held up under testing,
> so I submitted a patch.
>

The host_lock was not actually there for any scsi_done stuff. It was 
probably lazy programming that it was held there. For that code, the 
host_lock was held in fc_queuecommand for the rport check and for the 
setting of the SCp.ptr and fsp->cmd, and it was held in the completion 
path for the SCp.otr and fsp->cmd checks  The rport check locking got 
fixed recently and I was looking at the SCp.ptr and fsp->cmd and was 
wondering if there could be a problem where one thread completes the IO 
and sets those fields to NULL, but another thread could be completing it 
too and it would see a scsi_cmnd that is not released and reallocated by 
the other thread. So for example the fc_eh_abort code still grabs the 
host_lock when calling CMD_SP and taking a ref and checking that the fsp 
is not null.

If it is a problem then we should add some locking or some other atomic 
magic. If it is not a problem then those checks could just be removed, 
right?
_______________________________________________
devel mailing list
[email protected]
http://www.open-fcoe.org/mailman/listinfo/devel

Reply via email to