On 09/01/2010 05:45 PM, Chris Leech wrote: > On Wed, Sep 01, 2010 at 02:06:26PM -0700, Vasu Dev wrote: >>>> It looks safe to me to call scsi_done() w/o host_lock held, >>> >>> Hmmmm, this indeed this appears to be safe now.. For some reason I had >>> it in my head (and in TCM_Loop virtual SCSI LLD code as well) that >>> host_lock needed to be held while calling struct scsi_cmnd->scsi_done(). >>> >>> I assume this is some old age relic from the BLK days in the SCSI >>> completion path, and the subsequent conversion. I still see a couple of >>> ancient drivers in drivers/scsi/ that are still doing this, but I >>> believe I stand corrected in that (all..?) of the modern in-use >>> drivers/scsi code is indeed *not* holding host_lock while calling struct >>> scsi_cmnd->scsi_done().. >>> >> >> fcoe/libfc moved to scsi_done w/o holding scsi host_lock a while ago >> around dec, 09 and it was done after discussion with Mathew and Chris >> Leech from fcoe side at that time, they may have more to comment on >> this. > > There's not a whole lot to comment on. Matthew Wilcox was helping me > look for opportunities to reduce our host_lock use, and said he didn't > think it was needed around scsi_done anymore. It held up under testing, > so I submitted a patch. >
The host_lock was not actually there for any scsi_done stuff. It was probably lazy programming that it was held there. For that code, the host_lock was held in fc_queuecommand for the rport check and for the setting of the SCp.ptr and fsp->cmd, and it was held in the completion path for the SCp.otr and fsp->cmd checks The rport check locking got fixed recently and I was looking at the SCp.ptr and fsp->cmd and was wondering if there could be a problem where one thread completes the IO and sets those fields to NULL, but another thread could be completing it too and it would see a scsi_cmnd that is not released and reallocated by the other thread. So for example the fc_eh_abort code still grabs the host_lock when calling CMD_SP and taking a ref and checking that the fsp is not null. If it is a problem then we should add some locking or some other atomic magic. If it is not a problem then those checks could just be removed, right? _______________________________________________ devel mailing list [email protected] http://www.open-fcoe.org/mailman/listinfo/devel
