On Wed, 2016-12-07 at 14:24 -0500, Ewan D. Milne wrote:
> On Wed, 2016-12-07 at 10:16 -0800, James Bottomley wrote:
> > On Wed, 2016-12-07 at 12:40 -0500, Ewan D. Milne wrote:
> > > On Wed, 2016-12-07 at 08:55 -0800, Bart Van Assche wrote:
> > > > On 12/07/2016 08:48 AM, Bart Van Assche wrote:
> > > > > It's a known bug. Some time ago I posted a patch that 
> > > > > serializes all scsi_device_set_state() calls but I have not 
> > > > > yet found it in the list archives. However, that patch has 
> > > > > not yet been merged.
> > > > 
> > > > See also https://www.spinics.net/lists/linux-scsi/msg66966.html
> > > > .
> > > > 
> > > > Bart.
> > > > 
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux
> > > > -scsi" in
> > > > the body of a message to majord...@vger.kernel.org
> > > > More majordomo info at  
> > > > http://vger.kernel.org/majordomo-info.html
> > > 
> > > Yes, however that patch does not fix Wei Fang's issue.  In fact I
> > > just received a crash dump that appears to be the same thing.  It
> > > looks like the rport went away right after the initial INQUIRY, 
> > > so we set the state to SDEV_BLOCK and stop the queue, and then 
> > > the scan code continues and sets the state back to SDEV_RUNNING.
> > 
> > So here's the violation of the state model.  the rport went CREATED
> > ->BLOCK which is wrong: it should go CREATED->CREATED_BLOCK and 
> > then the add code would set it to BLOCK instead of RUNNING.
> > 
> > The question to diagnose is why CREATED->BLOCK worked.
> > 
> > James
> > 
> 
> I believe scsi_add_lun() changed the state from CREATED->RUNNING 
> which allowed the state to change from RUNNING->BLOCK, and then
> scsi_sysfs_add_sdev() called scsi_device_set_state() which changed
> the state from BLOCK->RUNNING.  But did not restart the queue.
> 
> I have a debug kernel out to the site that found this to make sure,
> assuming they can reproduce this, but I don't see any other way it 
> could have happened.

Hm, it looks like the state set in scsi_sysfs_add_sdev() is bogus.  We
expect the state to have been properly set before that (in
scsi_add_lun), so can we not simply remove it?

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to