On Wed, 2016-12-07 at 10:16 -0800, James Bottomley wrote:
> On Wed, 2016-12-07 at 12:40 -0500, Ewan D. Milne wrote:
> > On Wed, 2016-12-07 at 08:55 -0800, Bart Van Assche wrote:
> > > On 12/07/2016 08:48 AM, Bart Van Assche wrote:
> > > > It's a known bug. Some time ago I posted a patch that serializes 
> > > > all scsi_device_set_state() calls but I have not yet found it in 
> > > > the list archives. However, that patch has not yet been merged.
> > > 
> > > See also https://www.spinics.net/lists/linux-scsi/msg66966.html.
> > > 
> > > Bart.
> > > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux
> > > -scsi" in
> > > the body of a message to [email protected]
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > Yes, however that patch does not fix Wei Fang's issue.  In fact I 
> > just received a crash dump that appears to be the same thing.  It 
> > looks like the rport went away right after the initial INQUIRY, so we 
> > set the state to SDEV_BLOCK and stop the queue, and then the scan 
> > code continues and sets the state back to SDEV_RUNNING.
> 
> So here's the violation of the state model.  the rport went CREATED
> ->BLOCK which is wrong: it should go CREATED->CREATED_BLOCK and then
> the add code would set it to BLOCK instead of RUNNING.
> 
> The question to diagnose is why CREATED->BLOCK worked.
> 
> James
> 

I believe scsi_add_lun() changed the state from CREATED->RUNNING which
allowed the state to change from RUNNING->BLOCK, and then
scsi_sysfs_add_sdev() called scsi_device_set_state() which changed
the state from BLOCK->RUNNING.  But did not restart the queue.

I have a debug kernel out to the site that found this to make sure,
assuming they can reproduce this, but I don't see any other way it could
have happened.

-Ewan



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to