On Fri, 2016-11-04 at 12:17 -0600, Bart Van Assche wrote:
> On 11/04/2016 07:47 AM, James Bottomley wrote:
> > You know after
> > 
> > if (device_remove_file_self(dev, attr))
> > 
> > returns true that s_active is held and also that KERNFS_SUICIDAL is 
> > set on the node, so the non-self remove paths can simply check for 
> > this flag and return without descending into __kernfs_remove(), 
> > which would mean they never take s_active.  That means we never get 
> > the inversion.
> 
> Hello James,
> 
> The lock inversion is not triggered by the non-self remove paths but 
> by the self-remove path.

I think we should agree first what the problem is.  The inversion
occurs between the sysfs delete path and the device node delete caused
by a remove host.  When both are happening the inversion is that when

        if (device_remove_file_self(dev, attr))
                scsi_remove_device(to_scsi_device(dev));

Happens, after the if, the s_active lock is held then
scsi_remove_device goes on to take the scan_mutex.

Conversely in scsi_remove_host, the mutex is taken first then
scsi_forget_host iterates removing the devices, but sysfs file removal
eventually takes s_active in kernfs_drain, which is called from
kernfs_remove via kernfs_remove_by_name_ns, hence the inversion.

This is therefore a conflict between the self and non-self remove
paths.

> Anyway, can you have a look at the two attached 
> patches?

Well, they're still overly complex, but perhaps due to the
misunderstanding above?  If you look through that trigger case, you'll
see that the fix is simply to check KERNFS_SUICIDAL in
kernfs_remove_by_name_ns and not descend into __kernfs_remove() if it's
set.  I think kernfs_mutex mediates this, but probably only if it's
moved lower down in kernfs_drain.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to