Talking about simultaneous disconnect and device reset:

On Mon, 3 May 2004, David Brownell wrote:

> That's not _my_ analysis ... :)
> 
> One or the other task will lock the hub first.  Simple case:  khubd
> wins, driver's top-down lock acquisition will first block (because it
> can't get past khubd) and then later fail (the device is gone, though
> that task has an old device pointer that it's still got to release).

This is the case I was concerned about.  As you say, the driver's top-down
lock acquisition will block because it can't get past khubd.  Without
something like the polling I described in my previous message, however,
your "later fail" part is wrong.  It won't fail later; it will deadlock.  
usb_reset_device() can't proceed until khubd releases the hub, khubd can't
release the hub until the driver's disconnect() returns, and the driver's
disconnect() can't return until usb_reset_device() returns.


> This disconnect() issue is a parallel of the open()/disconnect() issue.
> In both cases, there's state that must linger after disconnect() returns,
> and be cleaned up later.  In one case it's what close() accesses, and it's
> associated with a user file handle.  In the other case, it's what the
> SCSI EH task will have to work with as it's noticing -ENODEV.

That is wrong.  It's not simply a question of lingering state; it's also a
question of lingering code.  After disconnect() returns we have to assume
that the driver is no longer resident in memory.  Unlike open(),
usb_reset_device() doesn't take a reference to the driver's module.  
Hence there can't be any threads (like SCSI EH) still trying to use it.

Let's also consider the special case of usb-storage, and let's suppose for
a moment that the module won't be removed from memory when disconnect()  
returns.  It's _still_ a problem, because disconnect() calls
scsi_unregister_host() and that routine won't return until the EH has 
finished.

Fortunately, I think the polling solution will take care of all this.


Talking about marking an entire subtree as NOTATTACHED as soon as we know 
a hub is gone, without waiting to acquire any semaphores:

> What I've been talking about is a scheme where hub->serialize serves
> as the (physical) topology lock for that subtree ... and everyone uses
> the same simple locking convention.  I'd require whoever changes any
> child's state to NOTATTACHED to hold that lock, then later to actually
> get rid of the device object.
> 
> The "why not recurse" stage would be an optimization, and you could
> be right that it's not easily worked.

Actually, my slightly inelegant solution of requiring usb_disconnect() to
hold the state-spinlock (in addition to any topology or serialize
semaphores) while erasing the children[] entry seems like a pretty good
way to resolve this problem.

Alan Stern



-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market... Oracle 10g. 
Take an Oracle 10g class now, and we'll give you the exam FREE. 
http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to