Talking about simultaneous disconnect and device reset: On Mon, 3 May 2004, David Brownell wrote:
> That's not _my_ analysis ... :) > > One or the other task will lock the hub first. Simple case: khubd > wins, driver's top-down lock acquisition will first block (because it > can't get past khubd) and then later fail (the device is gone, though > that task has an old device pointer that it's still got to release). This is the case I was concerned about. As you say, the driver's top-down lock acquisition will block because it can't get past khubd. Without something like the polling I described in my previous message, however, your "later fail" part is wrong. It won't fail later; it will deadlock. usb_reset_device() can't proceed until khubd releases the hub, khubd can't release the hub until the driver's disconnect() returns, and the driver's disconnect() can't return until usb_reset_device() returns. > This disconnect() issue is a parallel of the open()/disconnect() issue. > In both cases, there's state that must linger after disconnect() returns, > and be cleaned up later. In one case it's what close() accesses, and it's > associated with a user file handle. In the other case, it's what the > SCSI EH task will have to work with as it's noticing -ENODEV. That is wrong. It's not simply a question of lingering state; it's also a question of lingering code. After disconnect() returns we have to assume that the driver is no longer resident in memory. Unlike open(), usb_reset_device() doesn't take a reference to the driver's module. Hence there can't be any threads (like SCSI EH) still trying to use it. Let's also consider the special case of usb-storage, and let's suppose for a moment that the module won't be removed from memory when disconnect() returns. It's _still_ a problem, because disconnect() calls scsi_unregister_host() and that routine won't return until the EH has finished. Fortunately, I think the polling solution will take care of all this. Talking about marking an entire subtree as NOTATTACHED as soon as we know a hub is gone, without waiting to acquire any semaphores: > What I've been talking about is a scheme where hub->serialize serves > as the (physical) topology lock for that subtree ... and everyone uses > the same simple locking convention. I'd require whoever changes any > child's state to NOTATTACHED to hold that lock, then later to actually > get rid of the device object. > > The "why not recurse" stage would be an optimization, and you could > be right that it's not easily worked. Actually, my slightly inelegant solution of requiring usb_disconnect() to hold the state-spinlock (in addition to any topology or serialize semaphores) while erasing the children[] entry seems like a pretty good way to resolve this problem. Alan Stern ------------------------------------------------------- This SF.Net email is sponsored by: Oracle 10g Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click _______________________________________________ [EMAIL PROTECTED] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel