Alan Stern wrote:
I don't understand these comments at all. How does usb_lock_device() violate a parent-first locking policy in any way that wasn't violated
before?
That's part of the point. It doesn't fix those problems which will show up most easily with suspend/resume of a device tree. We've been over that one before, not long after I first posted locktree() code and early CONFIG_USB_SUSPEND patches.
And I don't think you'd deny that usbcore has had a variety of locking issues; the previous behavior was no paragon.
There are relatively few places that need to lock more than one device at a time, and they can all be fixed up. Most of them will be fairly simple to change; probably usb_disconnect() will be the worst case.
In part because its calling convention has always been kind of rude!
So that "lock(dev); usb_disconnect(dev); unlock(dev); release(dev);" and similar "obviously correct" code sequences don't work, since it's got more than the usual amount of side-effects magic.
My biggest stumbling block is what the API should be (and how to document it!). Should the rwsem be exposed? Should routines directly do down(&udev->serialize) for locking all devices after the first? Should there be a separate pair of routines to encapsulate the notion of nested locks? How should such a pair cope with the way locktree() doesn't nest its multiple locks? How can this be documented in a way that won't make people wonder why such a simple idea needs to be made so complex?
Only people working inside usbcore should ever need to worry about the complicated stuff. Everyone else just goes lock()...unlock().
As currently coded (nyet integration tested) the rwsem is still hidden, but "hub-aware" code is allowed to down(&serialize) after lock_device() for the root of a subtree.
I started to look at making usb_lock_device() be locktree() underneath. Issues:
- Code now doing lock_device(hdev->child[i]) would change; the top level would lock_device(), lower levels would just down(&udev->serialize) and thus, at most, wait for a previously-started task to complete. I only saw a couple obvious instances to change.
- The "device is gone" fault mode is handled differently. Much of the code you touched was just failing there, after getting the lock; locktree() doesn't bother even grabbing the lock when it'd be meaningless.
- That special lock_for_reset() logic needs more thought. Maybe it just needs a trylock() version of locktree().
There were a couple other minor tweaks, but the big issues were the "lock never fails" policy, and lock_for_reset().
I don't understand why this is needed. On all occasions when a device is locked, apart from suspend() and resume(), it's not necessary to lock the entire subtree below the device. Locktree() would be overkill.
That's not very true. Disconnect processing will certainly lock the whole subtree. You can't reset hubs without resetting their children. You shouldn't be able to unbind the hub driver from a hub without disconnecting the children. And so on.
Plus of course, if it's not a hub, none of that's an issue. It'll be transparent to device drivers that don't manipulate hubs, and the only cost will be a few instructions that prevent some nasty interference scenarios.
It's not clear why usb_lock_device() has to check whether the device is gone. If the device is gone, that fact will be detected later on anyway. Why disturb all the existing code that assumes device locking always succeeds?
Any disturbance isn't actually much at all, considering that the code I changed is more or less from
lock_device(); if (it_vanished()) { unlock_device(); return -ENODEV; } to if (lock_device() < 0) return -ENODEV;
If lock_for_reset() isn't changed to the locktree() model, does it need to be changed at all? I don't see any reason why. The only case where it could matter would be if you had previously locked another device like the parent hub, but you were running outside of probe() and disconnect() -- I don't think that particular combination ever occurs.
Inconsistencies in locking models always bother me, since they normally provide very pleasant spots for bug colonies to settle.
I'd ask it this way: why should lock_for_reset() ever succeed in a case where lock() would fail?
- Dave
Alan Stern
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel