After a restless night, I've come up with some interesting ideas...
Seems like. But I'll still poke at them ... :)
First, let's acknowledge that in 2.6 we must require drivers not to interact with a device after their disconnect() has returned. Also, let's
What's "interact" though? Not usb_put_dev(), clearly... I'd certainly agree that starting new I/O to any of the endpoints in the disconnected interface is forbidden, or changing its altsettings, etc.
But so long as the device is physically present, control traffic (ep0in/ep0out) is possible. And resetting is much the same: neither ep0 traffic nor reset() operations have stronger life-cycle constraints than "device must still be connected". Whether that's a good thing for drivers to do is a different question. There has never been enforcement of any policy in those areas.
acknowledge that in the absence of significant changes to the SCSI core, if usb-storage has a device-reset pending then its disconnect() routine _can't_ return until the reset has finished. My scheme takes these things into account.
See above re the reset(), though for argument I'll accept that usb-storage needs to work around SCSI (for now).
A key ingredient is to recognize that suspend/resume/reset don't need to lock the parent hub. They're only going to make use of one port, and it
You'll notice that "locktree" described itself as locking the port... :)
The reason it locks the parent hub, very briefly, is to force tasks to serialize. The dynamics change a bit when you're concerned with devices that are hubs ... every one of those operations refers to a TREE of downstream devices. Disconnect does too. So if someone is suspending/resuming/resetting/disconnecting an upstream device, the downstream device must be prevented from starting until that upstream operation completes.
Yes, when that tree is degenerate (just one device) the problems get a lot simpler. But when implementing suspend/resume, I can't make that simplifying assumption ... because those operations affect not just one port, but every port downstream! :)
Serializing access to a port should be done without using a semaphore or other simple synchronization primitive, because a process waiting to use a port must be forced to abort with an error if the port's child device is disconnected. That's necessary to avoid deadlock with disconnect:
khubd other thread ----- ------------ acquire port lock device wait for port process disconnect notification on the port mark device as STATE_NOTATTACHED wake up port waiters
Where "port waiters" being a new notion, like the notion of splitting apart the "port" (downstream end of the link?) and the device (upstream end?) as separate entities.
Why separate entities for both ends of the hub<-->dev link?
The "port waiters" notion looks like it it's probably worth pursuing, given the SCSI EH/disconnect issue.
call usb_disconnect(): try to lock device wake up, see device is gone fail port acquisition unlock device proceed with disconnect
Port acquisition should also fail if the hub's state changes from USB_STATE_CONFIGURED.
Using this approach, there's no need for a thread holding a child's lock to acquire the parent's lock -- no bottom-up locking. Serializing access to a port can be done fairly easily: Each hub can have a 32-bit bitmask indicating which ports are in use. There can be a waitqueue of processes waiting for an in-use port; since there's not liable to be much contention a single global waitqueue should suffice for all devices. Access to the bitmasks and the waitqueue is protected by a spinlock.
Or maybe atomic bitmasks with test-and-set, and wait_event().
Next, let's consider the 3 paths for binding:
(A) New device, khubd calls usb_set_configuration(), interfaces are probed. (B) Sysfs calls usb_set_configuration(), interfaces are unbound and then probed. (C) New driver is registered, driver model core probes all matching unbound interfaces.
There's also usbfs access, which should closely resemble (B).
Unbinding is similar. Notice that during (A) the device's hub port has
already been acquired, but during (B) and (C) it hasn't. Also, during (A) and (B) the device has been locked, but not during (C).
Modulo my proposal to eliminate that restriction for (A) and (B), removing the difference from (C).
For things to work properly, however, the device must be locked during binding. This is very difficult in (C) because any device might be probed. Solution: We'll effectively lock _all_ devices when (C) occurs.
Or, my proposal ...
We do this by introducing a new global rwsem. [...]
It turns out we will also need to distinguish cases (A) and (B). To do
Why?
that, we introduce new fields into struct usb_device and struct
usb_interface (taking a suggestion of Dave's). The usb_device field can take on the values
USB_DEVICE_ATTACHING, USB_DEVICE_ATTACHED, USB_DEVICE_DETACHING
and the usb_interface field can take on the values
USB_INTERFACE_UNBOUND, USB_INTERFACE_PROBING, USB_INTERFACE_BOUND, USB_INTEFACE_UNBINDING
Not unreasonable, though I don't yet see what those would achieve. I've wondered about SUSPENDING/RESUMING states too, but they have not seemed critical. That's a lot of states to add; and most of them won't be needed. On principle, I want to avoid adding new states ... and their consequent complexities.
Device resets are the hardest part to get right, because of the requirement that they not be allowed to block disconnect(). The key here is to change usb_reset_device's API: along with the usb_device the caller must also pass its usb_interface pointer. Then a polling loop much like what I discussed before can be used to acquire the device lock; if it fails whenever the interface is not PROBING or BOUND that will prevent any problems.
Why pass the device at all, if the interface is available? :)
That API change sounds a good thing, regardless of how any of the locking questions get resolved.
In addition, usb_reset_device() must be aware of the context in which it
is called. If the device is already locked (as it would be during
probe()) we should avoid trying to lock it again. If the device's hub's
port is already acquired (as it would be in (A)) we should avoid trying to
acquire it again. But now all the information needed to make these determinations is available. This means the device resets will work even during probe(), with no other modifications needed.
Hmm, wouldn't my own proposal for not holding usbdev->serialize also resolve that reset-during-probe issue? I get the impression you didn't quite see what I was driving at with it.
- Dave
-------------------------------------------------------
This SF.Net email is sponsored by Sleepycat Software
Learn developer strategies Cisco, Motorola, Ericsson & Lucent use to deliver higher performing products faster, at low TCO.
http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel