Alan Stern wrote:
After a restless night, I've come up with some interesting ideas...

Seems like. But I'll still poke at them ... :)



First, let's acknowledge that in 2.6 we must require drivers not to interact with a device after their disconnect() has returned. Also, let's

What's "interact" though? Not usb_put_dev(), clearly... I'd certainly agree that starting new I/O to any of the endpoints in the disconnected interface is forbidden, or changing its altsettings, etc.

But so long as the device is physically present, control traffic
(ep0in/ep0out) is possible.  And resetting is much the same:  neither
ep0 traffic nor reset() operations have stronger life-cycle constraints
than "device must still be connected".  Whether that's a good thing
for drivers to do is a different question.  There has never been
enforcement of any policy in those areas.


acknowledge that in the absence of significant changes to the SCSI core, if usb-storage has a device-reset pending then its disconnect() routine _can't_ return until the reset has finished. My scheme takes these things into account.

See above re the reset(), though for argument I'll accept that usb-storage needs to work around SCSI (for now).


A key ingredient is to recognize that suspend/resume/reset don't need to lock the parent hub. They're only going to make use of one port, and it

You'll notice that "locktree" described itself as locking the port... :)


The reason it locks the parent hub, very briefly, is to force tasks
to serialize.  The dynamics change a bit when you're concerned with
devices that are hubs ... every one of those operations refers to
a TREE of downstream devices.  Disconnect does too.  So if someone
is suspending/resuming/resetting/disconnecting an upstream device,
the downstream device must be prevented from starting until that
upstream operation completes.


Yes, when that tree is degenerate (just one device) the problems get a lot simpler. But when implementing suspend/resume, I can't make that simplifying assumption ... because those operations affect not just one port, but every port downstream! :)


Serializing access to a port should be done without using a semaphore or other simple synchronization primitive, because a process waiting to use a port must be forced to abort with an error if the port's child device is disconnected. That's necessary to avoid deadlock with disconnect:

        khubd                           other thread
        -----                           ------------
        acquire port                    lock device
                                        wait for port
        process disconnect notification
          on the port
        mark device as STATE_NOTATTACHED
        wake up port waiters

Where "port waiters" being a new notion, like the notion of splitting apart the "port" (downstream end of the link?) and the device (upstream end?) as separate entities.

Why separate entities for both ends of the hub<-->dev link?

The "port waiters" notion looks like it it's probably worth
pursuing, given the SCSI EH/disconnect issue.

        call usb_disconnect():
          try to lock device
                                        wake up, see device is gone
                                        fail port acquisition
                                        unlock device
          proceed with disconnect

Port acquisition should also fail if the hub's state changes from USB_STATE_CONFIGURED.

Using this approach, there's no need for a thread holding a child's lock to acquire the parent's lock -- no bottom-up locking. Serializing access to a port can be done fairly easily: Each hub can have a 32-bit bitmask indicating which ports are in use. There can be a waitqueue of processes waiting for an in-use port; since there's not liable to be much contention a single global waitqueue should suffice for all devices. Access to the bitmasks and the waitqueue is protected by a spinlock.

Or maybe atomic bitmasks with test-and-set, and wait_event().



Next, let's consider the 3 paths for binding:

        (A) New device, khubd calls usb_set_configuration(),
                interfaces are probed.
        (B) Sysfs calls usb_set_configuration(), interfaces are
                unbound and then probed.
        (C) New driver is registered, driver model core probes all
                matching unbound interfaces.

There's also usbfs access, which should closely resemble (B).



Unbinding is similar. Notice that during (A) the device's hub port has
already been acquired, but during (B) and (C) it hasn't. Also, during (A) and (B) the device has been locked, but not during (C).

Modulo my proposal to eliminate that restriction for (A) and (B), removing the difference from (C).

For things to work properly, however, the device must be locked during
binding.  This is very difficult in (C) because any device might be
probed.  Solution: We'll effectively lock _all_ devices when (C) occurs.

Or, my proposal ...



We do this by introducing a new global rwsem. [...]


It turns out we will also need to distinguish cases (A) and (B). To do

Why?


that, we introduce new fields into struct usb_device and struct
usb_interface (taking a suggestion of Dave's). The usb_device field can take on the values


USB_DEVICE_ATTACHING, USB_DEVICE_ATTACHED, USB_DEVICE_DETACHING

and the usb_interface field can take on the values

        USB_INTERFACE_UNBOUND, USB_INTERFACE_PROBING,
        USB_INTERFACE_BOUND, USB_INTEFACE_UNBINDING

Not unreasonable, though I don't yet see what those would achieve. I've wondered about SUSPENDING/RESUMING states too, but they have not seemed critical. That's a lot of states to add; and most of them won't be needed. On principle, I want to avoid adding new states ... and their consequent complexities.


Device resets are the hardest part to get right, because of the
requirement that they not be allowed to block disconnect().  The key here
is to change usb_reset_device's API: along with the usb_device the caller
must also pass its usb_interface pointer.  Then a polling loop much like
what I discussed before can be used to acquire the device lock; if it
fails whenever the interface is not PROBING or BOUND that will prevent any
problems.

Why pass the device at all, if the interface is available? :)


That API change sounds a good thing, regardless of how any
of the locking questions get resolved.


In addition, usb_reset_device() must be aware of the context in which it
is called. If the device is already locked (as it would be during
probe()) we should avoid trying to lock it again. If the device's hub's
port is already acquired (as it would be in (A)) we should avoid trying to
acquire it again. But now all the information needed to make these determinations is available. This means the device resets will work even during probe(), with no other modifications needed.

Hmm, wouldn't my own proposal for not holding usbdev->serialize also resolve that reset-during-probe issue? I get the impression you didn't quite see what I was driving at with it.

- Dave




-------------------------------------------------------
This SF.Net email is sponsored by Sleepycat Software
Learn developer strategies Cisco, Motorola, Ericsson & Lucent use to deliver higher performing products faster, at low TCO.
http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to