[linux-usb-devel] Re: USB locking problems

David Brownell Fri, 07 May 2004 09:38:46 -0700

Alan Stern wrote:

After a restless night, I've come up with some interesting ideas...

Seems like. But I'll still poke at them ... :)

First, let's acknowledge that in 2.6 we must require drivers not to interact with a device after their disconnect() has returned. Also, let's


What's "interact" though?  Not usb_put_dev(), clearly...
I'd certainly agree that starting new I/O to any of
the endpoints in the disconnected interface is forbidden,
or changing its altsettings, etc.

But so long as the device is physically present, control traffic
(ep0in/ep0out) is possible.  And resetting is much the same:  neither
ep0 traffic nor reset() operations have stronger life-cycle constraints
than "device must still be connected".  Whether that's a good thing
for drivers to do is a different question.  There has never been
enforcement of any policy in those areas.

acknowledge that in the absence of significant changes to the SCSI core, if usb-storage has a device-reset pending then its disconnect() routine _can't_ return until the reset has finished. My scheme takes these things into account.


See above re the reset(), though for argument I'll accept
that usb-storage needs to work around SCSI (for now).

A key ingredient is to recognize that suspend/resume/reset don't need to lock the parent hub. They're only going to make use of one port, and it

You'll notice that "locktree" described itself as locking the port... :)

The reason it locks the parent hub, very briefly, is to force tasks
to serialize.  The dynamics change a bit when you're concerned with
devices that are hubs ... every one of those operations refers to
a TREE of downstream devices.  Disconnect does too.  So if someone
is suspending/resuming/resetting/disconnecting an upstream device,
the downstream device must be prevented from starting until that
upstream operation completes.


Yes, when that tree is degenerate (just one device) the problems get
a lot simpler.  But when implementing suspend/resume, I can't make
that simplifying assumption ... because those operations affect not
just one port, but every port downstream!  :)

Serializing access to a port should be done without using a semaphore or other simple synchronization primitive, because a process waiting to use a port must be forced to abort with an error if the port's child device is disconnected. That's necessary to avoid deadlock with disconnect:
        khubd                           other thread
        -----                           ------------
        acquire port                    lock device
                                        wait for port
        process disconnect notification
          on the port
        mark device as STATE_NOTATTACHED
        wake up port waiters


Where "port waiters" being a new notion, like the notion
of splitting apart the "port" (downstream end of the link?)
and the device (upstream end?) as separate entities.

Why separate entities for both ends of the hub<-->dev link?

The "port waiters" notion looks like it it's probably worth
pursuing, given the SCSI EH/disconnect issue.

        call usb_disconnect():
          try to lock device
                                        wake up, see device is gone
                                        fail port acquisition
                                        unlock device
          proceed with disconnect
Port acquisition should also fail if the hub's state changes from USB_STATE_CONFIGURED.

Using this approach, there's no need for a thread holding a child's lock to acquire the parent's lock -- no bottom-up locking. Serializing access to a port can be done fairly easily: Each hub can have a 32-bit bitmask indicating which ports are in use. There can be a waitqueue of processes waiting for an in-use port; since there's not liable to be much contention a single global waitqueue should suffice for all devices. Access to the bitmasks and the waitqueue is protected by a spinlock.

Or maybe atomic bitmasks with test-and-set, and wait_event().

Next, let's consider the 3 paths for binding:

        (A) New device, khubd calls usb_set_configuration(),
                interfaces are probed.
        (B) Sysfs calls usb_set_configuration(), interfaces are
                unbound and then probed.
        (C) New driver is registered, driver model core probes all
                matching unbound interfaces.

There's also usbfs access, which should closely resemble (B).

Unbinding is similar. Notice that during (A) the device's hub port has already been acquired, but during (B) and (C) it hasn't. Also, during (A) and (B) the device has been locked, but not during (C).


Modulo my proposal to eliminate that restriction for (A) and (B),
removing the difference from (C).

For things to work properly, however, the device must be locked during
binding.  This is very difficult in (C) because any device might be
probed.  Solution: We'll effectively lock _all_ devices when (C) occurs.

Or, my proposal ...

We do this by introducing a new global rwsem. [...]

It turns out we will also need to distinguish cases (A) and (B). To do

Why?

that, we introduce new fields into struct usb_device and struct usb_interface (taking a suggestion of Dave's). The usb_device field can take on the values

USB_DEVICE_ATTACHING, USB_DEVICE_ATTACHED, USB_DEVICE_DETACHING

and the usb_interface field can take on the values
        USB_INTERFACE_UNBOUND, USB_INTERFACE_PROBING,
        USB_INTERFACE_BOUND, USB_INTEFACE_UNBINDING


Not unreasonable, though I don't yet see what those would achieve.
I've wondered about SUSPENDING/RESUMING states too, but they have
not seemed critical.  That's a lot of states to add; and most of
them won't be needed.  On principle, I want to avoid adding new
states ... and their consequent complexities.

Device resets are the hardest part to get right, because of the
requirement that they not be allowed to block disconnect().  The key here
is to change usb_reset_device's API: along with the usb_device the caller
must also pass its usb_interface pointer.  Then a polling loop much like
what I discussed before can be used to acquire the device lock; if it
fails whenever the interface is not PROBING or BOUND that will prevent any
problems.

Why pass the device at all, if the interface is available? :)

That API change sounds a good thing, regardless of how any
of the locking questions get resolved.

In addition, usb_reset_device() must be aware of the context in which it is called. If the device is already locked (as it would be during probe()) we should avoid trying to lock it again. If the device's hub's port is already acquired (as it would be in (A)) we should avoid trying to acquire it again. But now all the information needed to make these determinations is available. This means the device resets will work even during probe(), with no other modifications needed.


Hmm, wouldn't my own proposal for not holding usbdev->serialize also
resolve that reset-during-probe issue?  I get the impression you didn't
quite see what I was driving at with it.

- Dave

------------------------------------------------------- This SF.Net email is sponsored by Sleepycat Software Learn developer strategies Cisco, Motorola, Ericsson & Lucent use to deliver higher performing products faster, at low TCO. http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3 _______________________________________________ [EMAIL PROTECTED] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

[linux-usb-devel] Re: USB locking problems

Reply via email to