If you're detaching a USB device while another thread is still holding
a reference on it (ifconfig/dhclient for example) you might end up in
a deadlock situation and you'll have to reboot* to use your USB ports.

This deadlock is triggered if the thread detaching your device sleeps
in the *_detach() function waiting for all the previously submitted 
USB transfer to finish.  Sleeping in a *_detach() function means that
the USB device has been flagged as `dying'.  But our custom USB task
code prevent any task to be scheduled as soon as the device is dying.

Sadly, if your device has been detached, there's a lot of chances that
its transfers time out.  Since a timed out transfer needs a task to be
properly completed but our stack does not allow tasks to be scheduled
anymore -> deadlock.

If you've ever found yourself with a non working USB, you can use ps(1)
to see if one of your threads is waiting on "usbsync", the symptom of
this deadlock.

# ps axk -Owchan |grep usbsync

Diff below changes the USB task code to allow abort tasks to be scheduled
and executed even if the device is dying.  This should be safe because
aborting the pipe will remove the task if it hasn't been executed.

Comments, ok?

* If you can break into DDB you don't necessarily need to reboot when
such deadlock occurs. First get the xfer address (wait channel) for the
stuck thread:
# ps axk -Onwchan |grep usbsync

Then break into DDB and finish the transfer manually:
ddb> call usb_transfer_comple(0xnwchan_addr)

Index: usb.c
===================================================================
RCS file: /cvs/src/sys/dev/usb/usb.c,v
retrieving revision 1.103
diff -u -p -r1.103 usb.c
--- usb.c       18 Dec 2014 10:44:17 -0000      1.103
+++ usb.c       11 Jan 2015 22:41:14 -0000
@@ -298,8 +298,13 @@ usb_add_task(struct usbd_device *dev, st
 {
        int s;
 
-       /* Don't add task if the device's root hub is dying. */
-       if (usbd_is_dying(dev))
+       /*
+        * If the thread detaching ``dev'' is sleeping, waiting
+        * for all submitted transfers to finish, we must be able
+        * to enqueue abort tasks.  Otherwise timeouts can't give
+        * back submitted transfers to the stack.
+        */
+       if (usbd_is_dying(dev) && (task->type != USB_TASK_TYPE_ABORT))
                return;
 
        DPRINTFN(2,("%s: task=%p state=%d type=%d\n", __func__, task,
@@ -455,12 +460,9 @@ usb_abort_task_thread(void *arg)
                 */
                task->state |= USB_TASK_STATE_RUN;
                task->state &= ~USB_TASK_STATE_ONQ;
-               /* Don't actually execute the task if dying. */
-               if (!usbd_is_dying(task->dev)) {
-                       splx(s);
-                       task->fun(task->arg);
-                       s = splusb();
-               }
+               splx(s);
+               task->fun(task->arg);
+               s = splusb();
                task->state &= ~USB_TASK_STATE_RUN;
                if (task->state == USB_TASK_STATE_NONE)
                        wakeup(task);

Reply via email to