Here are some thoughts on the perennial problem of synchronous unlinking, including the complications of resubmission within a completion routine.
Currently the synchronous case of usb_unlink_urb() has the following semantics: If the URB is not in progress the routine returns immediately with -EBUSY. If the URB is in progress then it is unlinked, the URB's status is set to -ENOENT, the unlink routine blocks until the completion function has returned, and then it returns 0. In my opinion these semantics are insufficient. Certainly they are not appropriate for the way usb_unlink_urb() is actually used in many drivers. A very hasty survey of source files in drivers/usb/class showns that most synchronous unlinkers require these semantics instead: Regardless of whether the URB was in progress or not, when the routine returns the URB must no longer be linked and must be available for immediate reuse. Also the completion function must have already returned so that it can safely be unloaded from memory (as in rmmod). Resubmission within the completion function doesn't make much sense in the context of a synchronous unlink, so it wouldn't hurt -- and indeed it might help -- if the core fails such resubmissions. The return code from the unlink doesn't really matter since callers usually don't check it, so there's no harm in continuing to return -EBUSY if the URB wasn't in progress. Another problem is that there is no protection against possible races involving one task unlinking an URB at the same time that another is submitting it. (That's true for asynchronous unlinks as well.) While this might seem an unlikely scenario, it certainly could happen if a completion function automatically resubmits and a disconnect() routine unluckily tries to unlink the URB just during the resubmission. It has been stated that both these problems can be solved by adding careful locking to the drivers. That's true, but there are a lot of drivers and it takes a good deal to work to make sure the locking is done correctly in each case. A much simpler solution is to change the core; doing so requires some relatively small changes plus updates to each of the low-level HC drivers. Here is a suggestion for such a solution. Two new fields are added to struct urb: int usage_count to keep track of nested submissions (i.e., resubmissions from within completion functions) and wait_queue_header_t *unlinkers (for blocking during synchronous unlinks). URB submission fails if urb->unlinkers is set, indicating that tasks are waiting for a synchronously-unlinked completion handler to return. urb->status is _not_ set to -EINPROGRESS during usb_submit_urb(); doing so would invite the error of asking a low-level HC driver to unlink the URB before it has been linked. Instead, the HC drivers must check urb->unlinkers, set urb->status, and increment urb->usage_count all while holding urb->lock. usb_hcd_giveback_urb() decrements urb->usage_count, and when it reaches 0 (indicating no more nested resubmissions are in progress) it wakes up the tasks on the urb->unlinkers wait queue. hcd_unlink_urb() is the function most affected by this patch, but the net change might even be a slight simplification. There's no more need for the awkward "completion splicing" technique and the code is cleaner. The patch below includes the necessary changes for the UHCI driver. Similar changes would be needed for the other HC drivers, but I'm not as familiar with them. Clearly this is not something to be added to the kernel until 2.7 starts up. A related welcome change would be to create separate functions for doing synchronous and asynchronous unlinks rather than relying on the URB_ASYNC_UNLINK flag. Comments? Alan Stern --- 2.6/include/linux/usb.h.orig Mon Oct 20 10:40:14 2003 +++ 2.6/include/linux/usb.h Sun Dec 7 10:32:34 2003 @@ -690,6 +690,8 @@ /* private, usb core and host controller only fields in the urb */ spinlock_t lock; /* lock for the URB */ atomic_t count; /* reference count of the URB */ + int usage_count; /* nesting depth */ + wait_queue_head_t *unlinkers; /* synchronous unlinkers queue */ void *hcpriv; /* private data for host controller */ struct list_head urb_list; /* list pointer to all active urbs */ int bandwidth; /* bandwidth for INT/ISO request */ --- 2.6/drivers/usb/host/uhci-hcd.c.orig Mon Oct 20 10:19:03 2003 +++ 2.6/drivers/usb/host/uhci-hcd.c Sun Dec 7 11:10:16 2003 @@ -1461,6 +1461,13 @@ return -ENOMEM; } + spin_lock(&urb->lock); + if (urb->unlinkers) { + spin_unlock(&urb->lock); + spin_unlock_irqrestore(&uhci->urb_list_lock, flags); + return -EBUSY; + } + switch (usb_pipetype(urb->pipe)) { case PIPE_CONTROL: ret = uhci_submit_control(uhci, urb, eurb); @@ -1496,17 +1503,21 @@ break; } + urb->status = ret; if (ret != -EINPROGRESS) { /* Submit failed, so delete it from the urb_list */ struct urb_priv *urbp = urb->hcpriv; list_del_init(&urbp->urb_list); + spin_unlock(&urb->lock); spin_unlock_irqrestore(&uhci->urb_list_lock, flags); uhci_destroy_urb_priv (uhci, urb); return ret; } + ++urb->usage_count; + spin_unlock(&urb->lock); spin_unlock_irqrestore(&uhci->urb_list_lock, flags); return 0; --- drivers/usb/core/urb.c.orig Mon Aug 25 09:58:29 2003 +++ drivers/usb/core/urb.c Sun Dec 7 11:36:16 2003 @@ -131,7 +131,8 @@ * host controller driver are finished with the urb. When the completion * function is called, control of the URB is returned to the device * driver which issued the request. The completion handler may then - * immediately free or reuse that URB. + * immediately free or reuse that URB. However if a process is waiting + * for a synchronous unlink of the URB, resubmission will fail. * * For control endpoints, the synchronous usb_control_msg() call is * often used (in non-interrupt context) instead of this call. @@ -220,8 +221,9 @@ return -ENODEV; if (!(op = dev->bus->op) || !op->submit_urb) return -ENODEV; + if (urb->unlinkers) + return -EBUSY; - urb->status = -EINPROGRESS; urb->actual_length = 0; urb->bandwidth = 0; --- drivers/usb/core/hcd.c.orig Fri Sep 5 13:58:05 2003 +++ drivers/usb/core/hcd.c Sun Dec 7 11:34:55 2003 @@ -1128,30 +1128,6 @@ } } -struct completion_splice { // modified urb context: - /* did we complete? */ - struct completion done; - - /* original urb data */ - usb_complete_t complete; - void *context; -}; - -static void unlink_complete (struct urb *urb, struct pt_regs *regs) -{ - struct completion_splice *splice; - - splice = (struct completion_splice *) urb->context; - - /* issue original completion call */ - urb->complete = splice->complete; - urb->context = splice->context; - urb->complete (urb, regs); - - /* then let the synchronous unlink call complete */ - complete (&splice->done); -} - /* * called in any context; note ASYNC_UNLINK restrictions * @@ -1164,8 +1140,11 @@ struct usb_hcd *hcd = 0; struct device *sys = 0; unsigned long flags; - struct completion_splice splice; + wait_queue_head_t wq_head; + wait_queue_t wq; int retval; + int synchronous; + int oldstatus, newstatus; if (!urb) return -EINVAL; @@ -1184,6 +1163,10 @@ spin_lock_irqsave (&urb->lock, flags); spin_lock (&hcd_data_lock); + if (urb->usage_count == 0) { + retval = -EINVAL; + goto done; + } if (!urb->dev || !urb->dev->bus) { retval = -ENODEV; goto done; @@ -1208,16 +1191,6 @@ goto done; } - /* Any status except -EINPROGRESS means something already started to - * unlink this URB from the hardware. So there's no more work to do. - * - * FIXME use better explicit urb state - */ - if (urb->status != -EINPROGRESS) { - retval = -EBUSY; - goto done; - } - /* PCI IRQ setup can easily be broken so that USB controllers * never get completion IRQs ... maybe even the ones we need to * finish unlinking the initial failed usb_set_address(). @@ -1231,59 +1204,59 @@ /* maybe set up to block until the urb's completion fires. the * lower level hcd code is always async, locking on urb->status - * updates; an intercepted completion unblocks us. + * updates. for synchronous unlinks we add ourselves to a wait + * queue; our state will be set back to TASK_RUNNING after the + * completion routine returns. */ + newstatus = -ECONNRESET; + synchronous = 0; + retval = -EINPROGRESS; if (!(urb->transfer_flags & URB_ASYNC_UNLINK)) { + newstatus = -ENOENT; + synchronous = 1; + retval = 0; if (in_interrupt ()) { dev_dbg (hcd->controller, "non-async unlink in_interrupt"); retval = -EWOULDBLOCK; goto done; } - /* synchronous unlink: block till we see the completion */ - init_completion (&splice.done); - splice.complete = urb->complete; - splice.context = urb->context; - urb->complete = unlink_complete; - urb->context = &splice; - urb->status = -ENOENT; - } else { - /* asynchronous unlink */ - urb->status = -ECONNRESET; + + /* Create a wait queue head for us to wait on if the urb + * doesn't already have one. + */ + if (!urb->unlinkers) { + init_waitqueue_head (&wq_head); + urb->unlinkers = &wq_head; + } + set_current_state(TASK_UNINTERRUPTIBLE); + init_waitqueue_entry(&wq, current); + add_wait_queue(urb->unlinkers, &wq); } + + /* Any status except -EINPROGRESS means something already started to + * unlink this URB from the hardware. So we don't have to do it. + */ + if ((oldstatus = urb->status) != -EINPROGRESS) + retval = -EBUSY; + else + urb->status = newstatus; + spin_unlock (&hcd_data_lock); spin_unlock_irqrestore (&urb->lock, flags); - // FIXME remove splicing, so this becomes unlink1 (hcd, urb); - if (urb == (struct urb *) hcd->rh_timer.data) { - usb_rh_status_dequeue (hcd, urb); - retval = 0; - } else { - retval = hcd->driver->urb_dequeue (hcd, urb); - - /* hcds shouldn't really fail these calls, but... */ - if (retval) { - dev_dbg (sys, "dequeue %p --> %d\n", urb, retval); - if (!(urb->transfer_flags & URB_ASYNC_UNLINK)) { - spin_lock_irqsave (&urb->lock, flags); - urb->complete = splice.complete; - urb->context = splice.context; - spin_unlock_irqrestore (&urb->lock, flags); - } - goto bye; - } - } + if (oldstatus == -EINPROGRESS) + unlink1 (hcd, urb); /* block till giveback, if needed */ - if (urb->transfer_flags & URB_ASYNC_UNLINK) - return -EINPROGRESS; - - wait_for_completion (&splice.done); - return 0; + if (synchronous) { + schedule (); + current->state = TASK_RUNNING; + } + return retval; done: spin_unlock (&hcd_data_lock); spin_unlock_irqrestore (&urb->lock, flags); -bye: if (retval && sys && sys->driver) dev_dbg (sys, "hcd_unlink_urb %p fail %d\n", urb, retval); return retval; @@ -1454,7 +1427,8 @@ * completion function. The HCD has freed all per-urb resources * (and is done using urb->hcpriv). It also released all HCD locks; * the device driver won't cause problems if it frees, modifies, - * or resubmits this URB. + * or resubmits this URB. In addition it wakes up tasks waiting on + * the urb->unlinkers queue if urb->usage_count has dropped to 0. */ void usb_hcd_giveback_urb (struct usb_hcd *hcd, struct urb *urb, struct pt_regs *regs) { @@ -1482,6 +1456,15 @@ /* pass ownership to the completion handler */ urb->complete (urb, regs); + + /* wake up synchronous unlinkers if urb->usage_count drops to 0 */ + spin_lock (&urb->lock); + if (--urb->usage_count == 0 && urb->unlinkers) { + wake_up_all (urb->unlinkers); + urb->unlinkers = NULL; + } + spin_unlock (&urb->lock); + usb_put_urb (urb); } EXPORT_SYMBOL (usb_hcd_giveback_urb); ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ [EMAIL PROTECTED] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel