On Wed, 26 Jul 2006, David Brownell wrote: > On Wednesday 26 July 2006 8:29 am, Alan Stern wrote: > > On Wed, 26 Jul 2006, David Brownell wrote: > > > > > > The patch uses spin_lock_irqsave() and spin_unlock_irqrestore() along > > > > with > > > > a call to wait_event(). What happens if you end up calling > > > > wait_event() > > > > with interrupts disabled? Would it be better simply to spin? > > > > > > You're ignoring that ugly > > > > > > if (in_interrupt()) > > > return 0; > > > > You're forgetting that in_interrupt() is different from irqs_disabled(). > > In fact, either may be true independent of the other. > > Although I don't think any of the current HCDs allow IRQs to be enabled > while their handlers run ... so in_interrupt() implies irqs_disabled(). > In the more general case your observation is of course correct.
What the HCDs do is irrelevant. My point was that it's silly to call spin_lock_irqsave() and spin_unlock_irqrestore() in the same code block with wait_event(). The in_interrupt() test doesn't help; it won't prevent wait_event() from being called with interrupts disabled. And if you did add an irqs_disabled() test then there wouldn't be any need for the _irq{save|restore}. Furthermore, the fact that in_interrupt() implies irqs_disabled() while inside an HCD's handler is also irrelevant. We might be inside some other handler, or not in any IRQ handler at all. > > > Arguably the best thing to do on that branch -- control requests -- would > > > be to make that the equivalent of "if (1)". Control requests to root > > > hubs are always synchronous (and often done with IRQs disabled), > > > > Says who? Can you point to any examples of control requests to root hubs > > that are done with interrupts disabled? > > EHCI, OHCI, and UHCI all need to implement those requests with IRQs disabled, > since they need to access registers protected by spin_lock_irqsave() ... Misunderstanding... When you said the control requests to root hubs are often _done_ with IRQs disabled, I thought you meant the callers often left interrupts disabled while _submitting_ the requests. But you really meant they are _implemented_ with interrupts disabled, which is true. I agree that the cleanest way of handling unlink for root-hub control requests is to do nothing at all. But this would lead to the problem described below. > > > so the > > > worst that would happen is that some other CPU would give the urb back > > > shortly after that routine returns, rather than shortly before it does so. > > > > That's _not_ the worst. The code used to be that way, and I changed it > > after it caused a crash. > > I'm not following. Which way -- equivalent to if(1)? I forget exactly what the older code did, but I am certain that it did not include the sections copied from usb_kill_urb() -- that is, it did not wait for the control URB to complete. That's the part I added. > > The problem came up when there was a control URB running while the HCD > > module was unloaded. hcd_endpoint_disable() didn't wait for the URB to > > complete, the driver was unloaded from memory, and boom! > > Hmm, the if(1) should leave the urb in ep->urb_list so hcd_endpoint_disable() > would go through the loop again, retrying until the urb completed. Or was > there something else going on? (It's late, sigh.) Yes, the URB remained in ep->urb_list and hcd_endpoint_disable() did see it when it went through the loop again. But urb->status had been changed to -ESHUTDOWN, so hcd_endpoint_disable() simply skipped over it and did not keep retrying to unlink it. (It would be a bug for usbcore to call a dequeue method more than once for the same URB.) The real problem was this: hcd_endpoint_disable() relies on hcd->driver->endpoint_disable() to wait until all URBs for an endpoint have completed. Normally that's fine, but it doesn't work with URBs for endpoints on the root hub -- because these URBs never go through the hcd->driver->enqueue() pathway to begin with. So we needed another way to wait for them. My solution was to wait in usb_rh_urb_dequeue(). Maybe there's a better way. > > Nowadays we can prevent the crash with appropriate refcounting. But it > > could still leave us calling hcd->driver->stop() while the control URB was > > running. Do you think it would be safe, given how non-standardized those > > HCD methods are? > > I'm missing something. Refcounting wouldn't solve that; the issue is that > the URB wasn't yet given back, and thus is on ep->urb_list (with a refcount, > sure, but _because_ it's on the list and thus still busy). So stop() wouldn't > be called ... if HCDs don't behave the same, that's clearly trouble. No, you're confused. (It must have been _very_ late.) Note that hcd_endpoint_disable() doesn't wait for ep->urb_list to be empty. It merely waits for hcd->driver->endpoint_disable() to return. Refcounting would solve the problem of the HCD's module being unloaded from memory while its code was still in use. It would also solve the related problem of the HCD's private data structure being freed while still in use. But if hcd_endpoint_disable() fails to wait for root-hub URBs to complete, then there is a possibility that stop() could be called while an URB was still in progress. (Actually, improvements to the hub driver have made this much less likely. Still, it's something to be aware of.) Alan Stern ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel