Please configure your email client to wrap lines after 75 columns or so. On Fri, 27 Apr 2007, Mike Nuss wrote:
> Sometimes upon removing one of our devices (for which we have a custom > USB driver), OHCI fails to free all the associated resources with the > device. The problem is always associated with the "IRQ INTR_SF lossage" > message, which I assume is probably a hardware issue (we are using the > ZFMicro USB chipset, which has given us other headaches). > > Anyways, I realized that our disconnect() method was never getting > called in these cases, and it's because there is a lockup in > usb_hcd_endpoint_disable. Our device has two endpoints (plus control) - > one for reads, and one for writes. When we hit this condition, it > always hangs while disabling the read endpoint. I added a few lines of > debug code to ohci-hcd.c and hcd.c to try to figure out what's going on, > and traced it to the usb_kill_urb on line 1386 of hcd.c. > > Apr 27 18:50:43 blademan26 user.info kernel: usb 1-2.2: USB disconnect, > address 18 > Apr 27 18:50:43 blademan26 user.debug kernel: usb 1-2.2: unregistering device > Apr 27 18:50:43 blademan26 user.debug kernel: usb 1-2.2: usb_disable_device > nuking all URBs > Apr 27 18:50:43 blademan26 user.debug kernel: ohci_hcd 0000:00:13.0: shutdown > urb c2aac180 pipe 40411280 ep2in-intr > Apr 27 18:50:47 blademan26 user.warn kernel: ohci_hcd 0000:00:13.0: IRQ > INTR_SF lossage > Apr 27 18:50:47 blademan26 user.err kernel: ohci_hcd 0000:00:13.0: leak ed > c3c5f500 (#82) state 0 (has tds) > Apr 27 18:50:47 blademan26 user.err kernel: ohci_hcd 0000:00:13.0: free td > c3e47640 > Apr 27 18:50:47 blademan26 user.err kernel: ohci_hcd 0000:00:13.0: freed > Apr 27 18:50:47 blademan26 user.debug kernel: ohci_hcd 0000:00:13.0: urb list > not empty <just after line 1373> > Apr 27 18:50:47 blademan26 user.debug kernel: ohci_hcd 0000:00:13.0: kill > urb: c2aac180 status -108 <just before line 1386> > > The call to usb_kill_urb never returns. What would cause that to happen? Look at ohci-hcd.c:ohci_endpoint_disable(). There's a "switch(ed->state)" line; see what it does in the ED_UNLINK case. It retries for up to 1000 ms, waiting for ed->state to change to ED_IDLE. That's where the "IRQ INTR_SF lossage" message comes from, if ed->state never changes. What is _supposed_ to happen is that the controller hardware should generate an interrupt at the start of a new frame (which happens every millisecond). The interrupt handler calls ohci-q.c:finish_unlinks(). Near the start of that routine you'll see a label "skip_ed", which is referred to once. If the "skip_ed" path isn't taken then control moves down past the "rescan_this" label, eventually setting ed->state to ED_IDLE. So the problem is with finish_unlinks(). Either it isn't getting called, or it is called but for some reason the "skip_ed" path is always taken. By adding a private flag (set to true at the start of usb_hcd_endpoint_disable and set to false at the end) and some printk statements, you should be able to figure out what the story is. > It seems like that this point, we know the device is long gone, so there > should be some way to force the issue. What matters isn't the device -- it's the USB host controller's internal state. Alan Stern ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel