Please configure your email client to wrap lines after 75 columns or so.

On Fri, 27 Apr 2007, Mike Nuss wrote:

> Sometimes upon removing one of our devices (for which we have a custom
> USB driver), OHCI fails to free all the associated resources with the
> device. The problem is always associated with the "IRQ INTR_SF lossage"
> message, which I assume is probably a hardware issue (we are using the
> ZFMicro USB chipset, which has given us other headaches).
> 
> Anyways, I realized that our disconnect() method was never getting
> called in these cases, and it's because there is a lockup in
> usb_hcd_endpoint_disable.  Our device has two endpoints (plus control) -
> one for reads, and one for writes.  When we hit this condition, it
> always hangs while disabling the read endpoint. I added a few lines of
> debug code to ohci-hcd.c and hcd.c to try to figure out what's going on,
> and traced it to the usb_kill_urb on line 1386 of hcd.c.
> 
> Apr 27 18:50:43 blademan26 user.info kernel: usb 1-2.2: USB disconnect, 
> address 18
> Apr 27 18:50:43 blademan26 user.debug kernel: usb 1-2.2: unregistering device
> Apr 27 18:50:43 blademan26 user.debug kernel: usb 1-2.2: usb_disable_device 
> nuking all URBs
> Apr 27 18:50:43 blademan26 user.debug kernel: ohci_hcd 0000:00:13.0: shutdown 
> urb c2aac180 pipe 40411280 ep2in-intr
> Apr 27 18:50:47 blademan26 user.warn kernel: ohci_hcd 0000:00:13.0: IRQ 
> INTR_SF lossage
> Apr 27 18:50:47 blademan26 user.err kernel: ohci_hcd 0000:00:13.0: leak ed 
> c3c5f500 (#82) state 0 (has tds)
> Apr 27 18:50:47 blademan26 user.err kernel: ohci_hcd 0000:00:13.0: free td 
> c3e47640
> Apr 27 18:50:47 blademan26 user.err kernel: ohci_hcd 0000:00:13.0: freed
> Apr 27 18:50:47 blademan26 user.debug kernel: ohci_hcd 0000:00:13.0: urb list 
> not empty <just after line 1373>
> Apr 27 18:50:47 blademan26 user.debug kernel: ohci_hcd 0000:00:13.0: kill 
> urb: c2aac180 status -108 <just before line 1386>
> 
> The call to usb_kill_urb never returns. What would cause that to happen?

Look at ohci-hcd.c:ohci_endpoint_disable().  There's a "switch(ed->state)"
line; see what it does in the ED_UNLINK case.  It retries for up to 1000
ms, waiting for ed->state to change to ED_IDLE.  That's where the "IRQ
INTR_SF lossage" message comes from, if ed->state never changes.

What is _supposed_ to happen is that the controller hardware should
generate an interrupt at the start of a new frame (which happens every
millisecond).  The interrupt handler calls ohci-q.c:finish_unlinks().  
Near the start of that routine you'll see a label "skip_ed", which is
referred to once.  If the "skip_ed" path isn't taken then control moves
down past the "rescan_this" label, eventually setting ed->state to
ED_IDLE.

So the problem is with finish_unlinks().  Either it isn't getting called, 
or it is called but for some reason the "skip_ed" path is always taken.  
By adding a private flag (set to true at the start of 
usb_hcd_endpoint_disable and set to false at the end) and some printk 
statements, you should be able to figure out what the story is.

> It seems like that this point, we know the device is long gone, so there
> should be some way to force the issue.

What matters isn't the device -- it's the USB host controller's internal 
state.

Alan Stern


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to