Re: [linux-usb-devel] FW: OHCI hangs after failing to free resources

Mike Nuss Wed, 27 Jun 2007 19:01:20 -0700

Alan Stern wrote:
>
> On Fri, 4 May 2007, Mike Nuss wrote:
>
> > Further support (I dumped the 'dummy' td before freeing it). This is a
> > new trace so the addresses won't match the last trace.
> >
> > kernel: ohci_hcd 0000:00:13.0: leak ed c3e8e680 (#82) state 0 (has tds)
> > kernel: ohci_hcd 0000:00:13.0: ed->dummy td c3e8d8c0; urb 00000000 index
> > 0; hw next td 03e8d8c0
> > ohci_hcd 0000:00:13.0:      info 00000000 CC=0 (CARRY) DI=0 SETUP
> > ohci_hcd 0000:00:13.0:      cbp 00000000 be 00000000 (len 0)
> > ohci_hcd 0000:00:13.0: free ed->dummy: td c3e8d8c0
> >
> > ed->dummy has a non-null next pointer! I don't believe that should ever
> > be the case.
>
> No, that's normal.  Read ohci-mem.c:td_alloc() and you'll see that each
> newly-allocated TD has its hwNextTD field set to point at itself.
>
> In your previous messages you found that the
>
>                 /* INTR_WDH may need to clean up first */
>                 if (td->td_dma != head)
>                         goto skip_ed;
>
> path was followed.  This indicates that a WDH signal isn't being received;
> the driver isn't getting a notification from the controller that the TD
> stored at 0x3e3d400 has been completed.  In fact, maybe the controller
> isn't completing that TD at all -- it doesn't appear to be on the done
> list.  But the HeadP pointer gets advanced beyond it.
>
> As I understand it, the controller is always supposed to advance the HeadP
> pointer at the same time as it moves a TD to the done list, and it's
> supposed to generate an IRQ when the done list's head is changed.
>
> Can you get a complete dump of all the ED and TD structures when the error
> occurs?  There's code in ohci-dbg.c to do it for you.
>
> Alan Stern
>


It's been a while since I've written about this issue. We were able to
sidestep it for our purposes by slightly modifying the OHCI driver to
check a "force_unlinks" flag during finish_unlinks.

We set that flag when we detect that one of our devices seems to have
not sent any data for the past 30 seconds (on the device side, they send
keepalive packets at least that often now), then our device driver sets
the force_unlinks flag and sends a command to the device to reset
itself. The device goes away, but instead of hanging, the OHCI
code sees that the flag is set, and assumes that the TD it's waiting for
is never going to show up on the donelist, so it just frees it and goes
on. Then the device finishes resetting and comes back.

Obviously this isn't a general purpose fix, but it hasn't caused any
problems for us over the past few months. However, I seem to have found
a way to reproduce the issue more reliably, so I'm still interested in
trying to find the source of the problem. Which structures would you
like me to dump? If I dump just the affected read endpoint, here's the
output:

ohci_hcd 0000:00:13.0: read endpoint, ed c2d912c0 state 0x0 type intr;
next ed 00000000
ohci_hcd 0000:00:13.0:   info 08405110 MAX=64 DQ SKIP EP=2-IN DEV=16
ohci_hcd 0000:00:13.0:   tds: head 02ba7300 DATA0 tail 02ba7300
ohci_hcd 0000:00:13.0:   -> td c2ba7340; urb c272ca40 index 0; hw next
td 00000000
ohci_hcd 0000:00:13.0:      info 02140000 CC=0 DATA0 DI=0 IN R
ohci_hcd 0000:00:13.0:      cbp 02dbe37a be 02dbe39f (len 38)

It seems suspicious to me that hwNextTD is null.

Let me know what else I can provide.

Thanks,
Mike


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Re: [linux-usb-devel] FW: OHCI hangs after failing to free resources

Reply via email to