Alan Stern wrote: > > On Fri, 4 May 2007, Mike Nuss wrote: > > > Further support (I dumped the 'dummy' td before freeing it). This is a > > new trace so the addresses won't match the last trace. > > > > kernel: ohci_hcd 0000:00:13.0: leak ed c3e8e680 (#82) state 0 (has tds) > > kernel: ohci_hcd 0000:00:13.0: ed->dummy td c3e8d8c0; urb 00000000 index > > 0; hw next td 03e8d8c0 > > ohci_hcd 0000:00:13.0: info 00000000 CC=0 (CARRY) DI=0 SETUP > > ohci_hcd 0000:00:13.0: cbp 00000000 be 00000000 (len 0) > > ohci_hcd 0000:00:13.0: free ed->dummy: td c3e8d8c0 > > > > ed->dummy has a non-null next pointer! I don't believe that should ever > > be the case. > > No, that's normal. Read ohci-mem.c:td_alloc() and you'll see that each > newly-allocated TD has its hwNextTD field set to point at itself. > > In your previous messages you found that the > > /* INTR_WDH may need to clean up first */ > if (td->td_dma != head) > goto skip_ed; > > path was followed. This indicates that a WDH signal isn't being received; > the driver isn't getting a notification from the controller that the TD > stored at 0x3e3d400 has been completed. In fact, maybe the controller > isn't completing that TD at all -- it doesn't appear to be on the done > list. But the HeadP pointer gets advanced beyond it. > > As I understand it, the controller is always supposed to advance the HeadP > pointer at the same time as it moves a TD to the done list, and it's > supposed to generate an IRQ when the done list's head is changed. > > Can you get a complete dump of all the ED and TD structures when the error > occurs? There's code in ohci-dbg.c to do it for you. > > Alan Stern >
It's been a while since I've written about this issue. We were able to sidestep it for our purposes by slightly modifying the OHCI driver to check a "force_unlinks" flag during finish_unlinks. We set that flag when we detect that one of our devices seems to have not sent any data for the past 30 seconds (on the device side, they send keepalive packets at least that often now), then our device driver sets the force_unlinks flag and sends a command to the device to reset itself. The device goes away, but instead of hanging, the OHCI code sees that the flag is set, and assumes that the TD it's waiting for is never going to show up on the donelist, so it just frees it and goes on. Then the device finishes resetting and comes back. Obviously this isn't a general purpose fix, but it hasn't caused any problems for us over the past few months. However, I seem to have found a way to reproduce the issue more reliably, so I'm still interested in trying to find the source of the problem. Which structures would you like me to dump? If I dump just the affected read endpoint, here's the output: ohci_hcd 0000:00:13.0: read endpoint, ed c2d912c0 state 0x0 type intr; next ed 00000000 ohci_hcd 0000:00:13.0: info 08405110 MAX=64 DQ SKIP EP=2-IN DEV=16 ohci_hcd 0000:00:13.0: tds: head 02ba7300 DATA0 tail 02ba7300 ohci_hcd 0000:00:13.0: -> td c2ba7340; urb c272ca40 index 0; hw next td 00000000 ohci_hcd 0000:00:13.0: info 02140000 CC=0 DATA0 DI=0 IN R ohci_hcd 0000:00:13.0: cbp 02dbe37a be 02dbe39f (len 38) It seems suspicious to me that hwNextTD is null. Let me know what else I can provide. Thanks, Mike ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel