On Tue, 2 Mar 2004, David Brownell wrote: > If that "a5a5a5.." recurs, you might try adding 64 or 128 bytes > to dmapool.c "struct dma_page", forcing those structs into a > different slab cache "one not getting trashed".
Haven't seen it again. Currently my guesses are it was the control_msg to stack buffer which triggered it somehow, but just speculation. > > it seems it never happens when the device is > > connected to a SiS USB-1.1 OHCI-HC! So it might even be a silicon issue > > with the nec-ohci. Or the pci-interface of this guy is somewhat faster > > exposing some race with the hcd. > > If you add a udelay() in your urb completion callback when > the urb reports an unlink, does that make the problem vanish? > Or printk("S") as I did. Test in progress. printk("S") was not sufficient to get rid of it. Currently I'm using udelay(100) and so far it didn't appear. But it needs more testing .... .... happened again while I was typing. Seems we would need some very long delay, if it would help at all! > I could imagine that "misc OHCI updates" patch of a few weeks > back speeding up an OHCI implementation because it had less > work to do (no periodic schedule dma), exposing such a race. Hm, obviously I've missed something - since the completion callback is called from the interrupt, my understanding is at least on UP the only thing we could race with is the HC itself. But why would a delay in the completion callback help then - my assumption is all the critical stuff is done before the HCD decides to giveback the urb? > I doubt it; the only values the HC should see as _changing_ are > ed->hwTailP, in memory. Those PCI writes are just to make sure > the HC _eventually_ reads the changed schedule, in case it wasn't > already planning to do so. But if you like, have the tail end > of that routine do a readl(). Tried with readl(&ohci->regs->control) at the end of td_submit_urb didn't make any change for me. Guess you are right and it's something else. > It's the finish_unlinks() code that I suspect. That's a racey > concept, and bugs still turn up from time to time. So long as > there are two separate completion paths, I'll suspect more bugs > are lurking. Ok, that might explain it. > > As you can see from the scheme above, there are almost always bulk-reads > > submitted. Usually they will complete very fast because the device is > > (mis-) designed to return actual_length==0 instead of NAK when there is no > > And there I was thinking that "test 11" logic was completely > un-representative of when any sane driver would unlink! :) > It hasn't been getting run much lately, it seems. (Can't OSDL > start doing that, or something?) In fact I've just tried running the usb testsuite against the test firmware on ezusb-fx. Yes, with ohci-hcd test11 and 12 are failing exactly the same way: bad entry with one bulk and the almost-all-zero entry in the hash. ehci-hcd passes the whole testsuite. I haven't used the test setup for quite some time, so it's hard to tell when/if it was working last. IMHO this suggests whatever the issues might be with the stir4200, the symptoms exists independently. Right, test11 looks like it would probably trigger similar paths. Good, we have a independent testcase. > > So it could be related to the bulk-unlink? > > For me, yes. But yours was an interrupt transfer, yes? > Synch vs async interrupt seems to be no issue, which is good. Nope, it was interrupt in the past. Stephen changed this to bulk because he had problems with uhci. Meanwhile he has made another update with the resubmit-timer removed again - i.e. the (only) rx-urb is now resubmitted directly from the completion callback. So far it seems this version does not trigger the bad entry issue for me. > The hwNextTD unalignment looks like suspiciously like (ED_C|ED_H), > which sometimes the finish_unlinks() code needs to patch. And several > of those "bad" entries pointed back to themselves ... yikes! > > DATA1 could come from that just being the toggle when the HC wrote > back that word; it's in ED_C. So it could be a race when unlinking the last TD from the ED and removing the ED from the active schedule? Btw, while talking about the usbtest-firmware. I'm almost done with adding the ISO support. It seems it might even work ;-) So far I've tried with ohci_hcd and a slightly modified version of usbtest. No luck however with ehci_hcd. First I got nothing but ENOSYS because the split-iso support is disabled in 2.6.4-rc1. Enabling it didn't help because it didn't build due to some missing sitd_* functions. I guess I'd need some patches? Martin ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ [EMAIL PROTECTED] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel