> From: "Pandita, Vikram" <[EMAIL PROTECTED]> > > On kernel 2.6.24-rc5(from omap-git) I am testing the EHCI controller on > OMAP34xx. > I have connected the device (netchip2280 + g_zero.ko ) to the EHCI controller. > > Device side test: > On the gadget zero I run the IN test 6: ./testusb -a -t6 -s1024 -c100 > > OMAP EHCI HOST status: > On EHCI I can see that, the test does not complete sometimes. > > The status of EHCI registers is as follows and the test does not complete. > The system is still up but no USB activity is happening.
What do you mean? That you're watching the traffic on-the-wire, and nothing at all is happening? (I'm sure TI has a few sniffers around, and if your lab is doing much USB stuff it should easily be able to get its own. TotalPhase has one at $US 1200 now...) > # cat /sys/class/usb_host/usb_host1/async > qh/ffc00100 dev3 hs ep1 42002103 40000000 (00001d00 data0 nak4) That token looks suspicious. Notice the low byte of all-zeroes, including neither an "active" flag nor an error flag ... that seems like a "should not happen". Hence something the driver doesn't know what to do with ... qh_completions() will see that it's not active, qtd_copy_status() thin If that's typical, you've got the start of a good handle on an interesting puzzle. (And I'd expect that there is indeed no traffic on the wire, not even stuff that's getting NAKed.) You could add some diagnostic code to the QH scanning to handle that case. > ffc01840+in len=1024 04000d80 urb c7d9b6a0 > ffc018a0#in len=1024 04000d80 urb c7d9b620 > ffc01900#in len=1024 04000d80 urb c7d9b5a0 > ffc01960#in len=1024 04000d80 urb c7d9b520 > ffc019c0#in len=1024 04000d80 urb c7d9b4a0 > ffc01a20#in len=1024 04000d80 urb c7d9b420 > ffc01a80#in len=1024 04008d80 urb c7d9b3a0 > > > I suspect the problem with EHCI on OMAP34xx, > as the same netchip device setup with Dell PC EHCI works fine It could be a bug in the EHCI implementation you're using. (Can you say whose?) But I wouldn't assume that's the most likely case; after all, EHCI implementations should be mature by now. One small clue to be aware of: the fact that it works on "fast" hardware doesn't mean there's not a race lurking that could show up on slower hardware. Until its guts were more or less rewritten for the 2.6 kernel series, even OHCI had **lots** of little micro-races that would only show up on embedded ARM cores, not on PCs. It was a long slog through that code to find and them. And while the OMAP3 series may be faster than previous OMAPs, it's not as fast as most five-year old Dells waiting to be recycled ... so it'd be a good place to notice races where a relatively-faster EHCI would cause problems. I'm not saying there *are* such races, but just that existence of one wouldn't surprise me. (Howerver, given some other oddball reports, I do kind of expect one...) EHCI is fast enough to make such races show up more frequently than you might think. The fix might be as simple as adding a missing memory barrier, but such stuff can be tricky to find. You're fortunate to have what seems to be a reproducible fault mode! > Any help appreciated? How to go about debugging EHCI side? First make sense of that partial register dump. Translate that async schedule into English and tell us what it says about, for example, the token bits in that QH. Does the QTD overlay area agree with the QTD itself? Now, how could it have gotten that way? (Remembering the QH is in a dma-coherent memory region...) And what was the last traffic to occur *before* it stopped? Maybe you can hook up trace hardware to your OMAP board so it snoops all access (or at least, all CPU access) to the QH pool, and notice when this trouble case shows up. - Dave - To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
