On Thursday 05 July 2007, Mike Nuss wrote:
> Mike Nuss wrote:
> > David Brownell wrote:
> >> Hmm.  Here's a theory.  The way that the current code unlinks
> >> an ED is to set the SKIP bit *AND* remove the ED from the relevant
> >> part of the schedule.
> >>
> >> Maybe ... the hardware gets confused when the ED doesn't seem
> >> to be on the relevant list.  Like maybe it expects it to stay
> >> at the head of the ED list.  ISTR some silicon doesn't much
> >> like to see null pointers written into the hardware registers,
> >> and I know for a fact that the ed_deschedule logic was always
> >> a bit racey.
> >>
> >> That suggests that the safest route would be setting SKIP for
> >> one frame (so all pending TDs get properly retired), and
> >> only *THEN* taking it out of the queue.
> 
> There are three time periods in question.
> 
> A = before there is any problem
> 
> B = a read seems to have completed, the HC has advanced HeadP, but
> failed to put the completed TD on the donelist. At this point, something
> is seriously wrong, but ohci-hcd has no way of knowing this.
> 
> C = the device is removed. Then ohci-hcd hangs while waiting to free the
> outstanding TD, which will never happen since the HC thinks the TD is
> already finished.
> 
> The only cases I have observed SKIP being set are after time C. So I
> don't think that the descheduling code is responsible for the problem.
> 
> As sort-of good news, I ran some more tests today and now I know for
> *sure* that the transfer really did complete, and I can even get
> transfers going again. After time B, my device driver can detect that
> there's a problem just by the fact that no new data has come in for 30
> seconds. So, I added a new hook to ohci-q that "pretends" that the TD
> really did come back through the donelist, and goes through the normal
> TD completion routine. My driver calls this hook when it detects that
> the 30 second timeout has expired. The URB completes normally, with all
> the data intact. Then it's resubmitted for the next read, and everything
> works normally again.
> 
> I'm at a loss. It really looks like the HC just "skipped" a step for no
> reason.

I forget ... did you already try removing that special case
at the top of the IRQ handler, where it checks the low bit
of the hcca->done_head?  That's a speedup, but there's at
least one implementation that doesn't support that part of
the OHCI spec correctly.  (SA1111, which is ancient and out
of production, but still ... there could be others.)

- Dave


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to