Hi Alan,

On Thursday 20 December 2007, Alan Stern wrote:
> On Thu, 20 Dec 2007, Laurent Pinchart wrote:
> > Sunplus, the chip maker, investigated the problem and here is their
> > explanation:
> >
> > "STALL: The command timing in the linux is much faster than windows
> > system. This situation happens after the USB command is finished after
> > status stage occasionally. In the ISR, when firmware takes care an ACK
> > interrupt, for example an IN_ACK interrupt, firmware will clean the
> > IN_ACK event by “ CBREG_USB_Ep0AckEvt &= ~INTR_USB_EP0_IN_ACK; ”. In
> > this code, CPU will do (1) Read CBREG_USB_Ep0AckEvt to a buffer, (2) Use
> > buffer to do the AND operation, (3) Write the buffer value back to
> > register
> > CBREG_USB_Ep0AckEvt. If a SETUP_ACK event is enabled after between (1)
> > and (3), this event will be clear after (3) is done. This will cause the
> > STALL problem. Firmware will miss the SETUP_ACK event and thinks that a
> > NAK event is received after the USB command is finished and then return a
> > STALL."
> >
> > A related USB trace captured with a USB analyser is available at
> > http://www.irobotique.be/sunplus-trace.jpg. Please note that this trace
> > isn't related to the usbmon trace I sent in my last e-mail.
>
> The events shown in the USB trace don't match the text above.  The text
> describes a STALL response to a SETUP, but the trace shows a STALL
> response to an IN.

I know the information isn't consistent. Unfortunately that's all I've been 
able to get up to now.

> > The explanation seems a bit unclear to me. What I understand is that
> > interrupts can be lost if they arrive at the wrong time (seems like a
> > broken microcontroller to me if you can clear interrupt bits by writing
> > 0).
>
> Yes, that seems to be what they are trying to say.  The end result
> shouldn't be fatal; the next control transfer will time out and fail,
> so the driver should be able to retry it.

We tried patching the driver to retry control transfers 3 times and it didn't 
help. A device reset was required when the problem occurred.

> > Windows and Linux probably schedule transactions that make a control
> > transfer differently.
>
> I doubt that.  The scheduling is done in the EHCI hardware, not by
> software.  I did go back and look at some old Windows traces; the time
> interval between adjacent packets is extremely small.  I doubt it is
> any slower than Linux.

I'm lost there. I don't know why Windows and Linux timings are different.

> Now maybe Windows has a longer time interval _between_ control
> transfers.  That's certainly possible.  You could get the same effect
> by modifying the Linux driver to add a short delay.

That's not the case. I tried adding a one second delay between control 
transfers and it didn't really help. Instead of failing at the first control 
transfer, the device failed at the second one.

> > Some Linux users are unlucky and have their EHCI controller
> > schedule some packet exactly in the race condition time window.
> >
> > Could you have a look at the trace captured by Sunplus ? You might
> > understand their explanation differently than I do.
>
> It's hard to tell much from a single trace like that.  It would be
> easier if we had two similar traces to compare, one from Linux and one
> from Windows.

I know, but I haven't been able to get that. It's hard to get engineers from 
Logitech and Sunplus to spend a few days working on that issue when it 
doesn't affect Windows users. I'm already very grateful for the time they 
spent on chasing this bug.

The problem has affected many users. It often makes their camera completely 
unusable. I had to start telling them they could just throw their camera 
away :-(

Best regards,

Laurent Pinchart
-
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to