On 18/10/19(Fri) 14:55, Stefan Sperling wrote:
> On Fri, Oct 18, 2019 at 12:54:06PM +0200, Stefan Sperling wrote:
> > On Fri, Oct 18, 2019 at 11:37:01AM +0200, Martin Pieuchot wrote:
> > > The question here is how an `xfer' with a status of NORMAL_COMPLETION
> > > can be found in the pipe's queue?
> >
> > I don't know yet.
>
> I have managed to track it down after realizing that the xfer which
> triggers the problem is allocated very early on, in ulpt_open().
>
> It it always is one of these (from ulpt.c):
>
> sc->sc_in_xfer1 = usbd_alloc_xfer(sc->sc_udev);
> sc->sc_in_xfer2 = usbd_alloc_xfer(sc->sc_udev);
>
> Our friend which queues an xfer to the pipe after an error occured
> is this xfer callback which lacks any error checking whatsoever:
>
> static void
> ulpt_input(struct usbd_xfer *xfer, void *priv, usbd_status status)
> {
> struct ulpt_softc *sc = priv;
>
> DPRINTFN(2,("ulpt_input: got some data\n"));
> /* Do it again. */
> if (xfer == sc->sc_in_xfer1)
> usbd_transfer(sc->sc_in_xfer2);
> else
> usbd_transfer(sc->sc_in_xfer1);
> }
>
>
> Proof that the problematic pipe is ulpt's sc->sc_in_pipe:
>
> ulptwrite: transfer 16384 bytes
> ulpt_input: xfer=0xfffffd811f8da0f0 status=0
> (at this point sc_in_xfer1 has completed, and sc_in_xfer2 is queued)
> xhci0: txerr? code 4
> ulpt_input: xfer=0xfffffd811f8da2d0 status=13
> (at this point sc_in_xfer2 has failed, and sc_in_xfer1 is queued)
> usb0: usb_needs_explore
> (the USB stack tries to detach the printer which has been turned off)
> usb_add_task: task=0xffff800000093e80 state=0 type=1
> usb_explore: usb0
> usbd_do_request_flags
> usbd_do_request_flags: new xfer=0xfffffd811f8da690
> usbd_do_request_flags
> usbd_do_request_flags: new xfer=0xfffffd811f8da690
> ulpt_detach: sc=0xffff800000e63600
> ulpt_detach: aborting pipe=0xffff800000ddd000
> usbd_abort_pipe: pipe=0xffff800000ddd000
> ulpt_detach: aborting pipe=0xffff800000dbc000
> (ulpt attempts to abort sc->sc_in_pipe which has sc_in_xfer1 queued)
> usbd_abort_pipe: pipe=0xffff800000dbc000
> usbd_abort_pipe: pipe=0xffff800000dbc000 xfer=0xfffffd811f8da0f0
> (methods=0xffffffff81f41508)
> xhci_abort_xfer: xfer=0xfffffd811f8da0f0 status=NORMAL_COMPLETION
> err=CANCELLED actlen=23 len=64 idx=-1
> xhci_abort_xfer: already done
> (endless loop as discussed before)
>
> So I would suggest this diff:
As soon as a transfer with an error condition is passed to
usb_transfer_complete(), usbd_start_next() won't be called.
If the callback of that transfer had already enqueue a new xfer, via
usbd_transfer(), it will stay on the queue until the pipe is aborted.
Currently all our drivers call usbd_transfer() without reinitializing
`xfer->status' and that confuses *hci's abort() methods.
Does the diff below help?
Index: usbdi.c
===================================================================
RCS file: /cvs/src/sys/dev/usb/usbdi.c,v
retrieving revision 1.101
diff -u -p -r1.101 usbdi.c
--- usbdi.c 6 Oct 2019 17:11:51 -0000 1.101
+++ usbdi.c 18 Oct 2019 15:11:29 -0000
@@ -294,6 +294,7 @@ usbd_transfer(struct usbd_xfer *xfer)
usbd_dump_queue(pipe);
#endif
xfer->done = 0;
+ xfer->status = USBD_NOT_STARTED;
if (pipe->aborting)
return (USBD_CANCELLED);