Re: xhci(4): fix endless loop when transfer is aborted

Martin Pieuchot Fri, 18 Oct 2019 08:16:09 -0700

On 18/10/19(Fri) 14:55, Stefan Sperling wrote:
> On Fri, Oct 18, 2019 at 12:54:06PM +0200, Stefan Sperling wrote:
> > On Fri, Oct 18, 2019 at 11:37:01AM +0200, Martin Pieuchot wrote:
> > > The question here is how an `xfer' with a status of NORMAL_COMPLETION
> > > can be found in the pipe's queue?
> > 
> > I don't know yet.
> 
> I have managed to track it down after realizing that the xfer which
> triggers the problem is allocated very early on, in ulpt_open().
> 
> It it always is one of these (from ulpt.c):
> 
>               sc->sc_in_xfer1 = usbd_alloc_xfer(sc->sc_udev);
>               sc->sc_in_xfer2 = usbd_alloc_xfer(sc->sc_udev);
> 
> Our friend which queues an xfer to the pipe after an error occured
> is this xfer callback which lacks any error checking whatsoever:
> 
> static void
> ulpt_input(struct usbd_xfer *xfer, void *priv, usbd_status status)
> {
>       struct ulpt_softc *sc = priv;
> 
>       DPRINTFN(2,("ulpt_input: got some data\n"));
>       /* Do it again. */
>       if (xfer == sc->sc_in_xfer1)
>               usbd_transfer(sc->sc_in_xfer2);
>       else
>               usbd_transfer(sc->sc_in_xfer1);
> }
> 
> 
> Proof that the problematic pipe is ulpt's sc->sc_in_pipe:
> 
> ulptwrite: transfer 16384 bytes
> ulpt_input: xfer=0xfffffd811f8da0f0 status=0
>   (at this point sc_in_xfer1 has completed, and sc_in_xfer2 is queued)
> xhci0: txerr? code 4
> ulpt_input: xfer=0xfffffd811f8da2d0 status=13
>  (at this point sc_in_xfer2 has failed, and sc_in_xfer1 is queued)
> usb0: usb_needs_explore
>  (the USB stack tries to detach the printer which has been turned off)
> usb_add_task: task=0xffff800000093e80 state=0 type=1
> usb_explore: usb0
> usbd_do_request_flags
> usbd_do_request_flags: new xfer=0xfffffd811f8da690
> usbd_do_request_flags
> usbd_do_request_flags: new xfer=0xfffffd811f8da690
> ulpt_detach: sc=0xffff800000e63600
> ulpt_detach: aborting pipe=0xffff800000ddd000
> usbd_abort_pipe: pipe=0xffff800000ddd000
> ulpt_detach: aborting pipe=0xffff800000dbc000
>   (ulpt attempts to abort sc->sc_in_pipe which has sc_in_xfer1 queued)
> usbd_abort_pipe: pipe=0xffff800000dbc000
> usbd_abort_pipe: pipe=0xffff800000dbc000 xfer=0xfffffd811f8da0f0 
> (methods=0xffffffff81f41508)
> xhci_abort_xfer: xfer=0xfffffd811f8da0f0 status=NORMAL_COMPLETION 
> err=CANCELLED actlen=23 len=64 idx=-1
> xhci_abort_xfer: already done
>   (endless loop as discussed before)
> 
> So I would suggest this diff:


As soon as a transfer with an error condition is passed to
usb_transfer_complete(), usbd_start_next() won't be called.

If the callback of that transfer had already enqueue a new xfer, via
usbd_transfer(), it will stay on the queue until the pipe is aborted.

Currently all our drivers call usbd_transfer() without reinitializing
`xfer->status' and that confuses *hci's abort() methods.

Does the diff below help?

Index: usbdi.c
===================================================================
RCS file: /cvs/src/sys/dev/usb/usbdi.c,v
retrieving revision 1.101
diff -u -p -r1.101 usbdi.c
--- usbdi.c     6 Oct 2019 17:11:51 -0000       1.101
+++ usbdi.c     18 Oct 2019 15:11:29 -0000
@@ -294,6 +294,7 @@ usbd_transfer(struct usbd_xfer *xfer)
                usbd_dump_queue(pipe);
 #endif
        xfer->done = 0;
+       xfer->status = USBD_NOT_STARTED;
 
        if (pipe->aborting)
                return (USBD_CANCELLED);

Re: xhci(4): fix endless loop when transfer is aborted

Reply via email to