On Sat, Sep 17, 2022 at 10:40:29PM +0200, Marcus Glocker wrote:

> On Sat, Sep 17, 2022 at 11:22:41AM -0700, Stephan Somogyi wrote:
> 
> > Starting with arm64 snapshot kernel 1818 and continuing to 1822, the latest
> > snapshot, I've been experiencing a persistent problem with the RPi3's USB
> > bus locking up in a way that requires physical access to power cycle, and
> > is thus a fairly serious regression. This system has been running
> > continuously on -current since about 6.8-current without anything even
> > remotely like this happening.
> > 
> > The 100bT interface is at smsc0 on the usb bus. Initially, it looked like
> > there may have been a weird race condition since I also had a USB-based
> > flash drive plugged in, but moving that drive around to the other ports,
> > and eventual complete removal, hasn't stopped the hanging.
> > 
> > The hang is visible in dmesg as follows:
> > 
> > usbd_start_next: error=5
> > usbd_start_next: error=5
> > usbd_free_xfer: xfer=0xffffff8004e74a20 not free
> > smsc0: warning: Failed to read register 0x114
> > smsc0: warning: MII is busy
> > 
> > Searching around, I find references to some of these errors in FreeBSD and
> > OpenBSD going back at least to 2014, but no clear resolution. It's
> > _possible_ that I have some kind of creeping hardware failure, but it
> > doesn't seem likely.
> > 
> > Once the error messages appear, I can no longer access the system over the
> > network. I've since connected the serial console. If I try to reboot while
> > it's in this state, the system will hang hard and not even respond to the
> > console. If I try `ifconfig smsc0 down` it hangs in the same way.
> > 
> > While the USB drive was still part of the repro configuration, attempting
> > to sync or otherwise access the drive also resulted in the hard hang,
> > leading me to conclude this is a USB issue rather than either a mass
> > storage or an ethernet issue.
> > 
> > I've also done the usual variable elimination by using different USB
> > drives, different ethernet cables, different port & different switch, etc.
> > I no longer appear to be able to isolate this further myself.
> > 
> > My only recourse once it's in this state is to hard power cycle.
> > 
> > I'm happy to try and help debug further; I strongly prefer that
> > 7.2-release/-stable doesn't include this behavior.
> > 
> > s.
> 
> We had some changes recently in dwctwo(4).  I currently think that your
> issue might be related to the last commit to dwc2.c revision 1.67.  I'll
> prepare a diff and send it to you for testing by tomorrow.  We might
> need some iterations.  Worst case we can try to revert that commit.

Does this diff fix your issue?


Index: sys/dev/usb/dwc2/dwc2.c
===================================================================
RCS file: /cvs/src/sys/dev/usb/dwc2/dwc2.c,v
retrieving revision 1.67
diff -u -p -u -p -r1.67 dwc2.c
--- sys/dev/usb/dwc2/dwc2.c     10 Sep 2022 08:13:16 -0000      1.67
+++ sys/dev/usb/dwc2/dwc2.c     18 Sep 2022 07:41:24 -0000
@@ -242,7 +242,6 @@ dwc2_allocx(struct usbd_bus *bus)
 void
 dwc2_freex(struct usbd_bus *bus, struct usbd_xfer *xfer)
 {
-       struct dwc2_xfer *dxfer = DWC2_XFER2DXFER(xfer);
        struct dwc2_softc *sc = DWC2_BUS2SC(bus);
 
        DPRINTFN(10, "\n");
@@ -255,7 +254,6 @@ dwc2_freex(struct usbd_bus *bus, struct 
        xfer->busy_free = XFER_FREE;
 #endif
        DWC2_EVCNT_INCR(sc->sc_ev_xferpoolput);
-       dwc2_hcd_urb_free(sc->sc_hsotg, dxfer->urb, xfer->nframes);
        pool_put(&sc->sc_xferpool, xfer);
 }
 
Index: sys/dev/usb/dwc2/dwc2_hcd.c
===================================================================
RCS file: /cvs/src/sys/dev/usb/dwc2/dwc2_hcd.c,v
retrieving revision 1.28
diff -u -p -u -p -r1.28 dwc2_hcd.c
--- sys/dev/usb/dwc2/dwc2_hcd.c 9 Sep 2022 21:16:54 -0000       1.28
+++ sys/dev/usb/dwc2/dwc2_hcd.c 18 Sep 2022 07:41:24 -0000
@@ -4312,6 +4312,7 @@ void dwc2_host_complete(struct dwc2_hsot
                                        xfer);
        }
 
+       dwc2_hcd_urb_free(sc->sc_hsotg, dxfer->urb, xfer->nframes);
        qtd->urb = NULL;
        timeout_del(&xfer->timeout_handle);
        usb_rem_task(xfer->device, &xfer->abort_task);

Reply via email to