Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
On 2019/02/15 23:37, Nick Hudson wrote: On 15/02/2019 13:44, Rin Okuyama wrote: On 2019/02/15 21:57, Jonathan A. Kollasch wrote: On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote: Hi, On 2019/02/13 3:54, Nick Hudson wrote: On 12/02/2019 16:02, Rin Okuyama wrote: Hi, The system freezes indefinitely with xhci(4) or ehci(4), when NIC with multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped by "ifconfig down" or detached. As discussed in the previous message, this is due to infinite loop in usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them. Why not the attached patch instead? Nick Thank you so much for your prompt reply! It works if s/ux_state/ux_status/ here: + if (xfer->ux_state == USBD_NOT_STARTED) { Can I commit the revised patch? The revised patch results in https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566 firing upon `drvctl detach -d bwfm0` on Pinebook. Jonathan Kollasch IMO, this is because NOT_STARTED queues are removed in a way that is unexpected to ehci. For my old amd64 box, the attached patch fixes assertion failures when devices are detached from ehci. I would like to commit it together with the previous patch. ux_state has probably out lived its usefulness. Other/All HCs do the same ux_state check. So, either all need to be updated or we do the same XFER_ONQU to XFER_BUSY transition in usbd_ar_pipe Nick The latter does not work because ehci and uhci also check its own flags, ex_isdone and ux_isdone, respectively. Therefore, I chose the former: http://www.netbsd.org/~rin/usbhc_freex_20190215.patch Is this OK for you? Thanks, rin
Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
On 15/02/2019 13:44, Rin Okuyama wrote: On 2019/02/15 21:57, Jonathan A. Kollasch wrote: On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote: Hi, On 2019/02/13 3:54, Nick Hudson wrote: On 12/02/2019 16:02, Rin Okuyama wrote: Hi, The system freezes indefinitely with xhci(4) or ehci(4), when NIC with multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped by "ifconfig down" or detached. As discussed in the previous message, this is due to infinite loop in usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them. Why not the attached patch instead? Nick Thank you so much for your prompt reply! It works if s/ux_state/ux_status/ here: + if (xfer->ux_state == USBD_NOT_STARTED) { Can I commit the revised patch? The revised patch results in https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566 firing upon `drvctl detach -d bwfm0` on Pinebook. Jonathan Kollasch IMO, this is because NOT_STARTED queues are removed in a way that is unexpected to ehci. For my old amd64 box, the attached patch fixes assertion failures when devices are detached from ehci. I would like to commit it together with the previous patch. ux_state has probably out lived its usefulness. Other/All HCs do the same ux_state check. So, either all need to be updated or we do the same XFER_ONQU to XFER_BUSY transition in usbd_ar_pipe Nick
Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
On Fri, Feb 15, 2019 at 10:44:23PM +0900, Rin Okuyama wrote: > On 2019/02/15 21:57, Jonathan A. Kollasch wrote: > > On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote: > > > Hi, > > > > > > On 2019/02/13 3:54, Nick Hudson wrote: > > > > On 12/02/2019 16:02, Rin Okuyama wrote: > > > > > Hi, > > > > > > > > > > The system freezes indefinitely with xhci(4) or ehci(4), when NIC with > > > > > multiple outstanding transfers [axen(4), mue(4), and ure(4)] is > > > > > stopped > > > > > by "ifconfig down" or detached. > > > > > > > > > > As discussed in the previous message, this is due to infinite loop in > > > > > usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever > > > > > because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove > > > > > them. > > > > > > > > > > > > Why not the attached patch instead? > > > > > > > > Nick > > > > > > Thank you so much for your prompt reply! > > > > > > It works if s/ux_state/ux_status/ here: > > > > > > + if (xfer->ux_state == USBD_NOT_STARTED) { > > > > > > Can I commit the revised patch? > > > > > > > The revised patch results in > > https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566 > > firing upon `drvctl detach -d bwfm0` on Pinebook. > > > > Jonathan Kollasch > > IMO, this is because NOT_STARTED queues are removed in a way that is > unexpected to ehci. For my old amd64 box, the attached patch fixes > assertion failures when devices are detached from ehci. I would like > to commit it together with the previous patch. > Works for me; please commit. (I'm not 100% sure it's perfect, but it's better than it was, and we can fix it again later if necessary.) Jonathan Kollasch
Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
On 2019/02/15 21:57, Jonathan A. Kollasch wrote: On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote: Hi, On 2019/02/13 3:54, Nick Hudson wrote: On 12/02/2019 16:02, Rin Okuyama wrote: Hi, The system freezes indefinitely with xhci(4) or ehci(4), when NIC with multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped by "ifconfig down" or detached. As discussed in the previous message, this is due to infinite loop in usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them. Why not the attached patch instead? Nick Thank you so much for your prompt reply! It works if s/ux_state/ux_status/ here: + if (xfer->ux_state == USBD_NOT_STARTED) { Can I commit the revised patch? The revised patch results in https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566 firing upon `drvctl detach -d bwfm0` on Pinebook. Jonathan Kollasch IMO, this is because NOT_STARTED queues are removed in a way that is unexpected to ehci. For my old amd64 box, the attached patch fixes assertion failures when devices are detached from ehci. I would like to commit it together with the previous patch. Thanks, rin Index: sys/dev/usb/ehci.c === RCS file: /home/netbsd/src/sys/dev/usb/ehci.c,v retrieving revision 1.265 diff -p -u -r1.265 ehci.c --- sys/dev/usb/ehci.c 18 Sep 2018 02:00:06 - 1.265 +++ sys/dev/usb/ehci.c 13 Feb 2019 19:13:34 - @@ -1562,9 +1562,10 @@ ehci_freex(struct usbd_bus *bus, struct struct ehci_softc *sc = EHCI_BUS2SC(bus); struct ehci_xfer *ex __diagused = EHCI_XFER2EXFER(xfer); - KASSERTMSG(xfer->ux_state == XFER_BUSY, "xfer %p state %d\n", xfer, - xfer->ux_state); - KASSERT(ex->ex_isdone); + KASSERTMSG(xfer->ux_status == USBD_NOT_STARTED || + xfer->ux_state == XFER_BUSY, + "xfer %p state %d\n", xfer, xfer->ux_state); + KASSERT(xfer->ux_status == USBD_NOT_STARTED || ex->ex_isdone); #ifdef DIAGNOSTIC xfer->ux_state = XFER_FREE;
Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote: > Hi, > > On 2019/02/13 3:54, Nick Hudson wrote: > > On 12/02/2019 16:02, Rin Okuyama wrote: > > > Hi, > > > > > > The system freezes indefinitely with xhci(4) or ehci(4), when NIC with > > > multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped > > > by "ifconfig down" or detached. > > > > > > As discussed in the previous message, this is due to infinite loop in > > > usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever > > > because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them. > > > > > > Why not the attached patch instead? > > > > Nick > > Thank you so much for your prompt reply! > > It works if s/ux_state/ux_status/ here: > > + if (xfer->ux_state == USBD_NOT_STARTED) { > > Can I commit the revised patch? > The revised patch results in https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566 firing upon `drvctl detach -d bwfm0` on Pinebook. Jonathan Kollasch
Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
Hi, On 2019/02/13 3:54, Nick Hudson wrote: On 12/02/2019 16:02, Rin Okuyama wrote: Hi, The system freezes indefinitely with xhci(4) or ehci(4), when NIC with multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped by "ifconfig down" or detached. As discussed in the previous message, this is due to infinite loop in usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them. Why not the attached patch instead? Nick Thank you so much for your prompt reply! It works if s/ux_state/ux_status/ here: + if (xfer->ux_state == USBD_NOT_STARTED) { Can I commit the revised patch? Thanks, rin Index: usbdi.c === RCS file: /home/netbsd/src/sys/dev/usb/usbdi.c,v retrieving revision 1.181 diff -p -u -r1.181 usbdi.c --- sys/dev/usb/usbdi.c 10 Jan 2019 22:13:07 - 1.181 +++ sys/dev/usb/usbdi.c 13 Feb 2019 09:32:00 - @@ -884,9 +884,13 @@ usbd_ar_pipe(struct usbd_pipe *pipe) USBHIST_LOG(usbdebug, "pipe = %#jx xfer = %#jx " "(methods = %#jx)", (uintptr_t)pipe, (uintptr_t)xfer, (uintptr_t)pipe->up_methods, 0); - /* Make the HC abort it (and invoke the callback). */ - pipe->up_methods->upm_abort(xfer); - /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); */ + if (xfer->ux_status == USBD_NOT_STARTED) { + SIMPLEQ_REMOVE_HEAD(&pipe->up_queue, ux_next); + } else { + /* Make the HC abort it (and invoke the callback). */ + pipe->up_methods->upm_abort(xfer); + /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); */ + } } pipe->up_aborting = 0; return USBD_NORMAL_COMPLETION;
Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
On 12/02/2019 16:02, Rin Okuyama wrote: Hi, The system freezes indefinitely with xhci(4) or ehci(4), when NIC with multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped by "ifconfig down" or detached. As discussed in the previous message, this is due to infinite loop in usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them. Why not the attached patch instead? Nick ? sys/dev/usb/cscope.out Index: sys/dev/usb/usbdi.c === RCS file: /cvsroot/src/sys/dev/usb/usbdi.c,v retrieving revision 1.181 diff -u -p -r1.181 usbdi.c --- sys/dev/usb/usbdi.c 10 Jan 2019 22:13:07 - 1.181 +++ sys/dev/usb/usbdi.c 12 Feb 2019 18:51:28 - @@ -884,9 +884,13 @@ usbd_ar_pipe(struct usbd_pipe *pipe) USBHIST_LOG(usbdebug, "pipe = %#jx xfer = %#jx " "(methods = %#jx)", (uintptr_t)pipe, (uintptr_t)xfer, (uintptr_t)pipe->up_methods, 0); - /* Make the HC abort it (and invoke the callback). */ - pipe->up_methods->upm_abort(xfer); - /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); */ + if (xfer->ux_state == USBD_NOT_STARTED) { + SIMPLEQ_REMOVE_HEAD(&pipe->up_queue, ux_next); + } else { + /* Make the HC abort it (and invoke the callback). */ + pipe->up_methods->upm_abort(xfer); + /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); */ + } } pipe->up_aborting = 0; return USBD_NORMAL_COMPLETION;
Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
Hi, The system freezes indefinitely with xhci(4) or ehci(4), when NIC with multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped by "ifconfig down" or detached. As discussed in the previous message, this is due to infinite loop in usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them. I guess that this can happen for host controllers in which up_serialise is true for bulk transfers [it does not occur for dwctwo(4), in which up_serialise == false for bulk transfers]. For such adapters, IMO, usbd_start_next(pipe) should be called whenever some xfer is removed from pipe->up_queue. Actually, the freeze seems to be fixed by the attached patch. Can I commit the patch? Or, am I missing something? Thanks, rin Index: usbdi.c === RCS file: /home/netbsd/src/sys/dev/usb/usbdi.c,v retrieving revision 1.181 diff -p -u -r1.181 usbdi.c --- sys/dev/usb/usbdi.c 10 Jan 2019 22:13:07 - 1.181 +++ sys/dev/usb/usbdi.c 12 Feb 2019 14:07:35 - @@ -1013,7 +1013,7 @@ usb_transfer_complete(struct usbd_xfer * if (erred && pipe->up_iface != NULL) /* not control pipe */ pipe->up_running = 0; } - if (pipe->up_running && pipe->up_serialise) + if (pipe->up_serialise) usbd_start_next(pipe); } On 2019/02/09 22:28, sc dying wrote: On Sat, Feb 9, 2019 at 8:18 AM Rin Okuyama wrote: Hi, On 2019/02/08 23:16, sc dying wrote: On 2019/01/31 05:51, Rin Okuyama wrote: By the way, I find that the system hangs silently by "ifconfig mueN down" or detaching LAN7500 from USB port when multiple outstanding requests are enabled. This does not occur when MUE_TX_LIST_CNT = MUE_RX_LIST_CNT = 1. Do you have any ideas to fix it? My axen dongle locks up if AXEN_RX_LIST_CNT > 1 when ifconfig down on amd64 8.99.34. db{0}> bt breakpoint() at netbsd:breakpoint+0x5 comintr() at netbsd:comintr+0x861 Xhandle_ioapic_edge4() at netbsd:Xhandle_ioapic_edge4+0x66 --- interrupt --- xhci_device_bulk_abort() at netbsd:xhci_device_bulk_abort+0x1c usbd_ar_pipe() at netbsd:usbd_ar_pipe+0x1e9 usbd_abort_pipe() at netbsd:usbd_abort_pipe+0x27 axen_stop() at netbsd:axen_stop+0xc4 axen_ioctl() at netbsd:axen_ioctl+0x1d9 doifioctl() at netbsd:doifioctl+0xa99 sys_ioctl() at netbsd:sys_ioctl+0x11c syscall() at netbsd:syscall+0xb4 --- syscall (number 54) --- 732731d1a88a: Looks like kernel goes infinite loop in usbd_ar_pipe by some reason. It tries to abort NOT_STARTED xfers. Can you tell me how to reproduce this failure? "ifconfig down" works for me on RPI3B+. I tested on intel 3000 series CPU + Ivebridge with kernel from my local tree. But I found it does not happen with the lastest kernel from https://nycdn.netbsd.org/pub/NetBSD-daily/HEAD/latest/amd64/binary/kernel/netbsd-GENERIC.gz.