Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-15 Thread Rin Okuyama

On 2019/02/15 23:37, Nick Hudson wrote:

On 15/02/2019 13:44, Rin Okuyama wrote:

On 2019/02/15 21:57, Jonathan A. Kollasch wrote:

On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote:

Hi,

On 2019/02/13 3:54, Nick Hudson wrote:

On 12/02/2019 16:02, Rin Okuyama wrote:

Hi,

The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped
by "ifconfig down" or detached.

As discussed in the previous message, this is due to infinite loop in
usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them.



Why not the attached patch instead?

Nick


Thank you so much for your prompt reply!

It works if s/ux_state/ux_status/ here:

+    if (xfer->ux_state == USBD_NOT_STARTED) {

Can I commit the revised patch?



The revised patch results in
https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566
firing upon `drvctl detach -d bwfm0` on Pinebook.

Jonathan Kollasch


IMO, this is because NOT_STARTED queues are removed in a way that is
unexpected to ehci. For my old amd64 box, the attached patch fixes
assertion failures when devices are detached from ehci. I would like
to commit it together with the previous patch.



ux_state has probably out lived its usefulness.

Other/All HCs do the same ux_state check. So, either all need to be updated or 
we do the same XFER_ONQU to XFER_BUSY transition in usbd_ar_pipe


Nick


The latter does not work because ehci and uhci also check its own flags,
ex_isdone and ux_isdone, respectively. Therefore, I chose the former:

http://www.netbsd.org/~rin/usbhc_freex_20190215.patch

Is this OK for you?

Thanks,
rin


Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-15 Thread Nick Hudson

On 15/02/2019 13:44, Rin Okuyama wrote:

On 2019/02/15 21:57, Jonathan A. Kollasch wrote:

On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote:

Hi,

On 2019/02/13 3:54, Nick Hudson wrote:

On 12/02/2019 16:02, Rin Okuyama wrote:

Hi,

The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
multiple outstanding transfers [axen(4), mue(4), and ure(4)] is 
stopped

by "ifconfig down" or detached.

As discussed in the previous message, this is due to infinite loop in
usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove 
them.



Why not the attached patch instead?

Nick


Thank you so much for your prompt reply!

It works if s/ux_state/ux_status/ here:

+    if (xfer->ux_state == USBD_NOT_STARTED) {

Can I commit the revised patch?



The revised patch results in
https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566
firing upon `drvctl detach -d bwfm0` on Pinebook.

Jonathan Kollasch


IMO, this is because NOT_STARTED queues are removed in a way that is
unexpected to ehci. For my old amd64 box, the attached patch fixes
assertion failures when devices are detached from ehci. I would like
to commit it together with the previous patch.



ux_state has probably out lived its usefulness.

Other/All HCs do the same ux_state check. So, either all need to be 
updated or we do the same XFER_ONQU to XFER_BUSY transition in usbd_ar_pipe



Nick


Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-15 Thread Jonathan A. Kollasch
On Fri, Feb 15, 2019 at 10:44:23PM +0900, Rin Okuyama wrote:
> On 2019/02/15 21:57, Jonathan A. Kollasch wrote:
> > On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote:
> > > Hi,
> > > 
> > > On 2019/02/13 3:54, Nick Hudson wrote:
> > > > On 12/02/2019 16:02, Rin Okuyama wrote:
> > > > > Hi,
> > > > > 
> > > > > The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
> > > > > multiple outstanding transfers [axen(4), mue(4), and ure(4)] is 
> > > > > stopped
> > > > > by "ifconfig down" or detached.
> > > > > 
> > > > > As discussed in the previous message, this is due to infinite loop in
> > > > > usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
> > > > > because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove 
> > > > > them.
> > > > 
> > > > 
> > > > Why not the attached patch instead?
> > > > 
> > > > Nick
> > > 
> > > Thank you so much for your prompt reply!
> > > 
> > > It works if s/ux_state/ux_status/ here:
> > > 
> > > + if (xfer->ux_state == USBD_NOT_STARTED) {
> > > 
> > > Can I commit the revised patch?
> > > 
> > 
> > The revised patch results in
> > https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566
> > firing upon `drvctl detach -d bwfm0` on Pinebook.
> > 
> > Jonathan Kollasch
> 
> IMO, this is because NOT_STARTED queues are removed in a way that is
> unexpected to ehci. For my old amd64 box, the attached patch fixes
> assertion failures when devices are detached from ehci. I would like
> to commit it together with the previous patch.
> 

Works for me; please commit.  (I'm not 100% sure it's perfect, but
it's better than it was, and we can fix it again later if necessary.)

Jonathan Kollasch


Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-15 Thread Rin Okuyama

On 2019/02/15 21:57, Jonathan A. Kollasch wrote:

On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote:

Hi,

On 2019/02/13 3:54, Nick Hudson wrote:

On 12/02/2019 16:02, Rin Okuyama wrote:

Hi,

The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped
by "ifconfig down" or detached.

As discussed in the previous message, this is due to infinite loop in
usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them.



Why not the attached patch instead?

Nick


Thank you so much for your prompt reply!

It works if s/ux_state/ux_status/ here:

+   if (xfer->ux_state == USBD_NOT_STARTED) {

Can I commit the revised patch?



The revised patch results in
https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566
firing upon `drvctl detach -d bwfm0` on Pinebook.

Jonathan Kollasch


IMO, this is because NOT_STARTED queues are removed in a way that is
unexpected to ehci. For my old amd64 box, the attached patch fixes
assertion failures when devices are detached from ehci. I would like
to commit it together with the previous patch.

Thanks,
rin

Index: sys/dev/usb/ehci.c
===
RCS file: /home/netbsd/src/sys/dev/usb/ehci.c,v
retrieving revision 1.265
diff -p -u -r1.265 ehci.c
--- sys/dev/usb/ehci.c  18 Sep 2018 02:00:06 -  1.265
+++ sys/dev/usb/ehci.c  13 Feb 2019 19:13:34 -
@@ -1562,9 +1562,10 @@ ehci_freex(struct usbd_bus *bus, struct
struct ehci_softc *sc = EHCI_BUS2SC(bus);
struct ehci_xfer *ex __diagused = EHCI_XFER2EXFER(xfer);
 
-	KASSERTMSG(xfer->ux_state == XFER_BUSY, "xfer %p state %d\n", xfer,

-   xfer->ux_state);
-   KASSERT(ex->ex_isdone);
+   KASSERTMSG(xfer->ux_status == USBD_NOT_STARTED ||
+   xfer->ux_state == XFER_BUSY,
+   "xfer %p state %d\n", xfer, xfer->ux_state);
+   KASSERT(xfer->ux_status == USBD_NOT_STARTED || ex->ex_isdone);
 
 #ifdef DIAGNOSTIC

xfer->ux_state = XFER_FREE;


Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-15 Thread Jonathan A. Kollasch
On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote:
> Hi,
> 
> On 2019/02/13 3:54, Nick Hudson wrote:
> > On 12/02/2019 16:02, Rin Okuyama wrote:
> > > Hi,
> > > 
> > > The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
> > > multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped
> > > by "ifconfig down" or detached.
> > > 
> > > As discussed in the previous message, this is due to infinite loop in
> > > usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
> > > because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them.
> > 
> > 
> > Why not the attached patch instead?
> > 
> > Nick
> 
> Thank you so much for your prompt reply!
> 
> It works if s/ux_state/ux_status/ here:
> 
> + if (xfer->ux_state == USBD_NOT_STARTED) {
> 
> Can I commit the revised patch?
> 

The revised patch results in
https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566
firing upon `drvctl detach -d bwfm0` on Pinebook.

Jonathan Kollasch


Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-13 Thread Rin Okuyama

Hi,

On 2019/02/13 3:54, Nick Hudson wrote:

On 12/02/2019 16:02, Rin Okuyama wrote:

Hi,

The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped
by "ifconfig down" or detached.

As discussed in the previous message, this is due to infinite loop in
usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them.



Why not the attached patch instead?

Nick


Thank you so much for your prompt reply!

It works if s/ux_state/ux_status/ here:

+   if (xfer->ux_state == USBD_NOT_STARTED) {

Can I commit the revised patch?

Thanks,
rin

Index: usbdi.c
===
RCS file: /home/netbsd/src/sys/dev/usb/usbdi.c,v
retrieving revision 1.181
diff -p -u -r1.181 usbdi.c
--- sys/dev/usb/usbdi.c 10 Jan 2019 22:13:07 -  1.181
+++ sys/dev/usb/usbdi.c 13 Feb 2019 09:32:00 -
@@ -884,9 +884,13 @@ usbd_ar_pipe(struct usbd_pipe *pipe)
USBHIST_LOG(usbdebug, "pipe = %#jx xfer = %#jx "
"(methods = %#jx)", (uintptr_t)pipe, (uintptr_t)xfer,
(uintptr_t)pipe->up_methods, 0);
-   /* Make the HC abort it (and invoke the callback). */
-   pipe->up_methods->upm_abort(xfer);
-   /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); */
+   if (xfer->ux_status == USBD_NOT_STARTED) {
+   SIMPLEQ_REMOVE_HEAD(&pipe->up_queue, ux_next);
+   } else {
+   /* Make the HC abort it (and invoke the callback). */
+   pipe->up_methods->upm_abort(xfer);
+   /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); 
*/
+   }
}
pipe->up_aborting = 0;
return USBD_NORMAL_COMPLETION;


Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-12 Thread Nick Hudson

On 12/02/2019 16:02, Rin Okuyama wrote:

Hi,

The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped
by "ifconfig down" or detached.

As discussed in the previous message, this is due to infinite loop in
usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them.



Why not the attached patch instead?

Nick
? sys/dev/usb/cscope.out
Index: sys/dev/usb/usbdi.c
===
RCS file: /cvsroot/src/sys/dev/usb/usbdi.c,v
retrieving revision 1.181
diff -u -p -r1.181 usbdi.c
--- sys/dev/usb/usbdi.c 10 Jan 2019 22:13:07 -  1.181
+++ sys/dev/usb/usbdi.c 12 Feb 2019 18:51:28 -
@@ -884,9 +884,13 @@ usbd_ar_pipe(struct usbd_pipe *pipe)
USBHIST_LOG(usbdebug, "pipe = %#jx xfer = %#jx "
"(methods = %#jx)", (uintptr_t)pipe, (uintptr_t)xfer,
(uintptr_t)pipe->up_methods, 0);
-   /* Make the HC abort it (and invoke the callback). */
-   pipe->up_methods->upm_abort(xfer);
-   /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); */
+   if (xfer->ux_state == USBD_NOT_STARTED) {
+   SIMPLEQ_REMOVE_HEAD(&pipe->up_queue, ux_next);
+   } else {
+   /* Make the HC abort it (and invoke the callback). */
+   pipe->up_methods->upm_abort(xfer);
+   /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); 
*/
+   }
}
pipe->up_aborting = 0;
return USBD_NORMAL_COMPLETION;


Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-12 Thread Rin Okuyama

Hi,

The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped
by "ifconfig down" or detached.

As discussed in the previous message, this is due to infinite loop in
usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them.

I guess that this can happen for host controllers in which up_serialise
is true for bulk transfers [it does not occur for dwctwo(4), in which
up_serialise == false for bulk transfers]. For such adapters, IMO,
usbd_start_next(pipe) should be called whenever some xfer is removed
from pipe->up_queue.

Actually, the freeze seems to be fixed by the attached patch.
Can I commit the patch? Or, am I missing something?

Thanks,
rin

Index: usbdi.c
===
RCS file: /home/netbsd/src/sys/dev/usb/usbdi.c,v
retrieving revision 1.181
diff -p -u -r1.181 usbdi.c
--- sys/dev/usb/usbdi.c 10 Jan 2019 22:13:07 -  1.181
+++ sys/dev/usb/usbdi.c 12 Feb 2019 14:07:35 -
@@ -1013,7 +1013,7 @@ usb_transfer_complete(struct usbd_xfer *
if (erred && pipe->up_iface != NULL) /* not control pipe */
pipe->up_running = 0;
}
-   if (pipe->up_running && pipe->up_serialise)
+   if (pipe->up_serialise)
usbd_start_next(pipe);
 }
 



On 2019/02/09 22:28, sc dying wrote:

On Sat, Feb 9, 2019 at 8:18 AM Rin Okuyama  wrote:


Hi,

On 2019/02/08 23:16, sc dying wrote:

On 2019/01/31 05:51, Rin Okuyama wrote:

By the way, I find that the system hangs silently by
"ifconfig mueN down" or detaching LAN7500 from USB port when
multiple outstanding requests are enabled. This does not occur
when MUE_TX_LIST_CNT = MUE_RX_LIST_CNT = 1. Do you have any ideas
to fix it?


My axen dongle locks up if AXEN_RX_LIST_CNT > 1
when ifconfig down on amd64 8.99.34.

db{0}> bt
breakpoint() at netbsd:breakpoint+0x5
comintr() at netbsd:comintr+0x861
Xhandle_ioapic_edge4() at netbsd:Xhandle_ioapic_edge4+0x66
--- interrupt ---
xhci_device_bulk_abort() at netbsd:xhci_device_bulk_abort+0x1c
usbd_ar_pipe() at netbsd:usbd_ar_pipe+0x1e9
usbd_abort_pipe() at netbsd:usbd_abort_pipe+0x27
axen_stop() at netbsd:axen_stop+0xc4
axen_ioctl() at netbsd:axen_ioctl+0x1d9
doifioctl() at netbsd:doifioctl+0xa99
sys_ioctl() at netbsd:sys_ioctl+0x11c
syscall() at netbsd:syscall+0xb4
--- syscall (number 54) ---
732731d1a88a:

Looks like kernel goes infinite loop in usbd_ar_pipe by some reason.
It tries to abort NOT_STARTED xfers.


Can you tell me how to reproduce this failure?
"ifconfig down" works for me on RPI3B+.


I tested on intel 3000 series CPU + Ivebridge
with kernel from my local tree.
But I found it does not happen with the lastest kernel from
https://nycdn.netbsd.org/pub/NetBSD-daily/HEAD/latest/amd64/binary/kernel/netbsd-GENERIC.gz.