On Mon, Apr 27, 2015 at 03:13:06PM +0200, Martin Pieuchot wrote:
> On 27/04/15(Mon) 13:55, Remi Locherer wrote:
> > On Sun, Apr 26, 2015 at 10:07:19AM +0200, Stefan Sperling wrote:
> > > On Sun, Apr 26, 2015 at 09:36:06AM +0200, Remi Locherer wrote:
> > With a series of removals and insertions of the urtwn device I could
> > provoke another panic. I didn't wait for a timeout message to apear.
>
> Great ! You found another bug :)
Hmmm ....
> > [...]
> > urtwn0 detached
> > urtwn0 at uhub2 port 2 "Realtek 802.11n WLAN Adapter" rev 2.00/2.00 addr 3
> > urtwn0: MAC/BB RTL8188CUS, RF 6052 1T1R, address 80:1f:02:da:13:e3
> > usb_transfer_complete: actlen > len 4294951948 > 4
> > usb_transfer_complete: actlen > len 4294951180 > 4
> > usb_insert_transfer: xfer=0xffffff011a23e118 not free
>
> That's the real problem, some code path tries to reuse a xfer that's
> still on the ring.
>
> > kernel: protection fault trap, code=0
> > Stopped at ehci_check_intr+0xb: movq 0x10(%rax),%rax
> > ddb{0}> ehci_check_intr() at ehci_check_intr+0xb
> > ehci_softintr() at ehci_softintr+0x3f
> > softintr_dispatch() at softintr_dispatch+0x7f
> > Xsoftnet() at Xsoftnet+0x2d
>
> This trace tells use that the pipe is no longer valid, which means that
> the device has been removed but a xfer is still referenced by ehci.
>
> The output of "ps" could help understand what's going wrong in such
> case. If you can, please get it next time :)
>
> If you think you can reproduce this bug too, here's a diff that would
> get us a useful trace:
It seams to be easier to trigger this bug than the first one.
ddb trace and ps output:
https://relo.ch/urtwncrash_trace_part1.jpg
https://relo.ch/urtwncrash_trace_part2.jpg
https://relo.ch/urtwncrash_ps_part1.jpg
https://relo.ch/urtwncrash_ps_part2.jpg
Unfortunately "boot reboot" in ddb did not work so I had to upload
the photos. But at least one line number appeared in the output so now I
know how to build a kernel with debug symbols ;)
>
> Index: usbdi.c
> ===================================================================
> RCS file: /cvs/src/sys/dev/usb/usbdi.c,v
> retrieving revision 1.81
> diff -u -p -r1.81 usbdi.c
> --- usbdi.c 14 Mar 2015 03:38:50 -0000 1.81
> +++ usbdi.c 27 Apr 2015 13:08:33 -0000
> @@ -824,10 +824,8 @@ usb_insert_transfer(struct usbd_xfer *xf
> DPRINTFN(5,("usb_insert_transfer: pipe=%p running=%d timeout=%d\n",
> pipe, pipe->running, xfer->timeout));
> #ifdef DIAGNOSTIC
> - if (xfer->busy_free != XFER_FREE) {
> - printf("%s: xfer=%p not free\n", __func__, xfer);
> - return (USBD_INVAL);
> - }
> + if (xfer->busy_free != XFER_FREE)
> + panic("%s: xfer=%p not free\n", __func__, xfer);
> xfer->busy_free = XFER_ONQU;
> #endif
> s = splusb();
>