Hi Martin,
On Wed, Aug 29, 2018 at 10:44:51AM -0300, Martin Pieuchot wrote:
> On 28/08/18(Tue) 22:22, Tom Murphy wrote:
> > On Tue, Aug 28, 2018 at 04:20:41PM -0300, Martin Pieuchot wrote:
> > > Hello Tom,
> > >
> > > On 28/08/18(Tue) 11:10, Tom Murphy wrote:
> > > > On Tue, Aug 28, 2018 at 02:49:38PM +0900, Bryan Linton wrote:
> > > > > On 2018-08-25 21:40:57, Tom Murphy <[email protected]> wrote:
> > > > > > On Thu, Aug 23, 2018 at 08:45:54PM +0900, Tom Murphy wrote:
> > > > > > > I've narrowed it down.
> > > > > > >
> > > > > > >Last kernel where adb works: June 24 09:59:46 MDT 2018
> > > > > > >1st Kernel where adb panics: June 25 13:10:32 MDT 2018
> > >
> > > The real problem is in the xhci(4) driver. When a command with a
> > > timeout is submitted we should ensure no other command is enqueued
> > > before continuing. Sadly the driver did not include any mechanism
> > > to serialize command submissions. Diff below does that and should
> > > fix your problem.
> > >
> > > Can you try it on top of -current? Make sure you have no diff
> > > reverted.
> >
> > Hi,
> >
> > I think I spoke a little too soon. I found a case where it
> > started printing xhci0: command timeout over and over until
> > eventually the kernel panics with a protection fault. I couldn't
> > catch the backtrace properly, but it looked around the same area
> > as this original bug report.
>
> Without backtrace I can't make progress.
Apologies for the delay. Just found time to reproduce this. Here's
a backtrace:
kernel: protection fault trap, code=0
Stopped at xhci_abort_xfer+0x57: cmpb $0,0x471(%r14)
ddb{1}> bt
xhci_abort_xfer(9cbfd0450eaf1f9c,4) at xhci_abort_xfer+0x57
usbd_transfer(b41ac578436f71d1) at usb_transfer+0x24d
ugen_do_read(7fd9672719de8ccd,ffff800031d66660,ffff800001530000,
ffffff047e7b2ba0) at ugen_do_read+0x347
ugenread(d30250dfbb30abef,ffff800031d66660,ffff800031d66560) at
ugenread+0x47
spec_read(ae30accb64c92cb4) at spec_read+0xab
VOP_READ(1be36585d83bbddf,2d04c5610d671db7,ffffff047e7b2ba0,
ffffff0400000000) at VOP_READ+0x49
vn_read(f8ef5df9c7a62c48,ffffff03ff215788,10) at vn_read+0xf5
dofilereadv(1dd31abb163e61f3,30,ffff8000fffea028,3,
ffff800031d66790) at dofilereadv+0xe0
sys_read(cc3368a684a7a8f,2d04c5610d671db7,18) at sys_read+0x5c
syscall(eec44dfe2a55ecc1) at syscall+0x32a
Xsyscall(6,3,2b360799d50,3,1,2b3d0a18e00) at Xsyscall+0x128
end of kernel
end trace frame: 0x2b39d0546f0, count: -11
I managed to trigger it by disconnecting the phone, reconnecting it,
then running adb kill-server to re-run adb start-server to see if it
picked up on the phone and the panic happened right after
'adb kill-server'.
Please let me know if there's anything more you'd like me to test.
Thanks,
Tom