On 30/08/18(Thu) 18:15, Tom Murphy wrote:
> On Thu, Aug 30, 2018 at 10:30:04AM -0300, Martin Pieuchot wrote:
> > On 30/08/18(Thu) 14:00, Tom Murphy wrote:
> > > On Wed, Aug 29, 2018 at 10:44:51AM -0300, Martin Pieuchot wrote:
> > > > On 28/08/18(Tue) 22:22, Tom Murphy wrote:
> > > > > On Tue, Aug 28, 2018 at 04:20:41PM -0300, Martin Pieuchot wrote:
> > > > > > Hello Tom,
> > > > > > 
> > > > > > On 28/08/18(Tue) 11:10, Tom Murphy wrote:
> > > > > > > On Tue, Aug 28, 2018 at 02:49:38PM +0900, Bryan Linton wrote:
> > > > > > > > On 2018-08-25 21:40:57, Tom Murphy <[email protected]> wrote:
> > > > > > > > > On Thu, Aug 23, 2018 at 08:45:54PM +0900, Tom Murphy wrote:
> > > > > > > > > >  I've narrowed it down. 
> > > > > > > > > >
> > > > > > > > > >Last kernel where adb works:  June 24 09:59:46 MDT 2018
> > > > > > > > > >1st Kernel where adb panics:  June 25 13:10:32 MDT 2018
> > > > > > 
> > > > > > The real problem is in the xhci(4) driver.  When a command with a
> > > > > > timeout is submitted we should ensure no other command is enqueued
> > > > > > before continuing.  Sadly the driver did not include any mechanism
> > > > > > to serialize command submissions.  Diff below does that and should
> > > > > > fix your problem.
> > > > > > 
> > > > > > Can you try it on top of -current?  Make sure you have no diff
> > > > > > reverted.
> > > > > 
> > > > > Hi,
> > > > > 
> > > > >   I think I spoke a little too soon. I found a case where it
> > > > > started printing xhci0: command timeout over and over until
> > > > > eventually the kernel panics with a protection fault. I couldn't
> > > > > catch the backtrace properly, but it looked around the same area
> > > > > as this original bug report.
> > > > 
> > > > Without backtrace I can't make progress.
> > > 
> > > Apologies for the delay. Just found time to reproduce this. Here's
> > > a backtrace:
> > 
> > Almost, can you send the full dmesg with the backtrace at the end?
> 
> Hi, Sorry, here's the dmesg with the backtrace.

Is it the live dmesg?  I don't see any 'xhci0: command timeout'.  Btw
this message doesn't exist so I can't understand which code path is
triggering the problem.  Could you build a kernel with XHCI_DEBUG
enabled, reproduce the page fault and send the dmesg (at least the last
10 lines before crashing) + the trace?

Reply via email to