[email protected] (Mark Kettenis), 2020.11.29 (Sun) 14:05 (CET):
> > Date: Sun, 29 Nov 2020 12:54:10 +0000
> > From: Stuart Henderson <[email protected]>
> > 
> > On 2020/11/29 13:20, Theo Buehler wrote:
> > > On Sun, Nov 29, 2020 at 11:22:06AM +0000, Stuart Henderson wrote:
> > > > I have now seen mine crash with just the base "on by default" daemons,
> > > > one incoming ssh connection, top, and dhclient running.
> > > > 
> > > > I'm going to try bisecting old kernels to see if I can figure out when
> > > > it was introduced.
> > > > 
> > > > It might also be interesting to try GENERIC rather than GENERIC.MP.
> > > > 
> > > 
> > > Thanks for digging into this. Your APU seems much worse off than mine,
> > > which takes a few weeks before crashing these days, so it's not much use
> > > for bisecting.
> > > 
> > > Just a few data points that may help, assuming we see the same thing.
> > > 
> > > I had been running the firwmare 4.10.0.3 for more than a year with
> > > seemingly no issues, but I updated to 4.12.0.6 early November.
> > > 
> > > My snapshot updates prior to running into crashes were
> > > 
> > > Jul 7 -> Aug 21 -> Sep 21.
> > > 
> > > The first crash I had was with the Sep 21 snapshot after a bit more than
> > > a week uptime.
> > > 
> > > With early October snapshots it got particularly bad with crashes almost
> > > daily, that's when I reported. The first snap I saw crashing when going
> > > back and forth was from Sep 5.
> > > 
> > > Assuming you see the same thing as me, this would likely make the window
> > > for bisecting into
> > > 
> > > Jul 7 <-> Sep 5.
> > > 
> > > I always ran GENERIC.MP.
> > > 
> > 
> > Thanks, I found your earlier mail and started with Sep 11 which crashed
> > after about half an hour. I would have tried something around the 5th next
> > (there weren't many snaps built 5-13th) but given what you say I'll go a
> > little earlier so I'm now trying Sep 2 and I have kernels from a few other
> > snapshots around then lined up.
> 
> Please do note that the problems you (sthen@ and tb@) are describing
> are probably different from the issue Marcus is reporting in this
> thread.
> 
> Marcus, if your issue disappears when you're not running fstat, there
> is no reason for you to chase the issue by running older kernels.

Thanks for the hint!

I'll see tomorrow if the issue disappears, once people start working.

My plan would be to restart the hanging apu4 as soon as someone lends me
a helping hand with the power socket. This is the CARP master. I'll
leave all the extra thingies (fstat) running to make it hang again and get
more traces. Is there anything I should do when at the ddb> prompt?
Would remote console access via the CARP backup machine help? 

Marcus

Reply via email to