On 2017/11/13 08:44, Martin Pieuchot wrote: > On 12/11/17(Sun) 22:10, Stuart Henderson wrote: > > On 2017/11/12 22:48, Martin Pieuchot wrote: > > > On 12/11/17(Sun) 21:30, Stuart Henderson wrote: > > > > iked box, GENERIC.MP + WITNESS, -current as of Friday 10th: > > > > > > Weird, did you tweak "kern.splassert" on this box? Otherwise is looks > > > like a major corruption. > > > > It would have kern.splassert=2. (I know this can cause problems > > sometimes, though this would be the first time in 5+ years I've bumped > > into it, most of my routers where I have serial console have this set). > > Well the panic below correspond to a value of 0 or > 3.
Confirmed, it was definitely set to 2. > > > > login: panic: kernel diagnostic assertion "_kernel_lock_held()" failed: > > > > file "/src/cvs-openbsd/sys/kern/uipc_socket2.c", line 310 > > > ^^^ > > > Looks like one CPU is triggering this. > > > > > > > splassert: soassertlocked: want 1 have 256 > > > > > > > > panic: spl assertion failure in soassertlocked > > > ^^^ > > > That can't be coming from the same CPU.. > > > > > > > > > > > > > > > > Starting stack trace... > > > > Faulted in traceback, aborting... > > > > panic(splassert: if_down: want 1 have 256 > > > > panic: spl assertion failure in if_down) at > > > > Faulted in traceback, aborting... > > > > panicsplassert: if_down: want 1 have 256 > > > > +0x133panic: spl assertion failure in if_down > > > > Faulted in traceback, aborting... > > > > > > > > <repeated a few times> > > > > > > > > It's stuck at this point, I can't enter ddb. > > > > > > Are you running with WITNESS on purpose? Can you reproduce such problem > > > without it? I'm not saying it's WITNESS fault, but it's clear that > > > WITNESS kernels aren't ready for production yet. > > > > > > > I'm trying to get more information because it had either hanged or > > panicked previously (it didn't have serial connected at the time and > > the machine was needed so it had to be rebooted before I had chance > > to dig into it). > > From which snapshot was the kernel that hanged or panic'd? > It was running this: OpenBSD 6.2-current (GENERIC.MP) #199: Tue Nov 7 18:41:54 MST 2017 I've got it onto a remote control PDU now, now looking for some machine with an old enough ssh client to be able to connect to the PDU :-| Which kernel would be most useful to run now? I have now moved it to -current GENERIC.MP with the "fast path chunk removed from amd64/amd64/fpu.c fpu_kernel_enter() which we still suspect as maybe having some issues.