On Tue, Jun 2, 2026 at 9:25 PM Bjoern A. Zeeb
<[email protected]> wrote:
>
> On Wed, 27 May 2026, Bjoern A. Zeeb wrote:
>
> > On Tue, 26 May 2026, Bjoern A. Zeeb wrote:
> >
> >> Hi,
> >>
> >> I got some LinuxKPI problems sorted and can finally shutdown a system w/o
> >> a driver panicing but now I see on a recent main (pxe booted in bhyve);
> >> this seems reproducible and typing reset I get the next panic and the next
> >> and the next and ... until bhyve stops after scrolling for a few seconds.
> >>
> >> Anyone seen this or any ideas?  I'll try to build a plain main kernel
> >> otherwise
> >> to check that it's not anything else...
> >
> > I have already found the next LinuxKPI bug.
> >
> > If I just boot a kernel and do a shutdown -r I do not run into it
> > so unless it rings a bell for someone else as well, please ignore this for
> > now.
>
> It just happened again;  no known LinuxKPI bugs in the way this time.
>
> So maybe it's real after all...
>
>
> >> Syncing disks, vnodes remaining... 0 done
> >> All buffers synced.
> >> Uptime: 46s
> >> kernel trap 12 with interrupts disabled
> >>
> >>
> >> Fatal trap 12: page fault while in kernel mode
> >> cpuid = 0; apic id = 00
> >> fault virtual address   = 0xfffffe00a58a0630
> >> fault code              = supervisor read data, page not present
> >> instruction pointer     = 0x20:0xffffffff80c0ebe8
> >> stack pointer           = 0x28:0xfffffe008bc49bb0
> >> frame pointer           = 0x28:0xfffffe008bc49c20
> >> code segment            = base 0x0, limit 0xfffff, type 0x1b
> >>                        = DPL 0, pres 1, long 1, def32 0, gran 1
> >> processor eflags        = resume, IOPL = 0
> >> current process         = 11 (idle: cpu0)
> >> rdi: 0000000000002f2c rsi: 0000000000008000 rdx: 0000000000002e2d
> >> rcx: 0000000000002e2c  r8: fffffe00a58a0630  r9: 000000007fff2744
> >> rax: fffffe000ef4e000 rbx: 0000000000002e2c rbp: fffffe008bc49c20
> >> r10: 00000000000003e7 r11: 000000000000044c r12: 0000002f2d000000
> >> r13: 0000002f2d000000 r14: 0000002e2dd1597a r15: ffffffff82b28300
> >> trap number             = 12
> >> panic: page fault
> >> cpuid = 0
> >> time = 1779819492
> >> KDB: stack backtrace:
> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x36/frame
> >> 0xfffffe008bc498e0
> >> vpanic() at vpanic+0x149/frame 0xfffffe008bc49a10
> >> panic() at panic+0x43/frame 0xfffffe008bc49a70
> >> trap_pfault() at trap_pfault+0x449/frame 0xfffffe008bc49ae0
> >> calltrap() at calltrap+0x8/frame 0xfffffe008bc49ae0
> >> --- trap 0xc, rip = 0xffffffff80c0ebe8, rsp = 0xfffffe008bc49bb0, rbp =
> >> 0xfffffe008bc49c20 ---
> >> callout_process() at callout_process+0x138/frame 0xfffffe008bc49c20
> >> handleevents() at handleevents+0x19a/frame 0xfffffe008bc49c60
> >> timercb() at timercb+0x19e/frame 0xfffffe008bc49cc0
> >> lapic_handle_timer() at lapic_handle_timer+0xa4/frame 0xfffffe008bc49cf0
> >> Xtimerint() at Xtimerint+0xb1/frame 0xfffffe008bc49cf0
> >> --- interrupt, rip = 0xffffffff810b1104, rsp = 0xfffffe008bc49dc0, rbp =
> >> 0xfffffe008bc49dd0 ---
> >> cpu_idle_acpi() at cpu_idle_acpi+0x54/frame 0xfffffe008bc49dd0
> >> cpu_idle() at cpu_idle+0xa6/frame 0xfffffe008bc49df0
> >> sched_ule_idletd() at sched_ule_idletd+0x524/frame 0xfffffe008bc49ef0
> >> fork_exit() at fork_exit+0x82/frame 0xfffffe008bc49f30
> >> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe008bc49f30
> >> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> >> KDB: enter: panic
> >> [ thread pid 11 tid 100003 ]
> >> Stopped at      kdb_enter+0x33: movq    $0,0x15be0c2(%rip)
> >> db> reset
> >> panic: mtx_lock_spin: recursed on non-recursive mutex callout @
> >> /usr/src/sys/kern/kern_timeout.c:576
> >>
> >> cpuid = 0
> >> time = 1779819492
> >> KDB: stack backtrace:
> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x36/frame
> >> 0xfffffe008bc49160
> >> vpanic() at vpanic+0x149/frame 0xfffffe008bc49290
> >> panic() at panic+0x43/frame 0xfffffe008bc492f0
> >> __mtx_lock_spin_flags() at __mtx_lock_spin_flags+0x11b/frame
> >> 0xfffffe008bc49330
> >> _callout_stop_safe() at _callout_stop_safe+0x106/frame 0xfffffe008bc493a0
> >> shutdown_resettodr() at shutdown_resettodr+0x15/frame 0xfffffe008bc493b0
> >> kern_reboot() at kern_reboot+0x2a3/frame 0xfffffe008bc493f0
> >> db_reset() at db_reset+0x108/frame 0xfffffe008bc49420
> >> db_command() at db_command+0x3aa/frame 0xfffffe008bc494e0
> >> db_command_loop() at db_command_loop+0x4d/frame 0xfffffe008bc494f0
> >> db_trap() at db_trap+0x100/frame 0xfffffe008bc49590
> >> kdb_trap() at kdb_trap+0x25f/frame 0xfffffe008bc496e0
> >> trap() at trap+0x888/frame 0xfffffe008bc49810
> >> calltrap() at calltrap+0x8/frame 0xfffffe008bc49810
> >> --- trap 0x3, rip = 0xffffffff80c44f43, rsp = 0xfffffe008bc498e8, rbp =
> >> 0xfffffe008bc49a10 ---
> >> kdb_enter() at kdb_enter+0x33/frame 0xfffffe008bc49a10
> >> panic() at panic+0x43/frame 0xfffffe008bc49a70
> >> trap_pfault() at trap_pfault+0x449/frame 0xfffffe008bc49ae0
> >> calltrap() at calltrap+0x8/frame 0xfffffe008bc49ae0
> >> --- trap 0xc, rip = 0xffffffff80c0ebe8, rsp = 0xfffffe008bc49bb0, rbp =
> >> 0xfffffe008bc49c20 ---
> >> callout_process() at callout_process+0x138/frame 0xfffffe008bc49c20
> >> handleevents() at handleevents+0x19a/frame 0xfffffe008bc49c60
> >> timercb() at timercb+0x19e/frame 0xfffffe008bc49cc0
> >> lapic_handle_timer() at lapic_handle_timer+0xa4/frame 0xfffffe008bc49cf0
> >> Xtimerint() at Xtimerint+0xb1/frame 0xfffffe008bc49cf0
> >> --- interrupt, rip = 0xffffffff810b1104, rsp = 0xfffffe008bc49dc0, rbp =
> >> 0xfffffe008bc49dd0 ---
> >> cpu_idle_acpi() at cpu_idle_acpi+0x54/frame 0xfffffe008bc49dd0
> >> cpu_idle() at cpu_idle+0xa6/frame 0xfffffe008bc49df0
> >> sched_ule_idletd() at sched_ule_idletd+0x524/frame 0xfffffe008bc49ef0
> >> fork_exit() at fork_exit+0x82/frame 0xfffffe008bc49f30
> >> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe008bc49f30
> >> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> >> panic: mtx_lock_spin: recursed on non-recursive mutex callout @
> >> /usr/src/sys/kern/kern_timeout.c:576
> >>
> >> cpuid = 0
> >> time = 1779819492
> >> ..
> >> ..
> >> ..
> >>
> >>
> >>
> >
> >
>
> --
> Bjoern A. Zeeb                                                     r15:7
>

Can you resolve this?
> callout_process() at callout_process+0x138

Just guessing from my local kernel, that may be the first touch of a
callout in the LIST_FOREACH_SAFE loop of callout_process.  If so that
may suggest a use after free of some callout, with a dangling pointer
to the callout remaining in the list.  Maybe someone freed some
callout without stopping it.  Or maybe the list is corrupt in some
other way.

Ryan

Reply via email to