On Thu, Feb 07, 2019 at 11:18:40PM -0600, Kyle wrote:
> I recently upgraded a box running 6.2 to 6.4 via clean install. After a few
> days of running normally it started locking up, usually within a minute or so
> after booting up to the login prompt. ddb appears on the console.
>
> I eventually thought to try booting bsd.sp, which has been running for about
> a day now without locking up.
>
> Any clues to point me in the right direction would be much appreciated.
>
> Here's some excerpts from the serial console (different sessions) and a dmesg:
>
>
> ddb{0}> show panic
> the kernel did not panic
> ddb{0}> trace
> acpicpu_idle() at acpicpu_idle+0x1ea
> sched_idle(0) at sched_idle+0x245
> end trace frame: 0x0, count: -2
>
>
>
> login: NMI ... going o debugger
NMIs (Non Maskable Interrupts) are often an indication of hardware problems.
-Otto
> ddb{0}䀿 movq $0x8,%rcxuaei[1;24r c
> db{}>
> ddb{0}>
> d{0}>
> ddb{0}
> ddb{0}>
> ddb{0}>
> ddb{�}>
> ddb{0>
> db{0}> show panic
> the kernel did not panic
> ddb{0}>
> the kernel did not panic
> ddb{0}>
> the kernel did not panic
> ddb{0}>
> the kernel did not panic
> ddb{0}>
> the kernel di no paniddb{0}>
> the erne i nic
> ddb{0}>
> the kernel did not nic
> ddb{0}>
> thernel id not nc
> ddb{0}>
> the keel did notpanic
> ddb{0}>
> theknel did not nic
> ddb{0}>
> the kernel did not panic
> ddb{0}>
> the kernel did not panic
> ddb{0}>
> the kernel did not panc
> ddb0}>
> the kernel did not panic
> ddb{0}> traccce
> No such command
> ddb{0}>
> the kernel did not panic
> ddb{0}> tracce
> No such command
> ddb{0}> traace
> No such command
> ddb{0}> trace
> acpicpu_idle() at acpicpu_idle+0x1ea
> sched_idle(0) at sched_idle+0x245
> end trace frame: 0x0, count: -2
> ddb{0}>
> acpicpu_idle() at acpicpu_idle+0x1ea
> end trace frame: 0xffff8000218fca60, count: 0
> ddb{0}>
> acpicpu_idle() at acpicpu_idle+0x1ea
> end trace frame: 0xffff8000218fca60, count: 0
> ddb{0}>
> acpicpu_idle() at acpicpu_idle+0x1ea
> end trace frame: 0xffff8000218fca60, count: 0
> ddb{0}>
> acpicpu_idle() at acpicpu_idle+0x1ea
> end trace frame: 0xffff8000218fca60, count: 0
> ddb{0}>
> acpicpu_idle() at acpicpu_idle+0x1ea
> end trace frame: 0xffff8000218fca60, count: 0
> ddb{0}>
> acpicpu_idle() at acpicpu_idle+0x1ea
> end trace frame: 0xffff8000218fca60, count: 0
> ddb{0}> boot dump
> syncing disks..
>
>
>
>
> ddb{1}> boot dump
> syncing disks...panic: kernel diagnostic assertion "p->p_wchan == NULL"
> failed: file "/usr/src/sys/kern/kern_sched.c", line 338
> Stopped at db_enter+0x12: popq %r11
> TID PID UID PRFLAGS PFLAGS CPU COMMAND
> 72309 19595 73 0x100010 0x80 0 syslogd
> db_enter() at db_enter+0x12
> panic() at panic+0x120
> __assert(ffffffff811929f4,ffff80002191fa00,ffff800021750ff0,ffff8000ffffe960)
> a
> t __assert+0x24
> sched_chooseproc() at sched_chooseproc+0x241
> mi_switch() at mi_switch+0x1b4
> sleep_finish(6d3fbc03daad5a02,ffff80002191fb10) at sleep_finish+0x7f
> sleep_finish_all(270c6ff115c03531,ffff80002191fb10) at sleep_finish_all+0x1f
> tsleep(64263b673ec8ed92,ffffff02417d4200,ffffff027f616830,65420) at
> tsleep+0xcd
>
> getblk(f6e14443bcd449df,ffffff027f6167d0,ffff80002191fd00,0,ffffff027f3d3000)
> a
> t getblk+0xf5
> bread(ffff800000145000,ffffff027f616a28,ffffff027f3d3000,0) at bread+0x1b
> ffs_update(292bb10918b461bf,ffffff027f616a28) at ffs_update+0xfc
> VOP_FSYNC(f6e14443bcb92266,ffff80002191fe38,2be1d547d68afbf,ffff8000ffffe960)
> a
> t VOP_FSYNC+0x52
> ffs_sync_vnode(5116a32c89f5a557,ffff80002191fe38) at ffs_sync_vnode+0xd2
> vfs_mount_foreach_vnode(9582de35d54b8f61,2,ffff8000ffffe960) at
> vfs_mount_forea
> ch_vnode+0x4e
> end trace frame: 0xffff80002191fea0, count: 0
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports. Insufficient info makes it difficult to find and fix bugs.
> ddb{1}> boot sync
> panic: kernel diagnostic assertion "__mp_lock_held(&sched_lock, curcpu()) ==
> 0" failed: file "/usr/src/sys/kern/kern_lock.c", line 63
> Stopped at db_enter+0x12: popq %r11
> db_enter() at db_enter+0x12
> panic() at panic+0x120
> __assert(ffffffff811929f4,ffff80002191f530,0,ffffff02369aeae8) at
> __assert+0x24
>
> _kernel_lock(778d687e7309b96e,1) at _kernel_lock+0xea
> solock(537a4000962b3bf3) at solock+0x44
> route_input(6e54ffebe841494b,ffff80002191f610,ffff80000012f000) at
> route_input+
> 0xd1
> if_down(ffff80000012f000) at if_down+0x94
> if_downall() at if_downall+0x62
> boot(c) at boot+0x8d
> reboot(4800) at reboot+0x5a
> nvramattach(ffffffff81d05260) at nvramattach
> db_boot_sync_cmd(ffffffff819a1c9e,ffff80002191f6c0,ffffffff81d05260,1) at
> db_bo
> ot_sync_cmd+0xe
> db_command(be4ef64b60647db8,0) at db_command+0x2b4
> db_command_loop() at db_command_loop+0x96
> end trace frame: 0xffff80002191f820, count: 0
> ddb{1}>
>
>
>
> ddb{1}> dmesg
> OpenBSD 6.4 (GENERIC.MP) #6: Sat Jan 26 20:37:44 CET 2019
>
> [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.
> MP
> real mem = 8544854016 (8149MB)
> avail mem = 8276611072 (7893MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.8 @ 0x7f4d8000 (50 entries)
> bios0: vendor American Megatrends Inc. version "1.1a" date 08/27/2015
> bios0: Supermicro A1SAi
> acpi0 at bios0: rev 2
> acpi0: sleep states S0 S5
> acpi0: tables DSDT FACP FPDT FIDT SPMI MCFG WDAT UEFI APIC BDAT HPET SSDT
> HEST B
> ERT ERST EINJ
> acpi0: wakeup devices PEX1(S0) PEX2(S0) PEX3(S0) PEX4(S0) EHC1(S0)
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimcfg0 at acpi0
> acpimcfg0: addr 0xe0000000, bus 0-255
> acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Atom(TM) CPU C2558 @ 2.40GHz, 2400.43 MHz, 06-4d-08
> cpu0:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,C
> FLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,V
> MX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,RDRAND,
> NXE,RDTSCP,LONG,LAHF,3DNOWP,PERF,ITSC,SMEP,ERMS,SENSOR,ARAT,MELTDOWN
> cpu0: 1MB 64b/line 16-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
> cpu0: apic clock running at 100MHz
> cpu0: mwait min=64, max=64, C-substates=0.2.0.0.0.0.3, IBE
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: Intel(R) Atom(TM) CPU C2558 @ 2.40GHz, 2400.01 MHz, 06-4d-08
> cpu1:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,C
> FLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,V
> MX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,RDRAND,
> NXE,RDTSCP,LONG,LAHF,3DNOWP,PERF,ITSC,SMEP,ERMS,SENSOR,ARAT,MELTDOWN
> cpu1: 1MB 64b/line 16-way L2 cache
> cpu1: smt 0, core 1, package 0
> cpu2 at mainbus0: apid 4 (application processor)
> cpu2: Intel(R) Atom(TM) CPU C2558 @ 2.40GHz, 2400.01 MHz, 06-4d-08
> cpu2:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,C
> FLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,V
> MX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,RDRAND,
> NXE,RDTSCP,LONG,LAHF,3DNOWP,PERF,ITSC,SMEP,ERMS,SENSOR,ARAT,MELTDOWN
> cpu2: 1MB 64b/line 16-way L2 cache
> cpu2: smt 0, core 2, package 0
> cpu3 at mainbus0: apid 6 (application processor)
> cpu3: Intel(R) Atom(TM) CPU C2558 @ 2.40GHz, 2400.01 MHz, 06-4d-08
> cpu3:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,C
> FLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,V
> MX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,RDRAND,
> NXE,RDTSCP,LONG,LAHF,3DNOWP,PERF,ITSC,SMEP,ERMS,SENSOR,ARAT,MELTDOWN
> cpu3: 1MB 64b/line 16-way L2 cache
> cpu3: smt 0, core 3, package 0
> ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 20, 24 pins
> acpihpet0 at acpi0: 14318179 Hz
> acpiprt0 at acpi0: bus 0 (PCI0)
> acpiprt1 at acpi0: bus 1 (PEX1)
> acpiprt2 at acpi0: bus 2 (BR04)
> acpiprt3 at acpi0: bus 3 (PEX2)
> acpiprt4 at acpi0: bus 4 (PEX3)
> acpiprt5 at acpi0: bus -1 (PEX4)
> acpicpu0 at acpi0: C2(350@41 mwait.3@0x51), C1(1000@1 mwait.1), PSS
> acpicpu1 at acpi0: C2(350@41 mwait.3@0x51), C1(1000@1 mwait.1), PSS
> acpicpu2 at acpi0: C2(350@41 mwait.3@0x51), C1(1000@1 mwait.1), PSS
> acpicpu3 at acpi0: C2(350@41 mwait.3@0x51), C1(1000@1 mwait.1), PSS
> "PNP0003" at acpi0 not configured
> acpicmos0 at acpi0
> "IPI0001" at acpi0 not configured
> "PNP0C33" at acpi0 not configured
> ipmi at mainbus0 not configured
> [83/1848]
> cpu0: Enhanced SpeedStep 2400 MHz: speeds: 2400, 2300, 2200, 2100, 2000,
> 1900, 1
> 800, 1700, 1600, 1500, 1400, 1300, 1200 MHz
> pci0 at mainbus0 bus 0
> pchb0 at pci0 dev 0 function 0 "Intel Atom C2000 Host" rev 0x02
> ppb0 at pci0 dev 1 function 0 "Intel Atom C2000 PCIE" rev 0x02: msi
> pci1 at ppb0 bus 1
> ppb1 at pci1 dev 0 function 0 "ASPEED Technology AST1150 PCI" rev 0x03
> pci2 at ppb1 bus 2
> vga1 at pci2 dev 0 function 0 "ASPEED Technology AST2000" rev 0x30
> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> ppb2 at pci0 dev 2 function 0 "Intel Atom C2000 PCIE" rev 0x02: msi
> pci3 at ppb2 bus 3
> xhci0 at pci3 dev 0 function 0 "Renesas uPD720201 xHCI" rev 0x03: msi, xHCI
> 1.0
>
> usb0 at xhci0: USB revision 3.0
> uhub0 at usb0 configuration 1 interface 0 "Renesas xHCI root hub" rev
> 3.00/1.00
> addr 1
> ppb3 at pci0 dev 3 function 0 "Intel Atom C2000 PCIE" rev 0x02: msi
> pci4 at ppb3 bus 4
> vendor "Intel", unknown product 0x1f18 (class processor subclass
> Co-processor, r
> ev 0x02) at pci0 dev 11 function 0 not configured
> pchb1 at pci0 dev 14 function 0 "Intel Atom C2000 RAS" rev 0x02
> "Intel Atom C2000 RCEC" rev 0x02 at pci0 dev 15 function 0 not configured
> "Intel Atom C2000 SMBus" rev 0x02 at pci0 dev 19 function 0 not configured
> em0 at pci0 dev 20 function 0 "Intel I354 SGMII" rev 0x03: msi, address
> 0c:c4:7
> a:ad:2a:e4
> em1 at pci0 dev 20 function 1 "Intel I354 SGMII" rev 0x03: msi, address
> 0c:c4:7
> a:ad:2a:e5
> em2 at pci0 dev 20 function 2 "Intel I354 SGMII" rev 0x03: msi, address
> 0c:c4:7
> a:ad:2a:e6
> em3 at pci0 dev 20 function 3 "Intel I354 SGMII" rev 0x03: msi, address
> 0c:c4:7
> a:ad:2a:e7
> ehci0 at pci0 dev 22 function 0 "Intel Atom C2000 USB" rev 0x02: apic 2 int 23
> usb1 at ehci0: USB revision 2.0
> uhub1 at usb1 configuration 1 interface 0 "Intel EHCI root hub" rev 2.00/1.00
> a
> ddr 1
> ahci0 at pci0 dev 23 function 0 "Intel Atom C2000 AHCI" rev 0x02: msi, AHCI
> 1.3
>
> scsibus1 at ahci0: 32 targets
> ahci1 at pci0 dev 24 function 0 "Intel Atom C2000 AHCI" rev 0x02: msi, AHCI
> 1.3
>
> ahci1: port 0: 3.0Gb/s
> scsibus2 at ahci1: 32 targets
> sd0 at scsibus2 targ 0 lun 0: <ATA, SanDisk SSD P4 3, SSD> SCSI3 0/direct
> fixed
> naa.5001b44509be486c
> sd0: 30533MB, 512 bytes/sector, 62533296 sectors, thin
> pcib0 at pci0 dev 31 function 0 "Intel Atom C2000 PCU" rev 0x02
> ichiic0 at pci0 dev 31 function 3 "Intel Atom C2000 PCU SMBus" rev 0x02: apic
> 2
> int 18
> iic0 at ichiic0
> sdtemp0 at iic0 addr 0x18: at30ts00
> iic0: addr 0x2e 00=3f words 00=3f3f 01=0000 02=0000 03=0000 04=0000 05=0000
> 06=
> 0000 07=0000
> spdmem0 at iic0 addr 0x50: 8GB DDR3 SDRAM ECC PC3-12800 with thermal sensor
> isa0 at pcib0
> isadma0 at isa0
> com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
> com0: console
> com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
> pckbc0 at isa0 port 0x60/5 irq 1 irq 12
> pckbd0 at pckbc0 (kbd slot)
> wskbd0 at pckbd0: console keyboard, using wsdisplay0
> pms0 at pckbc0 (aux slot)
> wsmouse0 at pms0 mux 0
> pcppi0 at isa0 port 0x61
> spkr0 at pcppi0
> vmm0 at mainbus0: VMX/EPT (using slow L1TF mitigation)
> uhub2 at uhub1 port 1 configuration 1 interface 0 "Intel product 0x07db" rev
> 2.
> 00/0.02 addr 2
> uhub3 at uhub2 port 3 configuration 1 interface 0 "ATEN International product > 0
> x7000" rev 2.00/0.00 addr 3
> uhidev0 at uhub3 port 1 configuration 1 interface 0 "ATEN International
> product
> 0x2419" rev 1.10/1.00 addr 4
> uhidev0: iclass 3/1
> ukbd0 at uhidev0: 8 variable keys, 6 key codes
> wskbd1 at ukbd0 mux 1
> wskbd1: connecting to wsdisplay0
> uhidev1 at uhub3 port 1 configuration 1 interface 1 "ATEN International
> product
> 0x2419" rev 1.10/1.00 addr 4
> uhidev1: iclass 3/1
> ums0 at uhidev1: 3 buttons, Z dir
> wsmouse1 at ums0 mux 0
> vscsi0 at root
> scsibus3 at vscsi0: 256 targets
> softraid0 at root
> scsibus4 at softraid0: 256 targets
> root on sd0a (f7ed85462a4bd128.a) swap on sd0b dump on sd0b
> WARNING: / was not properly unmounted
> NMI ... going to debugger
> NSMtIo p..p.ed agtoi ng tao debuggecrpi
> cpu_idle+0x1ea: movq $0x8,%rcx
> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}>
> ddb{1}>
> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}>
> ddb{1}
> > ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}>
> > ddb{1
> }> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}>
> ddb{
> 1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}>
> ddb
> {1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}>
> dd
> b{1}> ddb{1}> ddb{1}> ddb{1}> ddb{1}>
>