Hello, Thanks for your bug report.
On 07/09/21(Tue) 15:18, M Smith wrote: > > Synopsis: OpenBSD amd64 6.9 repeatable kernel panic starting X > > Category: kernel > > Environment: > > System : OpenBSD 6.9 > Details : OpenBSD 6.9 (GENERIC.MP) #4: Tue Aug 10 08:12:23 MDT 2021 > > r...@syspatch-69-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > Architecture: OpenBSD.amd64 > Machine : amd64 > > > Description: > > I have been investigating a largely repeatable OpenBSD 6.9 > amd64 panic. Essentially the OS drops into the kernel debugger about 90% of > the time when starting X on specific hardware, and is doing so with what > seems like a memory related issue - possibly errant modification by > concurrent threads. Indeed. You're certainly hitting a VM/pmap bug. > The event is reproducible across two independent machines (both new). > Each machine has identical underlying hardware. A memory checker run > overnight on one machine did not identify any underlying memory issues. That points to something in your setup which exposes the bug. > The hardware: Avalue EMS-TGL-S85-A1-1R, CPU an 11th Gen Intel(R) > Core(TM) i7-1185G7E @ 2.80GHz with 2x 16GB memory boards (32GB in total). > > The mentioned possible errant memory modification, the assertion > underlying this panic > (https://www.sirranet.co.nz/openbsd_542456/69_panic.html) suggests that > kernel execution has failed to obtain a necessary exclusivity lock. Various > other panics differ in that many feature assertions based on "pool_do_get ... > offset ???" with the offset identifying the trigger condition, hinting at a > memory inconsistency. > > Testing on 7.0-current > (https://www.sirranet.co.nz/openbsd_542456/70_panic.html) sometimes results > in a panic on boot before invoking startX, other times the boot fails to > complete cleanly at the kernel linking step with the error "reodering > libraries ld in calloc(): chunk infor corrupted" and simular errors. Whether > these two events are related to the 6.9 panic is anything but conclusive. > > I see others have posted what looks like the same issue. I have posted > the above detail however as the assert identifying the lack of kernel lock > looks as though it may be of some value. > https://marc.info/?t=161769314800002&r=1&w=2 > https://marc.info/?t=162390602600001&r=1&w=2 All those report have in common a 1th Gen Intel CPU. > Any ideas would be greatly appreciated. You could start by booting bsd.sp to rule out any HW problem. Does the corruption happen with a vanilla install or does running particular program makes it easier to happen? > I can easily test/re-test on both 6.9 and 7.0-current). Does it also happen if you disable drm at boot?