Hi, I'm running 20220910 snapshot. There wasn't even one crush for two weeks. Thanks!
On Thu, 1 Sep 2022 22:03:25 +0200 Radek <[email protected]> wrote: > Hello, > after 2 days uptime there was another crash. > > OpenBSD 7.2-beta (GENERIC.MP) #712: Mon Aug 29 12:35:51 MDT 2022 > [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > ddb{2}> show panic > *cpu2: uvm_fault(0xffffffff823e0440, 0x0, 0, 2) -> e > > ddb{2}> trace > amdgpu_vram_mgr_reserve_range(ffff8000226d3738,14,9) at > amdgpu_vram_mgr_reserve > _range+0x101 > esp46_input(ffff8000226d3738,ffff8000226d3744,32,2) at esp46_input+0xee > ip_deliver(ffff8000226d3738,ffff8000226d3744,32,2) at ip_deliver+0x137 > ipintr() at ipintr+0x69 > if_netisr(0) at if_netisr+0xea > taskq_thread(ffff80000002c080) at taskq_thread+0x100 > end trace frame: 0x0, count: -6 > > ddb{2}> show register > rdi 0xffff800000cf4478 > rsi 0 > rbp 0xffff8000226d3630 > rbx 0x4 > rdx 0xd0ec __ALIGN_SIZE+0xc0ec > rcx 0x5 > rax 0 > r8 0x10 > r9 0x40bc0e0c79f31f6e > r10 0 > r11 0x9a636d5cd973370a > r12 0x32 > r13 0x14 > r14 0xffff8000226d3738 > r15 0xffff800000cf4448 > rip 0xffffffff81968321 amdgpu_vram_mgr_reserve_range+0x101 > cs 0x8 > rflags 0x10246 __ALIGN_SIZE+0xf246 > rsp 0xffff8000226d3598 > ss 0x10 > amdgpu_vram_mgr_reserve_range+0x101: addb %al,0(%rax) > > ddb{2}> ps > PID TID PPID UID S FLAGS WAIT COMMAND > 86171 333411 63955 74 3 0x1100092 bpf pflogd > 63955 461469 1 0 3 0x80 netio pflogd > 44926 303198 1 0 3 0x100083 ttyin getty > 91823 366301 1 0 3 0x100098 kqread cron > 56167 242659 1 0 3 0x80 nanoslp apcupsd > 56167 147809 1 0 3 0x4000088 sigwait apcupsd > 56167 171020 1 0 3 0x4000080 nanoslp apcupsd > 47439 46683 1 99 3 0x1100090 kqread sndiod > 51482 395614 1 110 3 0x100090 kqread sndiod > 5163 480478 99386 95 3 0x1100092 kqread smtpd > 76226 349104 99386 103 3 0x1100092 kqread smtpd > 53394 15461 99386 95 3 0x1100092 kqread smtpd > 7344 376375 99386 95 3 0x100092 kqread smtpd > 63235 309137 99386 95 3 0x1100092 kqread smtpd > 48160 64945 99386 95 3 0x1100092 kqread smtpd > 99386 38247 1 0 3 0x100080 kqread smtpd > 41420 333984 1 77 3 0x1100090 kqread dhcpd > 17481 420686 1 0 3 0x88 kqread sshd > 18611 160721 77165 68 3 0x1000090 kqread isakmpd > 77165 391744 1 0 3 0x80 netio isakmpd > 79254 85499 1 0 3 0x100080 kqread ntpd > 10257 99703 79620 83 3 0x100092 kqread ntpd > 79620 463938 1 83 3 0x1100092 kqread ntpd > 91894 184130 24465 73 3 0x1100090 kqread syslogd > 24465 244414 1 0 3 0x100082 netio syslogd > 62105 141098 1 0 3 0x100080 kqread resolvd > 19257 376757 61629 77 3 0x100092 kqread dhcpleased > 76652 204506 61629 77 3 0x100092 kqread dhcpleased > 61629 486499 1 0 3 0x80 kqread dhcpleased > 90626 362267 95555 115 3 0x100092 kqread slaacd > 93187 97889 95555 115 3 0x100092 kqread slaacd > 95555 477868 1 0 3 0x100080 kqread slaacd > 11780 22955 0 0 3 0x14200 bored smr > 98305 221595 0 0 3 0x14200 pgzero zerothread > 96593 391889 0 0 3 0x14200 aiodoned aiodoned > 30232 412444 0 0 3 0x14200 syncer update > 45741 353942 0 0 3 0x14200 cleaner cleaner > 39902 310884 0 0 3 0x14200 reaper reaper > 65354 212624 0 0 3 0x14200 pgdaemon pagedaemon > 33348 495407 0 0 3 0x14200 mmctsk sdmmc0 > 74730 412089 0 0 3 0x14200 usbtsk usbtask > 86868 405536 0 0 3 0x14200 usbatsk usbatsk > 20735 139086 0 0 3 0x40014200 acpi0 acpi0 > 6266 77841 0 0 3 0x40014200 idle3 > 61610 247494 0 0 3 0x40014200 idle2 > 55796 232481 0 0 7 0x40014200 idle1 > 91772 351464 0 0 3 0x14200 bored sensors > 89632 253333 0 0 3 0x14200 bored softnet > 4662 418517 0 0 3 0x14200 bored softnet > 22887 421630 0 0 7 0x14200 softnet > *56210 145960 0 0 7 0x14200 softnet > 45994 274986 0 0 3 0x14200 bored systqmp > 23497 183217 0 0 3 0x14200 bored systq > 99138 346429 0 0 3 0x40014200 bored softclock > 83442 336559 0 0 7 0x40014200 idle0 > 1 436237 0 0 3 0x82 wait init > 0 0 -1 0 3 0x10200 scheduler swapper > > ddb{2}> mach ddbcpu 0 > Stopped at x86_ipi_db+0x12: leave > x86_ipi_db(ffffffff822d4ff0) at x86_ipi_db+0x12 > x86_ipi_handler() at x86_ipi_handler+0x80 > Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23 > _kernel_lock() at _kernel_lock+0xa6 > softintr_dispatch(0) at softintr_dispatch+0x49 > Xsoftclock() at Xsoftclock+0x1f > acpicpu_idle() at acpicpu_idle+0x11f > sched_idle(ffffffff822d4ff0) at sched_idle+0x280 > end trace frame: 0x0, count: 7 > > ddb{0}> mach ddbcpu 1 > Stopped at x86_ipi_db+0x12: leave > x86_ipi_db(ffff800022508ff0) at x86_ipi_db+0x12 > x86_ipi_handler() at x86_ipi_handler+0x80 > Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23 > acpicpu_idle() at acpicpu_idle+0x11f > sched_idle(ffff800022508ff0) at sched_idle+0x280 > end trace frame: 0x0, count: 10 > > ddb{1}> mach ddbcpu 2 > Stopped at amdgpu_vram_mgr_reserve_range+0x101: addb %al,0(%rax) > amdgpu_vram_mgr_reserve_range(ffff8000226d3738,14,9) at > amdgpu_vram_mgr_reserve > _range+0x101 > esp46_input(ffff8000226d3738,ffff8000226d3744,32,2) at esp46_input+0xee > ip_deliver(ffff8000226d3738,ffff8000226d3744,32,2) at ip_deliver+0x137 > ipintr() at ipintr+0x69 > if_netisr(0) at if_netisr+0xea > taskq_thread(ffff80000002c080) at taskq_thread+0x100 > end trace frame: 0x0, count: 9 > > ddb{2}> mach ddbcpu 3 > Stopped at x86_ipi_db+0x12: leave > x86_ipi_db(ffff80002251aff0) at x86_ipi_db+0x12 > x86_ipi_handler() at x86_ipi_handler+0x80 > Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23 > __mp_acquire_count(ffff80000002c100,ffff80000002c118) at __mp_acquire_count > taskq_next_work(ffff80000002c100,ffff8000226d93d0) at taskq_next_work+0x61 > taskq_thread(ffff80000002c100) at taskq_thread+0xeb > end trace frame: 0x0, count: 9 > ddb{3}> > > > On Wed, 31 Aug 2022 22:07:45 +0200 > Radek <[email protected]> wrote: > > > Hello Alexandr, hello Alexander, > > > > > does your box run also diff committed [1] by bluhm@ ~week ago? > > No, I didn't. I missed that diff. I upgraded to a new snapshot yesterday. I > > works fine as far. > > OpenBSD 7.2-beta (GENERIC.MP) #712: Mon Aug 29 12:35:51 MDT 2022 > > [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > > > Thank you Alexander for your extensive explanation of the proper ddb > > commands order. > > > > Radek > > > > > > On Mon, 29 Aug 2022 12:30:31 +0200 > > Alexander Bluhm <[email protected]> wrote: > > > > > On Mon, Aug 29, 2022 at 04:42:45AM +0200, Radek wrote: > > > > the same problem occurs on -current. > > > > > > It is not the same problem. Traces are different. But I guess > > > your setup triggers some sort of race. > > > > > > Previous crashes with 7.1 were in route and IPsec, now it is in pf. > > > Unfortunately you missed my pf fragment fix by a couple of hours. > > > Please try a newer snapshot. > > > > > > OpenBSD 7.2-beta (GENERIC.MP) #705: Mon Aug 22 12:25:07 MDT 2022 > > > Changes by: [email protected] 2022/08/22 14:35:39 > > > > > > I could not figure out what is wrong with 7.1-stable crashes. The > > > register and ps output are not from the CPU where the crash happened. > > > You have to run show register and ps before switching CPU with mach > > > ddbcpu. > > > > > > So first run show panic. Then trace, show register, ps. > > > Finally inspect the other CPU with mach ddbcpu. > > > > > > The number in ddb{2}> prompt shows the CPU you are currently on. > > > If "show panic" mentions more than one CPU, the one with the * is > > > the interresting one. Usually ddb drops to that initially. Traces > > > from other CPU help to see if something was running concurrently. > > > > > > bluhm > > > > > > > > > Radek > > > > > Radek Radek
