Hello,
after 2 days uptime there was another crash.
OpenBSD 7.2-beta (GENERIC.MP) #712: Mon Aug 29 12:35:51 MDT 2022
[email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
ddb{2}> show panic
*cpu2: uvm_fault(0xffffffff823e0440, 0x0, 0, 2) -> e
ddb{2}> trace
amdgpu_vram_mgr_reserve_range(ffff8000226d3738,14,9) at amdgpu_vram_mgr_reserve
_range+0x101
esp46_input(ffff8000226d3738,ffff8000226d3744,32,2) at esp46_input+0xee
ip_deliver(ffff8000226d3738,ffff8000226d3744,32,2) at ip_deliver+0x137
ipintr() at ipintr+0x69
if_netisr(0) at if_netisr+0xea
taskq_thread(ffff80000002c080) at taskq_thread+0x100
end trace frame: 0x0, count: -6
ddb{2}> show register
rdi 0xffff800000cf4478
rsi 0
rbp 0xffff8000226d3630
rbx 0x4
rdx 0xd0ec __ALIGN_SIZE+0xc0ec
rcx 0x5
rax 0
r8 0x10
r9 0x40bc0e0c79f31f6e
r10 0
r11 0x9a636d5cd973370a
r12 0x32
r13 0x14
r14 0xffff8000226d3738
r15 0xffff800000cf4448
rip 0xffffffff81968321 amdgpu_vram_mgr_reserve_range+0x101
cs 0x8
rflags 0x10246 __ALIGN_SIZE+0xf246
rsp 0xffff8000226d3598
ss 0x10
amdgpu_vram_mgr_reserve_range+0x101: addb %al,0(%rax)
ddb{2}> ps
PID TID PPID UID S FLAGS WAIT COMMAND
86171 333411 63955 74 3 0x1100092 bpf pflogd
63955 461469 1 0 3 0x80 netio pflogd
44926 303198 1 0 3 0x100083 ttyin getty
91823 366301 1 0 3 0x100098 kqread cron
56167 242659 1 0 3 0x80 nanoslp apcupsd
56167 147809 1 0 3 0x4000088 sigwait apcupsd
56167 171020 1 0 3 0x4000080 nanoslp apcupsd
47439 46683 1 99 3 0x1100090 kqread sndiod
51482 395614 1 110 3 0x100090 kqread sndiod
5163 480478 99386 95 3 0x1100092 kqread smtpd
76226 349104 99386 103 3 0x1100092 kqread smtpd
53394 15461 99386 95 3 0x1100092 kqread smtpd
7344 376375 99386 95 3 0x100092 kqread smtpd
63235 309137 99386 95 3 0x1100092 kqread smtpd
48160 64945 99386 95 3 0x1100092 kqread smtpd
99386 38247 1 0 3 0x100080 kqread smtpd
41420 333984 1 77 3 0x1100090 kqread dhcpd
17481 420686 1 0 3 0x88 kqread sshd
18611 160721 77165 68 3 0x1000090 kqread isakmpd
77165 391744 1 0 3 0x80 netio isakmpd
79254 85499 1 0 3 0x100080 kqread ntpd
10257 99703 79620 83 3 0x100092 kqread ntpd
79620 463938 1 83 3 0x1100092 kqread ntpd
91894 184130 24465 73 3 0x1100090 kqread syslogd
24465 244414 1 0 3 0x100082 netio syslogd
62105 141098 1 0 3 0x100080 kqread resolvd
19257 376757 61629 77 3 0x100092 kqread dhcpleased
76652 204506 61629 77 3 0x100092 kqread dhcpleased
61629 486499 1 0 3 0x80 kqread dhcpleased
90626 362267 95555 115 3 0x100092 kqread slaacd
93187 97889 95555 115 3 0x100092 kqread slaacd
95555 477868 1 0 3 0x100080 kqread slaacd
11780 22955 0 0 3 0x14200 bored smr
98305 221595 0 0 3 0x14200 pgzero zerothread
96593 391889 0 0 3 0x14200 aiodoned aiodoned
30232 412444 0 0 3 0x14200 syncer update
45741 353942 0 0 3 0x14200 cleaner cleaner
39902 310884 0 0 3 0x14200 reaper reaper
65354 212624 0 0 3 0x14200 pgdaemon pagedaemon
33348 495407 0 0 3 0x14200 mmctsk sdmmc0
74730 412089 0 0 3 0x14200 usbtsk usbtask
86868 405536 0 0 3 0x14200 usbatsk usbatsk
20735 139086 0 0 3 0x40014200 acpi0 acpi0
6266 77841 0 0 3 0x40014200 idle3
61610 247494 0 0 3 0x40014200 idle2
55796 232481 0 0 7 0x40014200 idle1
91772 351464 0 0 3 0x14200 bored sensors
89632 253333 0 0 3 0x14200 bored softnet
4662 418517 0 0 3 0x14200 bored softnet
22887 421630 0 0 7 0x14200 softnet
*56210 145960 0 0 7 0x14200 softnet
45994 274986 0 0 3 0x14200 bored systqmp
23497 183217 0 0 3 0x14200 bored systq
99138 346429 0 0 3 0x40014200 bored softclock
83442 336559 0 0 7 0x40014200 idle0
1 436237 0 0 3 0x82 wait init
0 0 -1 0 3 0x10200 scheduler swapper
ddb{2}> mach ddbcpu 0
Stopped at x86_ipi_db+0x12: leave
x86_ipi_db(ffffffff822d4ff0) at x86_ipi_db+0x12
x86_ipi_handler() at x86_ipi_handler+0x80
Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23
_kernel_lock() at _kernel_lock+0xa6
softintr_dispatch(0) at softintr_dispatch+0x49
Xsoftclock() at Xsoftclock+0x1f
acpicpu_idle() at acpicpu_idle+0x11f
sched_idle(ffffffff822d4ff0) at sched_idle+0x280
end trace frame: 0x0, count: 7
ddb{0}> mach ddbcpu 1
Stopped at x86_ipi_db+0x12: leave
x86_ipi_db(ffff800022508ff0) at x86_ipi_db+0x12
x86_ipi_handler() at x86_ipi_handler+0x80
Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23
acpicpu_idle() at acpicpu_idle+0x11f
sched_idle(ffff800022508ff0) at sched_idle+0x280
end trace frame: 0x0, count: 10
ddb{1}> mach ddbcpu 2
Stopped at amdgpu_vram_mgr_reserve_range+0x101: addb %al,0(%rax)
amdgpu_vram_mgr_reserve_range(ffff8000226d3738,14,9) at amdgpu_vram_mgr_reserve
_range+0x101
esp46_input(ffff8000226d3738,ffff8000226d3744,32,2) at esp46_input+0xee
ip_deliver(ffff8000226d3738,ffff8000226d3744,32,2) at ip_deliver+0x137
ipintr() at ipintr+0x69
if_netisr(0) at if_netisr+0xea
taskq_thread(ffff80000002c080) at taskq_thread+0x100
end trace frame: 0x0, count: 9
ddb{2}> mach ddbcpu 3
Stopped at x86_ipi_db+0x12: leave
x86_ipi_db(ffff80002251aff0) at x86_ipi_db+0x12
x86_ipi_handler() at x86_ipi_handler+0x80
Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23
__mp_acquire_count(ffff80000002c100,ffff80000002c118) at __mp_acquire_count
taskq_next_work(ffff80000002c100,ffff8000226d93d0) at taskq_next_work+0x61
taskq_thread(ffff80000002c100) at taskq_thread+0xeb
end trace frame: 0x0, count: 9
ddb{3}>
On Wed, 31 Aug 2022 22:07:45 +0200
Radek <[email protected]> wrote:
> Hello Alexandr, hello Alexander,
>
> > does your box run also diff committed [1] by bluhm@ ~week ago?
> No, I didn't. I missed that diff. I upgraded to a new snapshot yesterday. I
> works fine as far.
> OpenBSD 7.2-beta (GENERIC.MP) #712: Mon Aug 29 12:35:51 MDT 2022
> [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>
> Thank you Alexander for your extensive explanation of the proper ddb commands
> order.
>
> Radek
>
>
> On Mon, 29 Aug 2022 12:30:31 +0200
> Alexander Bluhm <[email protected]> wrote:
>
> > On Mon, Aug 29, 2022 at 04:42:45AM +0200, Radek wrote:
> > > the same problem occurs on -current.
> >
> > It is not the same problem. Traces are different. But I guess
> > your setup triggers some sort of race.
> >
> > Previous crashes with 7.1 were in route and IPsec, now it is in pf.
> > Unfortunately you missed my pf fragment fix by a couple of hours.
> > Please try a newer snapshot.
> >
> > OpenBSD 7.2-beta (GENERIC.MP) #705: Mon Aug 22 12:25:07 MDT 2022
> > Changes by: [email protected] 2022/08/22 14:35:39
> >
> > I could not figure out what is wrong with 7.1-stable crashes. The
> > register and ps output are not from the CPU where the crash happened.
> > You have to run show register and ps before switching CPU with mach
> > ddbcpu.
> >
> > So first run show panic. Then trace, show register, ps.
> > Finally inspect the other CPU with mach ddbcpu.
> >
> > The number in ddb{2}> prompt shows the CPU you are currently on.
> > If "show panic" mentions more than one CPU, the one with the * is
> > the interresting one. Usually ddb drops to that initially. Traces
> > from other CPU help to see if something was running concurrently.
> >
> > bluhm
> >
>
>
> Radek
>
Radek