hello, On 2023-11-04 15:26, Alexandr Nedvedicky wrote: > Hello Johan, > > On Sat, Nov 04, 2023 at 10:01:06AM -0400, Johan Huldtgren wrote: > > hello, > > > > On 2023-11-03 19:10, Alexandr Nedvedicky wrote: > > > Hello Johan, > > > > > > > > > On Fri, Nov 03, 2023 at 12:27:53PM -0400, Johan Huldtgren wrote: > > > </snip> > > > > > > > > so this box just has the default (from when it was installed) ruleset. > > > > > > > > $ doas cat /etc/pf.conf > > > > # $OpenBSD: pf.conf,v 1.55 2017/12/03 20:40:04 sthen Exp $ > > > > # > > > > # See pf.conf(5) and /etc/examples/pf.conf > > > > > > > > set skip on lo > > > > set state-defaults pflow > > > > > > > > block return # block stateless traffic > > > > pass # establish keep-state > > > > > > > > # By default, do not permit remote connections to X11 > > > > block return in on ! lo0 proto tcp to port 6000:6010 > > > > > > > > # Port build user does not need network > > > > block return out log proto {tcp udp} user _pbuild > > > > > > > > > > So that's surprising then... Looks like you are very lucky > > > to hit the ASSERT. I'm surprised we have not seen it earlier. > > > > > > Diff below makes sure pf_test() function does not overwrite > > > timeout member in pf_state structure when timeout is set > > > to PFTM_UNLINKED already. We also modify/update timeout member > > > under protection of state mutex (pf_state::mtx). > > > > > > > > > Can you test the diff below? It applies to current as well to 7.4 > > > > I've rebuilt with your diff, as the panic was seemingly random I'm not > > sure how I can test, but I'll let this system run with your patch and > > report any issues should I see them. If you have any specific things > > you'd like me to try don't hesitate to let me know. dmesg below for > > complteness sake. > > > > thanks again, > > > > I'm afraid there is nothing more to do than keep an eye on your > system. I think what really increased a chance here is the number > of CPUs your box has. > > It is OK if you can come back with report early in December to let > us know if it helps or if there are more similar issues (which I'm > sure there are still some left).
so my machine paniced today, but the panic this time is completely different. I don't know if it's related to this issue, the patch, or a completely new issue, but I figured I'd start reporting it here. Unfortuntately when I tried to swap CPU to collect traces from the other ones the machine froze and I was forced to power cycle it. So I have the panic and initial trace but that's it. panic: ip_output no HDR Stopped at db_enter+0x14: popq %rbp TID PID UID PRFLAGS PFLAGS CPU COMMAND 74003 25022 0 0x10 0 2 afpd 355827 29745 107 0x1100002 0x4000000 3 vmd 451006 29745 107 0x1100002 0x4000000 4 vmd 131508 78367 107 0x1100002 0x4000000 5 vmd 112644 78367 107 0x1100002 0x4000000 1 vmd *133058 91446 0 0x14000 0x200 0 softnet0 db_enter() at db_enter+0x14 panic(ffffffff820c20df) at panic+0xc3 ip_output(fffffd8076b76e00,0,fffffd9c9e59e708,0,0,fffffd9c9e59e690,e4a23bf8c0204936) at ip_output+0xa26 udp_output(fffffd9c9e59e690,fffffd8076b76e00,fffffd8079d14b00,0) at udp_output+0x3be sosend(fffffd9c9e59f000,fffffd8079d14b00,0,fffffd8076b76e00,0,0) at sosend+0x37f pflow_output_process(ffff8000011a0800) at pflow_output_process+0x67 taskq_thread(ffff800000035200) at taskq_thread+0x100 end trace frame: 0x0, count: 8 https://www.openbsd.org/ddb.html describes the minimum info required in bug reports. Insufficient info makes it difficult to find and fix bugs. ddb{0}> ddb{0}> show panic *cpu0: ip_output no HDR ddb{0}> trace db_enter() at db_enter+0x14 panic(ffffffff820c20df) at panic+0xc3 ip_output(fffffd8076b76e00,0,fffffd9c9e59e708,0,0,fffffd9c9e59e690,e4a23bf8c0204936) at ip_output+0xa26 udp_output(fffffd9c9e59e690,fffffd8076b76e00,fffffd8079d14b00,0) at udp_output+0x3be sosend(fffffd9c9e59f000,fffffd8079d14b00,0,fffffd8076b76e00,0,0) at sosend+0x37f pflow_output_process(ffff8000011a0800) at pflow_output_process+0x67 taskq_thread(ffff800000035200) at taskq_thread+0x100 end trace frame: 0x0, count: -7 thanks, .jh