Hello Hrvoje,
On Fri, Nov 25, 2022 at 09:57:15AM +0100, Hrvoje Popovski wrote: > Hi, > > I think that this is similar problem as what David Hill send on tech@ > with subject "splassert on boot" > > I've checkout tree few minutes ago and in there should be > mvs@ "Remove netlock assertion within PF_LOCK()" and > dlg@ "get rid of NET_LOCK in the pf purge work" diffs. > > on boot I'm getting this splassert > > splassert: pfsync_delete_state: want 2 have 256 > Starting stack trace... > pfsync_delete_state(fffffd83a66644d8) at pfsync_delete_state+0x58 > pf_remove_state(fffffd83a66644d8) at pf_remove_state+0x14b > pf_purge_expired_states(1fdb,40) at pf_purge_expired_states+0x202 > pf_purge_states(0) at pf_purge_states+0x1c > taskq_thread(ffffffff822f69c8) at taskq_thread+0x11a > end trace frame: 0x0, count: 252 > End of stack trace. I've sent a diff yesterday to David Hill [1]. It looks like I've forgot to add cc' to tach@ > splassert: pfsync_delete_state: want 2 have 0 > Starting stack trace... > pfsync_delete_state(fffffd83a6676628) at pfsync_delete_state+0x58 > pf_remove_state(fffffd83a6676628) at pf_remove_state+0x14b > pf_purge_expired_states(1f9c,40) at pf_purge_expired_states+0x202 > pf_purge_states(0) at pf_purge_states+0x1c > taskq_thread(ffffffff822f69c8) at taskq_thread+0x11a > end trace frame: 0x0, count: 252 > End of stack trace. > > > and if i destroy pfsync interface and then sh /etc/netstart box panic > > uvm_fault(0xffffffff823d3250, 0x810, 0, 1) -> e > kernel: page fault trap, code=0 > Stopped at pfsync_q_ins+0x1a: movq 0x810(%r13),%rsi > TID PID UID PRFLAGS PFLAGS CPU COMMAND > * 68977 95532 0 0x14000 0x200 3K systqmp > pfsync_q_ins(fffffd83a6676628,2) at pfsync_q_ins+0x1a > pf_remove_state(fffffd83a6676628) at pf_remove_state+0x14b > pf_purge_expired_states(1f9c,40) at pf_purge_expired_states+0x202 > pf_purge_states(0) at pf_purge_states+0x1c > taskq_thread(ffffffff822f69c8) at taskq_thread+0x11a > end trace frame: 0x0, count: 10 > https://www.openbsd.org/ddb.html describes the minimum info required in > bug reports. Insufficient info makes it difficult to find and fix bugs. > ddb{3}> > Looks like we need to synchronize pfsync destroy with timer thread. thanks for great testing. regards sashan --------8<---------------8<---------------8<------------------8<-------- diff --git a/sys/net/if_pfsync.c b/sys/net/if_pfsync.c index f69790ee98d..24963a546de 100644 --- a/sys/net/if_pfsync.c +++ b/sys/net/if_pfsync.c @@ -1865,8 +1865,6 @@ pfsync_undefer(struct pfsync_deferral *pd, int drop) { struct pfsync_softc *sc = pfsyncif; - NET_ASSERT_LOCKED(); - if (sc == NULL) return; @@ -2128,8 +2126,6 @@ pfsync_delete_state(struct pf_state *st) { struct pfsync_softc *sc = pfsyncif; - NET_ASSERT_LOCKED(); - if (sc == NULL || !ISSET(sc->sc_if.if_flags, IFF_RUNNING)) return; @@ -2188,8 +2184,6 @@ pfsync_clear_states(u_int32_t creatorid, const char *ifname) struct pfsync_clr clr; } __packed r; - NET_ASSERT_LOCKED(); - if (sc == NULL || !ISSET(sc->sc_if.if_flags, IFF_RUNNING)) return;