Hello Hrvoje,

On Fri, Nov 25, 2022 at 09:57:15AM +0100, Hrvoje Popovski wrote:
> Hi,
> 
> I think that this is similar problem as what David Hill send on tech@
> with subject "splassert on boot"
> 
> I've checkout tree few minutes ago and in there should be
> mvs@ "Remove netlock assertion within PF_LOCK()" and
> dlg@ "get rid of NET_LOCK in the pf purge work" diffs.
> 
> on boot I'm getting this splassert
> 
> splassert: pfsync_delete_state: want 2 have 256
> Starting stack trace...
> pfsync_delete_state(fffffd83a66644d8) at pfsync_delete_state+0x58
> pf_remove_state(fffffd83a66644d8) at pf_remove_state+0x14b
> pf_purge_expired_states(1fdb,40) at pf_purge_expired_states+0x202
> pf_purge_states(0) at pf_purge_states+0x1c
> taskq_thread(ffffffff822f69c8) at taskq_thread+0x11a
> end trace frame: 0x0, count: 252
> End of stack trace.

    I've sent a diff yesterday to David Hill [1]. It looks like
    I've forgot to add cc' to tach@


> splassert: pfsync_delete_state: want 2 have 0
> Starting stack trace...
> pfsync_delete_state(fffffd83a6676628) at pfsync_delete_state+0x58
> pf_remove_state(fffffd83a6676628) at pf_remove_state+0x14b
> pf_purge_expired_states(1f9c,40) at pf_purge_expired_states+0x202
> pf_purge_states(0) at pf_purge_states+0x1c
> taskq_thread(ffffffff822f69c8) at taskq_thread+0x11a
> end trace frame: 0x0, count: 252
> End of stack trace.
> 
> 
> and if i destroy pfsync interface and then sh /etc/netstart box panic
> 
> uvm_fault(0xffffffff823d3250, 0x810, 0, 1) -> e
> kernel: page fault trap, code=0
> Stopped at      pfsync_q_ins+0x1a:      movq    0x810(%r13),%rsi
>     TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
> * 68977  95532      0     0x14000      0x200    3K systqmp
> pfsync_q_ins(fffffd83a6676628,2) at pfsync_q_ins+0x1a
> pf_remove_state(fffffd83a6676628) at pf_remove_state+0x14b
> pf_purge_expired_states(1f9c,40) at pf_purge_expired_states+0x202
> pf_purge_states(0) at pf_purge_states+0x1c
> taskq_thread(ffffffff822f69c8) at taskq_thread+0x11a
> end trace frame: 0x0, count: 10
> https://www.openbsd.org/ddb.html describes the minimum info required in
> bug reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{3}>
> 

    Looks like we need to synchronize pfsync destroy with timer thread.

thanks for great testing.

regards
sashan

--------8<---------------8<---------------8<------------------8<--------
diff --git a/sys/net/if_pfsync.c b/sys/net/if_pfsync.c
index f69790ee98d..24963a546de 100644
--- a/sys/net/if_pfsync.c
+++ b/sys/net/if_pfsync.c
@@ -1865,8 +1865,6 @@ pfsync_undefer(struct pfsync_deferral *pd, int drop)
 {
        struct pfsync_softc *sc = pfsyncif;
 
-       NET_ASSERT_LOCKED();
-
        if (sc == NULL)
                return;
 
@@ -2128,8 +2126,6 @@ pfsync_delete_state(struct pf_state *st)
 {
        struct pfsync_softc *sc = pfsyncif;
 
-       NET_ASSERT_LOCKED();
-
        if (sc == NULL || !ISSET(sc->sc_if.if_flags, IFF_RUNNING))
                return;
 
@@ -2188,8 +2184,6 @@ pfsync_clear_states(u_int32_t creatorid, const char 
*ifname)
                struct pfsync_clr clr;
        } __packed r;
 
-       NET_ASSERT_LOCKED();
-
        if (sc == NULL || !ISSET(sc->sc_if.if_flags, IFF_RUNNING))
                return;
 

Reply via email to