Re: pfsync panic after pf_purge backout

2022-11-29 Thread Hrvoje Popovski
On 28.11.2022. 17:07, Alexandr Nedvedicky wrote:
> diff below should avoid panic above (and similar panics in pfsync_q_del().
> It also prints some 'error' system message buffer (a.k.a. dmesg)
> 
> We panic because we attempt to remove state from psync queue which is
> already empty. the pfsync_q_del() must be acting on some stale information
> kept in `st` argument (state).
> 
> the information we print into dmesg buffer should help us understand
> what's going on. At the moment I can not explain how does it come
> there is a state which claims its presence on state queue, while the
> queue in question is empty.
> 
> I'd like to ask you to give a try diff below and repeat your test.
> Let it run for some time and collect 'dmesg' output for me after usual
> uptime-to-panic elapses during a test run.


Hi,

here's panic with WITESS, this diff and this one
https://www.mail-archive.com/tech@openbsd.org/msg72582.html

I will leave box in ddb ...


wsmouse1 at ums1 mux 0
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
107228145326
32609891)nosync: no unlinked: no timeout: PFTM_TCP_OPENING
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
132937190089
77519715)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
124393720902
11468387)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
182347109632
42042467)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<4>com1: 1 silo overflow, 0 ibuf overflows
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
142245292777
58899299)nosync: yes unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
475207607577
1004003)nosync: no unlinked: no timeout: PFTM_TCP_FIN_WAIT
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
159470263971
51003747)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
134712868555
08329571)nosync: no unlinked: no timeout: PFTM_TCP_FIN_WAIT
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
790090587538
2371427)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
543286072470
0808291)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
924950197971
1276131)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
139134763774
04146787)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
930062379002
4590435)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
125019156702
63792739)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
134887042707
43536739)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
155369141176
78236771)nosync: no unlinked: no timeout: PFTM_TCP_FIN_WAIT
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
100393240929
61031267)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
909085788793
5661155)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
909085788793
5661155)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
909085788793
5661155)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
909085788793
5661155)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
909085788793
5661155)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
182376159588
61972579)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
159637401507
14238051)nosync: no unlinked: no timeout: PFTM_TCP_FIN_WAIT
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
242206588258
3172195)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
105537141411
11420003)nosync: no unlinked: no timeout: PFTM_TCP_OPENING
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
152424297451
57538915)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
498918805181
3434467)nosync: no unlinked: no timeout: PFTM_TCP_FIN_WAIT
<3>pfsync: pfsync_q_del: stale queue (PFSYNC

Re: pfsync panic after pf_purge backout

2022-11-28 Thread Alexandr Nedvedicky
Hello,


> 
> 
> Hi,
> 
> here's panic with WITNESS and this diff on tech@
> https://www.mail-archive.com/tech@openbsd.org/msg72582.html
> 
> I will stop now because I'm not sure what I'm doing and which diffs I'm
> testing...
> 
> 
> r620-1# uvm_fault(0x8248ea28, 0x17, 0, 2) -> e
> kernel: page fault trap, code=0
> Stopped at  pfsync_q_del+0x96:  movq%rdx,0x8(%rax)
> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
> *300703  35643  0 0x14000  0x2001K systq
>  237790  10061  0 0x14000 0x42000  softclock
> pfsync_q_del(fd8323dc3900) at pfsync_q_del+0x96
> pfsync_delete_state(fd8323dc3900) at pfsync_delete_state+0x118
> pf_remove_state(fd8323dc3900) at pf_remove_state+0x14e
> pf_purge_expired_states(c3501) at pf_purge_expired_states+0x1b3
> pf_purge(823ae080) at pf_purge+0x28
> taskq_thread(822cbe30) at taskq_thread+0x11a
> end trace frame: 0x0, count: 9
> https://www.openbsd.org/ddb.html describes the minimum info required in
> bug reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{1}>
> 
> 

diff below should avoid panic above (and similar panics in pfsync_q_del().
It also prints some 'error' system message buffer (a.k.a. dmesg)

We panic because we attempt to remove state from psync queue which is
already empty. the pfsync_q_del() must be acting on some stale information
kept in `st` argument (state).

the information we print into dmesg buffer should help us understand
what's going on. At the moment I can not explain how does it come
there is a state which claims its presence on state queue, while the
queue in question is empty.

I'd like to ask you to give a try diff below and repeat your test.
Let it run for some time and collect 'dmesg' output for me after usual
uptime-to-panic elapses during a test run.

thanks a lot
regards
sashan

8<---8<---8<--8<
diff --git a/sys/net/if_pfsync.c b/sys/net/if_pfsync.c
index f69790ee98d..d4be84a1f57 100644
--- a/sys/net/if_pfsync.c
+++ b/sys/net/if_pfsync.c
@@ -2254,6 +2254,74 @@ pfsync_q_ins(struct pf_state *st, int q)
} while (0);
 }
 
+const char *
+pfsync_qname(int q)
+{
+   switch (q) {
+   case PFSYNC_S_IACK:
+   return ("PFSYNC_S_IACK");
+   
+   case PFSYNC_S_UPD_C:
+   return ("PFSYNC_S_UPD_C");
+
+   case PFSYNC_S_DEL:
+   return ("PFSYNC_S_DEL");
+
+   case PFSYNC_S_INS:
+   return ("PFSYNC_S_INS");
+
+   case PFSYNC_S_UPD:
+   return ("PFSYNC_S_UPD");
+
+   case PFSYNC_S_COUNT:
+   return ("PFSYNC_S_COUNT");
+
+   case PFSYNC_S_DEFER:
+   return ("PFSYNC_S_DEFER");
+
+   case PFSYNC_S_NONE:
+   return ("PFSYNC_S_NONE");
+
+   default:
+   return ("???");
+   }
+}
+
+const char *
+pfsync_timeout_name(unsigned int timeout)
+{
+   const char *timeout_name[] = {
+   "PFTM_TCP_FIRST_PACKET",
+   "PFTM_TCP_OPENING",
+   "PFTM_TCP_ESTABLISHED",
+   "PFTM_TCP_CLOSING",
+   "PFTM_TCP_FIN_WAIT",
+   "PFTM_TCP_CLOSED",
+   "PFTM_UDP_FIRST_PACKET",
+   "PFTM_UDP_SINGLE",
+   "PFTM_UDP_MULTIPLE",
+   "PFTM_ICMP_FIRST_PACKET",
+   "PFTM_ICMP_ERROR_REPLY",
+   "PFTM_OTHER_FIRST_PACKET",
+   "PFTM_OTHER_SINGLE",
+   "PFTM_OTHER_MULTIPLE",
+   "PFTM_FRAG",
+   "PFTM_INTERVAL",
+   "PFTM_ADAPTIVE_START",
+   "PFTM_ADAPTIVE_END",
+   "PFTM_SRC_NODE",
+   "PFTM_TS_DIFF",
+   "PFTM_MAX",
+   "PFTM_PURGE",
+   "PFTM_UNLINKED"
+   };
+
+   if (timeout > PFTM_UNLINKED)
+   return ("???");
+   else
+   return (timeout_name[timeout]);
+}
+
 void
 pfsync_q_del(struct pf_state *st)
 {
@@ -2273,6 +2341,19 @@ pfsync_q_del(struct pf_state *st)
mtx_leave(&sc->sc_st_mtx);
return;
}
+
+   if (TAILQ_EMPTY(&sc->sc_qs[q])) {
+   mtx_leave(&sc->sc_st_mtx);
+   DPFPRINTF(LOG_ERR,
+   "%s: stale queue (%s) in state (id %llu)"
+   "nosync: %s unlinked: %s timeout: %s", __func__,
+   pfsync_qname(q), st->id,
+   (st->state_flags & PFSTATE_NOSYNC) ? "yes" : "no",
+   (st->timeout == PFTM_UNLINKED) ? "yes" : "no",
+   pfsync_timeout_name(st->timeout));
+   return;
+   }
+
atomic_sub_long(&sc->sc_len, pfsync_qs[q].len);
TAILQ_REMOVE(&sc->sc_qs[q], st, sync_list);
if (TAILQ_EMPTY(&sc->sc_qs[q]))



Re: pfsync panic after pf_purge backout

2022-11-27 Thread Hrvoje Popovski
On 27.11.2022. 9:28, Hrvoje Popovski wrote:
> On 27.11.2022. 1:51, Alexandr Nedvedicky wrote:
>> Hello,
>>
>> On Sat, Nov 26, 2022 at 08:33:28PM +0100, Hrvoje Popovski wrote:
>> 
>>> I just need to say that with all pf, pfsync and with pf_purge diffs
>>> after hackaton + this diff on tech@
>>> https://www.mail-archive.com/tech@openbsd.org/msg72582.html
>>> my production firewall seems stable and it wasn't without that diff
>> this diff still waits for OK. it makes pfsync to use
>> state mutex to safely dereference keys.
>>
>>> I'm not sure if we have same diffs but even Josmar Pierri on bugs@
>>> https://www.mail-archive.com/bugs@openbsd.org/msg18994.html
>>> who had panics quite regularly with that diff on tech@ seems to have
>>> stable firewall now.
>>>
>>>
>>>
>>> r620-1# uvm_fault(0x82374288, 0x17, 0, 2) -> e
>>> kernel: page fault trap, code=0
>>> Stopped at  pfsync_q_del+0x96:  movq%rdx,0x8(%rax)
>>> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
>>> *192892  19920  0 0x14000  0x2005K softnet
>>> pfsync_q_del(fd82e8a4ce20) at pfsync_q_del+0x96
>>> pf_remove_state(fd82e8a4ce20) at pf_remove_state+0x14b
>>> pfsync_in_del_c(fd8006d843b8,c,79,0) at pfsync_in_del_c+0x6f
>>> pfsync_input(800022d60ad8,800022d60ae4,f0,2) at pfsync_input+0x33c
>>> ip_deliver(800022d60ad8,800022d60ae4,f0,2) at ip_deliver+0x113
>>> ipintr() at ipintr+0x69
>>> if_netisr(0) at if_netisr+0xea
>>> taskq_thread(8003) at taskq_thread+0x100
>>> end trace frame: 0x0, count: 7
>>> https://www.openbsd.org/ddb.html describes the minimum info required in
>>> bug reports.  Insufficient info makes it difficult to find and fix bugs.
>>> ddb{5}>
>>>
>> those panics are causing me headaches. this got most-likely uncovered
>> by diff which adds a mutex. The mutex makes pfsync stable enough
>> so you can trigger unknown bugs.
> 
> Hi,
> 
> here's panic with WITNESS. Now I will try to trigger panic with that
> mutex diff on tech@


Hi,

here's panic with WITNESS and this diff on tech@
https://www.mail-archive.com/tech@openbsd.org/msg72582.html

I will stop now because I'm not sure what I'm doing and which diffs I'm
testing...


r620-1# uvm_fault(0x8248ea28, 0x17, 0, 2) -> e
kernel: page fault trap, code=0
Stopped at  pfsync_q_del+0x96:  movq%rdx,0x8(%rax)
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
*300703  35643  0 0x14000  0x2001K systq
 237790  10061  0 0x14000 0x42000  softclock
pfsync_q_del(fd8323dc3900) at pfsync_q_del+0x96
pfsync_delete_state(fd8323dc3900) at pfsync_delete_state+0x118
pf_remove_state(fd8323dc3900) at pf_remove_state+0x14e
pf_purge_expired_states(c3501) at pf_purge_expired_states+0x1b3
pf_purge(823ae080) at pf_purge+0x28
taskq_thread(822cbe30) at taskq_thread+0x11a
end trace frame: 0x0, count: 9
https://www.openbsd.org/ddb.html describes the minimum info required in
bug reports.  Insufficient info makes it difficult to find and fix bugs.
ddb{1}>


ddb{1}> show panic
*cpu1: uvm_fault(0x8248ea28, 0x17, 0, 2) -> e
ddb{1}>


ddb{1}> show reg
rdi  0x9
rsi  0xf
rbp   0x800022d593c0
rbx   0xfd83347714a8
rdx   0x
rcx 0x10
rax  0xf
r80x7fff
r90x800022d59570
r10   0xcc7e29c6fd100f64
r11   0x1e575244acf63fd3
r12   0x808c4000
r13   0xfd8318aac200
r14   0xfd8323dc3900
r15   0x808c47e0
rip   0x817d3ec6pfsync_q_del+0x96
cs   0x8
rflags   0x10286__ALIGN_SIZE+0xf286
rsp   0x800022d59390
ss  0x10
pfsync_q_del+0x96:  movq%rdx,0x8(%rax)
ddb{1}>



ddb{1}>  show locks
exclusive rwlock pf_state_lock r = 0 (0x822b03a0)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x17f
#2  pf_purge+0x28
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
exclusive rwlock pf_lock r = 0 (0x822b0370)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x173
#2  pf_purge+0x28
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
exclusive rwlock pfstates r = 0 (0x822c57d0)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x167
#2  pf_purge+0x28
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
exclusive rwlock netlock r = 0 (0x822b2590)
#0  witness_lock+0x311
#1  rw_enter+0x292
#2  pf_purge_expired_states+0x15b
#3  pf_purge+0x28
#4  taskq_thread+0x11a
#5  proc_trampoline+0x1c
exclusive kernel_lock &kernel_lock r = 1 (0x824be1f8)
#0  witness_lock+0x311
#1  __mp_acquire_count+0x38
#2  mi_switch+0x28b
#3  sleep_finish+0xfe
#4  rw_enter+0x232
#5  pf_purge_expired_states+0x15b
#6  pf_purge+0x28
#

Re: pfsync panic after pf_purge backout

2022-11-27 Thread Hrvoje Popovski
On 27.11.2022. 1:51, Alexandr Nedvedicky wrote:
> Hello,
> 
> On Sat, Nov 26, 2022 at 08:33:28PM +0100, Hrvoje Popovski wrote:
> 
>> I just need to say that with all pf, pfsync and with pf_purge diffs
>> after hackaton + this diff on tech@
>> https://www.mail-archive.com/tech@openbsd.org/msg72582.html
>> my production firewall seems stable and it wasn't without that diff
> 
> this diff still waits for OK. it makes pfsync to use
> state mutex to safely dereference keys.
> 
>>
>> I'm not sure if we have same diffs but even Josmar Pierri on bugs@
>> https://www.mail-archive.com/bugs@openbsd.org/msg18994.html
>> who had panics quite regularly with that diff on tech@ seems to have
>> stable firewall now.
>>
>>
>>
>> r620-1# uvm_fault(0x82374288, 0x17, 0, 2) -> e
>> kernel: page fault trap, code=0
>> Stopped at  pfsync_q_del+0x96:  movq%rdx,0x8(%rax)
>> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
>> *192892  19920  0 0x14000  0x2005K softnet
>> pfsync_q_del(fd82e8a4ce20) at pfsync_q_del+0x96
>> pf_remove_state(fd82e8a4ce20) at pf_remove_state+0x14b
>> pfsync_in_del_c(fd8006d843b8,c,79,0) at pfsync_in_del_c+0x6f
>> pfsync_input(800022d60ad8,800022d60ae4,f0,2) at pfsync_input+0x33c
>> ip_deliver(800022d60ad8,800022d60ae4,f0,2) at ip_deliver+0x113
>> ipintr() at ipintr+0x69
>> if_netisr(0) at if_netisr+0xea
>> taskq_thread(8003) at taskq_thread+0x100
>> end trace frame: 0x0, count: 7
>> https://www.openbsd.org/ddb.html describes the minimum info required in
>> bug reports.  Insufficient info makes it difficult to find and fix bugs.
>> ddb{5}>
>>
> 
> those panics are causing me headaches. this got most-likely uncovered
> by diff which adds a mutex. The mutex makes pfsync stable enough
> so you can trigger unknown bugs.


Hi,

here's panic with WITNESS. Now I will try to trigger panic with that
mutex diff on tech@

r620-1# uvm_fault(0x824be2f8, 0x17, 0, 2) -> e
kernel: page fault trap, code=0
Stopped at  pfsync_q_del+0x96:  movq%rdx,0x8(%rax)
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
 267891  77512 830x100012  02  ntpd
*250028  94877  0 0x14000  0x2003K systq
  35323  58401  0 0x14000 0x42000  softclock
pfsync_q_del(fd82c6826da0) at pfsync_q_del+0x96
pfsync_delete_state(fd82c6826da0) at pfsync_delete_state+0x118
pf_remove_state(fd82c6826da0) at pf_remove_state+0x14b
pf_purge_expired_states(b3368) at pf_purge_expired_states+0x1b3
pf_purge(823d8c80) at pf_purge+0x28
taskq_thread(822c73e8) at taskq_thread+0x11a
end trace frame: 0x0, count: 9
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports.  Insufficient info makes it difficult to find and fix bugs.


ddb{3}> show panic
*cpu3: uvm_fault(0x824be2f8, 0x17, 0, 2) -> e
ddb{3}>


ddb{3}> show reg
rdi  0x9
rsi  0xf
rbp   0x800022d59b80
rbx   0xfd842e15b7d8
rdx   0x
rcx 0x10
rax  0xf
r80x7fff
r90x800022d59d20
r10   0x362267f9c796ed3a
r11   0xcadda0efd2fc372f
r12   0x808c3000
r13   0xfd8310c597b0
r14   0xfd82c6826da0
r15   0x808c37e0
rip   0x81398f56pfsync_q_del+0x96
cs   0x8
rflags   0x10286__ALIGN_SIZE+0xf286
rsp   0x800022d59b50
ss  0x10
pfsync_q_del+0x96:  movq%rdx,0x8(%rax)


ddb{3}> show locks
exclusive rwlock pf_state_lock r = 0 (0x822e2d58)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x17f
#2  pf_purge+0x28
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
exclusive rwlock pf_lock r = 0 (0x822e2d28)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x173
#2  pf_purge+0x28
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
exclusive rwlock pfstates r = 0 (0x822bf040)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x167
#2  pf_purge+0x28
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
exclusive rwlock netlock r = 0 (0x822bb6a0)
#0  witness_lock+0x311
#1  rw_enter+0x292
#2  pf_purge_expired_states+0x15b
#3  pf_purge+0x28
#4  taskq_thread+0x11a
#5  proc_trampoline+0x1c
exclusive kernel_lock &kernel_lock r = 1 (0x824bd7e8)
#0  witness_lock+0x311
#1  __mp_acquire_count+0x38
#2  mi_switch+0x28b
#3  sleep_finish+0xfe
#4  rw_enter+0x232
#5  pf_purge_expired_states+0x15b
#6  pf_purge+0x28
#7  taskq_thread+0x11a
#8  proc_trampoline+0x1c
shared rwlock systq r = 0 (0x822c7458)
#0  witness_lock+0x311
#1  taskq_thread+0x10d
#2  proc_trampoline+0x1c
exclusive mutex &sc->sc_st_mtx r = 0 (0x808c37f0)
#0  witness_lock+0x311

Re: pfsync panic after pf_purge backout

2022-11-26 Thread Alexandr Nedvedicky
Hello,

On Sat, Nov 26, 2022 at 08:33:28PM +0100, Hrvoje Popovski wrote:

> I just need to say that with all pf, pfsync and with pf_purge diffs
> after hackaton + this diff on tech@
> https://www.mail-archive.com/tech@openbsd.org/msg72582.html
> my production firewall seems stable and it wasn't without that diff

this diff still waits for OK. it makes pfsync to use
state mutex to safely dereference keys.

> 
> I'm not sure if we have same diffs but even Josmar Pierri on bugs@
> https://www.mail-archive.com/bugs@openbsd.org/msg18994.html
> who had panics quite regularly with that diff on tech@ seems to have
> stable firewall now.
> 
> 
> 
> r620-1# uvm_fault(0x82374288, 0x17, 0, 2) -> e
> kernel: page fault trap, code=0
> Stopped at  pfsync_q_del+0x96:  movq%rdx,0x8(%rax)
> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
> *192892  19920  0 0x14000  0x2005K softnet
> pfsync_q_del(fd82e8a4ce20) at pfsync_q_del+0x96
> pf_remove_state(fd82e8a4ce20) at pf_remove_state+0x14b
> pfsync_in_del_c(fd8006d843b8,c,79,0) at pfsync_in_del_c+0x6f
> pfsync_input(800022d60ad8,800022d60ae4,f0,2) at pfsync_input+0x33c
> ip_deliver(800022d60ad8,800022d60ae4,f0,2) at ip_deliver+0x113
> ipintr() at ipintr+0x69
> if_netisr(0) at if_netisr+0xea
> taskq_thread(8003) at taskq_thread+0x100
> end trace frame: 0x0, count: 7
> https://www.openbsd.org/ddb.html describes the minimum info required in
> bug reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{5}>
> 

those panics are causing me headaches. this got most-likely uncovered
by diff which adds a mutex. The mutex makes pfsync stable enough
so you can trigger unknown bugs.

thanks and
regards
sashan