Hello Olivier and Sebastien,
I took a look at old version of pf_state_key_link_reverse(),
before my commit [1] changed it. The answer was there:
7368 void
7369 pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key
*skrev)
7370 {
7371 /* Note that sk and skrev may be equal, then we refcount twice. */
7372 KASSERT(sk != skrev);
7373 KASSERT(sk->reverse == NULL);
7374 KASSERT(skrev->reverse == NULL);
7375 sk->reverse = pf_state_key_ref(skrev);
7376 skrev->reverse = pf_state_key_ref(sk);
7377 }
comment at line 7371 says the skrev and sk may be equal.
Taking a look at current pf_state_key_link_reverse(), which is in the tree
right now, behaves differently to old function when sk and skrev are
identical:
7368 void
7369 pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key
*skrev)
7370 {
7371 struct pf_state_key *old_reverse;
7372
7373 old_reverse = atomic_cas_ptr(&sk->reverse, NULL, skrev);
7374 if (old_reverse != NULL)
7375 KASSERT(old_reverse == skrev);
7376 else
7377 pf_state_key_ref(skrev);
7378
7379 old_reverse = atomic_cas_ptr(&skrev->reverse, NULL, sk);
7380 if (old_reverse != NULL)
7381 KASSERT(old_reverse == sk);
7382 else
7383 pf_state_key_ref(sk);
7384 }
note that if both keys are identical, we are going to grab just single
reference. The story goes like that:
we assume sk == skrev
sk->reverse == NULL when we do atomic_cas() at line 7373
the cas sets sk->reverse to skrev, which is equal to sk
the old_reverse is NULL when we reach 7374, therefore we
grab reference to skrev
at line 7379 we do cas_ptr on skrev->reverse.
because sk == skrev, the skrev->reverse is not NULL.
the old_reverse is not NULL when we reach 7380,
old_reverse equals to skrev (and sk)
old_reverse == skrev == sk
in this case we do KASSERT(old_reverse == sk) at line 7381,
the assertion holds so we just return from the function
without getting extra reference. this is wrong.
I'd like to ask you for yet another brave test. just to verify
the story I dream of above really happens. The plan is to put
yet another KASSERT() to pf_state_key_link_reverse():
KASSERT(sk != skrev);
I expect the KASSERT will fire sooner or later. I'm not sure
which packet could trigger such condition (sk == skrev), I
suspect this could be kind of multicast/broadcast packet.
Hence I'd like to ask you to give a try KASSSERT() above.
diff below is against current tree. It adds desired assert to
your pf_state_key_link_reverse().
thank you very much for your help.
regards
sashan
[1] https://marc.info/?l=openbsd-cvs&m=161951631726837&w=2
--------8<---------------8<---------------8<------------------8<--------
diff --git a/sys/net/pf.c b/sys/net/pf.c
index 23eebf4a274..298568dec28 100644
--- a/sys/net/pf.c
+++ b/sys/net/pf.c
@@ -7368,19 +7368,12 @@ pf_inp_unlink(struct inpcb *inp)
void
pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key *skrev)
{
- struct pf_state_key *old_reverse;
-
- old_reverse = atomic_cas_ptr(&sk->reverse, NULL, skrev);
- if (old_reverse != NULL)
- KASSERT(old_reverse == skrev);
- else
- pf_state_key_ref(skrev);
-
- old_reverse = atomic_cas_ptr(&skrev->reverse, NULL, sk);
- if (old_reverse != NULL)
- KASSERT(old_reverse == sk);
- else
- pf_state_key_ref(sk);
+ /* Note that sk and skrev may be equal, then we refcount twice. */
+ KASSERT(sk != skrev);
+ KASSERT(sk->reverse == NULL);
+ KASSERT(skrev->reverse == NULL);
+ sk->reverse = pf_state_key_ref(skrev);
+ skrev->reverse = pf_state_key_ref(sk);
}
#if NPFLOG > 0