Re: pf_state_key_link_reverse() needs atomic ops

Martin Pieuchot Mon, 10 Jun 2019 07:47:55 -0700

On 10/06/19(Mon) 09:29, Alexandr Nedvedicky wrote:
> Hello,
> 
> sorry for extra delay (was off-line over the weekend).
> 
> On Sat, Jun 08, 2019 at 09:46:24PM +1000, Jonathan Matthew wrote:
> > On Tue, Jun 04, 2019 at 01:50:51AM +0200, Alexandr Nedvedicky wrote:
> > > Hello,
> > > 
> > > I've managed to get pf_test() running in parallel on forwarding path in my
> > > experimental tree. And there was some fall out. PF died on ASSERT() in
> > > pf_state_key_link_reverse() at line 7371:
> > > 
> > >     7368 pf_state_key_link_reverse(struct pf_state_key *sk, ...)
> > >     7369 {
> > >     7370         /* Note that sk and skrev may be equal, then we ... */
> > >     7371         KASSERT(sk->reverse == NULL);
> > >     7372         KASSERT(skrev->reverse == NULL);
> > >     7373         sk->reverse = pf_state_key_ref(skrev);
> > >     7374         skrev->reverse = pf_state_key_ref(sk);
> > >     7375 }
> > > 
> > > 
> > > pf_state_key_link_reverse() function is being called from pf_find_state() 
> > > here:
> > > 
> > >     1074         if (sk == NULL) {
> > >     1075                 if ((sk = RB_FIND(pf_state_tree, &pf_statetbl,
> > >     1076                     (struct pf_state_key *)key)) == NULL)
> > >     1077                         return (PF_DROP);
> > >     1078                 if (pd->dir == PF_OUT && pkt_sk &&
> > >     1079                     pf_compare_state_keys(pkt_sk, sk, ...) == 0)
> > >     1080                         pf_state_key_link_reverse(sk, pkt_sk);
> > >     1081                 else if (pd->dir == PF_OUT && 
> > > pd->m->m_pkthdr.pf.inp &&
> > >     1082                     !pd->m->m_pkthdr.pf.inp->inp_pf_sk && 
> > > !sk->inp)
> > >     1083                         pf_state_key_link_inpcb(sk, 
> > > pd->m->m_pkt...);
> > >     1084         }
> > > 
> > > the pf_find_state() performs a look up to PF state table and as such it 
> > > runs
> > > under PF_STATE_ENTER_READ() lock. So there might be actually more threads
> > > running at pf_state_key_link_reverse() function. The thread, which 
> > > currently
> > > loses the race trips the assert in 7371. Patch below uses atomic ops
> > > to handle the race properly.
> > 
> > Does this mean you're ending up with packets from a single flow being 
> > processed
> > on multiple threads at the same time?
> 
>     yes that's correct. the patch above comes from my private branch [1].
>     mpi@ pointed out in off-line email exchange the patch unlocks local 
> inbound
>     packets too, which is coming bit early. However for forwarding case things
>     seem to work (given all the panics I could see and fix).


The way I understand Jono's question is: "Do we want this locking?".  If
we decide that a single flow will always be processed by a single CPU,
is this necessary?  If it isn't should we add an assert to document this
design decision?

Re: pf_state_key_link_reverse() needs atomic ops

Reply via email to