On Tue, Aug 09, 2022 at 09:26:58PM +0200, Hrvoje Popovski wrote:
> Hi all,
>
> when sending lot of traffic over firewall with pflow and if I run
> ifconfig pflow0 destroy I'm getting kernel: protection fault trap.
>
>
> This is latest snapshot:
> OpenBSD 7.2-beta (GENERIC.MP) #677: Mon Aug 8 18:58:49 MDT 2022
> [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>
>
> r620-1# ifconfig pflow0 destroy
> kernel: protection fault trap, code=0
> Stopped at in_nam2sin+0x29: cmpb $0x2,0x1(%rdx)
>
Hi,
The kernel lock within pflow_output_process() doesn't help because the
following sosend() has sleep points. So, at least pflow_clone_destroy()
should wait until pflow_output_process() finished. We should use
taskq_del_barrier(9) instead of task_del(9).
Also we need to unlink dying pflow(4) interface from the stack before
start destruction.
This diff should help. Please keep in mind, this diff is incomplete,
because it doesn't fix the race between pflowioctl() and
pflow_output_process(). This race is much more complicated, because we
need to introduce the new lock to protect `so' and take it before call
sosend(), but the sosend() takes netlock, which is taken before
pflowioctl() where we modify `so'. This introduces re-locking games to
pflowioctl() path, I so want to make this with separate diff, because
this potential panic was not triggered.
Index: sys/net/if_pflow.c
===================================================================
RCS file: /cvs/src/sys/net/if_pflow.c,v
retrieving revision 1.94
diff -u -p -r1.94 if_pflow.c
--- sys/net/if_pflow.c 6 Jun 2022 14:45:41 -0000 1.94
+++ sys/net/if_pflow.c 9 Aug 2022 21:29:04 -0000
@@ -274,6 +274,10 @@ pflow_clone_destroy(struct ifnet *ifp)
error = 0;
+ NET_LOCK();
+ SLIST_REMOVE(&pflowif_list, sc, pflow_softc, sc_next);
+ NET_UNLOCK();
+
if (timeout_initialized(&sc->sc_tmo))
timeout_del(&sc->sc_tmo);
if (timeout_initialized(&sc->sc_tmo6))
@@ -281,7 +285,7 @@ pflow_clone_destroy(struct ifnet *ifp)
if (timeout_initialized(&sc->sc_tmo_tmpl))
timeout_del(&sc->sc_tmo_tmpl);
pflow_flush(sc);
- task_del(net_tq(ifp->if_index), &sc->sc_outputtask);
+ taskq_del_barrier(net_tq(ifp->if_index), &sc->sc_outputtask);
mq_purge(&sc->sc_outputqueue);
m_freem(sc->send_nam);
if (sc->so != NULL) {
@@ -293,9 +297,6 @@ pflow_clone_destroy(struct ifnet *ifp)
if (sc->sc_flowsrc != NULL)
free(sc->sc_flowsrc, M_DEVBUF, sc->sc_flowsrc->sa_len);
if_detach(ifp);
- NET_LOCK();
- SLIST_REMOVE(&pflowif_list, sc, pflow_softc, sc_next);
- NET_UNLOCK();
free(sc, M_DEVBUF, sizeof(*sc));
return (error);
}