On Sun, Jul 10, 2022 at 07:34:29PM +0300, Vitaliy Makkoveev wrote:
> While testing my private pipex(4) diff Hrvoje Popovski caught
> (*if_qstart)() related panic:
>
> r420-1# rcctl restart npppd
>
> npppdkernel: protection fault trap, code=0
>
> Stopped at ifq_dequeue+0x55: movq %rdx,0x20(%rax)
>
>
>
> ddb{3}> trace
>
> ifq_dequeue(ffff80002281ef28) at ifq_dequeue+0x55
>
> pppx_if_qstart(ffff80002281ef28) at pppx_if_qstart+0x6f
>
> ifq_serialize(ffff80002281ef28,ffff80002281f008) at ifq_serialize+0xfd
>
> taskq_thread(ffff800000030080) at taskq_thread+0x100
>
> end trace frame: 0x0, count: -4
>
> The diff, which Hrvoje testing, allows us to have multiple concurrent
> pppx_if_qstart() handlers absolutely without any netlock or kernel lock
> serialization, so the if_detach() call could be occurred while we have
> at least one start routine (*if_qstart)() running on other CPU. In this
> case the if_detach() thread should wait until all running (*if_qstart)()
> threads finished, but it doesn't.
>
> We have the ifq_barrier() routine for that, so use it just after we
> unlinked dying interface from the stack. It's not accessible by
> if_get(9) or if_unit(9). It has no concurrent ioctl(2) threads. It is
> also detached from the pseudo devices like bridge(4). But it could have
> concurrent (*if_qstart)() handlers running. So wait them here, and then
> continue destruction.
>
> It's not related to this diff, but we also could do lockless `if_ioctl'
> assignment to if_detached_ioctl().
>
> ok?
OK bluhm@
> Index: sys/net/if.c
> ===================================================================
> RCS file: /cvs/src/sys/net/if.c,v
> retrieving revision 1.657
> diff -u -p -r1.657 if.c
> --- sys/net/if.c 29 Jun 2022 09:08:07 -0000 1.657
> +++ sys/net/if.c 10 Jul 2022 12:55:56 -0000
> @@ -1029,6 +1029,10 @@ if_detach(struct ifnet *ifp)
> /* Other CPUs must not have a reference before we start destroying. */
> if_remove(ifp);
>
> + ifp->if_qstart = if_detached_qstart;
> +
> + /* Wait until the start routines finished. */
> + ifq_barrier(&ifp->if_snd);
> ifq_clr_oactive(&ifp->if_snd);
>
> #if NBPFILTER > 0
> @@ -1037,7 +1041,6 @@ if_detach(struct ifnet *ifp)
>
> NET_LOCK();
> s = splnet();
> - ifp->if_qstart = if_detached_qstart;
> ifp->if_ioctl = if_detached_ioctl;
> ifp->if_watchdog = NULL;
>