While testing my private pipex(4) diff Hrvoje Popovski caught
(*if_qstart)() related panic:

r420-1# rcctl restart  npppd                                                    
npppdkernel: protection fault trap, code=0                                      
Stopped at      ifq_dequeue+0x55:       movq    %rdx,0x20(%rax)                 
                                                                                
ddb{3}> trace                                                                   
ifq_dequeue(ffff80002281ef28) at ifq_dequeue+0x55                               
pppx_if_qstart(ffff80002281ef28) at pppx_if_qstart+0x6f                         
ifq_serialize(ffff80002281ef28,ffff80002281f008) at ifq_serialize+0xfd          
taskq_thread(ffff800000030080) at taskq_thread+0x100                            
end trace frame: 0x0, count: -4

The diff, which Hrvoje testing, allows us to have multiple concurrent
pppx_if_qstart() handlers absolutely without any netlock or kernel lock
serialization, so the if_detach() call could be occurred while we have
at least one start routine (*if_qstart)() running on other CPU. In this
case the if_detach() thread should wait until all running (*if_qstart)()
threads finished, but it doesn't.

We have the ifq_barrier() routine for that, so use it just after we
unlinked dying interface from the stack. It's not accessible by
if_get(9) or if_unit(9). It has no concurrent ioctl(2) threads. It is
also detached from the pseudo devices like bridge(4). But it could have
concurrent (*if_qstart)() handlers running. So wait them here, and then
continue destruction.

It's not related to this diff, but we also could do lockless `if_ioctl'
assignment to if_detached_ioctl().

ok?

Index: sys/net/if.c
===================================================================
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.657
diff -u -p -r1.657 if.c
--- sys/net/if.c        29 Jun 2022 09:08:07 -0000      1.657
+++ sys/net/if.c        10 Jul 2022 12:55:56 -0000
@@ -1029,6 +1029,10 @@ if_detach(struct ifnet *ifp)
        /* Other CPUs must not have a reference before we start destroying. */
        if_remove(ifp);
 
+       ifp->if_qstart = if_detached_qstart;
+
+       /* Wait until the start routines finished. */
+       ifq_barrier(&ifp->if_snd);
        ifq_clr_oactive(&ifp->if_snd);
 
 #if NBPFILTER > 0
@@ -1037,7 +1041,6 @@ if_detach(struct ifnet *ifp)
 
        NET_LOCK();
        s = splnet();
-       ifp->if_qstart = if_detached_qstart;
        ifp->if_ioctl = if_detached_ioctl;
        ifp->if_watchdog = NULL;
 

Reply via email to