On Sun, Jul 10, 2022 at 07:34:29PM +0300, Vitaliy Makkoveev wrote:
> While testing my private pipex(4) diff Hrvoje Popovski caught
> (*if_qstart)() related panic:
> 
> r420-1# rcctl restart  npppd                                                  
>   
> npppdkernel: protection fault trap, code=0                                    
>   
> Stopped at      ifq_dequeue+0x55:       movq    %rdx,0x20(%rax)               
>   
>                                                                               
>   
> ddb{3}> trace                                                                 
>   
> ifq_dequeue(ffff80002281ef28) at ifq_dequeue+0x55                             
>   
> pppx_if_qstart(ffff80002281ef28) at pppx_if_qstart+0x6f                       
>   
> ifq_serialize(ffff80002281ef28,ffff80002281f008) at ifq_serialize+0xfd        
>   
> taskq_thread(ffff800000030080) at taskq_thread+0x100                          
>   
> end trace frame: 0x0, count: -4
> 
> The diff, which Hrvoje testing, allows us to have multiple concurrent
> pppx_if_qstart() handlers absolutely without any netlock or kernel lock
> serialization, so the if_detach() call could be occurred while we have
> at least one start routine (*if_qstart)() running on other CPU. In this
> case the if_detach() thread should wait until all running (*if_qstart)()
> threads finished, but it doesn't.
> 
> We have the ifq_barrier() routine for that, so use it just after we
> unlinked dying interface from the stack. It's not accessible by
> if_get(9) or if_unit(9). It has no concurrent ioctl(2) threads. It is
> also detached from the pseudo devices like bridge(4). But it could have
> concurrent (*if_qstart)() handlers running. So wait them here, and then
> continue destruction.
> 
> It's not related to this diff, but we also could do lockless `if_ioctl'
> assignment to if_detached_ioctl().
> 
> ok?

OK bluhm@

> Index: sys/net/if.c
> ===================================================================
> RCS file: /cvs/src/sys/net/if.c,v
> retrieving revision 1.657
> diff -u -p -r1.657 if.c
> --- sys/net/if.c      29 Jun 2022 09:08:07 -0000      1.657
> +++ sys/net/if.c      10 Jul 2022 12:55:56 -0000
> @@ -1029,6 +1029,10 @@ if_detach(struct ifnet *ifp)
>       /* Other CPUs must not have a reference before we start destroying. */
>       if_remove(ifp);
>  
> +     ifp->if_qstart = if_detached_qstart;
> +
> +     /* Wait until the start routines finished. */
> +     ifq_barrier(&ifp->if_snd);
>       ifq_clr_oactive(&ifp->if_snd);
>  
>  #if NBPFILTER > 0
> @@ -1037,7 +1041,6 @@ if_detach(struct ifnet *ifp)
>  
>       NET_LOCK();
>       s = splnet();
> -     ifp->if_qstart = if_detached_qstart;
>       ifp->if_ioctl = if_detached_ioctl;
>       ifp->if_watchdog = NULL;
>  

Reply via email to