On Fri, Apr 10, 2020 at 01:22:32PM -0600, Theo de Raadt wrote:
> I would like an explanation of the root cause.  What commit broke this?

The root cause for these asserts is that bridge_ioctl() drops NET_LOCK().


net/if_bridge.c revision 1.332
date: 2019/05/12 19:53:22;  author: mpi;  state: Exp;  lines: +92 -71;  
commitid: sO9kfjHuGizxTwI1;
Switch the list of span interfaces and interfaces to SMR.

This removes the KERNEL_LOCK() around the list iteration in bridge_enqueue().

Since the NET_LOCK() isn't protecting any data structure, release it early
in all the code paths coming from the Network Stack to prevent possible
deadlock situations with smr_barrier().

bridge_input() is still KERNEL_LOCK()ed as well as bridge_filterrule().

ok visa@


> Visa Hankala <[email protected]> wrote:
> 
> > On Fri, Apr 10, 2020 at 10:16:47AM -0400, David Hill wrote:
> > > On a April 9, 2020 cvs built machine:
> > > 
> > > An splassert shows when I add or delete vlan0 on bridge0
> > > 
> > > # ifconfig bridge0 add vlan0
> > > splassert: vlan_ioctl: want 2 have 0
> > > Starting stack trace...
> > > vlan_ioctl(ffff800000936800,80206910,ffff80003316f318) at vlan_ioctl+0x65
> > > ifpromisc(ffff800000936800,1) at ifpromisc+0xbb
> > > bridge_ioctl(ffff8000009da000,8060693c,ffff80003316f520) at
> > > bridge_ioctl+0x8c8
> > > ifioctl(fffffd811e7e5e40,8060693c,ffff80003316f520,ffff800032fd49e8) at
> > > ifioctl+0xa03
> > > soo_ioctl(fffffd81015768e8,8060693c,ffff80003316f520,ffff800032fd49e8) at
> > > soo_ioctl+0x171
> > > sys_ioctl(ffff800032fd49e8,ffff80003316f630,ffff80003316f690) at
> > > sys_ioctl+0x2df
> > > syscall(ffff80003316f700) at syscall+0x389
> > > Xsyscall() at Xsyscall+0x128
> > > end of kernel
> > > end trace frame: 0x7f7ffffdcef0, count: 249
> > > End of stack trace.
> > > 
> > > # ifconfig bridge0 del vlan0
> > > splassert: vlan_ioctl: want 2 have 0
> > > Starting stack trace...
> > > vlan_ioctl(ffff800000936800,80206910,ffff80003316ee78) at vlan_ioctl+0x65
> > > ifpromisc(ffff800000936800,0) at ifpromisc+0xbb
> > > bridge_ifremove(ffff8000009ed900) at bridge_ifremove+0xa4
> > > bridge_ioctl(ffff8000009da000,8060693d,ffff80003316f0c0) at
> > > bridge_ioctl+0x91e
> > > ifioctl(fffffd811e7e5e40,8060693d,ffff80003316f0c0,ffff800032fd49e8) at
> > > ifioctl+0xa03
> > > soo_ioctl(fffffd81015768e8,8060693d,ffff80003316f0c0,ffff800032fd49e8) at
> > > soo_ioctl+0x171
> > > sys_ioctl(ffff800032fd49e8,ffff80003316f1d0,ffff80003316f230) at
> > > sys_ioctl+0x2df
> > > syscall(ffff80003316f2a0) at syscall+0x389
> > > Xsyscall() at Xsyscall+0x128
> > > end of kernel
> > > end trace frame: 0x7f7fffff6a20, count: 248
> > 
> > Acquiring NET_LOCK() when calling ifpromisc() seems to help.
> > This is similar to what bpf(4) does.
> > 
> > It looks that besides bridge(4), there are other places in the kernel
> > where ifpromisc() is called without NET_LOCK().
> > 
> > Index: net/if_bridge.c
> > ===================================================================
> > RCS file: src/sys/net/if_bridge.c,v
> > retrieving revision 1.338
> > diff -u -p -r1.338 if_bridge.c
> > --- net/if_bridge.c 6 Nov 2019 03:51:26 -0000       1.338
> > +++ net/if_bridge.c 10 Apr 2020 19:05:05 -0000
> > @@ -313,7 +313,9 @@ bridge_ioctl(struct ifnet *ifp, u_long c
> >                     break;
> >             }
> >  
> > +           NET_LOCK();
> >             error = ifpromisc(ifs, 1);
> > +           NET_UNLOCK();
> >             if (error != 0) {
> >                     free(bif, M_DEVBUF, sizeof(*bif));
> >                     break;
> > @@ -558,7 +560,9 @@ bridge_ifremove(struct bridge_iflist *bi
> >     }
> >  
> >     bif->ifp->if_bridgeidx = 0;
> > +   NET_LOCK();
> >     error = ifpromisc(bif->ifp, 0);
> > +   NET_UNLOCK();
> >  
> >     bridge_rtdelete(sc, bif->ifp, 0);
> >     bridge_flushrule(bif);
> > 

Reply via email to