On Sat, Apr 11, 2020 at 01:43:20PM +0000, Visa Hankala wrote:
> On Sat, Apr 11, 2020 at 11:09:54PM +1000, David Gwynne wrote:
> > On Sat, Apr 11, 2020 at 03:21:49AM +0000, Visa Hankala wrote:
> > > On Fri, Apr 10, 2020 at 01:30:47PM -0600, Theo de Raadt wrote:
> > > > Why did it take almost a year to find this?
> > > >
> > > > Or is this bug due to ioctl(2) becoming UNLOCKED on 2020/02/22?
> > >
> > > This is not related to ioctl(2) becoming UNLOCKED. Lower-layer ioctl
> > > code, soo_ioctl() included, lock the kernel when needed. However, most
> > > .if_ioctl backends need NET_LOCK() in addition to KERNEL_LOCK(). In
> > > most cases, that is satisfied by ifioctl() which acquires the lock
> > > before invoking .if_ioctl(). bridge_ioctl() nullifies this by
> > > releasing NET_LOCK().
> >
> > yes.
> >
> > i came up with the following diff before i read the thread here. it's
> > largely identical to what you (visa) already came up with, but it adds
> > some extra checks to ifpromisc based on the doco in around struct ifnet
> > members in src/sys/net/if_var.h. i audited the rest of the ifpromisc
> > calls and found another one in if_aggr that i was able to trigger.
> >
> > i think the only other call to ifpromisc outside src/sys/net is in carp,
> > and i managed to convinced myself that all those calls hold NET_LOCK
> > already.
> >
> > Index: if.c
> > ===================================================================
> > RCS file: /cvs/src/sys/net/if.c,v
> > retrieving revision 1.601
> > diff -u -p -r1.601 if.c
> > --- if.c 10 Mar 2020 09:11:55 -0000 1.601
> > +++ if.c 11 Apr 2020 13:08:46 -0000
> > @@ -3031,7 +3031,9 @@ ifpromisc(struct ifnet *ifp, int pswitch
> > unsigned short oif_flags;
> > int oif_pcount, error;
> >
> > + NET_ASSERT_LOCKED(); /* modifying if_flags */
> > oif_flags = ifp->if_flags;
> > + KERNEL_ASSERT_LOCKED(); /* modifying if_pcount */
> > oif_pcount = ifp->if_pcount;
> > if (pswitch) {
> > if (ifp->if_pcount++ != 0)
> > Index: if_aggr.c
> > ===================================================================
> > RCS file: /cvs/src/sys/net/if_aggr.c,v
> > retrieving revision 1.28
> > diff -u -p -r1.28 if_aggr.c
> > --- if_aggr.c 11 Mar 2020 07:01:42 -0000 1.28
> > +++ if_aggr.c 11 Apr 2020 13:08:46 -0000
> > @@ -589,8 +589,10 @@ aggr_clone_destroy(struct ifnet *ifp)
> > if_detach(ifp);
> >
> > /* last ref, no need to lock. aggr_p_dtor locks anyway */
> > + NET_LOCK();
> > while ((p = TAILQ_FIRST(&sc->sc_ports)) != NULL)
> > aggr_p_dtor(sc, p, "destroy");
> > + NET_UNLOCK();
> >
> > free(sc, M_DEVBUF, sizeof(*sc));
> >
> > Index: if_bridge.c
> > ===================================================================
> > RCS file: /cvs/src/sys/net/if_bridge.c,v
> > retrieving revision 1.338
> > diff -u -p -r1.338 if_bridge.c
> > --- if_bridge.c 6 Nov 2019 03:51:26 -0000 1.338
> > +++ if_bridge.c 11 Apr 2020 13:08:46 -0000
> > @@ -313,7 +313,9 @@ bridge_ioctl(struct ifnet *ifp, u_long c
> > break;
> > }
> >
> > + NET_LOCK();
> > error = ifpromisc(ifs, 1);
> > + NET_UNLOCK();
> > if (error != 0) {
> > free(bif, M_DEVBUF, sizeof(*bif));
> > break;
> > @@ -558,7 +560,9 @@ bridge_ifremove(struct bridge_iflist *bi
> > }
> >
> > bif->ifp->if_bridgeidx = 0;
> > + NET_LOCK();
> > error = ifpromisc(bif->ifp, 0);
> > + NET_UNLOCK();
> >
> > bridge_rtdelete(sc, bif->ifp, 0);
> > bridge_flushrule(bif);
> > Index: if_tpmr.c
> > ===================================================================
> > RCS file: /cvs/src/sys/net/if_tpmr.c,v
> > retrieving revision 1.9
> > diff -u -p -r1.9 if_tpmr.c
> > --- if_tpmr.c 11 Apr 2020 11:01:03 -0000 1.9
> > +++ if_tpmr.c 11 Apr 2020 13:08:46 -0000
> > @@ -201,12 +201,14 @@ tpmr_clone_destroy(struct ifnet *ifp)
> >
> > if_detach(ifp);
> >
> > + NET_LOCK();
> > for (i = 0; i < nitems(sc->sc_ports); i++) {
> > struct tpmr_port *p = SMR_PTR_GET_LOCKED(&sc->sc_ports[i]);
> > if (p == NULL)
> > continue;
> > tpmr_p_dtor(sc, p, "destroy");
> > }
> > + NET_UNLOCK();
>
> This makes tpmr_p_dtor() call smr_barrier() with NET_LOCK() held.
tpmr_p_dtor is already called with NET_LOCK held, this just does it
consistently now.
> There is a risk of a deadlock if some SMR callback wants to take that
> lock. Currently all the callbacks look NET_LOCK()-free, though.
>
> A similar point applies to the smr_barrier() in aggr(4).
Similarly, the smr_barrier in aggr(4) was usually called with NET_LOCK
held, and is now consistently called with it.
> Could tpmr_p_dtor() and aggr_map() release the lock momentarily when
> calling the barrier?
That looks straightforward to do in tpmr_p_dtor(). aggr_map() is going
to take a lot more thought before I can say. aggr_map() is called
all over the place, and I'll have to fit a lot more code back into my
head to figure out what's going on.
However, you say that currently there are no SMR callbacks that take the
NET_LOCK. Can we take advantage of that and put the ifpromisc diffs in,
and work on the smr_barrier bits separately?
dlg