On Mon, Aug 15, 2016 at 08:42:06AM +0200, Martin Pieuchot wrote:
> On 08/08/16(Mon) 11:48, Martin Pieuchot wrote:
> > The rtable_walk() & prio bug I just sent a fix for should theoretically
> > not cause any trouble.  Sadly it is piled on top of another bug for
> > which a fix is attached.
> > 
> > When an interface is removed the current code starts by purging all its
> > corresponding route entries.  This is wrong because the per-AF code has
> > some knowledge of which automagic route should be removed first.
> > 
> > In other words, the rtable_walk() hang should never have been triggered
> > because the IPv4-specific code should take care of removing the
> > RTF_BROADCAST entry.  I believe that this ordering problem is the reason
> > why error code are ignored in AF-specific code paths.
> > 
> > Diff attached fixes that, ok?
> 
> Anyone?
> 

IMO this is a bit of a catch-22 here. Until recently we did store less
interface specific stuff in the routing table and so it made sense to
clean up the routes first (the assumption was that this will first remove
all gateway routes and probably also the arp/ndp caches). This makes
removing of the interface specific stuff simpler because the upper layers
are already clean. Now we turn this upside down but then we may get burned
because gateway routes are cleaned too late and put the routing table into
an inconsitent state. I know you removed a fair amount of references that
would cause crashes because of bad pointers. Need to update my mental
model of the rtable / ifa / ifp interaction to see if this is indeed not
possible. At the same time I wonder if we need multiple walks to make this
save and easier to understand... (replacing dragons with other smaller
dragons is not that helpful because they grow over time).

I try to find time tonight to look into this.

> > Index: net/if.c
> > ===================================================================
> > RCS file: /cvs/src/sys/net/if.c,v
> > retrieving revision 1.436
> > diff -u -p -r1.436 if.c
> > --- net/if.c        13 Jul 2016 16:45:19 -0000      1.436
> > +++ net/if.c        22 Jul 2016 12:45:28 -0000
> > @@ -931,7 +931,6 @@ if_detach(struct ifnet *ifp)
> >  #if NBPFILTER > 0
> >     bpfdetach(ifp);
> >  #endif
> > -   rt_if_remove(ifp);
> >     rti_delete(ifp);
> >  #if NETHER > 0 && defined(NFSCLIENT)
> >     if (ifp->if_index == revarp_ifidx)
> > @@ -944,6 +943,7 @@ if_detach(struct ifnet *ifp)
> >  #ifdef INET6
> >     in6_ifdetach(ifp);
> >  #endif
> > +   rt_if_remove(ifp);
> >  #if NPF > 0
> >     pfi_detach_ifnet(ifp);
> >  #endif
> > @@ -1931,15 +1931,15 @@ ifioctl(struct socket *so, u_long cmd, c
> >                      */
> >                     if (up)
> >                             if_down(ifp);
> > -                   rt_if_remove(ifp);
> >                     rti_delete(ifp);
> >  #ifdef MROUTING
> >                     vif_delete(ifp);
> >  #endif
> > +                   in_ifdetach(ifp);
> >  #ifdef INET6
> >                     in6_ifdetach(ifp);
> >  #endif
> > -                   in_ifdetach(ifp);
> > +                   rt_if_remove(ifp);
> >                     splx(s);
> >             }
> >  
> 

-- 
:wq Claudio

Reply via email to