Re: Purge route entries when an address is removed
On 2015/09/13 11:15, Martin Pieuchot wrote: > Currently we leave RTF_STATIC route entries in the table when the > address they are attached to is removed from a system. > > That's why ifas need to be refcounted and that's why we have *a lot* > of checks in the stack to not use cached routes attached to such ifa. > > I'd like to simplify all of this by simply purging all the routes > attached to an ifa being removed. This behavior is coherent with > the fact that routes *need* an ifa to be inserted in the table. > > This makes the kernel simpler as it no longer try to find a new ifa > when a route with a stale address is being used. This does bad things with pppoe(4) default routes, the usual way to configure this is with a wildcard 0.0.0.0 in hostname.pppoe0 and with default pointing with -ifp pppoe0. I'm not 100% sure about this but I guess that when IPCP negotiates an address and removes the temporary 0.0.0.0 wildcard address to configure it on the interface, the default -ifp route is also killed. If you want to play with this yourself and don't have pppoe available, you can build a pppoe test rig using npppd. On the client side: cat >> /etc/hostname.pppoe0 << EOF inet 0.0.0.0 255.255.255.255 0.0.0.1 pppoedev em0 \ authproto chap authname test authkey yayaya !route add default -ifp pppoe0 0.0.0.1 EOF On the server side (will nat/route, assuming it has internet access itself): ifconfig pppx0 up sysctl net.pipex.enable=1 sysctl net.inet.ip.forwarding=1 /etc/npppd/npppd.conf authentication LOCAL type local { users-file "/etc/npppd/npppd-users" } tunnel PPPOE protocol pppoe { listen on interface em1 } ipcp IPCP { pool-address 172.16.192.2-172.16.192.254 dns-servers 8.8.8.8 } interface pppx0 address 172.16.192.1 ipcp IPCP bind tunnel from PPPOE authenticated by LOCAL to pppx0 /etc/npppd/npppd-users test:password=yayaya: /etc/pf.conf pass out quick on egress inet received-on pppx nat-to egress:0 ..
Re: Purge route entries when an address is removed
Martin Pieuchot(m...@openbsd.org) on 2015.09.13 16:08:50 +0200: > On 13/09/15(Sun) 15:51, Alexander Bluhm wrote: > > On Sun, Sep 13, 2015 at 11:15:50AM +0200, Martin Pieuchot wrote: > > > This makes the kernel simpler as it no longer try to find a new ifa > > > when a route with a stale address is being used. > > > > This makes the code simpler, which is good. > > > > I am still not convinced that we want to loose the feature that the > > routes jump to another interface address. When we have multiple > > suiteable addresses and one gets deleted, the system can use another > > one. > > This is the price to pay for making the code simpler. I strongly > believe this "feature" is a side effect of history that should not > have been added in the first place. > > However I'd like to fix potential issues with this diff before committing > it, so tests are welcome :) Hi, i found the time to play with your diff. On a router (where you probably wouldn't to this operationaly) you get tons of "route xxx vanished before delete" from bgpd and ospfd, but they continue to work, apparently as intended. I havent found a box with pppoe to test. it would be nice if someone could test that. as far as it goes, ok from me. /Benno > > The patch itself looks correct. Just one question: > > > > > @@ -850,22 +850,10 @@ rtrequest1(int req, struct rt_addrinfo * > > > return (EINVAL); > > > if ((rt->rt_flags & RTF_CLONING) == 0) > > > return (EINVAL); > > > - if (rt->rt_ifa->ifa_ifp) { > > > - info->rti_ifa = rt->rt_ifa; > > > - } else { > > > - /* > > > - * The address of the cloning route is not longer > > > - * configured on an interface, but its descriptor > > > - * is still there because of reference counting. > > > - * > > > - * Try to find a similar active address and use > > > - * it for the cloned route. The cloning route > > > - * will get the new address and interface later. > > > - */ > > > - info->rti_ifa = NULL; > > > - info->rti_info[RTAX_IFA] = rt->rt_ifa->ifa_addr; > > > - } > > > + if (rt->rt_ifa->ifa_ifp == NULL) > > > + return (EAGAIN); > > > > Why return EAGAIN here? Should it be EINVAL like in the other > > cases? Can this happen at all? > > This should never happen but I'd like to take a safe approach and be > able to differentiate the error code if this still happens. I'd > happily turn this into a KASSERT() but not right now. +1
Re: Purge route entries when an address is removed
On 2015/09/22 23:07, Sebastian Benoit wrote: > Martin Pieuchot(m...@openbsd.org) on 2015.09.13 16:08:50 +0200: > > On 13/09/15(Sun) 15:51, Alexander Bluhm wrote: > > > On Sun, Sep 13, 2015 at 11:15:50AM +0200, Martin Pieuchot wrote: > > > > This makes the kernel simpler as it no longer try to find a new ifa > > > > when a route with a stale address is being used. > > > > > > This makes the code simpler, which is good. > > > > > > I am still not convinced that we want to loose the feature that the > > > routes jump to another interface address. When we have multiple > > > suiteable addresses and one gets deleted, the system can use another > > > one. > > > > This is the price to pay for making the code simpler. I strongly > > believe this "feature" is a side effect of history that should not > > have been added in the first place. > > > > However I'd like to fix potential issues with this diff before committing > > it, so tests are welcome :) > > Hi, > > i found the time to play with your diff. > > On a router (where you probably wouldn't to this operationaly) you get tons > of "route xxx vanished before delete" from bgpd and ospfd, but they continue > to work, apparently as intended. > > I havent found a box with pppoe to test. it would be nice if someone could > test that. Preferably someone with a relatively unstable pppoe. Maybe one of those providers which kills sessions every day and forces an address change..
Re: Purge route entries when an address is removed
Martin Pieuchot [m...@openbsd.org] wrote: > Currently we leave RTF_STATIC route entries in the table when the > address they are attached to is removed from a system. > > That's why ifas need to be refcounted and that's why we have *a lot* > of checks in the stack to not use cached routes attached to such ifa. > > I'd like to simplify all of this by simply purging all the routes > attached to an ifa being removed. This behavior is coherent with > the fact that routes *need* an ifa to be inserted in the table. > > This makes the kernel simpler as it no longer try to find a new ifa > when a route with a stale address is being used. > I'm pretty sure this feature has been broken for a while. Once upon a time in OpenBSD, this used to work: ifconfig em0 192.168.1.2/24 route add default 192.168.1.1 ping 8.8.8.8 reply! ifconfig em0 delete 192.168.1.2 ifconfig em1 192.168.1.2/24 ping 8.8.8.8 reply! For some years, this has been required: ifconfig em0 192.168.1.2/24 route add default 192.168.1.1 ping 8.8.8.8 reply! ifconfig em0 delete 192.168.1.2 ifconfig em1 192.168.1.2/24 **route delete default** **route add default 192.168.1.1** ping 8.8.8.8 reply! It appears, what you're proposing is: ifconfig em0 192.168.1.2/24 route add default 192.168.1.1 ping 8.8.8.8 reply! ifconfig em0 delete 192.168.1.2 ifconfig em1 192.168.1.2/24 **route add default 192.168.1.1** ping 8.8.8.8 reply! As a user (only), I really like the original behavior, it seems to be consistent with the behavior of some other systems, at least in the case of the default router. If there was some way to do this without stupid complexity, it might be worth considering. In any event, what you propose seems much better than the current state of affairs, where you end up with broken routes in the table that need nothing more than to be removed. Chris
Purge route entries when an address is removed
Currently we leave RTF_STATIC route entries in the table when the address they are attached to is removed from a system. That's why ifas need to be refcounted and that's why we have *a lot* of checks in the stack to not use cached routes attached to such ifa. I'd like to simplify all of this by simply purging all the routes attached to an ifa being removed. This behavior is coherent with the fact that routes *need* an ifa to be inserted in the table. This makes the kernel simpler as it no longer try to find a new ifa when a route with a stale address is being used. Tests and oks welcome. Index: net/if.c === RCS file: /cvs/src/sys/net/if.c,v retrieving revision 1.376 diff -u -p -r1.376 if.c --- net/if.c12 Sep 2015 20:26:06 - 1.376 +++ net/if.c13 Sep 2015 08:53:56 - @@ -143,6 +143,7 @@ struct if_clone *if_clone_lookup(const c intif_group_egress_build(void); +void if_purgeaddrs(struct ifnet *); void if_link_state_change_task(void *); void if_input_process(void *); @@ -893,7 +894,6 @@ if_deactivate(struct ifnet *ifp) void if_detach(struct ifnet *ifp) { - struct ifaddr *ifa; struct ifg_list *ifg; struct domain *dp; int i, s; @@ -916,7 +916,6 @@ if_detach(struct ifnet *ifp) #if NBPFILTER > 0 bpfdetach(ifp); #endif - rt_if_remove(ifp); rti_delete(ifp); #if NETHER > 0 && defined(NFSCLIENT) if (ifp == revarp_ifp) @@ -944,16 +943,7 @@ if_detach(struct ifnet *ifp) if_free_sadl(ifp); /* We should not have any address left at this point. */ - if (!TAILQ_EMPTY(>if_addrlist)) { -#ifdef DIAGNOSTIC - printf("%s: address list non empty\n", ifp->if_xname); -#endif - while ((ifa = TAILQ_FIRST(>if_addrlist)) != NULL) { - ifa_del(ifp, ifa); - ifa->ifa_ifp = NULL; - ifafree(ifa); - } - } + if_purgeaddrs(ifp); free(ifp->if_addrhooks, M_TEMP, 0); free(ifp->if_linkstatehooks, M_TEMP, 0); @@ -1570,6 +1560,28 @@ if_put(struct ifnet *ifp) atomic_dec_int(>if_refcnt); } + +/* + * Remove the remaining addresses from an interface, this should + * theorically not be necessary. + */ +void +if_purgeaddrs(struct ifnet *ifp) +{ + struct ifaddr *ifa; + + if (!TAILQ_EMPTY(>if_addrlist)) { +#ifdef DIAGNOSTIC + printf("%s: address list non empty\n", ifp->if_xname); +#endif + while ((ifa = TAILQ_FIRST(>if_addrlist)) != NULL) { + ifa_del(ifp, ifa); + ifa->ifa_ifp = NULL; + ifafree(ifa); + } + } +} + /* * Interface ioctls. */ @@ -1912,7 +1924,6 @@ ifioctl(struct socket *so, u_long cmd, c */ if (up) if_down(ifp); - rt_if_remove(ifp); rti_delete(ifp); #ifdef MROUTING vif_delete(ifp); @@ -1921,6 +1932,10 @@ ifioctl(struct socket *so, u_long cmd, c in6_ifdetach(ifp); #endif in_ifdetach(ifp); + + /* We should not have any address left at this point. */ + if_purgeaddrs(ifp); + splx(s); } @@ -2509,6 +2524,7 @@ void ifa_del(struct ifnet *ifp, struct ifaddr *ifa) { TAILQ_REMOVE(>if_addrlist, ifa, ifa_list); + rt_ifa_purge(ifa); } void Index: net/route.c === RCS file: /cvs/src/sys/net/route.c,v retrieving revision 1.239 diff -u -p -r1.239 route.c --- net/route.c 12 Sep 2015 20:50:17 - 1.239 +++ net/route.c 13 Sep 2015 09:00:43 - @@ -152,7 +152,7 @@ voidrt_timer_init(void); intrtable_alloc(void ***, u_int); intrtflushclone1(struct rtentry *, void *, u_int); void rtflushclone(unsigned int, struct rtentry *); -intrt_if_remove_rtdelete(struct rtentry *, void *, u_int); +intrt_ifa_remove_rtdelete(struct rtentry *, void *, unsigned int); struct ifaddr *ifa_ifwithroute(int, struct sockaddr *, struct sockaddr *, u_int); @@ -850,22 +850,10 @@ rtrequest1(int req, struct rt_addrinfo * return (EINVAL); if ((rt->rt_flags & RTF_CLONING) == 0) return (EINVAL); - if (rt->rt_ifa->ifa_ifp) { - info->rti_ifa = rt->rt_ifa; - } else { - /* -* The address of the cloning route is not longer -* configured on an interface, but its descriptor -* is still there because of reference counting. -* -* Try to find a similar active
Re: Purge route entries when an address is removed
On 13/09/15(Sun) 15:51, Alexander Bluhm wrote: > On Sun, Sep 13, 2015 at 11:15:50AM +0200, Martin Pieuchot wrote: > > This makes the kernel simpler as it no longer try to find a new ifa > > when a route with a stale address is being used. > > This makes the code simpler, which is good. > > I am still not convinced that we want to loose the feature that the > routes jump to another interface address. When we have multiple > suiteable addresses and one gets deleted, the system can use another > one. This is the price to pay for making the code simpler. I strongly believe this "feature" is a side effect of history that should not have been added in the first place. However I'd like to fix potential issues with this diff before committing it, so tests are welcome :) > The patch itself looks correct. Just one question: > > > @@ -850,22 +850,10 @@ rtrequest1(int req, struct rt_addrinfo * > > return (EINVAL); > > if ((rt->rt_flags & RTF_CLONING) == 0) > > return (EINVAL); > > - if (rt->rt_ifa->ifa_ifp) { > > - info->rti_ifa = rt->rt_ifa; > > - } else { > > - /* > > -* The address of the cloning route is not longer > > -* configured on an interface, but its descriptor > > -* is still there because of reference counting. > > -* > > -* Try to find a similar active address and use > > -* it for the cloned route. The cloning route > > -* will get the new address and interface later. > > -*/ > > - info->rti_ifa = NULL; > > - info->rti_info[RTAX_IFA] = rt->rt_ifa->ifa_addr; > > - } > > + if (rt->rt_ifa->ifa_ifp == NULL) > > + return (EAGAIN); > > Why return EAGAIN here? Should it be EINVAL like in the other > cases? Can this happen at all? This should never happen but I'd like to take a safe approach and be able to differentiate the error code if this still happens. I'd happily turn this into a KASSERT() but not right now.
Re: Purge route entries when an address is removed
On Sun, Sep 13, 2015 at 11:15:50AM +0200, Martin Pieuchot wrote: > This makes the kernel simpler as it no longer try to find a new ifa > when a route with a stale address is being used. This makes the code simpler, which is good. I am still not convinced that we want to loose the feature that the routes jump to another interface address. When we have multiple suiteable addresses and one gets deleted, the system can use another one. > Tests and oks welcome. The patch itself looks correct. Just one question: > @@ -850,22 +850,10 @@ rtrequest1(int req, struct rt_addrinfo * > return (EINVAL); > if ((rt->rt_flags & RTF_CLONING) == 0) > return (EINVAL); > - if (rt->rt_ifa->ifa_ifp) { > - info->rti_ifa = rt->rt_ifa; > - } else { > - /* > - * The address of the cloning route is not longer > - * configured on an interface, but its descriptor > - * is still there because of reference counting. > - * > - * Try to find a similar active address and use > - * it for the cloned route. The cloning route > - * will get the new address and interface later. > - */ > - info->rti_ifa = NULL; > - info->rti_info[RTAX_IFA] = rt->rt_ifa->ifa_addr; > - } > + if (rt->rt_ifa->ifa_ifp == NULL) > + return (EAGAIN); Why return EAGAIN here? Should it be EINVAL like in the other cases? Can this happen at all? bluhm