On Tue, 2008-06-10 at 22:50 -0400, Peter Memishian wrote: > I happened upon an interesting and unfortunate interaction today that's > worthy of some discussion. Specifically, when the IFF_RUNNING flag is > cleared on an IP interface, dhcpagent purges any routes it added over the > interface on the grounds that the routes can no longer be used, thus > allowing any overlapping (but still usable) routes to be used.
By "overlapping", you mean other default routes? If they were more specific routes, then they were being used instead of the default route to begin with by definition, and removing the default route would have no effect for those destinations covered by the more specific route. Does dhcpagent install multiple default routes? > However, in the common case with e.g. two interfaces together in an IPMP > group that has DHCP data addresses, when the group fails, the IPMP IP > interface's IFF_RUNNING flag will be cleared and thus its routes removed > by dhcpagent. At that point, if probe-based failure detection is enabled, > in.mpathd will fallback to multicast targets. For sites configured not to > answer in.mpathd's multicast probes, this means the interface will *never* > repair. For sites where the multicast probes will be answered but by > nodes that are not representative of overall connectivity, this will lead > to a spurious repair, followed by a subsequent failure when the routes are > restored but the routers still prove to be unreachable. Neither behavior > seems acceptable. I don't see the latter unacceptable behavior as being specific to this problem, but a problem with multicast probes in general. If fail-back due to multicast probe confirmation is "unacceptable", then why does in.mpathd do this at all? The first failure mode in which multicast probes are blocked is indeed bad. > Thoughts? Clearly, we could remove the code in the DHCP client that > removes the routes (at least when IPMP is in use), but it makes some sense > as-is. Yes, the DHCP client behavior is acceptable IMO. Since IPv4 has no neighbor unreachability detection like IPv6, leaving the route there means that all of your new connection attempts hang for over 3 minutes while your TCP SYNs go into the bit bucket instead of returning immediately with "Network is unreachable". > Further, Jim mentioned that routing daemons also do this, though I > didn't see anything that did this in ON's in.routed or in SFW's quagga > source. Consider a default route learned through a routing protocol. When the link goes down, we may not delete that route immediately based on the link state flag, but we will cease to receive updates from the router and will eventually delete the route after a short time (based on the routing protocol algorithm). The end result is the same, the default route goes away when the link goes down. Routing daemons could probably go a step further and delete routes based on link status alone, but there's no explicit need to implement that since routing protocols handle this indirectly. > Another alternative would be to only remove the routes if there > is in fact an overlapping route, but that may be non-trivial to implement. I think the right behavior is to always remove the route. I mention what I believe to be the main benefit of removing the route above (makes diagnosis of application connection problems much easier). I also don't see the relationship with "overlapping routes", so I may be missing your (or the code's) rationale there. > We could also document that of explicit "host routes" need to be used with > probe-based failure detection when when DHCP (or dynamic routing?) is in > use, but that may not go far enough. Documentation would certainly be an option. Sowmini had another suggestion that might be acceptable, which is to leave IFF_RUNNING alone and potentially use another interface flag to denote complete and total failure. With that approach, you unfortunately lose the benefit of having dhcpagent remove the default route to begin with while allowing the interface to recover. Another way to do this while preserving the semantics you've defined for IFF_RUNNING would be to modify dhcpagent to ignore the IFF_RUNNING flag on IPMP interfaces. -Seb
