> > Discussing this issue with Erik this morning, we came up with a proposal
 > > that has similar effect but with a lot less risk: in.mpathd could simply
 > > ignore requests to delete targets from an interface associated with a
 > > failed group, and continue to probe the existing target set (possibly
 > > expanded target set if new targets are added).  When an interface in the
 > > group repairs, it could then rebuild the target list based on the latest
 > > routing table.  I prototyped this (literally a one line change) and it
 > > "seems" to work.
 > > 
 > > Thoughts?
 > 
 > That sounds like it should work.  Is there potentially a race condition
 > between dhcpagent removing the route and in.mpathd realizing that the
 > interface has failed and therefore should not remove the next hop from
 > the target list?

No, since in.mpathd is responsible for triggering the group failure by
setting the IFF_FAILED flag on the last usable interface in the group.
Setting IFF_FAILED will in turn cause the kernel to clear IFF_RUNNING on
the IPMP IP interface, which will cause dhcpagent/routing daemons to
remove the routes -- but by that time, in.mpathd is well aware the group
has failed.

-- 
meem

Reply via email to