On Fri, Aug 12, 2022 at 03:06:14PM +0200, Theo Buehler wrote: > On Fri, Aug 12, 2022 at 12:43:09PM +0200, Claudio Jeker wrote: > > There is currently a race in bgpd when multiple nexthop become invalid at > > the same time. The problem is that the decision process may select an > > alternative path that also has a no longer valid nexthop but the process > > that does all the adjustments did not reach that prefix yet. > > The main issue here is that the RDE uses the true_nexthop to communicate > > this change but the true_nexthop is 0 or :: in this case. > > > > This diff solves the issue by no longer using true_nexthop in the kroute > > message but instead have the kroute code do the lookup instead. The state > > in kroute is always up to date so the system knows if the nexthop is valid > > or not and either issues a change or remove depending on that. > > > > With this the rde no longer uses the true_nexthop (it is only there to be > > reported to bgpctl). It only cares about exit_nexthop, validity and the > > connected network info. All of those should not cause any problems during > > nexthop flips. > > This reads fine. If you fix the grammar of the comment, then it's ok, e.g.: > > > + * Ignore the nexthop for VPN routes. The gateway is a forced > > s/a // > > > + * to an mpe(4) interface route using an MPLS label.
Something is still missing in this diff. Since changing a route does not update the kroutes. Guess the kroute code needs to do this proper now, I thought I could still use the old slow path for this but it looks like the update is surpressed. Will look more into this and come up with a new version (with the grammar fix from above). -- :wq Claudio