On Fri, Aug 12, 2022 at 03:06:14PM +0200, Theo Buehler wrote:
> On Fri, Aug 12, 2022 at 12:43:09PM +0200, Claudio Jeker wrote:
> > There is currently a race in bgpd when multiple nexthop become invalid at
> > the same time. The problem is that the decision process may select an
> > alternative path that also has a no longer valid nexthop but the process
> > that does all the adjustments did not reach that prefix yet.
> > The main issue here is that the RDE uses the true_nexthop to communicate
> > this change but the true_nexthop is 0 or :: in this case.
> > 
> > This diff solves the issue by no longer using true_nexthop in the kroute
> > message but instead have the kroute code do the lookup instead. The state
> > in kroute is always up to date so the system knows if the nexthop is valid
> > or not and either issues a change or remove depending on that.
> > 
> > With this the rde no longer uses the true_nexthop (it is only there to be
> > reported to bgpctl). It only cares about exit_nexthop, validity and the
> > connected network info. All of those should not cause any problems during
> > nexthop flips.
> 
> This reads fine. If you fix the grammar of the comment, then it's ok, e.g.:
> 
> > +    * Ignore the nexthop for VPN routes. The gateway is a forced
> 
> s/a //
> 
> > +    * to an mpe(4) interface route using an MPLS label.

Something is still missing in this diff. Since changing a route does not
update the kroutes. Guess the kroute code needs to do this proper now, I
thought I could still use the old slow path for this but it looks like the
update is surpressed. 

Will look more into this and come up with a new version (with the grammar
fix from above).
-- 
:wq Claudio

Reply via email to