Re: bgpd better nexthop handling

Theo Buehler Mon, 15 Aug 2022 06:01:12 -0700

On Mon, Aug 15, 2022 at 02:26:32PM +0200, Claudio Jeker wrote:
> On Fri, Aug 12, 2022 at 03:30:16PM +0200, Claudio Jeker wrote:
> > On Fri, Aug 12, 2022 at 03:06:14PM +0200, Theo Buehler wrote:
> > > On Fri, Aug 12, 2022 at 12:43:09PM +0200, Claudio Jeker wrote:
> > > > There is currently a race in bgpd when multiple nexthop become invalid 
> > > > at
> > > > the same time. The problem is that the decision process may select an
> > > > alternative path that also has a no longer valid nexthop but the process
> > > > that does all the adjustments did not reach that prefix yet.
> > > > The main issue here is that the RDE uses the true_nexthop to communicate
> > > > this change but the true_nexthop is 0 or :: in this case.
> > > > 
> > > > This diff solves the issue by no longer using true_nexthop in the kroute
> > > > message but instead have the kroute code do the lookup instead. The 
> > > > state
> > > > in kroute is always up to date so the system knows if the nexthop is 
> > > > valid
> > > > or not and either issues a change or remove depending on that.
> > > > 
> > > > With this the rde no longer uses the true_nexthop (it is only there to 
> > > > be
> > > > reported to bgpctl). It only cares about exit_nexthop, validity and the
> > > > connected network info. All of those should not cause any problems 
> > > > during
> > > > nexthop flips.
> > > 
> > > This reads fine. If you fix the grammar of the comment, then it's ok, 
> > > e.g.:
> > > 
> > > > +        * Ignore the nexthop for VPN routes. The gateway is a forced
> > > 
> > > s/a //
> > > 
> > > > +        * to an mpe(4) interface route using an MPLS label.
> > 
> > Something is still missing in this diff. Since changing a route does not
> > update the kroutes. Guess the kroute code needs to do this proper now, I
> > thought I could still use the old slow path for this but it looks like the
> > update is surpressed. 
> > 
> > Will look more into this and come up with a new version (with the grammar
> > fix from above).
> 
> Here we go. The bug I was chasing was actually introduced a few weeks ago.
> The problem is that when you RTM_CHANGE a route knexthop_track() fails to
> notice the change (okr and kr remain the same since the route did not
> change) and so no update is issued.
> Fix this by introducing knexthop_update() that just forces an update if
> the kroute covering this knexthop is modified. The rest of the diff is
> unmodified.
>


ok tb

I have one small question:

> +/*
> + * Called on route change.
> + */
> +void
> +knexthop_update(struct ktable *kt, struct kroute_full *kf)
> +{
> +     struct knexthop *kn;
> +
> +     RB_FOREACH(kn, knexthop_tree, KT2KNT(kt))
> +             if (prefix_compare(&kf->prefix, &kn->nexthop,
> +                 kf->prefixlen) == 0)
> +                     knexthop_send_update(kn);

Can there actually be multiple knexthops with the same prefix or could we
return early here?

> +}
> +
>  void
>  knexthop_send_update(struct knexthop *kn)
>  {

Re: bgpd better nexthop handling

Reply via email to