On Mon, Aug 15, 2022 at 02:26:32PM +0200, Claudio Jeker wrote: > On Fri, Aug 12, 2022 at 03:30:16PM +0200, Claudio Jeker wrote: > > On Fri, Aug 12, 2022 at 03:06:14PM +0200, Theo Buehler wrote: > > > On Fri, Aug 12, 2022 at 12:43:09PM +0200, Claudio Jeker wrote: > > > > There is currently a race in bgpd when multiple nexthop become invalid > > > > at > > > > the same time. The problem is that the decision process may select an > > > > alternative path that also has a no longer valid nexthop but the process > > > > that does all the adjustments did not reach that prefix yet. > > > > The main issue here is that the RDE uses the true_nexthop to communicate > > > > this change but the true_nexthop is 0 or :: in this case. > > > > > > > > This diff solves the issue by no longer using true_nexthop in the kroute > > > > message but instead have the kroute code do the lookup instead. The > > > > state > > > > in kroute is always up to date so the system knows if the nexthop is > > > > valid > > > > or not and either issues a change or remove depending on that. > > > > > > > > With this the rde no longer uses the true_nexthop (it is only there to > > > > be > > > > reported to bgpctl). It only cares about exit_nexthop, validity and the > > > > connected network info. All of those should not cause any problems > > > > during > > > > nexthop flips. > > > > > > This reads fine. If you fix the grammar of the comment, then it's ok, > > > e.g.: > > > > > > > + * Ignore the nexthop for VPN routes. The gateway is a forced > > > > > > s/a // > > > > > > > + * to an mpe(4) interface route using an MPLS label. > > > > Something is still missing in this diff. Since changing a route does not > > update the kroutes. Guess the kroute code needs to do this proper now, I > > thought I could still use the old slow path for this but it looks like the > > update is surpressed. > > > > Will look more into this and come up with a new version (with the grammar > > fix from above). > > Here we go. The bug I was chasing was actually introduced a few weeks ago. > The problem is that when you RTM_CHANGE a route knexthop_track() fails to > notice the change (okr and kr remain the same since the route did not > change) and so no update is issued. > Fix this by introducing knexthop_update() that just forces an update if > the kroute covering this knexthop is modified. The rest of the diff is > unmodified. >
ok tb I have one small question: > +/* > + * Called on route change. > + */ > +void > +knexthop_update(struct ktable *kt, struct kroute_full *kf) > +{ > + struct knexthop *kn; > + > + RB_FOREACH(kn, knexthop_tree, KT2KNT(kt)) > + if (prefix_compare(&kf->prefix, &kn->nexthop, > + kf->prefixlen) == 0) > + knexthop_send_update(kn); Can there actually be multiple knexthops with the same prefix or could we return early here? > +} > + > void > knexthop_send_update(struct knexthop *kn) > {