On Tue, Feb 4, 2025 at 3:04 PM Felix Huettner via dev <[email protected]> wrote: > > Previously we just assumed that if a LR had multiple LRPs bound to the > local chassis that all routes where valid for all of these LRPs. > This commit handles a previous option in ovn-nb that allowes the user to > specify the interface name that a route needs to use to be acceptable > for a given LRP. > The users can then have a 1:1 relationship between LRPs and interfaces > to allow ovn-controller to determine which route belongs to which LRP. > > This can e.g. be used if a chassis has multiple uplinks. Each of these > uplinks could be assigned to a different LRP of the same LR connected to > the same localnet. > The option allows then to map e.g. a learned default route to each > individual LRP. If one of the links would go down just one of the > learned routes would be removed, causing ECMP across multiple chassis to > be correctly weighted. > > Acked-by: Dumitru Ceara <[email protected]> > Signed-off-by: Felix Huettner <[email protected]> > ---
Thank you for the patch, this is indeed an important use case, I wonder a bit about the approach to it though. Most constructs and options currently in OVN refer to logical constructs that reside either in OVN or in OVS. Your proposal introduces an option that refers to system details, which may or may not be the same across chassis. Putting my feet into the shoes of a CMS developer, how would I know how to map between these two worlds using only OVN APIs? I assume that you are using this feature together with the LRP options:routing-protocol-redirect feature, which takes the name of an LSP. Could we instead either use that value, or have this new option take a LSP name which you then can map to a system interface on each chassis as needed in the ovn-controller code? -- Frode Nordahl > v5->v6: > * addressed review comments > v3->v4: > - addressed review comments. > - updated commit message to be more descriptive. > > controller/route-exchange-netlink.c | 2 ++ > controller/route-exchange-netlink.h | 3 +++ > controller/route-exchange.c | 20 ++++++++++++++------ > controller/route.c | 12 ++++++++---- > controller/route.h | 6 ++++-- > tests/system-ovn.at | 23 ++++++++++++++++++++++- > 6 files changed, 53 insertions(+), 13 deletions(-) > > diff --git a/controller/route-exchange-netlink.c > b/controller/route-exchange-netlink.c > index 0969f15a1..82b87ae4e 100644 > --- a/controller/route-exchange-netlink.c > +++ b/controller/route-exchange-netlink.c > @@ -230,6 +230,8 @@ handle_route_msg(const struct route_table_msg *msg, void > *data) > rr->prefix = rd->rta_dst; > rr->plen = rd->rtm_dst_len; > rr->nexthop = nexthop->addr; > + memcpy(rr->ifname, nexthop->ifname, IFNAMSIZ); > + rr->ifname[IFNAMSIZ] = 0; > ovs_list_push_back(handle_data->learned_routes, &rr->list_node); > } > return; > diff --git a/controller/route-exchange-netlink.h > b/controller/route-exchange-netlink.h > index 9ea2fb8cb..dee651971 100644 > --- a/controller/route-exchange-netlink.h > +++ b/controller/route-exchange-netlink.h > @@ -21,6 +21,7 @@ > #include <stdint.h> > #include "openvswitch/list.h" > #include <netinet/in.h> > +#include <net/if.h> > > /* This value is arbitrary but currently unused. > * See https://github.com/iproute2/iproute2/blob/main/etc/iproute2/rt_protos > */ > @@ -35,6 +36,8 @@ struct re_nl_received_route_node { > struct in6_addr prefix; > unsigned int plen; > struct in6_addr nexthop; > + /* Adding 1 to this to be sure we actually have a terminating '\0' */ > + char ifname[IFNAMSIZ + 1]; > }; > > int re_nl_create_vrf(const char *ifname, uint32_t table_id); > diff --git a/controller/route-exchange.c b/controller/route-exchange.c > index 850316465..a1976234b 100644 > --- a/controller/route-exchange.c > +++ b/controller/route-exchange.c > @@ -95,7 +95,7 @@ route_lookup(struct hmap *route_map, > static void > sb_sync_learned_routes(const struct ovs_list *learned_routes, > const struct sbrec_datapath_binding *datapath, > - const struct sset *bound_ports, > + const struct smap *bound_ports, > struct ovsdb_idl_txn *ovnsb_idl_txn, > struct ovsdb_idl_index *sbrec_port_binding_by_name, > struct ovsdb_idl_index > *sbrec_learned_route_by_datapath) > @@ -110,8 +110,9 @@ sb_sync_learned_routes(const struct ovs_list > *learned_routes, > SBREC_LEARNED_ROUTE_FOR_EACH_EQUAL (sb_route, filter, > sbrec_learned_route_by_datapath) { > /* If the port is not local we don't care about it. > - * Some other ovn-controller will handle it. */ > - if (!sset_contains(bound_ports, > + * Some other ovn-controller will handle it. > + * We may not use smap_get since the value might be validly NULL. */ > + if (!smap_get_node(bound_ports, > sb_route->logical_port->logical_port)) { > continue; > } > @@ -125,11 +126,18 @@ sb_sync_learned_routes(const struct ovs_list > *learned_routes, > learned_route->plen); > char *nexthop = normalize_v46(&learned_route->nexthop); > > - const char *logical_port_name; > - SSET_FOR_EACH (logical_port_name, bound_ports) { > + struct smap_node *port_node; > + SMAP_FOR_EACH (port_node, bound_ports) { > + /* The user specified an ifname, but we learned it on a different > + * port. */ > + if (port_node->value && strcmp(port_node->value, > + learned_route->ifname)) { > + continue; > + } > + > const struct sbrec_port_binding *logical_port = > lport_lookup_by_name(sbrec_port_binding_by_name, > - logical_port_name); > + port_node->key); > if (!logical_port) { > continue; > } > diff --git a/controller/route.c b/controller/route.c > index 108435bad..7b8eed289 100644 > --- a/controller/route.c > +++ b/controller/route.c > @@ -82,7 +82,7 @@ advertise_datapath_cleanup(struct advertise_datapath_entry > *ad) > free(ar); > } > hmap_destroy(&ad->routes); > - sset_destroy(&ad->bound_ports); > + smap_destroy(&ad->bound_ports); > free(ad); > } > > @@ -114,7 +114,7 @@ route_run(struct route_ctx_in *r_ctx_in, > ad = xzalloc(sizeof(*ad)); > ad->db = ld->datapath; > hmap_init(&ad->routes); > - sset_init(&ad->bound_ports); > + smap_init(&ad->bound_ports); > > /* This is a LR datapath, find LRPs with route exchange options > * that are bound locally. */ > @@ -132,10 +132,14 @@ route_run(struct route_ctx_in *r_ctx_in, > > ad->maintain_vrf |= smap_get_bool( > &repb->options, "dynamic-routing-maintain-vrf", false); > - sset_add(&ad->bound_ports, local_peer->logical_port); > + char *ifname = nullable_xstrdup( > + smap_get(&repb->options, > + "dynamic-routing-ifname")); > + smap_add_nocopy(&ad->bound_ports, > + xstrdup(local_peer->logical_port), ifname); > } > > - if (sset_is_empty(&ad->bound_ports)) { > + if (smap_is_empty(&ad->bound_ports)) { > advertise_datapath_cleanup(ad); > continue; > } > diff --git a/controller/route.h b/controller/route.h > index 10b9434fc..c53134559 100644 > --- a/controller/route.h > +++ b/controller/route.h > @@ -22,6 +22,7 @@ > #include <netinet/in.h> > #include "openvswitch/hmap.h" > #include "sset.h" > +#include "smap.h" > > struct hmap; > struct ovsdb_idl_index; > @@ -50,8 +51,9 @@ struct advertise_datapath_entry { > struct hmap routes; > > /* The name of the port bindings locally bound for this datapath and > - * running route exchange logic. */ > - struct sset bound_ports; > + * running route exchange logic. > + * The key is the port name and the value is the ifname if set. */ > + struct smap bound_ports; > }; > > struct advertise_route_entry { > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > index afe0ee902..f41a23115 100644 > --- a/tests/system-ovn.at > +++ b/tests/system-ovn.at > @@ -15660,6 +15660,17 @@ check_row_count Learned_Route 1 > lp=$(fetch_column port_binding _uuid logical_port=internet-phys) > check_row_count Learned_Route 1 logical_port=$lp ip_prefix=233.252.0.0/24 > nexthop=192.168.10.10 > > +# By setting a learning interface filter we will now forget about this route. > +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \ > + options:dynamic-routing-ifname=thisdoesnotexist > +check_row_count Learned_Route 0 > + > +# Changing it to "lo" will allow us to learn the route again. > +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \ > + options:dynamic-routing-ifname=lo > +check_row_count Learned_Route 1 > + > + > OVS_APP_EXIT_AND_WAIT([ovn-controller]) > > as ovn-sb > @@ -15869,10 +15880,20 @@ check ip route add 233.252.0.0/24 via 192.168.10.10 > dev lo onlink vrf ovnvrf1337 > # For now we trigger a recompute as route watching is not yet implemented. > check ovn-appctl -t ovn-controller inc-engine/recompute > check ovn-nbctl --wait=hv sync > -check_row_count Learned_Route 2 > +check_row_count Learned_Route 1 > lp=$(fetch_column port_binding _uuid logical_port=internet-phys) > check_row_count Learned_Route 1 logical_port=$lp ip_prefix=233.252.0.0/24 > nexthop=192.168.10.10 > > +# By setting a learning interface filter we will now forget about this route. > +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \ > + options:dynamic-routing-ifname=thisdoesnotexist > +check_row_count Learned_Route 0 > + > +# Changing it to "lo" will allow us to learn the route again. > +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \ > + options:dynamic-routing-ifname=lo > +check_row_count Learned_Route 1 > + > as ovn-sb > OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > > -- > 2.47.1 > > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
