On Thu, Feb 06, 2025 at 01:27:37PM +0100, Frode Nordahl wrote: > tor. 6. feb. 2025, 13:22 skrev Dumitru Ceara <[email protected]>: > > > On 2/6/25 1:08 PM, Felix Huettner wrote: > > > On Thu, Feb 06, 2025 at 11:48:45AM +0100, Dumitru Ceara wrote: > > >> On 2/6/25 11:38 AM, Frode Nordahl wrote: > > >>> On Thu, Feb 6, 2025 at 11:22 AM Felix Huettner > > >>> <[email protected]> wrote: > > >>>> > > >>>> On Thu, Feb 06, 2025 at 10:51:29AM +0100, Dumitru Ceara wrote: > > >>>>> On 2/6/25 10:31 AM, Frode Nordahl wrote: > > >>>>>> On Tue, Feb 4, 2025 at 3:04 PM Felix Huettner via dev > > >>>>>> <[email protected]> wrote: > > >>>>>>> > > >>>>>>> Previously we just assumed that if a LR had multiple LRPs bound to > > the > > >>>>>>> local chassis that all routes where valid for all of these LRPs. > > >>>>>>> This commit handles a previous option in ovn-nb that allowes the > > user to > > >>>>>>> specify the interface name that a route needs to use to be > > acceptable > > >>>>>>> for a given LRP. > > >>>>>>> The users can then have a 1:1 relationship between LRPs and > > interfaces > > >>>>>>> to allow ovn-controller to determine which route belongs to which > > LRP. > > >>>>>>> > > >>>>>>> This can e.g. be used if a chassis has multiple uplinks. Each of > > these > > >>>>>>> uplinks could be assigned to a different LRP of the same LR > > connected to > > >>>>>>> the same localnet. > > >>>>>>> The option allows then to map e.g. a learned default route to each > > >>>>>>> individual LRP. If one of the links would go down just one of the > > >>>>>>> learned routes would be removed, causing ECMP across multiple > > chassis to > > >>>>>>> be correctly weighted. > > >>>>>>> > > >>>>>>> Acked-by: Dumitru Ceara <[email protected]> > > >>>>>>> Signed-off-by: Felix Huettner <[email protected]> > > >>>>>>> --- > > >>>> > > >>>> Hi Frode, Hi Dumitru, > > >>>> > > >>>>>> > > >>>>>> Thank you for the patch, this is indeed an important use case, I > > >>>>>> wonder a bit about the approach to it though. > > >>>>>> > > >>>>>> Most constructs and options currently in OVN refer to logical > > >>>>>> constructs that reside either in OVN or in OVS. Your proposal > > >>>>>> introduces an option that refers to system details, which may or may > > >>>>>> not be the same across chassis. > > >>>>>> > > >>>>>> Putting my feet into the shoes of a CMS developer, how would I know > > >>>>>> how to map between these two worlds using only OVN APIs? > > >>>>>> > > >>>>>> I assume that you are using this feature together with the LRP > > >>>>>> options:routing-protocol-redirect feature, which takes the name of > > an > > >>>>>> LSP. > > >>>>>> > > >>>>>> Could we instead either use that value, or have this new option > > take a > > >>>>>> LSP name which you then can map to a system interface on each > > chassis > > >>>>>> as needed in the ovn-controller code? > > >>>>>> > > >>>>> > > >>>>> Ah, good point, you're right Frode. dynamic-routing-ifname is the > > name > > >>>>> of a system interface and there's no guarantee that will be the same > > on > > >>>>> all chassis. Using a LSP name instead might be the best way forward > > >>>>> indeed. We could use > > >>>>> local_binding_is_up(LRP.options.dynamic-routing-ifname) to determine > > if > > >>>>> that associated LSP is bound locally or not. > > >>>> > > >>>> I don't think we can directly map from an lsp name to the interface > > >>>> name. > > >>>> Allthough the following argumentation comes from the time where > > >>>> the patchseries had netns support i think this would still be valid. > > >>>> In the netns usecase i would assume that the routing-protocol-redirect > > >>>> LSP is locally bound to a veth pair. One part of the pair is plugged > > >>>> into OVS and mapped to the LSP. > > >>>> The other end of the pair would be in a netns. This would also be the > > >>>> interface that would actually learn the routes. > > >>>> > > >>>> So the interface that is mapped to the LSP would be different from the > > >>>> one we learn routes on. > > >>>> In that case i assumed that it is easier for users to ensure that the > > >>>> interface name across chassis is identical based on a value they are > > >>>> free to choose. > > > > > > Hi Frode, Hi Dumitru, > > > > > >>> > > >>> Dealing with the many ways a VIF can be attached in a system is indeed > > >>> non-trivial. Thank you for highlighting the veth pair case, I think > > >>> that is indeed the most common one. It is useful also without > > >>> namespaces, as it would allow OVS to be supervisor of one side and the > > >>> VRF to be supervisor of the peer. > > >>> > > >>>> What might be an option is to use the LSP path you describe above with > > >>>> an optional overwrite on the local ovsdb. > > >>>> > > >>>> What do you think of that? > > >>> > > >>> For the veth pair case, from cursory view the peer ifindex is visible > > >>> in `ethtool -S` for both sides, even across namespaces. > > >>> > > >>> However, we don't know how people will actually wire this, so perhaps > > >>> your suggestion of adding some option in the local ovsdb is the best > > >>> course of action. > > >>> > > >>> external_id in the Open vSwitch Interfaces table? > > >>> > > >> > > >> I'm trying to figure out how trivial incremental processing would be in > > >> ovn-controller for new OVS.Interface.external_ids. > > >> > > >> Felix, would route_runtime_data_handler() be called if such a new > > >> external-id changes value? > > >> > > >> If it would then I assume it would return "false" (we don't yet support > > >> I-P for that case) and trigger a route reconciliation. So that'd be > > >> fine. Or am I missing something? > > > > > > I would assume the same, but honestly i would only be sure once i tried > > > it out. > > > > > > Another thing that came to my mind: Going this way prevents the user > > > from using an interface filter if there is no LSP mapping to the > > > interface. > > > I think that something like this could happen if you integrate routing > > > with some custom agent. Especially if such a agent does not do in-band > > > BGP but rather uses some outside logic to learn routes. > > > > > > In such cases it might make more sense to use the external_id column in > > > the OpenVswitch table. > > > Per default we would lookup the interface based on the locally bound > > > LSP. > > > However if we find that LSP name in the external_ids of the OpenVswitch > > > table then we use the interface name in there, independent if that LSP > > > is bound locally or even exists. > > > > > > What are your opinions on that? > > > > > > > > > I would try such an implementation in the next hour or so. > > > If i don't get it to a nice state by then i would send out v7 with the > > > current state of this patch. > > > If needed this patch could be removed from the series and would probably > > > not affect the others too much. > > > > > > > This sounds OK to me, if Frode agrees. > > > > Works for me.
Hi Dumitru, Hi Frode, I think i have a working solution and i will send it out in v7. It creates some larger changes in: * northd: Sync routing data to pb. * controller: Support learning routes per iface. I will therefor remove the existing Acks from these commits. Additionally some smaller changes are in: * controller: Announce routes via route-exchange. * controller: Watch for route changes But these are just changes in the testcases to bring them in line with the above changes. I hope that changeset fits for everyone and that the quality is fine even though it was built fast :) Thanks a lot, Felix > > -- > Frode Nordahl > > > Thanks a lot, > > > Felix > > > > > > > > >> > > >> Thanks, > > >> Dumitru > > >> > > >>> -- > > >>> Frode Nordahl > > >>> > > >>>> Thanks a lot, > > >>>> Felix > > >>>> > > >>>>> > > >>>>> Regards, > > >>>>> Dumitru > > >>>>> > > >>>>>> -- > > >>>>>> Frode Nordahl > > >>>>>> > > >>>>>>> v5->v6: > > >>>>>>> * addressed review comments > > >>>>>>> v3->v4: > > >>>>>>> - addressed review comments. > > >>>>>>> - updated commit message to be more descriptive. > > >>>>>>> > > >>>>>>> controller/route-exchange-netlink.c | 2 ++ > > >>>>>>> controller/route-exchange-netlink.h | 3 +++ > > >>>>>>> controller/route-exchange.c | 20 ++++++++++++++------ > > >>>>>>> controller/route.c | 12 ++++++++---- > > >>>>>>> controller/route.h | 6 ++++-- > > >>>>>>> tests/system-ovn.at | 23 ++++++++++++++++++++++- > > >>>>>>> 6 files changed, 53 insertions(+), 13 deletions(-) > > >>>>>>> > > >>>>>>> diff --git a/controller/route-exchange-netlink.c > > b/controller/route-exchange-netlink.c > > >>>>>>> index 0969f15a1..82b87ae4e 100644 > > >>>>>>> --- a/controller/route-exchange-netlink.c > > >>>>>>> +++ b/controller/route-exchange-netlink.c > > >>>>>>> @@ -230,6 +230,8 @@ handle_route_msg(const struct route_table_msg > > *msg, void *data) > > >>>>>>> rr->prefix = rd->rta_dst; > > >>>>>>> rr->plen = rd->rtm_dst_len; > > >>>>>>> rr->nexthop = nexthop->addr; > > >>>>>>> + memcpy(rr->ifname, nexthop->ifname, IFNAMSIZ); > > >>>>>>> + rr->ifname[IFNAMSIZ] = 0; > > >>>>>>> ovs_list_push_back(handle_data->learned_routes, > > &rr->list_node); > > >>>>>>> } > > >>>>>>> return; > > >>>>>>> diff --git a/controller/route-exchange-netlink.h > > b/controller/route-exchange-netlink.h > > >>>>>>> index 9ea2fb8cb..dee651971 100644 > > >>>>>>> --- a/controller/route-exchange-netlink.h > > >>>>>>> +++ b/controller/route-exchange-netlink.h > > >>>>>>> @@ -21,6 +21,7 @@ > > >>>>>>> #include <stdint.h> > > >>>>>>> #include "openvswitch/list.h" > > >>>>>>> #include <netinet/in.h> > > >>>>>>> +#include <net/if.h> > > >>>>>>> > > >>>>>>> /* This value is arbitrary but currently unused. > > >>>>>>> * See > > https://github.com/iproute2/iproute2/blob/main/etc/iproute2/rt_protos */ > > >>>>>>> @@ -35,6 +36,8 @@ struct re_nl_received_route_node { > > >>>>>>> struct in6_addr prefix; > > >>>>>>> unsigned int plen; > > >>>>>>> struct in6_addr nexthop; > > >>>>>>> + /* Adding 1 to this to be sure we actually have a terminating > > '\0' */ > > >>>>>>> + char ifname[IFNAMSIZ + 1]; > > >>>>>>> }; > > >>>>>>> > > >>>>>>> int re_nl_create_vrf(const char *ifname, uint32_t table_id); > > >>>>>>> diff --git a/controller/route-exchange.c > > b/controller/route-exchange.c > > >>>>>>> index 850316465..a1976234b 100644 > > >>>>>>> --- a/controller/route-exchange.c > > >>>>>>> +++ b/controller/route-exchange.c > > >>>>>>> @@ -95,7 +95,7 @@ route_lookup(struct hmap *route_map, > > >>>>>>> static void > > >>>>>>> sb_sync_learned_routes(const struct ovs_list *learned_routes, > > >>>>>>> const struct sbrec_datapath_binding > > *datapath, > > >>>>>>> - const struct sset *bound_ports, > > >>>>>>> + const struct smap *bound_ports, > > >>>>>>> struct ovsdb_idl_txn *ovnsb_idl_txn, > > >>>>>>> struct ovsdb_idl_index > > *sbrec_port_binding_by_name, > > >>>>>>> struct ovsdb_idl_index > > *sbrec_learned_route_by_datapath) > > >>>>>>> @@ -110,8 +110,9 @@ sb_sync_learned_routes(const struct ovs_list > > *learned_routes, > > >>>>>>> SBREC_LEARNED_ROUTE_FOR_EACH_EQUAL (sb_route, filter, > > >>>>>>> > > sbrec_learned_route_by_datapath) { > > >>>>>>> /* If the port is not local we don't care about it. > > >>>>>>> - * Some other ovn-controller will handle it. */ > > >>>>>>> - if (!sset_contains(bound_ports, > > >>>>>>> + * Some other ovn-controller will handle it. > > >>>>>>> + * We may not use smap_get since the value might be > > validly NULL. */ > > >>>>>>> + if (!smap_get_node(bound_ports, > > >>>>>>> sb_route->logical_port->logical_port)) > > { > > >>>>>>> continue; > > >>>>>>> } > > >>>>>>> @@ -125,11 +126,18 @@ sb_sync_learned_routes(const struct ovs_list > > *learned_routes, > > >>>>>>> > > learned_route->plen); > > >>>>>>> char *nexthop = normalize_v46(&learned_route->nexthop); > > >>>>>>> > > >>>>>>> - const char *logical_port_name; > > >>>>>>> - SSET_FOR_EACH (logical_port_name, bound_ports) { > > >>>>>>> + struct smap_node *port_node; > > >>>>>>> + SMAP_FOR_EACH (port_node, bound_ports) { > > >>>>>>> + /* The user specified an ifname, but we learned it on > > a different > > >>>>>>> + * port. */ > > >>>>>>> + if (port_node->value && strcmp(port_node->value, > > >>>>>>> + > > learned_route->ifname)) { > > >>>>>>> + continue; > > >>>>>>> + } > > >>>>>>> + > > >>>>>>> const struct sbrec_port_binding *logical_port = > > >>>>>>> lport_lookup_by_name(sbrec_port_binding_by_name, > > >>>>>>> - logical_port_name); > > >>>>>>> + port_node->key); > > >>>>>>> if (!logical_port) { > > >>>>>>> continue; > > >>>>>>> } > > >>>>>>> diff --git a/controller/route.c b/controller/route.c > > >>>>>>> index 108435bad..7b8eed289 100644 > > >>>>>>> --- a/controller/route.c > > >>>>>>> +++ b/controller/route.c > > >>>>>>> @@ -82,7 +82,7 @@ advertise_datapath_cleanup(struct > > advertise_datapath_entry *ad) > > >>>>>>> free(ar); > > >>>>>>> } > > >>>>>>> hmap_destroy(&ad->routes); > > >>>>>>> - sset_destroy(&ad->bound_ports); > > >>>>>>> + smap_destroy(&ad->bound_ports); > > >>>>>>> free(ad); > > >>>>>>> } > > >>>>>>> > > >>>>>>> @@ -114,7 +114,7 @@ route_run(struct route_ctx_in *r_ctx_in, > > >>>>>>> ad = xzalloc(sizeof(*ad)); > > >>>>>>> ad->db = ld->datapath; > > >>>>>>> hmap_init(&ad->routes); > > >>>>>>> - sset_init(&ad->bound_ports); > > >>>>>>> + smap_init(&ad->bound_ports); > > >>>>>>> > > >>>>>>> /* This is a LR datapath, find LRPs with route exchange > > options > > >>>>>>> * that are bound locally. */ > > >>>>>>> @@ -132,10 +132,14 @@ route_run(struct route_ctx_in *r_ctx_in, > > >>>>>>> > > >>>>>>> ad->maintain_vrf |= smap_get_bool( > > >>>>>>> &repb->options, "dynamic-routing-maintain-vrf", > > false); > > >>>>>>> - sset_add(&ad->bound_ports, local_peer->logical_port); > > >>>>>>> + char *ifname = nullable_xstrdup( > > >>>>>>> + smap_get(&repb->options, > > >>>>>>> + > > "dynamic-routing-ifname")); > > >>>>>>> + smap_add_nocopy(&ad->bound_ports, > > >>>>>>> + xstrdup(local_peer->logical_port), > > ifname); > > >>>>>>> } > > >>>>>>> > > >>>>>>> - if (sset_is_empty(&ad->bound_ports)) { > > >>>>>>> + if (smap_is_empty(&ad->bound_ports)) { > > >>>>>>> advertise_datapath_cleanup(ad); > > >>>>>>> continue; > > >>>>>>> } > > >>>>>>> diff --git a/controller/route.h b/controller/route.h > > >>>>>>> index 10b9434fc..c53134559 100644 > > >>>>>>> --- a/controller/route.h > > >>>>>>> +++ b/controller/route.h > > >>>>>>> @@ -22,6 +22,7 @@ > > >>>>>>> #include <netinet/in.h> > > >>>>>>> #include "openvswitch/hmap.h" > > >>>>>>> #include "sset.h" > > >>>>>>> +#include "smap.h" > > >>>>>>> > > >>>>>>> struct hmap; > > >>>>>>> struct ovsdb_idl_index; > > >>>>>>> @@ -50,8 +51,9 @@ struct advertise_datapath_entry { > > >>>>>>> struct hmap routes; > > >>>>>>> > > >>>>>>> /* The name of the port bindings locally bound for this > > datapath and > > >>>>>>> - * running route exchange logic. */ > > >>>>>>> - struct sset bound_ports; > > >>>>>>> + * running route exchange logic. > > >>>>>>> + * The key is the port name and the value is the ifname if > > set. */ > > >>>>>>> + struct smap bound_ports; > > >>>>>>> }; > > >>>>>>> > > >>>>>>> struct advertise_route_entry { > > >>>>>>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at > > >>>>>>> index afe0ee902..f41a23115 100644 > > >>>>>>> --- a/tests/system-ovn.at > > >>>>>>> +++ b/tests/system-ovn.at > > >>>>>>> @@ -15660,6 +15660,17 @@ check_row_count Learned_Route 1 > > >>>>>>> lp=$(fetch_column port_binding _uuid logical_port=internet-phys) > > >>>>>>> check_row_count Learned_Route 1 logical_port=$lp ip_prefix= > > 233.252.0.0/24 nexthop=192.168.10.10 > > >>>>>>> > > >>>>>>> +# By setting a learning interface filter we will now forget about > > this route. > > >>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \ > > >>>>>>> + options:dynamic-routing-ifname=thisdoesnotexist > > >>>>>>> +check_row_count Learned_Route 0 > > >>>>>>> + > > >>>>>>> +# Changing it to "lo" will allow us to learn the route again. > > >>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \ > > >>>>>>> + options:dynamic-routing-ifname=lo > > >>>>>>> +check_row_count Learned_Route 1 > > >>>>>>> + > > >>>>>>> + > > >>>>>>> OVS_APP_EXIT_AND_WAIT([ovn-controller]) > > >>>>>>> > > >>>>>>> as ovn-sb > > >>>>>>> @@ -15869,10 +15880,20 @@ check ip route add 233.252.0.0/24 via > > 192.168.10.10 dev lo onlink vrf ovnvrf1337 > > >>>>>>> # For now we trigger a recompute as route watching is not yet > > implemented. > > >>>>>>> check ovn-appctl -t ovn-controller inc-engine/recompute > > >>>>>>> check ovn-nbctl --wait=hv sync > > >>>>>>> -check_row_count Learned_Route 2 > > >>>>>>> +check_row_count Learned_Route 1 > > >>>>>>> lp=$(fetch_column port_binding _uuid logical_port=internet-phys) > > >>>>>>> check_row_count Learned_Route 1 logical_port=$lp ip_prefix= > > 233.252.0.0/24 nexthop=192.168.10.10 > > >>>>>>> > > >>>>>>> +# By setting a learning interface filter we will now forget about > > this route. > > >>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \ > > >>>>>>> + options:dynamic-routing-ifname=thisdoesnotexist > > >>>>>>> +check_row_count Learned_Route 0 > > >>>>>>> + > > >>>>>>> +# Changing it to "lo" will allow us to learn the route again. > > >>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \ > > >>>>>>> + options:dynamic-routing-ifname=lo > > >>>>>>> +check_row_count Learned_Route 1 > > >>>>>>> + > > >>>>>>> as ovn-sb > > >>>>>>> OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > > >>>>>>> > > >>>>>>> -- > > >>>>>>> 2.47.1 > > >>>>>>> > > >>>>>>> > > >>>>>>> _______________________________________________ > > >>>>>>> dev mailing list > > >>>>>>> [email protected] > > >>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > >>>>>> > > >>>>> > > >>> > > >> > > > > > > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
