tor. 6. feb. 2025, 13:22 skrev Dumitru Ceara <[email protected]>: > On 2/6/25 1:08 PM, Felix Huettner wrote: > > On Thu, Feb 06, 2025 at 11:48:45AM +0100, Dumitru Ceara wrote: > >> On 2/6/25 11:38 AM, Frode Nordahl wrote: > >>> On Thu, Feb 6, 2025 at 11:22 AM Felix Huettner > >>> <[email protected]> wrote: > >>>> > >>>> On Thu, Feb 06, 2025 at 10:51:29AM +0100, Dumitru Ceara wrote: > >>>>> On 2/6/25 10:31 AM, Frode Nordahl wrote: > >>>>>> On Tue, Feb 4, 2025 at 3:04 PM Felix Huettner via dev > >>>>>> <[email protected]> wrote: > >>>>>>> > >>>>>>> Previously we just assumed that if a LR had multiple LRPs bound to > the > >>>>>>> local chassis that all routes where valid for all of these LRPs. > >>>>>>> This commit handles a previous option in ovn-nb that allowes the > user to > >>>>>>> specify the interface name that a route needs to use to be > acceptable > >>>>>>> for a given LRP. > >>>>>>> The users can then have a 1:1 relationship between LRPs and > interfaces > >>>>>>> to allow ovn-controller to determine which route belongs to which > LRP. > >>>>>>> > >>>>>>> This can e.g. be used if a chassis has multiple uplinks. Each of > these > >>>>>>> uplinks could be assigned to a different LRP of the same LR > connected to > >>>>>>> the same localnet. > >>>>>>> The option allows then to map e.g. a learned default route to each > >>>>>>> individual LRP. If one of the links would go down just one of the > >>>>>>> learned routes would be removed, causing ECMP across multiple > chassis to > >>>>>>> be correctly weighted. > >>>>>>> > >>>>>>> Acked-by: Dumitru Ceara <[email protected]> > >>>>>>> Signed-off-by: Felix Huettner <[email protected]> > >>>>>>> --- > >>>> > >>>> Hi Frode, Hi Dumitru, > >>>> > >>>>>> > >>>>>> Thank you for the patch, this is indeed an important use case, I > >>>>>> wonder a bit about the approach to it though. > >>>>>> > >>>>>> Most constructs and options currently in OVN refer to logical > >>>>>> constructs that reside either in OVN or in OVS. Your proposal > >>>>>> introduces an option that refers to system details, which may or may > >>>>>> not be the same across chassis. > >>>>>> > >>>>>> Putting my feet into the shoes of a CMS developer, how would I know > >>>>>> how to map between these two worlds using only OVN APIs? > >>>>>> > >>>>>> I assume that you are using this feature together with the LRP > >>>>>> options:routing-protocol-redirect feature, which takes the name of > an > >>>>>> LSP. > >>>>>> > >>>>>> Could we instead either use that value, or have this new option > take a > >>>>>> LSP name which you then can map to a system interface on each > chassis > >>>>>> as needed in the ovn-controller code? > >>>>>> > >>>>> > >>>>> Ah, good point, you're right Frode. dynamic-routing-ifname is the > name > >>>>> of a system interface and there's no guarantee that will be the same > on > >>>>> all chassis. Using a LSP name instead might be the best way forward > >>>>> indeed. We could use > >>>>> local_binding_is_up(LRP.options.dynamic-routing-ifname) to determine > if > >>>>> that associated LSP is bound locally or not. > >>>> > >>>> I don't think we can directly map from an lsp name to the interface > >>>> name. > >>>> Allthough the following argumentation comes from the time where > >>>> the patchseries had netns support i think this would still be valid. > >>>> In the netns usecase i would assume that the routing-protocol-redirect > >>>> LSP is locally bound to a veth pair. One part of the pair is plugged > >>>> into OVS and mapped to the LSP. > >>>> The other end of the pair would be in a netns. This would also be the > >>>> interface that would actually learn the routes. > >>>> > >>>> So the interface that is mapped to the LSP would be different from the > >>>> one we learn routes on. > >>>> In that case i assumed that it is easier for users to ensure that the > >>>> interface name across chassis is identical based on a value they are > >>>> free to choose. > > > > Hi Frode, Hi Dumitru, > > > >>> > >>> Dealing with the many ways a VIF can be attached in a system is indeed > >>> non-trivial. Thank you for highlighting the veth pair case, I think > >>> that is indeed the most common one. It is useful also without > >>> namespaces, as it would allow OVS to be supervisor of one side and the > >>> VRF to be supervisor of the peer. > >>> > >>>> What might be an option is to use the LSP path you describe above with > >>>> an optional overwrite on the local ovsdb. > >>>> > >>>> What do you think of that? > >>> > >>> For the veth pair case, from cursory view the peer ifindex is visible > >>> in `ethtool -S` for both sides, even across namespaces. > >>> > >>> However, we don't know how people will actually wire this, so perhaps > >>> your suggestion of adding some option in the local ovsdb is the best > >>> course of action. > >>> > >>> external_id in the Open vSwitch Interfaces table? > >>> > >> > >> I'm trying to figure out how trivial incremental processing would be in > >> ovn-controller for new OVS.Interface.external_ids. > >> > >> Felix, would route_runtime_data_handler() be called if such a new > >> external-id changes value? > >> > >> If it would then I assume it would return "false" (we don't yet support > >> I-P for that case) and trigger a route reconciliation. So that'd be > >> fine. Or am I missing something? > > > > I would assume the same, but honestly i would only be sure once i tried > > it out. > > > > Another thing that came to my mind: Going this way prevents the user > > from using an interface filter if there is no LSP mapping to the > > interface. > > I think that something like this could happen if you integrate routing > > with some custom agent. Especially if such a agent does not do in-band > > BGP but rather uses some outside logic to learn routes. > > > > In such cases it might make more sense to use the external_id column in > > the OpenVswitch table. > > Per default we would lookup the interface based on the locally bound > > LSP. > > However if we find that LSP name in the external_ids of the OpenVswitch > > table then we use the interface name in there, independent if that LSP > > is bound locally or even exists. > > > > What are your opinions on that? > > > > > > I would try such an implementation in the next hour or so. > > If i don't get it to a nice state by then i would send out v7 with the > > current state of this patch. > > If needed this patch could be removed from the series and would probably > > not affect the others too much. > > > > This sounds OK to me, if Frode agrees. >
Works for me. -- Frode Nordahl > Thanks a lot, > > Felix > > > > > >> > >> Thanks, > >> Dumitru > >> > >>> -- > >>> Frode Nordahl > >>> > >>>> Thanks a lot, > >>>> Felix > >>>> > >>>>> > >>>>> Regards, > >>>>> Dumitru > >>>>> > >>>>>> -- > >>>>>> Frode Nordahl > >>>>>> > >>>>>>> v5->v6: > >>>>>>> * addressed review comments > >>>>>>> v3->v4: > >>>>>>> - addressed review comments. > >>>>>>> - updated commit message to be more descriptive. > >>>>>>> > >>>>>>> controller/route-exchange-netlink.c | 2 ++ > >>>>>>> controller/route-exchange-netlink.h | 3 +++ > >>>>>>> controller/route-exchange.c | 20 ++++++++++++++------ > >>>>>>> controller/route.c | 12 ++++++++---- > >>>>>>> controller/route.h | 6 ++++-- > >>>>>>> tests/system-ovn.at | 23 ++++++++++++++++++++++- > >>>>>>> 6 files changed, 53 insertions(+), 13 deletions(-) > >>>>>>> > >>>>>>> diff --git a/controller/route-exchange-netlink.c > b/controller/route-exchange-netlink.c > >>>>>>> index 0969f15a1..82b87ae4e 100644 > >>>>>>> --- a/controller/route-exchange-netlink.c > >>>>>>> +++ b/controller/route-exchange-netlink.c > >>>>>>> @@ -230,6 +230,8 @@ handle_route_msg(const struct route_table_msg > *msg, void *data) > >>>>>>> rr->prefix = rd->rta_dst; > >>>>>>> rr->plen = rd->rtm_dst_len; > >>>>>>> rr->nexthop = nexthop->addr; > >>>>>>> + memcpy(rr->ifname, nexthop->ifname, IFNAMSIZ); > >>>>>>> + rr->ifname[IFNAMSIZ] = 0; > >>>>>>> ovs_list_push_back(handle_data->learned_routes, > &rr->list_node); > >>>>>>> } > >>>>>>> return; > >>>>>>> diff --git a/controller/route-exchange-netlink.h > b/controller/route-exchange-netlink.h > >>>>>>> index 9ea2fb8cb..dee651971 100644 > >>>>>>> --- a/controller/route-exchange-netlink.h > >>>>>>> +++ b/controller/route-exchange-netlink.h > >>>>>>> @@ -21,6 +21,7 @@ > >>>>>>> #include <stdint.h> > >>>>>>> #include "openvswitch/list.h" > >>>>>>> #include <netinet/in.h> > >>>>>>> +#include <net/if.h> > >>>>>>> > >>>>>>> /* This value is arbitrary but currently unused. > >>>>>>> * See > https://github.com/iproute2/iproute2/blob/main/etc/iproute2/rt_protos */ > >>>>>>> @@ -35,6 +36,8 @@ struct re_nl_received_route_node { > >>>>>>> struct in6_addr prefix; > >>>>>>> unsigned int plen; > >>>>>>> struct in6_addr nexthop; > >>>>>>> + /* Adding 1 to this to be sure we actually have a terminating > '\0' */ > >>>>>>> + char ifname[IFNAMSIZ + 1]; > >>>>>>> }; > >>>>>>> > >>>>>>> int re_nl_create_vrf(const char *ifname, uint32_t table_id); > >>>>>>> diff --git a/controller/route-exchange.c > b/controller/route-exchange.c > >>>>>>> index 850316465..a1976234b 100644 > >>>>>>> --- a/controller/route-exchange.c > >>>>>>> +++ b/controller/route-exchange.c > >>>>>>> @@ -95,7 +95,7 @@ route_lookup(struct hmap *route_map, > >>>>>>> static void > >>>>>>> sb_sync_learned_routes(const struct ovs_list *learned_routes, > >>>>>>> const struct sbrec_datapath_binding > *datapath, > >>>>>>> - const struct sset *bound_ports, > >>>>>>> + const struct smap *bound_ports, > >>>>>>> struct ovsdb_idl_txn *ovnsb_idl_txn, > >>>>>>> struct ovsdb_idl_index > *sbrec_port_binding_by_name, > >>>>>>> struct ovsdb_idl_index > *sbrec_learned_route_by_datapath) > >>>>>>> @@ -110,8 +110,9 @@ sb_sync_learned_routes(const struct ovs_list > *learned_routes, > >>>>>>> SBREC_LEARNED_ROUTE_FOR_EACH_EQUAL (sb_route, filter, > >>>>>>> > sbrec_learned_route_by_datapath) { > >>>>>>> /* If the port is not local we don't care about it. > >>>>>>> - * Some other ovn-controller will handle it. */ > >>>>>>> - if (!sset_contains(bound_ports, > >>>>>>> + * Some other ovn-controller will handle it. > >>>>>>> + * We may not use smap_get since the value might be > validly NULL. */ > >>>>>>> + if (!smap_get_node(bound_ports, > >>>>>>> sb_route->logical_port->logical_port)) > { > >>>>>>> continue; > >>>>>>> } > >>>>>>> @@ -125,11 +126,18 @@ sb_sync_learned_routes(const struct ovs_list > *learned_routes, > >>>>>>> > learned_route->plen); > >>>>>>> char *nexthop = normalize_v46(&learned_route->nexthop); > >>>>>>> > >>>>>>> - const char *logical_port_name; > >>>>>>> - SSET_FOR_EACH (logical_port_name, bound_ports) { > >>>>>>> + struct smap_node *port_node; > >>>>>>> + SMAP_FOR_EACH (port_node, bound_ports) { > >>>>>>> + /* The user specified an ifname, but we learned it on > a different > >>>>>>> + * port. */ > >>>>>>> + if (port_node->value && strcmp(port_node->value, > >>>>>>> + > learned_route->ifname)) { > >>>>>>> + continue; > >>>>>>> + } > >>>>>>> + > >>>>>>> const struct sbrec_port_binding *logical_port = > >>>>>>> lport_lookup_by_name(sbrec_port_binding_by_name, > >>>>>>> - logical_port_name); > >>>>>>> + port_node->key); > >>>>>>> if (!logical_port) { > >>>>>>> continue; > >>>>>>> } > >>>>>>> diff --git a/controller/route.c b/controller/route.c > >>>>>>> index 108435bad..7b8eed289 100644 > >>>>>>> --- a/controller/route.c > >>>>>>> +++ b/controller/route.c > >>>>>>> @@ -82,7 +82,7 @@ advertise_datapath_cleanup(struct > advertise_datapath_entry *ad) > >>>>>>> free(ar); > >>>>>>> } > >>>>>>> hmap_destroy(&ad->routes); > >>>>>>> - sset_destroy(&ad->bound_ports); > >>>>>>> + smap_destroy(&ad->bound_ports); > >>>>>>> free(ad); > >>>>>>> } > >>>>>>> > >>>>>>> @@ -114,7 +114,7 @@ route_run(struct route_ctx_in *r_ctx_in, > >>>>>>> ad = xzalloc(sizeof(*ad)); > >>>>>>> ad->db = ld->datapath; > >>>>>>> hmap_init(&ad->routes); > >>>>>>> - sset_init(&ad->bound_ports); > >>>>>>> + smap_init(&ad->bound_ports); > >>>>>>> > >>>>>>> /* This is a LR datapath, find LRPs with route exchange > options > >>>>>>> * that are bound locally. */ > >>>>>>> @@ -132,10 +132,14 @@ route_run(struct route_ctx_in *r_ctx_in, > >>>>>>> > >>>>>>> ad->maintain_vrf |= smap_get_bool( > >>>>>>> &repb->options, "dynamic-routing-maintain-vrf", > false); > >>>>>>> - sset_add(&ad->bound_ports, local_peer->logical_port); > >>>>>>> + char *ifname = nullable_xstrdup( > >>>>>>> + smap_get(&repb->options, > >>>>>>> + > "dynamic-routing-ifname")); > >>>>>>> + smap_add_nocopy(&ad->bound_ports, > >>>>>>> + xstrdup(local_peer->logical_port), > ifname); > >>>>>>> } > >>>>>>> > >>>>>>> - if (sset_is_empty(&ad->bound_ports)) { > >>>>>>> + if (smap_is_empty(&ad->bound_ports)) { > >>>>>>> advertise_datapath_cleanup(ad); > >>>>>>> continue; > >>>>>>> } > >>>>>>> diff --git a/controller/route.h b/controller/route.h > >>>>>>> index 10b9434fc..c53134559 100644 > >>>>>>> --- a/controller/route.h > >>>>>>> +++ b/controller/route.h > >>>>>>> @@ -22,6 +22,7 @@ > >>>>>>> #include <netinet/in.h> > >>>>>>> #include "openvswitch/hmap.h" > >>>>>>> #include "sset.h" > >>>>>>> +#include "smap.h" > >>>>>>> > >>>>>>> struct hmap; > >>>>>>> struct ovsdb_idl_index; > >>>>>>> @@ -50,8 +51,9 @@ struct advertise_datapath_entry { > >>>>>>> struct hmap routes; > >>>>>>> > >>>>>>> /* The name of the port bindings locally bound for this > datapath and > >>>>>>> - * running route exchange logic. */ > >>>>>>> - struct sset bound_ports; > >>>>>>> + * running route exchange logic. > >>>>>>> + * The key is the port name and the value is the ifname if > set. */ > >>>>>>> + struct smap bound_ports; > >>>>>>> }; > >>>>>>> > >>>>>>> struct advertise_route_entry { > >>>>>>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at > >>>>>>> index afe0ee902..f41a23115 100644 > >>>>>>> --- a/tests/system-ovn.at > >>>>>>> +++ b/tests/system-ovn.at > >>>>>>> @@ -15660,6 +15660,17 @@ check_row_count Learned_Route 1 > >>>>>>> lp=$(fetch_column port_binding _uuid logical_port=internet-phys) > >>>>>>> check_row_count Learned_Route 1 logical_port=$lp ip_prefix= > 233.252.0.0/24 nexthop=192.168.10.10 > >>>>>>> > >>>>>>> +# By setting a learning interface filter we will now forget about > this route. > >>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \ > >>>>>>> + options:dynamic-routing-ifname=thisdoesnotexist > >>>>>>> +check_row_count Learned_Route 0 > >>>>>>> + > >>>>>>> +# Changing it to "lo" will allow us to learn the route again. > >>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \ > >>>>>>> + options:dynamic-routing-ifname=lo > >>>>>>> +check_row_count Learned_Route 1 > >>>>>>> + > >>>>>>> + > >>>>>>> OVS_APP_EXIT_AND_WAIT([ovn-controller]) > >>>>>>> > >>>>>>> as ovn-sb > >>>>>>> @@ -15869,10 +15880,20 @@ check ip route add 233.252.0.0/24 via > 192.168.10.10 dev lo onlink vrf ovnvrf1337 > >>>>>>> # For now we trigger a recompute as route watching is not yet > implemented. > >>>>>>> check ovn-appctl -t ovn-controller inc-engine/recompute > >>>>>>> check ovn-nbctl --wait=hv sync > >>>>>>> -check_row_count Learned_Route 2 > >>>>>>> +check_row_count Learned_Route 1 > >>>>>>> lp=$(fetch_column port_binding _uuid logical_port=internet-phys) > >>>>>>> check_row_count Learned_Route 1 logical_port=$lp ip_prefix= > 233.252.0.0/24 nexthop=192.168.10.10 > >>>>>>> > >>>>>>> +# By setting a learning interface filter we will now forget about > this route. > >>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \ > >>>>>>> + options:dynamic-routing-ifname=thisdoesnotexist > >>>>>>> +check_row_count Learned_Route 0 > >>>>>>> + > >>>>>>> +# Changing it to "lo" will allow us to learn the route again. > >>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \ > >>>>>>> + options:dynamic-routing-ifname=lo > >>>>>>> +check_row_count Learned_Route 1 > >>>>>>> + > >>>>>>> as ovn-sb > >>>>>>> OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > >>>>>>> > >>>>>>> -- > >>>>>>> 2.47.1 > >>>>>>> > >>>>>>> > >>>>>>> _______________________________________________ > >>>>>>> dev mailing list > >>>>>>> [email protected] > >>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >>>>>> > >>>>> > >>> > >> > > > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
