On 2/6/25 11:38 AM, Frode Nordahl wrote:
> On Thu, Feb 6, 2025 at 11:22 AM Felix Huettner
> <[email protected]> wrote:
>>
>> On Thu, Feb 06, 2025 at 10:51:29AM +0100, Dumitru Ceara wrote:
>>> On 2/6/25 10:31 AM, Frode Nordahl wrote:
>>>> On Tue, Feb 4, 2025 at 3:04 PM Felix Huettner via dev
>>>> <[email protected]> wrote:
>>>>>
>>>>> Previously we just assumed that if a LR had multiple LRPs bound to the
>>>>> local chassis that all routes where valid for all of these LRPs.
>>>>> This commit handles a previous option in ovn-nb that allowes the user to
>>>>> specify the interface name that a route needs to use to be acceptable
>>>>> for a given LRP.
>>>>> The users can then have a 1:1 relationship between LRPs and interfaces
>>>>> to allow ovn-controller to determine which route belongs to which LRP.
>>>>>
>>>>> This can e.g. be used if a chassis has multiple uplinks. Each of these
>>>>> uplinks could be assigned to a different LRP of the same LR connected to
>>>>> the same localnet.
>>>>> The option allows then to map e.g. a learned default route to each
>>>>> individual LRP. If one of the links would go down just one of the
>>>>> learned routes would be removed, causing ECMP across multiple chassis to
>>>>> be correctly weighted.
>>>>>
>>>>> Acked-by: Dumitru Ceara <[email protected]>
>>>>> Signed-off-by: Felix Huettner <[email protected]>
>>>>> ---
>>
>> Hi Frode, Hi Dumitru,
>>
>>>>
>>>> Thank you for the patch, this is indeed an important use case, I
>>>> wonder a bit about the approach to it though.
>>>>
>>>> Most constructs and options currently in OVN refer to logical
>>>> constructs that reside either in OVN or in OVS. Your proposal
>>>> introduces an option that refers to system details, which may or may
>>>> not be the same across chassis.
>>>>
>>>> Putting my feet into the shoes of a CMS developer, how would I know
>>>> how to map between these two worlds using only OVN APIs?
>>>>
>>>> I assume that you are using this feature together with the LRP
>>>> options:routing-protocol-redirect feature, which takes the name of an
>>>> LSP.
>>>>
>>>> Could we instead either use that value, or have this new option take a
>>>> LSP name which you then can map to a system interface on each chassis
>>>> as needed in the ovn-controller code?
>>>>
>>>
>>> Ah, good point, you're right Frode. dynamic-routing-ifname is the name
>>> of a system interface and there's no guarantee that will be the same on
>>> all chassis.  Using a LSP name instead might be the best way forward
>>> indeed.  We could use
>>> local_binding_is_up(LRP.options.dynamic-routing-ifname) to determine if
>>> that associated LSP is bound locally or not.
>>
>> I don't think we can directly map from an lsp name to the interface
>> name.
>> Allthough the following argumentation comes from the time where
>> the patchseries had netns support i think this would still be valid.
>> In the netns usecase i would assume that the routing-protocol-redirect
>> LSP is locally bound to a veth pair. One part of the pair is plugged
>> into OVS and mapped to the LSP.
>> The other end of the pair would be in a netns. This would also be the
>> interface that would actually learn the routes.
>>
>> So the interface that is mapped to the LSP would be different from the
>> one we learn routes on.
>> In that case i assumed that it is easier for users to ensure that the
>> interface name across chassis is identical based on a value they are
>> free to choose.
> 
> Dealing with the many ways a VIF can be attached in a system is indeed
> non-trivial. Thank you for highlighting the veth pair case, I think
> that is indeed the most common one. It is useful also without
> namespaces, as it would allow OVS to be supervisor of one side and the
> VRF to be supervisor of the peer.
> 
>> What might be an option is to use the LSP path you describe above with
>> an optional overwrite on the local ovsdb.
>>
>> What do you think of that?
> 
> For the veth pair case, from cursory view the peer ifindex is visible
> in `ethtool -S` for both sides, even across namespaces.
> 
> However, we don't know how people will actually wire this, so perhaps
> your suggestion of adding some option in the local ovsdb is the best
> course of action.
> 
> external_id in the Open vSwitch Interfaces table?
> 

I'm trying to figure out how trivial incremental processing would be in
ovn-controller for new OVS.Interface.external_ids.

Felix, would route_runtime_data_handler() be called if such a new
external-id changes value?

If it would then I assume it would return "false" (we don't yet support
I-P for that case) and trigger a route reconciliation.  So that'd be
fine.  Or am I missing something?

Thanks,
Dumitru

> --
> Frode Nordahl
> 
>> Thanks a lot,
>> Felix
>>
>>>
>>> Regards,
>>> Dumitru
>>>
>>>> --
>>>> Frode Nordahl
>>>>
>>>>> v5->v6:
>>>>>   * addressed review comments
>>>>> v3->v4:
>>>>>   - addressed review comments.
>>>>>   - updated commit message to be more descriptive.
>>>>>
>>>>>  controller/route-exchange-netlink.c |  2 ++
>>>>>  controller/route-exchange-netlink.h |  3 +++
>>>>>  controller/route-exchange.c         | 20 ++++++++++++++------
>>>>>  controller/route.c                  | 12 ++++++++----
>>>>>  controller/route.h                  |  6 ++++--
>>>>>  tests/system-ovn.at                 | 23 ++++++++++++++++++++++-
>>>>>  6 files changed, 53 insertions(+), 13 deletions(-)
>>>>>
>>>>> diff --git a/controller/route-exchange-netlink.c 
>>>>> b/controller/route-exchange-netlink.c
>>>>> index 0969f15a1..82b87ae4e 100644
>>>>> --- a/controller/route-exchange-netlink.c
>>>>> +++ b/controller/route-exchange-netlink.c
>>>>> @@ -230,6 +230,8 @@ handle_route_msg(const struct route_table_msg *msg, 
>>>>> void *data)
>>>>>              rr->prefix = rd->rta_dst;
>>>>>              rr->plen = rd->rtm_dst_len;
>>>>>              rr->nexthop = nexthop->addr;
>>>>> +            memcpy(rr->ifname, nexthop->ifname, IFNAMSIZ);
>>>>> +            rr->ifname[IFNAMSIZ] = 0;
>>>>>              ovs_list_push_back(handle_data->learned_routes, 
>>>>> &rr->list_node);
>>>>>          }
>>>>>          return;
>>>>> diff --git a/controller/route-exchange-netlink.h 
>>>>> b/controller/route-exchange-netlink.h
>>>>> index 9ea2fb8cb..dee651971 100644
>>>>> --- a/controller/route-exchange-netlink.h
>>>>> +++ b/controller/route-exchange-netlink.h
>>>>> @@ -21,6 +21,7 @@
>>>>>  #include <stdint.h>
>>>>>  #include "openvswitch/list.h"
>>>>>  #include <netinet/in.h>
>>>>> +#include <net/if.h>
>>>>>
>>>>>  /* This value is arbitrary but currently unused.
>>>>>   * See 
>>>>> https://github.com/iproute2/iproute2/blob/main/etc/iproute2/rt_protos */
>>>>> @@ -35,6 +36,8 @@ struct re_nl_received_route_node {
>>>>>      struct in6_addr prefix;
>>>>>      unsigned int plen;
>>>>>      struct in6_addr nexthop;
>>>>> +    /* Adding 1 to this to be sure we actually have a terminating '\0' */
>>>>> +    char ifname[IFNAMSIZ + 1];
>>>>>  };
>>>>>
>>>>>  int re_nl_create_vrf(const char *ifname, uint32_t table_id);
>>>>> diff --git a/controller/route-exchange.c b/controller/route-exchange.c
>>>>> index 850316465..a1976234b 100644
>>>>> --- a/controller/route-exchange.c
>>>>> +++ b/controller/route-exchange.c
>>>>> @@ -95,7 +95,7 @@ route_lookup(struct hmap *route_map,
>>>>>  static void
>>>>>  sb_sync_learned_routes(const struct ovs_list *learned_routes,
>>>>>                         const struct sbrec_datapath_binding *datapath,
>>>>> -                       const struct sset *bound_ports,
>>>>> +                       const struct smap *bound_ports,
>>>>>                         struct ovsdb_idl_txn *ovnsb_idl_txn,
>>>>>                         struct ovsdb_idl_index 
>>>>> *sbrec_port_binding_by_name,
>>>>>                         struct ovsdb_idl_index 
>>>>> *sbrec_learned_route_by_datapath)
>>>>> @@ -110,8 +110,9 @@ sb_sync_learned_routes(const struct ovs_list 
>>>>> *learned_routes,
>>>>>      SBREC_LEARNED_ROUTE_FOR_EACH_EQUAL (sb_route, filter,
>>>>>                                          sbrec_learned_route_by_datapath) 
>>>>> {
>>>>>          /* If the port is not local we don't care about it.
>>>>> -         * Some other ovn-controller will handle it. */
>>>>> -        if (!sset_contains(bound_ports,
>>>>> +         * Some other ovn-controller will handle it.
>>>>> +         * We may not use smap_get since the value might be validly 
>>>>> NULL. */
>>>>> +        if (!smap_get_node(bound_ports,
>>>>>                             sb_route->logical_port->logical_port)) {
>>>>>              continue;
>>>>>          }
>>>>> @@ -125,11 +126,18 @@ sb_sync_learned_routes(const struct ovs_list 
>>>>> *learned_routes,
>>>>>                                                 learned_route->plen);
>>>>>          char *nexthop = normalize_v46(&learned_route->nexthop);
>>>>>
>>>>> -        const char *logical_port_name;
>>>>> -        SSET_FOR_EACH (logical_port_name, bound_ports) {
>>>>> +        struct smap_node *port_node;
>>>>> +        SMAP_FOR_EACH (port_node, bound_ports) {
>>>>> +            /* The user specified an ifname, but we learned it on a 
>>>>> different
>>>>> +             * port. */
>>>>> +            if (port_node->value && strcmp(port_node->value,
>>>>> +                                           learned_route->ifname)) {
>>>>> +                continue;
>>>>> +            }
>>>>> +
>>>>>              const struct sbrec_port_binding *logical_port =
>>>>>                  lport_lookup_by_name(sbrec_port_binding_by_name,
>>>>> -                                     logical_port_name);
>>>>> +                                     port_node->key);
>>>>>              if (!logical_port) {
>>>>>                  continue;
>>>>>              }
>>>>> diff --git a/controller/route.c b/controller/route.c
>>>>> index 108435bad..7b8eed289 100644
>>>>> --- a/controller/route.c
>>>>> +++ b/controller/route.c
>>>>> @@ -82,7 +82,7 @@ advertise_datapath_cleanup(struct 
>>>>> advertise_datapath_entry *ad)
>>>>>          free(ar);
>>>>>      }
>>>>>      hmap_destroy(&ad->routes);
>>>>> -    sset_destroy(&ad->bound_ports);
>>>>> +    smap_destroy(&ad->bound_ports);
>>>>>      free(ad);
>>>>>  }
>>>>>
>>>>> @@ -114,7 +114,7 @@ route_run(struct route_ctx_in *r_ctx_in,
>>>>>          ad = xzalloc(sizeof(*ad));
>>>>>          ad->db = ld->datapath;
>>>>>          hmap_init(&ad->routes);
>>>>> -        sset_init(&ad->bound_ports);
>>>>> +        smap_init(&ad->bound_ports);
>>>>>
>>>>>          /* This is a LR datapath, find LRPs with route exchange options
>>>>>           * that are bound locally. */
>>>>> @@ -132,10 +132,14 @@ route_run(struct route_ctx_in *r_ctx_in,
>>>>>
>>>>>              ad->maintain_vrf |= smap_get_bool(
>>>>>                  &repb->options, "dynamic-routing-maintain-vrf", false);
>>>>> -            sset_add(&ad->bound_ports, local_peer->logical_port);
>>>>> +            char *ifname = nullable_xstrdup(
>>>>> +                                    smap_get(&repb->options,
>>>>> +                                             "dynamic-routing-ifname"));
>>>>> +            smap_add_nocopy(&ad->bound_ports,
>>>>> +                            xstrdup(local_peer->logical_port), ifname);
>>>>>          }
>>>>>
>>>>> -        if (sset_is_empty(&ad->bound_ports)) {
>>>>> +        if (smap_is_empty(&ad->bound_ports)) {
>>>>>              advertise_datapath_cleanup(ad);
>>>>>              continue;
>>>>>          }
>>>>> diff --git a/controller/route.h b/controller/route.h
>>>>> index 10b9434fc..c53134559 100644
>>>>> --- a/controller/route.h
>>>>> +++ b/controller/route.h
>>>>> @@ -22,6 +22,7 @@
>>>>>  #include <netinet/in.h>
>>>>>  #include "openvswitch/hmap.h"
>>>>>  #include "sset.h"
>>>>> +#include "smap.h"
>>>>>
>>>>>  struct hmap;
>>>>>  struct ovsdb_idl_index;
>>>>> @@ -50,8 +51,9 @@ struct advertise_datapath_entry {
>>>>>      struct hmap routes;
>>>>>
>>>>>      /* The name of the port bindings locally bound for this datapath and
>>>>> -     * running route exchange logic. */
>>>>> -    struct sset bound_ports;
>>>>> +     * running route exchange logic.
>>>>> +     * The key is the port name and the value is the ifname if set. */
>>>>> +    struct smap bound_ports;
>>>>>  };
>>>>>
>>>>>  struct advertise_route_entry {
>>>>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>>>>> index afe0ee902..f41a23115 100644
>>>>> --- a/tests/system-ovn.at
>>>>> +++ b/tests/system-ovn.at
>>>>> @@ -15660,6 +15660,17 @@ check_row_count Learned_Route 1
>>>>>  lp=$(fetch_column port_binding _uuid logical_port=internet-phys)
>>>>>  check_row_count Learned_Route 1 logical_port=$lp 
>>>>> ip_prefix=233.252.0.0/24 nexthop=192.168.10.10
>>>>>
>>>>> +# By setting a learning interface filter we will now forget about this 
>>>>> route.
>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \
>>>>> +      options:dynamic-routing-ifname=thisdoesnotexist
>>>>> +check_row_count Learned_Route 0
>>>>> +
>>>>> +# Changing it to "lo" will allow us to learn the route again.
>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \
>>>>> +      options:dynamic-routing-ifname=lo
>>>>> +check_row_count Learned_Route 1
>>>>> +
>>>>> +
>>>>>  OVS_APP_EXIT_AND_WAIT([ovn-controller])
>>>>>
>>>>>  as ovn-sb
>>>>> @@ -15869,10 +15880,20 @@ check ip route add 233.252.0.0/24 via 
>>>>> 192.168.10.10 dev lo onlink vrf ovnvrf1337
>>>>>  # For now we trigger a recompute as route watching is not yet 
>>>>> implemented.
>>>>>  check ovn-appctl -t ovn-controller inc-engine/recompute
>>>>>  check ovn-nbctl --wait=hv sync
>>>>> -check_row_count Learned_Route 2
>>>>> +check_row_count Learned_Route 1
>>>>>  lp=$(fetch_column port_binding _uuid logical_port=internet-phys)
>>>>>  check_row_count Learned_Route 1 logical_port=$lp 
>>>>> ip_prefix=233.252.0.0/24 nexthop=192.168.10.10
>>>>>
>>>>> +# By setting a learning interface filter we will now forget about this 
>>>>> route.
>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \
>>>>> +      options:dynamic-routing-ifname=thisdoesnotexist
>>>>> +check_row_count Learned_Route 0
>>>>> +
>>>>> +# Changing it to "lo" will allow us to learn the route again.
>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \
>>>>> +      options:dynamic-routing-ifname=lo
>>>>> +check_row_count Learned_Route 1
>>>>> +
>>>>>  as ovn-sb
>>>>>  OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>>>>
>>>>> --
>>>>> 2.47.1
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> dev mailing list
>>>>> [email protected]
>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>
>>>
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to