On 2/6/25 1:08 PM, Felix Huettner wrote:
> On Thu, Feb 06, 2025 at 11:48:45AM +0100, Dumitru Ceara wrote:
>> On 2/6/25 11:38 AM, Frode Nordahl wrote:
>>> On Thu, Feb 6, 2025 at 11:22 AM Felix Huettner
>>> <[email protected]> wrote:
>>>>
>>>> On Thu, Feb 06, 2025 at 10:51:29AM +0100, Dumitru Ceara wrote:
>>>>> On 2/6/25 10:31 AM, Frode Nordahl wrote:
>>>>>> On Tue, Feb 4, 2025 at 3:04 PM Felix Huettner via dev
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> Previously we just assumed that if a LR had multiple LRPs bound to the
>>>>>>> local chassis that all routes where valid for all of these LRPs.
>>>>>>> This commit handles a previous option in ovn-nb that allowes the user to
>>>>>>> specify the interface name that a route needs to use to be acceptable
>>>>>>> for a given LRP.
>>>>>>> The users can then have a 1:1 relationship between LRPs and interfaces
>>>>>>> to allow ovn-controller to determine which route belongs to which LRP.
>>>>>>>
>>>>>>> This can e.g. be used if a chassis has multiple uplinks. Each of these
>>>>>>> uplinks could be assigned to a different LRP of the same LR connected to
>>>>>>> the same localnet.
>>>>>>> The option allows then to map e.g. a learned default route to each
>>>>>>> individual LRP. If one of the links would go down just one of the
>>>>>>> learned routes would be removed, causing ECMP across multiple chassis to
>>>>>>> be correctly weighted.
>>>>>>>
>>>>>>> Acked-by: Dumitru Ceara <[email protected]>
>>>>>>> Signed-off-by: Felix Huettner <[email protected]>
>>>>>>> ---
>>>>
>>>> Hi Frode, Hi Dumitru,
>>>>
>>>>>>
>>>>>> Thank you for the patch, this is indeed an important use case, I
>>>>>> wonder a bit about the approach to it though.
>>>>>>
>>>>>> Most constructs and options currently in OVN refer to logical
>>>>>> constructs that reside either in OVN or in OVS. Your proposal
>>>>>> introduces an option that refers to system details, which may or may
>>>>>> not be the same across chassis.
>>>>>>
>>>>>> Putting my feet into the shoes of a CMS developer, how would I know
>>>>>> how to map between these two worlds using only OVN APIs?
>>>>>>
>>>>>> I assume that you are using this feature together with the LRP
>>>>>> options:routing-protocol-redirect feature, which takes the name of an
>>>>>> LSP.
>>>>>>
>>>>>> Could we instead either use that value, or have this new option take a
>>>>>> LSP name which you then can map to a system interface on each chassis
>>>>>> as needed in the ovn-controller code?
>>>>>>
>>>>>
>>>>> Ah, good point, you're right Frode. dynamic-routing-ifname is the name
>>>>> of a system interface and there's no guarantee that will be the same on
>>>>> all chassis.  Using a LSP name instead might be the best way forward
>>>>> indeed.  We could use
>>>>> local_binding_is_up(LRP.options.dynamic-routing-ifname) to determine if
>>>>> that associated LSP is bound locally or not.
>>>>
>>>> I don't think we can directly map from an lsp name to the interface
>>>> name.
>>>> Allthough the following argumentation comes from the time where
>>>> the patchseries had netns support i think this would still be valid.
>>>> In the netns usecase i would assume that the routing-protocol-redirect
>>>> LSP is locally bound to a veth pair. One part of the pair is plugged
>>>> into OVS and mapped to the LSP.
>>>> The other end of the pair would be in a netns. This would also be the
>>>> interface that would actually learn the routes.
>>>>
>>>> So the interface that is mapped to the LSP would be different from the
>>>> one we learn routes on.
>>>> In that case i assumed that it is easier for users to ensure that the
>>>> interface name across chassis is identical based on a value they are
>>>> free to choose.
> 
> Hi Frode, Hi Dumitru,
> 
>>>
>>> Dealing with the many ways a VIF can be attached in a system is indeed
>>> non-trivial. Thank you for highlighting the veth pair case, I think
>>> that is indeed the most common one. It is useful also without
>>> namespaces, as it would allow OVS to be supervisor of one side and the
>>> VRF to be supervisor of the peer.
>>>
>>>> What might be an option is to use the LSP path you describe above with
>>>> an optional overwrite on the local ovsdb.
>>>>
>>>> What do you think of that?
>>>
>>> For the veth pair case, from cursory view the peer ifindex is visible
>>> in `ethtool -S` for both sides, even across namespaces.
>>>
>>> However, we don't know how people will actually wire this, so perhaps
>>> your suggestion of adding some option in the local ovsdb is the best
>>> course of action.
>>>
>>> external_id in the Open vSwitch Interfaces table?
>>>
>>
>> I'm trying to figure out how trivial incremental processing would be in
>> ovn-controller for new OVS.Interface.external_ids.
>>
>> Felix, would route_runtime_data_handler() be called if such a new
>> external-id changes value?
>>
>> If it would then I assume it would return "false" (we don't yet support
>> I-P for that case) and trigger a route reconciliation.  So that'd be
>> fine.  Or am I missing something?
> 
> I would assume the same, but honestly i would only be sure once i tried
> it out.
> 
> Another thing that came to my mind: Going this way prevents the user
> from using an interface filter if there is no LSP mapping to the
> interface.
> I think that something like this could happen if you integrate routing
> with some custom agent. Especially if such a agent does not do in-band
> BGP but rather uses some outside logic to learn routes.
> 
> In such cases it might make more sense to use the external_id column in
> the OpenVswitch table.
> Per default we would lookup the interface based on the locally bound
> LSP.
> However if we find that LSP name in the external_ids of the OpenVswitch
> table then we use the interface name in there, independent if that LSP
> is bound locally or even exists.
> 
> What are your opinions on that?
> 
> 
> I would try such an implementation in the next hour or so.
> If i don't get it to a nice state by then i would send out v7 with the
> current state of this patch.
> If needed this patch could be removed from the series and would probably
> not affect the others too much.
> 

This sounds OK to me, if Frode agrees.

> Thanks a lot,
> Felix
> 
> 
>>
>> Thanks,
>> Dumitru
>>
>>> --
>>> Frode Nordahl
>>>
>>>> Thanks a lot,
>>>> Felix
>>>>
>>>>>
>>>>> Regards,
>>>>> Dumitru
>>>>>
>>>>>> --
>>>>>> Frode Nordahl
>>>>>>
>>>>>>> v5->v6:
>>>>>>>   * addressed review comments
>>>>>>> v3->v4:
>>>>>>>   - addressed review comments.
>>>>>>>   - updated commit message to be more descriptive.
>>>>>>>
>>>>>>>  controller/route-exchange-netlink.c |  2 ++
>>>>>>>  controller/route-exchange-netlink.h |  3 +++
>>>>>>>  controller/route-exchange.c         | 20 ++++++++++++++------
>>>>>>>  controller/route.c                  | 12 ++++++++----
>>>>>>>  controller/route.h                  |  6 ++++--
>>>>>>>  tests/system-ovn.at                 | 23 ++++++++++++++++++++++-
>>>>>>>  6 files changed, 53 insertions(+), 13 deletions(-)
>>>>>>>
>>>>>>> diff --git a/controller/route-exchange-netlink.c 
>>>>>>> b/controller/route-exchange-netlink.c
>>>>>>> index 0969f15a1..82b87ae4e 100644
>>>>>>> --- a/controller/route-exchange-netlink.c
>>>>>>> +++ b/controller/route-exchange-netlink.c
>>>>>>> @@ -230,6 +230,8 @@ handle_route_msg(const struct route_table_msg *msg, 
>>>>>>> void *data)
>>>>>>>              rr->prefix = rd->rta_dst;
>>>>>>>              rr->plen = rd->rtm_dst_len;
>>>>>>>              rr->nexthop = nexthop->addr;
>>>>>>> +            memcpy(rr->ifname, nexthop->ifname, IFNAMSIZ);
>>>>>>> +            rr->ifname[IFNAMSIZ] = 0;
>>>>>>>              ovs_list_push_back(handle_data->learned_routes, 
>>>>>>> &rr->list_node);
>>>>>>>          }
>>>>>>>          return;
>>>>>>> diff --git a/controller/route-exchange-netlink.h 
>>>>>>> b/controller/route-exchange-netlink.h
>>>>>>> index 9ea2fb8cb..dee651971 100644
>>>>>>> --- a/controller/route-exchange-netlink.h
>>>>>>> +++ b/controller/route-exchange-netlink.h
>>>>>>> @@ -21,6 +21,7 @@
>>>>>>>  #include <stdint.h>
>>>>>>>  #include "openvswitch/list.h"
>>>>>>>  #include <netinet/in.h>
>>>>>>> +#include <net/if.h>
>>>>>>>
>>>>>>>  /* This value is arbitrary but currently unused.
>>>>>>>   * See 
>>>>>>> https://github.com/iproute2/iproute2/blob/main/etc/iproute2/rt_protos */
>>>>>>> @@ -35,6 +36,8 @@ struct re_nl_received_route_node {
>>>>>>>      struct in6_addr prefix;
>>>>>>>      unsigned int plen;
>>>>>>>      struct in6_addr nexthop;
>>>>>>> +    /* Adding 1 to this to be sure we actually have a terminating '\0' 
>>>>>>> */
>>>>>>> +    char ifname[IFNAMSIZ + 1];
>>>>>>>  };
>>>>>>>
>>>>>>>  int re_nl_create_vrf(const char *ifname, uint32_t table_id);
>>>>>>> diff --git a/controller/route-exchange.c b/controller/route-exchange.c
>>>>>>> index 850316465..a1976234b 100644
>>>>>>> --- a/controller/route-exchange.c
>>>>>>> +++ b/controller/route-exchange.c
>>>>>>> @@ -95,7 +95,7 @@ route_lookup(struct hmap *route_map,
>>>>>>>  static void
>>>>>>>  sb_sync_learned_routes(const struct ovs_list *learned_routes,
>>>>>>>                         const struct sbrec_datapath_binding *datapath,
>>>>>>> -                       const struct sset *bound_ports,
>>>>>>> +                       const struct smap *bound_ports,
>>>>>>>                         struct ovsdb_idl_txn *ovnsb_idl_txn,
>>>>>>>                         struct ovsdb_idl_index 
>>>>>>> *sbrec_port_binding_by_name,
>>>>>>>                         struct ovsdb_idl_index 
>>>>>>> *sbrec_learned_route_by_datapath)
>>>>>>> @@ -110,8 +110,9 @@ sb_sync_learned_routes(const struct ovs_list 
>>>>>>> *learned_routes,
>>>>>>>      SBREC_LEARNED_ROUTE_FOR_EACH_EQUAL (sb_route, filter,
>>>>>>>                                          
>>>>>>> sbrec_learned_route_by_datapath) {
>>>>>>>          /* If the port is not local we don't care about it.
>>>>>>> -         * Some other ovn-controller will handle it. */
>>>>>>> -        if (!sset_contains(bound_ports,
>>>>>>> +         * Some other ovn-controller will handle it.
>>>>>>> +         * We may not use smap_get since the value might be validly 
>>>>>>> NULL. */
>>>>>>> +        if (!smap_get_node(bound_ports,
>>>>>>>                             sb_route->logical_port->logical_port)) {
>>>>>>>              continue;
>>>>>>>          }
>>>>>>> @@ -125,11 +126,18 @@ sb_sync_learned_routes(const struct ovs_list 
>>>>>>> *learned_routes,
>>>>>>>                                                 learned_route->plen);
>>>>>>>          char *nexthop = normalize_v46(&learned_route->nexthop);
>>>>>>>
>>>>>>> -        const char *logical_port_name;
>>>>>>> -        SSET_FOR_EACH (logical_port_name, bound_ports) {
>>>>>>> +        struct smap_node *port_node;
>>>>>>> +        SMAP_FOR_EACH (port_node, bound_ports) {
>>>>>>> +            /* The user specified an ifname, but we learned it on a 
>>>>>>> different
>>>>>>> +             * port. */
>>>>>>> +            if (port_node->value && strcmp(port_node->value,
>>>>>>> +                                           learned_route->ifname)) {
>>>>>>> +                continue;
>>>>>>> +            }
>>>>>>> +
>>>>>>>              const struct sbrec_port_binding *logical_port =
>>>>>>>                  lport_lookup_by_name(sbrec_port_binding_by_name,
>>>>>>> -                                     logical_port_name);
>>>>>>> +                                     port_node->key);
>>>>>>>              if (!logical_port) {
>>>>>>>                  continue;
>>>>>>>              }
>>>>>>> diff --git a/controller/route.c b/controller/route.c
>>>>>>> index 108435bad..7b8eed289 100644
>>>>>>> --- a/controller/route.c
>>>>>>> +++ b/controller/route.c
>>>>>>> @@ -82,7 +82,7 @@ advertise_datapath_cleanup(struct 
>>>>>>> advertise_datapath_entry *ad)
>>>>>>>          free(ar);
>>>>>>>      }
>>>>>>>      hmap_destroy(&ad->routes);
>>>>>>> -    sset_destroy(&ad->bound_ports);
>>>>>>> +    smap_destroy(&ad->bound_ports);
>>>>>>>      free(ad);
>>>>>>>  }
>>>>>>>
>>>>>>> @@ -114,7 +114,7 @@ route_run(struct route_ctx_in *r_ctx_in,
>>>>>>>          ad = xzalloc(sizeof(*ad));
>>>>>>>          ad->db = ld->datapath;
>>>>>>>          hmap_init(&ad->routes);
>>>>>>> -        sset_init(&ad->bound_ports);
>>>>>>> +        smap_init(&ad->bound_ports);
>>>>>>>
>>>>>>>          /* This is a LR datapath, find LRPs with route exchange options
>>>>>>>           * that are bound locally. */
>>>>>>> @@ -132,10 +132,14 @@ route_run(struct route_ctx_in *r_ctx_in,
>>>>>>>
>>>>>>>              ad->maintain_vrf |= smap_get_bool(
>>>>>>>                  &repb->options, "dynamic-routing-maintain-vrf", false);
>>>>>>> -            sset_add(&ad->bound_ports, local_peer->logical_port);
>>>>>>> +            char *ifname = nullable_xstrdup(
>>>>>>> +                                    smap_get(&repb->options,
>>>>>>> +                                             
>>>>>>> "dynamic-routing-ifname"));
>>>>>>> +            smap_add_nocopy(&ad->bound_ports,
>>>>>>> +                            xstrdup(local_peer->logical_port), ifname);
>>>>>>>          }
>>>>>>>
>>>>>>> -        if (sset_is_empty(&ad->bound_ports)) {
>>>>>>> +        if (smap_is_empty(&ad->bound_ports)) {
>>>>>>>              advertise_datapath_cleanup(ad);
>>>>>>>              continue;
>>>>>>>          }
>>>>>>> diff --git a/controller/route.h b/controller/route.h
>>>>>>> index 10b9434fc..c53134559 100644
>>>>>>> --- a/controller/route.h
>>>>>>> +++ b/controller/route.h
>>>>>>> @@ -22,6 +22,7 @@
>>>>>>>  #include <netinet/in.h>
>>>>>>>  #include "openvswitch/hmap.h"
>>>>>>>  #include "sset.h"
>>>>>>> +#include "smap.h"
>>>>>>>
>>>>>>>  struct hmap;
>>>>>>>  struct ovsdb_idl_index;
>>>>>>> @@ -50,8 +51,9 @@ struct advertise_datapath_entry {
>>>>>>>      struct hmap routes;
>>>>>>>
>>>>>>>      /* The name of the port bindings locally bound for this datapath 
>>>>>>> and
>>>>>>> -     * running route exchange logic. */
>>>>>>> -    struct sset bound_ports;
>>>>>>> +     * running route exchange logic.
>>>>>>> +     * The key is the port name and the value is the ifname if set. */
>>>>>>> +    struct smap bound_ports;
>>>>>>>  };
>>>>>>>
>>>>>>>  struct advertise_route_entry {
>>>>>>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>>>>>>> index afe0ee902..f41a23115 100644
>>>>>>> --- a/tests/system-ovn.at
>>>>>>> +++ b/tests/system-ovn.at
>>>>>>> @@ -15660,6 +15660,17 @@ check_row_count Learned_Route 1
>>>>>>>  lp=$(fetch_column port_binding _uuid logical_port=internet-phys)
>>>>>>>  check_row_count Learned_Route 1 logical_port=$lp 
>>>>>>> ip_prefix=233.252.0.0/24 nexthop=192.168.10.10
>>>>>>>
>>>>>>> +# By setting a learning interface filter we will now forget about this 
>>>>>>> route.
>>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \
>>>>>>> +      options:dynamic-routing-ifname=thisdoesnotexist
>>>>>>> +check_row_count Learned_Route 0
>>>>>>> +
>>>>>>> +# Changing it to "lo" will allow us to learn the route again.
>>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \
>>>>>>> +      options:dynamic-routing-ifname=lo
>>>>>>> +check_row_count Learned_Route 1
>>>>>>> +
>>>>>>> +
>>>>>>>  OVS_APP_EXIT_AND_WAIT([ovn-controller])
>>>>>>>
>>>>>>>  as ovn-sb
>>>>>>> @@ -15869,10 +15880,20 @@ check ip route add 233.252.0.0/24 via 
>>>>>>> 192.168.10.10 dev lo onlink vrf ovnvrf1337
>>>>>>>  # For now we trigger a recompute as route watching is not yet 
>>>>>>> implemented.
>>>>>>>  check ovn-appctl -t ovn-controller inc-engine/recompute
>>>>>>>  check ovn-nbctl --wait=hv sync
>>>>>>> -check_row_count Learned_Route 2
>>>>>>> +check_row_count Learned_Route 1
>>>>>>>  lp=$(fetch_column port_binding _uuid logical_port=internet-phys)
>>>>>>>  check_row_count Learned_Route 1 logical_port=$lp 
>>>>>>> ip_prefix=233.252.0.0/24 nexthop=192.168.10.10
>>>>>>>
>>>>>>> +# By setting a learning interface filter we will now forget about this 
>>>>>>> route.
>>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \
>>>>>>> +      options:dynamic-routing-ifname=thisdoesnotexist
>>>>>>> +check_row_count Learned_Route 0
>>>>>>> +
>>>>>>> +# Changing it to "lo" will allow us to learn the route again.
>>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \
>>>>>>> +      options:dynamic-routing-ifname=lo
>>>>>>> +check_row_count Learned_Route 1
>>>>>>> +
>>>>>>>  as ovn-sb
>>>>>>>  OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>>>>>>
>>>>>>> --
>>>>>>> 2.47.1
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> dev mailing list
>>>>>>> [email protected]
>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>>
>>>>>
>>>
>>
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to