On Thu, Feb 06, 2025 at 01:27:37PM +0100, Frode Nordahl wrote:
> tor. 6. feb. 2025, 13:22 skrev Dumitru Ceara <[email protected]>:
> 
> > On 2/6/25 1:08 PM, Felix Huettner wrote:
> > > On Thu, Feb 06, 2025 at 11:48:45AM +0100, Dumitru Ceara wrote:
> > >> On 2/6/25 11:38 AM, Frode Nordahl wrote:
> > >>> On Thu, Feb 6, 2025 at 11:22 AM Felix Huettner
> > >>> <[email protected]> wrote:
> > >>>>
> > >>>> On Thu, Feb 06, 2025 at 10:51:29AM +0100, Dumitru Ceara wrote:
> > >>>>> On 2/6/25 10:31 AM, Frode Nordahl wrote:
> > >>>>>> On Tue, Feb 4, 2025 at 3:04 PM Felix Huettner via dev
> > >>>>>> <[email protected]> wrote:
> > >>>>>>>
> > >>>>>>> Previously we just assumed that if a LR had multiple LRPs bound to
> > the
> > >>>>>>> local chassis that all routes where valid for all of these LRPs.
> > >>>>>>> This commit handles a previous option in ovn-nb that allowes the
> > user to
> > >>>>>>> specify the interface name that a route needs to use to be
> > acceptable
> > >>>>>>> for a given LRP.
> > >>>>>>> The users can then have a 1:1 relationship between LRPs and
> > interfaces
> > >>>>>>> to allow ovn-controller to determine which route belongs to which
> > LRP.
> > >>>>>>>
> > >>>>>>> This can e.g. be used if a chassis has multiple uplinks. Each of
> > these
> > >>>>>>> uplinks could be assigned to a different LRP of the same LR
> > connected to
> > >>>>>>> the same localnet.
> > >>>>>>> The option allows then to map e.g. a learned default route to each
> > >>>>>>> individual LRP. If one of the links would go down just one of the
> > >>>>>>> learned routes would be removed, causing ECMP across multiple
> > chassis to
> > >>>>>>> be correctly weighted.
> > >>>>>>>
> > >>>>>>> Acked-by: Dumitru Ceara <[email protected]>
> > >>>>>>> Signed-off-by: Felix Huettner <[email protected]>
> > >>>>>>> ---
> > >>>>
> > >>>> Hi Frode, Hi Dumitru,
> > >>>>
> > >>>>>>
> > >>>>>> Thank you for the patch, this is indeed an important use case, I
> > >>>>>> wonder a bit about the approach to it though.
> > >>>>>>
> > >>>>>> Most constructs and options currently in OVN refer to logical
> > >>>>>> constructs that reside either in OVN or in OVS. Your proposal
> > >>>>>> introduces an option that refers to system details, which may or may
> > >>>>>> not be the same across chassis.
> > >>>>>>
> > >>>>>> Putting my feet into the shoes of a CMS developer, how would I know
> > >>>>>> how to map between these two worlds using only OVN APIs?
> > >>>>>>
> > >>>>>> I assume that you are using this feature together with the LRP
> > >>>>>> options:routing-protocol-redirect feature, which takes the name of
> > an
> > >>>>>> LSP.
> > >>>>>>
> > >>>>>> Could we instead either use that value, or have this new option
> > take a
> > >>>>>> LSP name which you then can map to a system interface on each
> > chassis
> > >>>>>> as needed in the ovn-controller code?
> > >>>>>>
> > >>>>>
> > >>>>> Ah, good point, you're right Frode. dynamic-routing-ifname is the
> > name
> > >>>>> of a system interface and there's no guarantee that will be the same
> > on
> > >>>>> all chassis.  Using a LSP name instead might be the best way forward
> > >>>>> indeed.  We could use
> > >>>>> local_binding_is_up(LRP.options.dynamic-routing-ifname) to determine
> > if
> > >>>>> that associated LSP is bound locally or not.
> > >>>>
> > >>>> I don't think we can directly map from an lsp name to the interface
> > >>>> name.
> > >>>> Allthough the following argumentation comes from the time where
> > >>>> the patchseries had netns support i think this would still be valid.
> > >>>> In the netns usecase i would assume that the routing-protocol-redirect
> > >>>> LSP is locally bound to a veth pair. One part of the pair is plugged
> > >>>> into OVS and mapped to the LSP.
> > >>>> The other end of the pair would be in a netns. This would also be the
> > >>>> interface that would actually learn the routes.
> > >>>>
> > >>>> So the interface that is mapped to the LSP would be different from the
> > >>>> one we learn routes on.
> > >>>> In that case i assumed that it is easier for users to ensure that the
> > >>>> interface name across chassis is identical based on a value they are
> > >>>> free to choose.
> > >
> > > Hi Frode, Hi Dumitru,
> > >
> > >>>
> > >>> Dealing with the many ways a VIF can be attached in a system is indeed
> > >>> non-trivial. Thank you for highlighting the veth pair case, I think
> > >>> that is indeed the most common one. It is useful also without
> > >>> namespaces, as it would allow OVS to be supervisor of one side and the
> > >>> VRF to be supervisor of the peer.
> > >>>
> > >>>> What might be an option is to use the LSP path you describe above with
> > >>>> an optional overwrite on the local ovsdb.
> > >>>>
> > >>>> What do you think of that?
> > >>>
> > >>> For the veth pair case, from cursory view the peer ifindex is visible
> > >>> in `ethtool -S` for both sides, even across namespaces.
> > >>>
> > >>> However, we don't know how people will actually wire this, so perhaps
> > >>> your suggestion of adding some option in the local ovsdb is the best
> > >>> course of action.
> > >>>
> > >>> external_id in the Open vSwitch Interfaces table?
> > >>>
> > >>
> > >> I'm trying to figure out how trivial incremental processing would be in
> > >> ovn-controller for new OVS.Interface.external_ids.
> > >>
> > >> Felix, would route_runtime_data_handler() be called if such a new
> > >> external-id changes value?
> > >>
> > >> If it would then I assume it would return "false" (we don't yet support
> > >> I-P for that case) and trigger a route reconciliation.  So that'd be
> > >> fine.  Or am I missing something?
> > >
> > > I would assume the same, but honestly i would only be sure once i tried
> > > it out.
> > >
> > > Another thing that came to my mind: Going this way prevents the user
> > > from using an interface filter if there is no LSP mapping to the
> > > interface.
> > > I think that something like this could happen if you integrate routing
> > > with some custom agent. Especially if such a agent does not do in-band
> > > BGP but rather uses some outside logic to learn routes.
> > >
> > > In such cases it might make more sense to use the external_id column in
> > > the OpenVswitch table.
> > > Per default we would lookup the interface based on the locally bound
> > > LSP.
> > > However if we find that LSP name in the external_ids of the OpenVswitch
> > > table then we use the interface name in there, independent if that LSP
> > > is bound locally or even exists.
> > >
> > > What are your opinions on that?
> > >
> > >
> > > I would try such an implementation in the next hour or so.
> > > If i don't get it to a nice state by then i would send out v7 with the
> > > current state of this patch.
> > > If needed this patch could be removed from the series and would probably
> > > not affect the others too much.
> > >
> >
> > This sounds OK to me, if Frode agrees.
> >
> 
> Works for me.

Hi Dumitru, Hi Frode,

I think i have a working solution and i will send it out in v7.

It creates some larger changes in:
* northd: Sync routing data to pb.
* controller: Support learning routes per iface.
I will therefor remove the existing Acks from these commits.

Additionally some smaller changes are in:
* controller: Announce routes via route-exchange.
* controller: Watch for route changes
But these are just changes in the testcases to bring them in line with
the above changes.

I hope that changeset fits for everyone and that the quality is fine
even though it was built fast :)

Thanks a lot,
Felix

> 
> --
> Frode Nordahl
> 
> > Thanks a lot,
> > > Felix
> > >
> > >
> > >>
> > >> Thanks,
> > >> Dumitru
> > >>
> > >>> --
> > >>> Frode Nordahl
> > >>>
> > >>>> Thanks a lot,
> > >>>> Felix
> > >>>>
> > >>>>>
> > >>>>> Regards,
> > >>>>> Dumitru
> > >>>>>
> > >>>>>> --
> > >>>>>> Frode Nordahl
> > >>>>>>
> > >>>>>>> v5->v6:
> > >>>>>>>   * addressed review comments
> > >>>>>>> v3->v4:
> > >>>>>>>   - addressed review comments.
> > >>>>>>>   - updated commit message to be more descriptive.
> > >>>>>>>
> > >>>>>>>  controller/route-exchange-netlink.c |  2 ++
> > >>>>>>>  controller/route-exchange-netlink.h |  3 +++
> > >>>>>>>  controller/route-exchange.c         | 20 ++++++++++++++------
> > >>>>>>>  controller/route.c                  | 12 ++++++++----
> > >>>>>>>  controller/route.h                  |  6 ++++--
> > >>>>>>>  tests/system-ovn.at                 | 23 ++++++++++++++++++++++-
> > >>>>>>>  6 files changed, 53 insertions(+), 13 deletions(-)
> > >>>>>>>
> > >>>>>>> diff --git a/controller/route-exchange-netlink.c
> > b/controller/route-exchange-netlink.c
> > >>>>>>> index 0969f15a1..82b87ae4e 100644
> > >>>>>>> --- a/controller/route-exchange-netlink.c
> > >>>>>>> +++ b/controller/route-exchange-netlink.c
> > >>>>>>> @@ -230,6 +230,8 @@ handle_route_msg(const struct route_table_msg
> > *msg, void *data)
> > >>>>>>>              rr->prefix = rd->rta_dst;
> > >>>>>>>              rr->plen = rd->rtm_dst_len;
> > >>>>>>>              rr->nexthop = nexthop->addr;
> > >>>>>>> +            memcpy(rr->ifname, nexthop->ifname, IFNAMSIZ);
> > >>>>>>> +            rr->ifname[IFNAMSIZ] = 0;
> > >>>>>>>              ovs_list_push_back(handle_data->learned_routes,
> > &rr->list_node);
> > >>>>>>>          }
> > >>>>>>>          return;
> > >>>>>>> diff --git a/controller/route-exchange-netlink.h
> > b/controller/route-exchange-netlink.h
> > >>>>>>> index 9ea2fb8cb..dee651971 100644
> > >>>>>>> --- a/controller/route-exchange-netlink.h
> > >>>>>>> +++ b/controller/route-exchange-netlink.h
> > >>>>>>> @@ -21,6 +21,7 @@
> > >>>>>>>  #include <stdint.h>
> > >>>>>>>  #include "openvswitch/list.h"
> > >>>>>>>  #include <netinet/in.h>
> > >>>>>>> +#include <net/if.h>
> > >>>>>>>
> > >>>>>>>  /* This value is arbitrary but currently unused.
> > >>>>>>>   * See
> > https://github.com/iproute2/iproute2/blob/main/etc/iproute2/rt_protos */
> > >>>>>>> @@ -35,6 +36,8 @@ struct re_nl_received_route_node {
> > >>>>>>>      struct in6_addr prefix;
> > >>>>>>>      unsigned int plen;
> > >>>>>>>      struct in6_addr nexthop;
> > >>>>>>> +    /* Adding 1 to this to be sure we actually have a terminating
> > '\0' */
> > >>>>>>> +    char ifname[IFNAMSIZ + 1];
> > >>>>>>>  };
> > >>>>>>>
> > >>>>>>>  int re_nl_create_vrf(const char *ifname, uint32_t table_id);
> > >>>>>>> diff --git a/controller/route-exchange.c
> > b/controller/route-exchange.c
> > >>>>>>> index 850316465..a1976234b 100644
> > >>>>>>> --- a/controller/route-exchange.c
> > >>>>>>> +++ b/controller/route-exchange.c
> > >>>>>>> @@ -95,7 +95,7 @@ route_lookup(struct hmap *route_map,
> > >>>>>>>  static void
> > >>>>>>>  sb_sync_learned_routes(const struct ovs_list *learned_routes,
> > >>>>>>>                         const struct sbrec_datapath_binding
> > *datapath,
> > >>>>>>> -                       const struct sset *bound_ports,
> > >>>>>>> +                       const struct smap *bound_ports,
> > >>>>>>>                         struct ovsdb_idl_txn *ovnsb_idl_txn,
> > >>>>>>>                         struct ovsdb_idl_index
> > *sbrec_port_binding_by_name,
> > >>>>>>>                         struct ovsdb_idl_index
> > *sbrec_learned_route_by_datapath)
> > >>>>>>> @@ -110,8 +110,9 @@ sb_sync_learned_routes(const struct ovs_list
> > *learned_routes,
> > >>>>>>>      SBREC_LEARNED_ROUTE_FOR_EACH_EQUAL (sb_route, filter,
> > >>>>>>>
> > sbrec_learned_route_by_datapath) {
> > >>>>>>>          /* If the port is not local we don't care about it.
> > >>>>>>> -         * Some other ovn-controller will handle it. */
> > >>>>>>> -        if (!sset_contains(bound_ports,
> > >>>>>>> +         * Some other ovn-controller will handle it.
> > >>>>>>> +         * We may not use smap_get since the value might be
> > validly NULL. */
> > >>>>>>> +        if (!smap_get_node(bound_ports,
> > >>>>>>>                             sb_route->logical_port->logical_port))
> > {
> > >>>>>>>              continue;
> > >>>>>>>          }
> > >>>>>>> @@ -125,11 +126,18 @@ sb_sync_learned_routes(const struct ovs_list
> > *learned_routes,
> > >>>>>>>
> >  learned_route->plen);
> > >>>>>>>          char *nexthop = normalize_v46(&learned_route->nexthop);
> > >>>>>>>
> > >>>>>>> -        const char *logical_port_name;
> > >>>>>>> -        SSET_FOR_EACH (logical_port_name, bound_ports) {
> > >>>>>>> +        struct smap_node *port_node;
> > >>>>>>> +        SMAP_FOR_EACH (port_node, bound_ports) {
> > >>>>>>> +            /* The user specified an ifname, but we learned it on
> > a different
> > >>>>>>> +             * port. */
> > >>>>>>> +            if (port_node->value && strcmp(port_node->value,
> > >>>>>>> +
> >  learned_route->ifname)) {
> > >>>>>>> +                continue;
> > >>>>>>> +            }
> > >>>>>>> +
> > >>>>>>>              const struct sbrec_port_binding *logical_port =
> > >>>>>>>                  lport_lookup_by_name(sbrec_port_binding_by_name,
> > >>>>>>> -                                     logical_port_name);
> > >>>>>>> +                                     port_node->key);
> > >>>>>>>              if (!logical_port) {
> > >>>>>>>                  continue;
> > >>>>>>>              }
> > >>>>>>> diff --git a/controller/route.c b/controller/route.c
> > >>>>>>> index 108435bad..7b8eed289 100644
> > >>>>>>> --- a/controller/route.c
> > >>>>>>> +++ b/controller/route.c
> > >>>>>>> @@ -82,7 +82,7 @@ advertise_datapath_cleanup(struct
> > advertise_datapath_entry *ad)
> > >>>>>>>          free(ar);
> > >>>>>>>      }
> > >>>>>>>      hmap_destroy(&ad->routes);
> > >>>>>>> -    sset_destroy(&ad->bound_ports);
> > >>>>>>> +    smap_destroy(&ad->bound_ports);
> > >>>>>>>      free(ad);
> > >>>>>>>  }
> > >>>>>>>
> > >>>>>>> @@ -114,7 +114,7 @@ route_run(struct route_ctx_in *r_ctx_in,
> > >>>>>>>          ad = xzalloc(sizeof(*ad));
> > >>>>>>>          ad->db = ld->datapath;
> > >>>>>>>          hmap_init(&ad->routes);
> > >>>>>>> -        sset_init(&ad->bound_ports);
> > >>>>>>> +        smap_init(&ad->bound_ports);
> > >>>>>>>
> > >>>>>>>          /* This is a LR datapath, find LRPs with route exchange
> > options
> > >>>>>>>           * that are bound locally. */
> > >>>>>>> @@ -132,10 +132,14 @@ route_run(struct route_ctx_in *r_ctx_in,
> > >>>>>>>
> > >>>>>>>              ad->maintain_vrf |= smap_get_bool(
> > >>>>>>>                  &repb->options, "dynamic-routing-maintain-vrf",
> > false);
> > >>>>>>> -            sset_add(&ad->bound_ports, local_peer->logical_port);
> > >>>>>>> +            char *ifname = nullable_xstrdup(
> > >>>>>>> +                                    smap_get(&repb->options,
> > >>>>>>> +
> >  "dynamic-routing-ifname"));
> > >>>>>>> +            smap_add_nocopy(&ad->bound_ports,
> > >>>>>>> +                            xstrdup(local_peer->logical_port),
> > ifname);
> > >>>>>>>          }
> > >>>>>>>
> > >>>>>>> -        if (sset_is_empty(&ad->bound_ports)) {
> > >>>>>>> +        if (smap_is_empty(&ad->bound_ports)) {
> > >>>>>>>              advertise_datapath_cleanup(ad);
> > >>>>>>>              continue;
> > >>>>>>>          }
> > >>>>>>> diff --git a/controller/route.h b/controller/route.h
> > >>>>>>> index 10b9434fc..c53134559 100644
> > >>>>>>> --- a/controller/route.h
> > >>>>>>> +++ b/controller/route.h
> > >>>>>>> @@ -22,6 +22,7 @@
> > >>>>>>>  #include <netinet/in.h>
> > >>>>>>>  #include "openvswitch/hmap.h"
> > >>>>>>>  #include "sset.h"
> > >>>>>>> +#include "smap.h"
> > >>>>>>>
> > >>>>>>>  struct hmap;
> > >>>>>>>  struct ovsdb_idl_index;
> > >>>>>>> @@ -50,8 +51,9 @@ struct advertise_datapath_entry {
> > >>>>>>>      struct hmap routes;
> > >>>>>>>
> > >>>>>>>      /* The name of the port bindings locally bound for this
> > datapath and
> > >>>>>>> -     * running route exchange logic. */
> > >>>>>>> -    struct sset bound_ports;
> > >>>>>>> +     * running route exchange logic.
> > >>>>>>> +     * The key is the port name and the value is the ifname if
> > set. */
> > >>>>>>> +    struct smap bound_ports;
> > >>>>>>>  };
> > >>>>>>>
> > >>>>>>>  struct advertise_route_entry {
> > >>>>>>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> > >>>>>>> index afe0ee902..f41a23115 100644
> > >>>>>>> --- a/tests/system-ovn.at
> > >>>>>>> +++ b/tests/system-ovn.at
> > >>>>>>> @@ -15660,6 +15660,17 @@ check_row_count Learned_Route 1
> > >>>>>>>  lp=$(fetch_column port_binding _uuid logical_port=internet-phys)
> > >>>>>>>  check_row_count Learned_Route 1 logical_port=$lp ip_prefix=
> > 233.252.0.0/24 nexthop=192.168.10.10
> > >>>>>>>
> > >>>>>>> +# By setting a learning interface filter we will now forget about
> > this route.
> > >>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \
> > >>>>>>> +      options:dynamic-routing-ifname=thisdoesnotexist
> > >>>>>>> +check_row_count Learned_Route 0
> > >>>>>>> +
> > >>>>>>> +# Changing it to "lo" will allow us to learn the route again.
> > >>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \
> > >>>>>>> +      options:dynamic-routing-ifname=lo
> > >>>>>>> +check_row_count Learned_Route 1
> > >>>>>>> +
> > >>>>>>> +
> > >>>>>>>  OVS_APP_EXIT_AND_WAIT([ovn-controller])
> > >>>>>>>
> > >>>>>>>  as ovn-sb
> > >>>>>>> @@ -15869,10 +15880,20 @@ check ip route add 233.252.0.0/24 via
> > 192.168.10.10 dev lo onlink vrf ovnvrf1337
> > >>>>>>>  # For now we trigger a recompute as route watching is not yet
> > implemented.
> > >>>>>>>  check ovn-appctl -t ovn-controller inc-engine/recompute
> > >>>>>>>  check ovn-nbctl --wait=hv sync
> > >>>>>>> -check_row_count Learned_Route 2
> > >>>>>>> +check_row_count Learned_Route 1
> > >>>>>>>  lp=$(fetch_column port_binding _uuid logical_port=internet-phys)
> > >>>>>>>  check_row_count Learned_Route 1 logical_port=$lp ip_prefix=
> > 233.252.0.0/24 nexthop=192.168.10.10
> > >>>>>>>
> > >>>>>>> +# By setting a learning interface filter we will now forget about
> > this route.
> > >>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \
> > >>>>>>> +      options:dynamic-routing-ifname=thisdoesnotexist
> > >>>>>>> +check_row_count Learned_Route 0
> > >>>>>>> +
> > >>>>>>> +# Changing it to "lo" will allow us to learn the route again.
> > >>>>>>> +check ovn-nbctl --wait=hv set Logical_Router_Port internet-phys \
> > >>>>>>> +      options:dynamic-routing-ifname=lo
> > >>>>>>> +check_row_count Learned_Route 1
> > >>>>>>> +
> > >>>>>>>  as ovn-sb
> > >>>>>>>  OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> 2.47.1
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> _______________________________________________
> > >>>>>>> dev mailing list
> > >>>>>>> [email protected]
> > >>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > >>>>>>
> > >>>>>
> > >>>
> > >>
> > >
> >
> >
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to