On 2/13/25 10:18 PM, [email protected] wrote:
> On Thu, 2025-02-13 at 19:21 +0100, Felix Huettner wrote:
>> On Thu, Feb 13, 2025 at 02:33:45PM +0100, Martin Kalcok wrote:
>>> This change builds on top of the new "dynamic routing" OVN feature
>>> that allows advertising routes to the fabric network. New accepted
>>> values are added to "dynamic-routing-redistribute" option.
>>>
>>> * nat - When included in the option, ovn-controller will advertise
>>> routes for external NAT IPs of the LR, as well as those
>>> of the neighboring routers (directly connected or through
>>> the shared LS).
>>> * lb - When included in the option, ovn-controller will advertise
>>> routes for LB VIPs of the LR, as well as those
>>> of the neighboring routers (directly connected or through
>>> the shared LS).
>>
>> Hi Martin, Frode, Dumitru,
>>
>> from my perspective that patch looks good.
>>
>> Acked-By: Felix Huettner <[email protected]>
>>
>> I have added some comments on things below that might make sense to
>> do
>> differently. But honestly from my perspective they should not stand
>> in
>> the way of this patchset.
>>
>> There is a small type in ovn-nb.xml which should probably be fixed
>> though.
>>
>>>
>>> Co-authored-by: Frode Nordahl <[email protected]>
>>> Co-authored-by: Dumitru Ceara <[email protected]>
>>> Signed-off-by: Frode Nordahl <[email protected]>
>>> Signed-off-by: Dumitru Ceara <[email protected]>
>>> Signed-off-by: Martin Kalcok <[email protected]>
>>> ---
>>> TODO.rst | 6 +
>>> lib/stopwatch-names.h | 1 +
>>> northd/en-advertised-route-sync.c | 202 ++++++++--
>>> northd/en-advertised-route-sync.h | 4 +
>>> northd/en-learned-route-sync.c | 3 +-
>>> northd/inc-proc-northd.c | 6 +
>>> northd/northd.c | 246 +++++++++++-
>>> northd/northd.h | 32 +-
>>> ovn-nb.xml | 36 ++
>>> tests/ovn-northd.at | 601
>>> ++++++++++++++++++++++++++++
>>> tests/system-ovn.at | 628
>>> ++++++++++++++++++++++++++++++
>>> 11 files changed, 1716 insertions(+), 49 deletions(-)
>>>
>>> diff --git a/TODO.rst b/TODO.rst
>>> index d75db3152..c50b2b980 100644
>>> --- a/TODO.rst
>>> +++ b/TODO.rst
>>> @@ -156,3 +156,9 @@ OVN To-do List
>>> monitoring conditions to update before we actually try to
>>> learn routes.
>>> Otherwise we could try to add duplicated Learned_Routes and
>>> the ovnsb
>>> commit would fail.
>>> + * Consider splitting parsed_route structure. When creating
>>> parsed routes
>>> + with tracked_port explicitly set, other members of this
>>> structure are
>>> + usually unused/default. A new structure dedicated to routes
>>> with
>>> + explicitly defined tracked_port would be more efficient.
>>> + More details in
>>> +
>>> https://mail.openvswitch.org/pipermail/ovs-dev/2025-February/420985.html
>>> diff --git a/lib/stopwatch-names.h b/lib/stopwatch-names.h
>>> index c12dd28d0..13aa5e7bf 100644
>>> --- a/lib/stopwatch-names.h
>>> +++ b/lib/stopwatch-names.h
>>> @@ -36,5 +36,6 @@
>>> #define LS_STATEFUL_RUN_STOPWATCH_NAME "ls_stateful"
>>> #define ADVERTISED_ROUTE_SYNC_RUN_STOPWATCH_NAME
>>> "advertised_route_sync"
>>> #define LEARNED_ROUTE_SYNC_RUN_STOPWATCH_NAME "learned_route_sync"
>>> +#define DYNAMIC_ROUTES_RUN_STOPWATCH_NAME "dynamic_routes"
>>>
>>> #endif
>>> diff --git a/northd/en-advertised-route-sync.c b/northd/en-
>>> advertised-route-sync.c
>>> index cfea7c39b..279020a25 100644
>>> --- a/northd/en-advertised-route-sync.c
>>> +++ b/northd/en-advertised-route-sync.c
>>> @@ -30,7 +30,8 @@ advertised_route_table_sync(
>>> struct ovsdb_idl_txn *ovnsb_txn,
>>> const struct sbrec_advertised_route_table
>>> *sbrec_advertised_route_table,
>>> const struct lr_stateful_table *lr_stateful_table,
>>> - const struct hmap *parsed_routes,
>>> + const struct hmap *routes,
>>> + const struct hmap *dynamic_routes,
>>> struct advertised_route_sync_data *data);
>>>
>>> bool
>>> @@ -141,6 +142,8 @@ en_advertised_route_sync_run(struct engine_node
>>> *node, void *data OVS_UNUSED)
>>> struct advertised_route_sync_data *routes_sync_data = data;
>>> struct routes_data *routes_data
>>> = engine_get_input_data("routes", node);
>>> + struct dynamic_routes_data *dynamic_routes_data
>>> + = engine_get_input_data("dynamic_routes", node);
>>> const struct engine_context *eng_ctx = engine_get_context();
>>> const struct sbrec_advertised_route_table
>>> *sbrec_advertised_route_table =
>>> EN_OVSDB_GET(engine_get_input("SB_advertised_route",
>>> node));
>>> @@ -153,12 +156,84 @@ en_advertised_route_sync_run(struct
>>> engine_node *node, void *data OVS_UNUSED)
>>> sbrec_advertised_route_table,
>>> &lr_stateful_data->table,
>>> &routes_data->parsed_routes,
>>> + &dynamic_routes_data-
>>>> parsed_routes,
>>> routes_sync_data);
>>>
>>> stopwatch_stop(ADVERTISED_ROUTE_SYNC_RUN_STOPWATCH_NAME,
>>> time_msec());
>>> engine_set_node_state(node, EN_UPDATED);
>>> }
>>>
>>> +void
>>> +*en_dynamic_routes_init(struct engine_node *node OVS_UNUSED,
>>> + struct engine_arg *arg OVS_UNUSED)
>>> +{
>>> + struct dynamic_routes_data *data = xmalloc(sizeof *data);
>>> + *data = (struct dynamic_routes_data) {
>>> + .parsed_routes = HMAP_INITIALIZER(&data->parsed_routes),
>>> + };
>>> +
>>> + return data;
>>> +}
>>> +
>>> +static void
>>> +en_dynamic_routes_clean(struct dynamic_routes_data *data)
>>> +{
>>> + struct parsed_route *r;
>>> + HMAP_FOR_EACH_POP (r, key_node, &data->parsed_routes) {
>>> + parsed_route_free(r);
>>> + }
>>> +}
>>> +void
>>> +en_dynamic_routes_cleanup(void *data_)
>>> +{
>>> + struct dynamic_routes_data *data = data_;
>>> +
>>> + en_dynamic_routes_clean(data);
>>> + hmap_destroy(&data->parsed_routes);
>>> +}
>>> +
>>> +void
>>> +en_dynamic_routes_run(struct engine_node *node, void *data)
>>> +{
>>> + struct dynamic_routes_data *dynamic_routes_data = data;
>>> + struct northd_data *northd_data =
>>> engine_get_input_data("northd", node);
>>> + struct ed_type_lr_stateful *lr_stateful_data =
>>> + engine_get_input_data("lr_stateful", node);
>>> +
>>> + en_dynamic_routes_clean(data);
>>> + const struct lr_stateful_record *lr_stateful_rec;
>>> + HMAP_FOR_EACH (lr_stateful_rec, key_node,
>>> + &lr_stateful_data->table.entries) {
>>> + const struct ovn_datapath *od =
>>> + ovn_datapaths_find_by_index(&northd_data-
>>>> lr_datapaths,
>>> + lr_stateful_rec-
>>>> lr_index);
>>> + if (!od->dynamic_routing) {
>>> + continue;
>>> + }
>>> + build_nat_parsed_routes(od, lr_stateful_rec->lrnat_rec,
>>> + &dynamic_routes_data-
>>>> parsed_routes);
>>> + build_nat_connected_parsed_routes(od, &lr_stateful_data-
>>>> table,
>>> + &dynamic_routes_data-
>>>> parsed_routes);
>>> +
>>> + build_lb_parsed_routes(od, lr_stateful_rec->lb_ips,
>>> + &dynamic_routes_data-
>>>> parsed_routes);
>>> + build_lb_connected_parsed_routes(od, &lr_stateful_data-
>>>> table,
>>> + &dynamic_routes_data-
>>>> parsed_routes);
>>> + }
>>> + engine_set_node_state(node, EN_UPDATED);
>>> +}
>>> +
>>> +bool
>>> +dynamic_routes_lr_stateful_handler(struct engine_node *node
>>> OVS_UNUSED,
>>> + void *data OVS_UNUSED)
>>> +{
>>> + /* XXX: Incremental processing of dynamic routes for stateful
>>> + * configuration changes is not yet supported. Return false
>>> and
>>> + * trigger a recomputation.*/
>>> + return false;
>>> +}
>>
>> I am not sure if we should have a function that always returns false.
>> Maybe it would be more obvious if engine_add_input just gets NULL as
>> a
>> handler?
>
> I don't have strong opinion on this, but the function gives us a good
> place to store comment about eventually implementing I-P :D
>
That's true, however, I think it's better if we pass NULL as handler.
Like that we have a single place where we can easily tell which input
handlers are not implemented yet. I'd also say, let's add a TODO.rst
item for this too, we have one for Learned Routes.
>>
>>> +
>>> +
>>> struct ar_entry {
>>> struct hmap_node hmap_node;
>>>
>>> @@ -333,12 +408,84 @@ publish_host_routes(struct hmap *sync_routes,
>>> }
>>> }
>>>
>>> +static void
>>> +advertised_route_table_sync_route_add(
>>> + const struct lr_stateful_table *lr_stateful_table,
>>> + struct advertised_route_sync_data *data,
>>> + struct uuidset *host_route_lrps,
>>> + struct hmap *sync_routes,
>>> + const struct parsed_route *route)
>>> +{
>>> + if (route->is_discard_route) {
>>> + return;
>>> + }
>>> + if (prefix_is_link_local(&route->prefix, route->plen)) {
>>> + return;
>>> + }
>>> + if (!route->od->dynamic_routing) {
>>> + return;
>>> + }
>>> +
>>> + enum dynamic_routing_redistribute_mode drr =
>>> + route->out_port->dynamic_routing_redistribute;
>>> + if (route->source == ROUTE_SOURCE_CONNECTED) {
>>> + if (!drr_mode_CONNECTED_is_set(drr)) {
>>> + return;
>>> + }
>>> + /* If we advertise host routes, we only need to do so once
>>> per
>>> + * LRP. */
>>> + const struct uuid *lrp_uuid = &route->out_port->nbrp-
>>>> header_.uuid;
>>> + if (drr_mode_CONNECTED_AS_HOST_is_set(drr) &&
>>> + !uuidset_contains(host_route_lrps, lrp_uuid)) {
>>> + uuidset_insert(host_route_lrps, lrp_uuid);
>>> + publish_host_routes(sync_routes, lr_stateful_table,
>>> route, data);
>>> + return;
>>> + }
>>> + }
>>> + if (route->source == ROUTE_SOURCE_STATIC &&
>>> !drr_mode_STATIC_is_set(drr)) {
>>> + return;
>>> + }
>>> + if (route->source == ROUTE_SOURCE_NAT) {
>>> + if (!drr_mode_NAT_is_set(drr)) {
>>> + return;
>>> + }
>>> + /* If NAT route tracks port on a different DP than the one
>>> that
>>> + * advertises the route, we need to watch for changes on
>>> that DP as
>>> + * well. */
>>> + if (route->tracked_port && route->tracked_port->od !=
>>> route->od) {
>>> + uuidset_insert(&data->nb_lr,
>>> + &route->tracked_port->od->nbr-
>>>> header_.uuid);
>>> + }
>>> + }
>>> + if (route->source == ROUTE_SOURCE_LB) {
>>> + if (!drr_mode_LB_is_set(drr)) {
>>> + return;
>>> + }
>>> + /* If LB route tracks port on a different DP than the one
>>> that
>>> + * advertises the route, we need to watch for changes on
>>> that DP as
>>> + * well. */
>>> + if (route->tracked_port && route->tracked_port->od !=
>>> route->od) {
>>> + uuidset_insert(&data->nb_lr,
>>> + &route->tracked_port->od->nbr-
>>>> header_.uuid);
>>> + }
>>> + }
>>> +
>>> + char *ip_prefix = normalize_v46_prefix(&route->prefix, route-
>>>> plen);
>>> + const struct sbrec_port_binding *tracked_port = NULL;
>>> + if (route->tracked_port) {
>>> + tracked_port = route->tracked_port->sb;
>>> + }
>>> + ar_add_entry(sync_routes, route->od->sb, route->out_port->sb,
>>> + ip_prefix, tracked_port);
>>> +}
>>> +
>>> static void
>>> advertised_route_table_sync(
>>> struct ovsdb_idl_txn *ovnsb_txn,
>>> const struct sbrec_advertised_route_table
>>> *sbrec_advertised_route_table,
>>> const struct lr_stateful_table *lr_stateful_table,
>>> - const struct hmap *parsed_routes,
>>> + const struct hmap *routes,
>>> + const struct hmap *dynamic_routes,
>>> struct advertised_route_sync_data *data)
>>> {
>>> struct hmap sync_routes = HMAP_INITIALIZER(&sync_routes);
>>> @@ -346,47 +493,24 @@ advertised_route_table_sync(
>>> const struct parsed_route *route;
>>>
>>> struct ar_entry *route_e;
>>> - const struct sbrec_advertised_route *sb_route;
>>> - HMAP_FOR_EACH (route, key_node, parsed_routes) {
>>> - if (route->is_discard_route) {
>>> - continue;
>>> - }
>>> - if (prefix_is_link_local(&route->prefix, route->plen)) {
>>> - continue;
>>> - }
>>> - if (!route->od->dynamic_routing) {
>>> - continue;
>>> - }
>>>
>>> - enum dynamic_routing_redistribute_mode drr =
>>> - route->out_port->dynamic_routing_redistribute;
>>> - if (route->source == ROUTE_SOURCE_CONNECTED) {
>>> - if (!drr_mode_CONNECTED_is_set(drr)) {
>>> - continue;
>>> - }
>>> - /* If we advertise host routes, we only need to do so
>>> once per
>>> - * LRP. */
>>> - const struct uuid *lrp_uuid =
>>> - &route->out_port->nbrp->header_.uuid;
>>> - if (drr_mode_CONNECTED_AS_HOST_is_set(drr) &&
>>> - !uuidset_contains(&host_route_lrps, lrp_uuid)) {
>>> - uuidset_insert(&host_route_lrps, lrp_uuid);
>>> - publish_host_routes(&sync_routes,
>>> lr_stateful_table,
>>> - route, data);
>>> - continue;
>>> - }
>>> - }
>>> - if (route->source == ROUTE_SOURCE_STATIC &&
>>> - !drr_mode_STATIC_is_set(drr)) {
>>> - continue;
>>> - }
>>> -
>>> - char *ip_prefix = normalize_v46_prefix(&route->prefix,
>>> route->plen);
>>> - route_e = ar_add_entry(&sync_routes, route->od->sb,
>>> - route->out_port->sb, ip_prefix,
>>> NULL);
>>> + /* First build the set of non-dynamic routes that need sync-
>>> ing. */
>>> + HMAP_FOR_EACH (route, key_node, routes) {
>>> + advertised_route_table_sync_route_add(lr_stateful_table,
>>> + data,
>>> &host_route_lrps,
>>> + &sync_routes,
>>> + route);
>>> + }
>>> + /* Then add the set of dynamic routes that need sync-ing. */
>>> + HMAP_FOR_EACH (route, key_node, dynamic_routes) {
>>> + advertised_route_table_sync_route_add(lr_stateful_table,
>>> + data,
>>> &host_route_lrps,
>>> + &sync_routes,
>>> + route);
>>> }
>>> uuidset_destroy(&host_route_lrps);
>>>
>>> + const struct sbrec_advertised_route *sb_route;
>>> SBREC_ADVERTISED_ROUTE_TABLE_FOR_EACH_SAFE (sb_route,
>>>
>>> sbrec_advertised_route_table) {
>>> route_e = ar_find(&sync_routes, sb_route->datapath,
>>> diff --git a/northd/en-advertised-route-sync.h b/northd/en-
>>> advertised-route-sync.h
>>> index 1f24fd329..e18b75643 100644
>>> --- a/northd/en-advertised-route-sync.h
>>> +++ b/northd/en-advertised-route-sync.h
>>> @@ -36,4 +36,8 @@ void *en_advertised_route_sync_init(struct
>>> engine_node *, struct engine_arg *);
>>> void en_advertised_route_sync_cleanup(void *data);
>>> void en_advertised_route_sync_run(struct engine_node *, void
>>> *data);
>>>
>>> +bool dynamic_routes_lr_stateful_handler(struct engine_node *, void
>>> *data);
>>> +void *en_dynamic_routes_init(struct engine_node *, struct
>>> engine_arg *);
>>> +void en_dynamic_routes_cleanup(void *data);
>>> +void en_dynamic_routes_run(struct engine_node *, void *data);
>>> #endif /* EN_ADVERTISED_ROUTE_SYNC_H */
>>> diff --git a/northd/en-learned-route-sync.c b/northd/en-learned-
>>> route-sync.c
>>> index 406f1551f..4e87b3265 100644
>>> --- a/northd/en-learned-route-sync.c
>>> +++ b/northd/en-learned-route-sync.c
>>> @@ -181,7 +181,8 @@ parse_route_from_sbrec_route(struct hmap
>>> *parsed_routes_out,
>>>
>>> parsed_route_add(od, nexthop, &prefix, plen, false,
>>> lrp_addr_s,
>>> out_port, 0, false, false, NULL,
>>> - ROUTE_SOURCE_LEARNED, &route->header_,
>>> parsed_routes_out);
>>> + ROUTE_SOURCE_LEARNED, &route->header_, NULL,
>>> + parsed_routes_out);
>>> }
>>>
>>> static void
>>> diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
>>> index 438daf1c6..1bdc10e6a 100644
>>> --- a/northd/inc-proc-northd.c
>>> +++ b/northd/inc-proc-northd.c
>>> @@ -175,6 +175,7 @@ static ENGINE_NODE(multicast_igmp,
>>> "multicast_igmp");
>>> static ENGINE_NODE(acl_id, "acl_id");
>>> static ENGINE_NODE(advertised_route_sync,
>>> "advertised_route_sync");
>>> static ENGINE_NODE(learned_route_sync, "learned_route_sync");
>>> +static ENGINE_NODE(dynamic_routes, "dynamic_routes");
>>>
>>> void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
>>> struct ovsdb_idl_loop *sb)
>>> @@ -289,7 +290,12 @@ void inc_proc_northd_init(struct
>>> ovsdb_idl_loop *nb,
>>> engine_add_input(&en_ecmp_nexthop, &en_sb_mac_binding,
>>> ecmp_nexthop_mac_binding_handler);
>>>
>>> + engine_add_input(&en_dynamic_routes, &en_lr_stateful,
>>> + dynamic_routes_lr_stateful_handler);
>>> + engine_add_input(&en_dynamic_routes, &en_northd,
>>> engine_noop_handler);
>>> +
>>> engine_add_input(&en_advertised_route_sync, &en_routes, NULL);
>>> + engine_add_input(&en_advertised_route_sync,
>>> &en_dynamic_routes, NULL);
>>> engine_add_input(&en_advertised_route_sync,
>>> &en_sb_advertised_route,
>>> NULL);
>>> engine_add_input(&en_advertised_route_sync, &en_lr_stateful,
>>> diff --git a/northd/northd.c b/northd/northd.c
>>> index 1c9433e96..e9866e7be 100644
>>> --- a/northd/northd.c
>>> +++ b/northd/northd.c
>>> @@ -848,6 +848,14 @@ parse_dynamic_routing_redistribute(
>>> out |= DRRM_STATIC;
>>> continue;
>>> }
>>> + if (!strcmp(token, "nat")) {
>>> + out |= DRRM_NAT;
>>> + continue;
>>> + }
>>> + if (!strcmp(token, "lb")) {
>>> + out |= DRRM_LB;
>>> + continue;
>>> + }
>>> static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>> 1);
>>> VLOG_WARN_RL(&rl,
>>> "unkown dynamic-routing-redistribute option
>>> '%s' on %s",
>>> @@ -11012,6 +11020,7 @@ parsed_route_init(const struct ovn_datapath
>>> *od,
>>> bool ecmp_symmetric_reply,
>>> const struct sset *ecmp_selection_fields,
>>> enum route_source source,
>>> + const struct ovn_port *tracked_port,
>>> const struct ovsdb_idl_row *source_hint)
>>> {
>>>
>>> @@ -11027,6 +11036,7 @@ parsed_route_init(const struct ovn_datapath
>>> *od,
>>> new_pr->is_discard_route = is_discard_route;
>>> new_pr->lrp_addr_s = nullable_xstrdup(lrp_addr_s);
>>> new_pr->out_port = out_port;
>>> + new_pr->tracked_port = tracked_port;
>>> new_pr->source = source;
>>> if (ecmp_selection_fields) {
>>> sset_clone(&new_pr->ecmp_selection_fields,
>>> ecmp_selection_fields);
>>> @@ -11052,7 +11062,7 @@ parsed_route_clone(const struct
>>> parsed_route *pr)
>>> pr->od, nexthop, pr->prefix, pr->plen, pr-
>>>> is_discard_route,
>>> pr->lrp_addr_s, pr->out_port, pr->route_table_id, pr-
>>>> is_src_route,
>>> pr->ecmp_symmetric_reply, &pr->ecmp_selection_fields, pr-
>>>> source,
>>> - pr->source_hint);
>>> + pr->tracked_port, pr->source_hint);
>>>
>>> new_pr->hash = pr->hash;
>>> return new_pr;
>>> @@ -11095,13 +11105,14 @@ parsed_route_add(const struct
>>> ovn_datapath *od,
>>> const struct sset *ecmp_selection_fields,
>>> enum route_source source,
>>> const struct ovsdb_idl_row *source_hint,
>>> + const struct ovn_port *tracked_port,
>>> struct hmap *routes)
>>> {
>>>
>>> struct parsed_route *new_pr = parsed_route_init(
>>> od, nexthop, *prefix, plen, is_discard_route, lrp_addr_s,
>>> out_port,
>>> route_table_id, is_src_route, ecmp_symmetric_reply,
>>> - ecmp_selection_fields, source, source_hint);
>>> + ecmp_selection_fields, source, tracked_port, source_hint);
>>>
>>> new_pr->hash = route_hash(new_pr);
>>>
>>> @@ -11238,7 +11249,7 @@ parsed_routes_add_static(const struct
>>> ovn_datapath *od,
>>> parsed_route_add(od, nexthop, &prefix, plen, is_discard_route,
>>> lrp_addr_s,
>>> out_port, route_table_id, is_src_route,
>>> ecmp_symmetric_reply, &ecmp_selection_fields,
>>> source,
>>> - &route->header_, routes);
>>> + &route->header_, NULL, routes);
>>> sset_destroy(&ecmp_selection_fields);
>>> }
>>>
>>> @@ -11256,7 +11267,7 @@ parsed_routes_add_connected(const struct
>>> ovn_datapath *od,
>>> false, addr->addr_s, op,
>>> 0, false,
>>> false, NULL, ROUTE_SOURCE_CONNECTED,
>>> - &op->nbrp->header_, routes);
>>> + &op->nbrp->header_, NULL, routes);
>>> }
>>>
>>> for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
>>> @@ -11268,7 +11279,7 @@ parsed_routes_add_connected(const struct
>>> ovn_datapath *od,
>>> false, addr->addr_s, op,
>>> 0, false,
>>> false, NULL, ROUTE_SOURCE_CONNECTED,
>>> - &op->nbrp->header_, routes);
>>> + &op->nbrp->header_, NULL, routes);
>>> }
>>> }
>>>
>>> @@ -11306,6 +11317,229 @@ build_parsed_routes(const struct
>>> ovn_datapath *od, const struct hmap *lr_ports,
>>> }
>>> }
>>>
>>> +/* This function adds new parsed route for each entry in lr_nat
>>> record
>>> + * to "routes". Logical port of the route is set to
>>> "advertising_op" and
>>> + * tracked port is set to NAT's distributed gw port. If NAT
>>> doesn't have
>>> + * DGP (for example if it's set on gateway router), no tracked
>>> port will
>>> + * be set.*/
>>> +static void
>>> +build_nat_parsed_route_for_port(const struct ovn_port
>>> *advertising_op,
>>> + const struct lr_nat_record
>>> *lr_nat,
>>> + struct hmap *routes)
>>> +{
>>> + const struct ovn_datapath *advertising_od = advertising_op-
>>>> od;
>>> +
>>> + for (size_t i = 0; i < lr_nat->n_nat_entries; i++) {
>>> + const struct ovn_nat *nat = &lr_nat->nat_entries[i];
>>> + int plen = nat_entry_is_v6(nat) ? 128 : 32;
>>> + struct in6_addr prefix;
>>> + ip46_parse(nat->nb->external_ip, &prefix);
>>> + parsed_route_add(advertising_od, NULL, &prefix, plen,
>>> false,
>>> + nat->nb->external_ip, advertising_op, 0,
>>> false,
>>> + false, NULL, ROUTE_SOURCE_NAT, &nat->nb-
>>>> header_,
>>> + nat->l3dgw_port, routes);
>>
>> I am thinking about if in the case of a distributed dnat_and_snat NAT
>> the tracked_port should be logical_port.
>> But it is late so i am not really sure.
>> Also it is something that can easily be changed later when there is a
>> need.
>
> I could swear we had that.
> Anyway, I was also manually testing different topologies, and I found
> out that we actually need to advertise tracked_port even for GW
> routers. The reason being that when two GW routers are connected via LS
> and both of them redistribute NAT, when you add NAT on one of the
> routers, both will announce it with the same metric.
>
> I'll roll these changes together, they should be minor.
>
+1, let's also add a test for the distributed NAT case, please.
>>
>>> + }
>>> +}
>>> +
>>> +/* Generate parsed routes for NAT external IPs in lr_nat, for each
>>> ovn port
>>> + * in "od" that has enabled redistribution of NAT adresses.*/
>>> +void
>>> +build_nat_parsed_routes(const struct ovn_datapath *od,
>>> + const struct lr_nat_record *lr_nat,
>>> + struct hmap *routes)
>>> +{
>>> + const struct ovn_port *op;
>>> + HMAP_FOR_EACH (op, dp_node, &od->ports) {
>>> + if (!drr_mode_NAT_is_set(op-
>>>> dynamic_routing_redistribute)) {
>>> + continue;
>>> + }
>>> +
>>> + build_nat_parsed_route_for_port(op, lr_nat, routes);
>>> + }
>>> +}
>>> +
>>> +/* Similar to build_nat_parsed_routes, this function generates
>>> parsed routes
>>> + * for nat records in neighboring routers. For each ovn port in
>>> "od" that has
>>> + * enabled redistribution of NAT adresses, look up their neighbors
>>> (either
>>> + * directly routers, or routers connected through common LS) and
>>> advertise
>>> + * thier external NAT IPs too.*/
>>> +void
>>> +build_nat_connected_parsed_routes(
>>> + const struct ovn_datapath *od,
>>> + const struct lr_stateful_table *lr_stateful_table,
>>> + struct hmap *routes)
>>> +{
>>> + const struct ovn_port *op;
>>> + HMAP_FOR_EACH (op, dp_node, &od->ports) {
>>> + if (!drr_mode_NAT_is_set(op-
>>>> dynamic_routing_redistribute)) {
>>> + continue;
>>> + }
>>> +
>>> + if (!op->peer) {
>>> + continue;
>>> + }
>>> +
>>> + struct ovn_datapath *peer_od = op->peer->od;
>>> + if (!peer_od->nbs && !peer_od->nbr) {
>>> + continue;
>>> + }
>>> +
>>> + const struct ovn_port *peer_port = NULL;
>>> + /* This is a directly connected LR peer. */
>>> + if (peer_od->nbr) {
>>> + peer_port = op->peer;
>>> +
>>> + const struct lr_stateful_record *peer_lr_stateful =
>>> + lr_stateful_table_find_by_index(lr_stateful_table,
>>> + peer_od->index);
>>> + if (!peer_lr_stateful) {
>>> + continue;
>>> + }
>>> +
>>> + /* Advertise peer's NAT routes via the local port too.
>>> */
>>> + build_nat_parsed_route_for_port(op, peer_lr_stateful-
>>>> lrnat_rec,
>>> + routes);
>>> + return;
>>> + }
>>> +
>>> + /* This peer is LSP, we need to check all connected router
>>> ports
>>> + * for NAT.*/
>>> + for (size_t i = 0; i < peer_od->n_router_ports; i++) {
>>> + peer_port = peer_od->router_ports[i]->peer;
>>> + if (peer_port == op) {
>>> + /* Skip advertising router. */
>>> + continue;
>>> + }
>>> +
>>> + const struct lr_stateful_record *peer_lr_stateful =
>>> + lr_stateful_table_find_by_index(lr_stateful_table,
>>> + peer_port->od-
>>>> index);
>>> + if (!peer_lr_stateful) {
>>> + continue;
>>> + }
>>> +
>>> + /* Advertise peer's NAT routes via the local port too.
>>> */
>>> + build_nat_parsed_route_for_port(op, peer_lr_stateful-
>>>> lrnat_rec,
>>> + routes);
>>> + }
>>> + }
>>> +}
>>> +
>>> +/* This function adds new parsed route for each IP in lb_ips to
>>> "routes".*/
>>> +static void
>>> +build_lb_parsed_route_for_port(const struct ovn_port
>>> *advertising_op,
>>> + const struct ovn_port
>>> *tracked_port,
>>> + const struct ovn_lb_ip_set *lb_ips,
>>> + struct hmap *routes)
>>> +{
>>> + const struct ovn_datapath *advertising_od = advertising_op-
>>>> od;
>>> +
>>> + const char *ip_address;
>>> + SSET_FOR_EACH (ip_address, &lb_ips->ips_v4) {
>>> + struct in6_addr prefix;
>>> + ip46_parse(ip_address, &prefix);
>>> + parsed_route_add(advertising_od, NULL, &prefix, 32, false,
>>> + ip_address, advertising_op, 0, false,
>>> false,
>>> + NULL, ROUTE_SOURCE_LB, &advertising_op-
>>>> nbrp->header_,
>>> + tracked_port, routes);
>>> + }
>>> + SSET_FOR_EACH (ip_address, &lb_ips->ips_v6) {
>>> + struct in6_addr prefix;
>>> + ip46_parse(ip_address, &prefix);
>>> + parsed_route_add(advertising_od, NULL, &prefix, 128,
>>> false,
>>> + ip_address, advertising_op, 0, false,
>>> false,
>>> + NULL, ROUTE_SOURCE_LB, &advertising_op-
>>>> nbrp->header_,
>>> + tracked_port, routes);
>>> + }
>>> +}
>>> +
>>> +/* Similar to build_lb_parsed_routes, this function generates
>>> parsed routes
>>> + * for LB VIPs of neighboring routers. For each ovn port in "od"
>>> that has
>>> + * enabled redistribution of LB VIPs, look up their neighbors
>>> (either
>>> + * directly routers, or routers connected through common LS) and
>>> advertise
>>> + * thier LB VIPs too.*/
>>
>> Since this looks extremely similar to
>> build_nat_connected_parsed_routes
>> maybe it would make sense to merge them and just call
>> build_lb_parsed_route_for_port and build_nat_parsed_route_for_port.
>> Then we don't need to iterate over the same thing twice.
>>
>> On the other hand this greatly increases clariy
>
> I'd personally stick with separate functions in this case. There's
> quite a lot of branching already, and the two function have slight
> differences that re (at least to me) easier to follow when the
> functions are separate.
>
I think I'd prefer separate functions too to be honest. It's not that
much bloat in the end.
>>
>>> +void
>>> +build_lb_connected_parsed_routes(
>>> + const struct ovn_datapath *od,
>>> + const struct lr_stateful_table *lr_stateful_table,
>>> + struct hmap *routes)
>>> +{
>>> + const struct ovn_port *op;
>>> + HMAP_FOR_EACH (op, dp_node, &od->ports) {
>>> + if (!drr_mode_LB_is_set(op->dynamic_routing_redistribute))
>>> {
>>> + continue;
>>> + }
>>> +
>>> + if (!op->peer) {
>>> + continue;
>>> + }
>>> +
>>> + struct ovn_datapath *peer_od = op->peer->od;
>>> + if (!peer_od->nbs && !peer_od->nbr) {
>>> + continue;
>>> + }
>>> +
>>> + const struct lr_stateful_record *lr_stateful_rec;
>>> + const struct ovn_port *peer_port = NULL;
>>> + /* This is directly connected LR peer. */
>>> + if (peer_od->nbr) {
>>> + lr_stateful_rec = lr_stateful_table_find_by_index(
>>> + lr_stateful_table, peer_od->index);
>>> + peer_port = op->peer;
>>> + build_lb_parsed_route_for_port(op, peer_port,
>>> + lr_stateful_rec-
>>>> lb_ips, routes);
>>> + return;
>>> + }
>>> +
>>> + /* This peer is LSP, we need to check all connected router
>>> ports for
>>> + * LBs.*/
>>> + for (size_t i = 0; i < peer_od->n_router_ports; i++) {
>>> + peer_port = peer_od->router_ports[i]->peer;
>>> + if (peer_port == op) {
>>> + /* no need to check for LBs on ovn_port that
>>> initiated this
>>> + * function.*/
>>> + continue;
>>> + }
>>> + lr_stateful_rec = lr_stateful_table_find_by_index(
>>> + lr_stateful_table, peer_port->od->index);
>>> +
>>> + build_lb_parsed_route_for_port(op, peer_port,
>>> + lr_stateful_rec-
>>>> lb_ips, routes);
>>> + }
>>> + }
>>> +}
>>> +
>>> +void
>>> +build_lb_parsed_routes(const struct ovn_datapath *od,
>>> + const struct ovn_lb_ip_set *lb_ips,
>>> + struct hmap *routes)
>>> +{
>>> + const struct ovn_port *op;
>>> + HMAP_FOR_EACH (op, dp_node, &od->ports) {
>>> + if (!drr_mode_LB_is_set(op->dynamic_routing_redistribute))
>>> {
>>> + continue;
>>> + }
>>> +
>>> + /* Traffic processed by a load balancer is:
>>> + * - handled by the chassis where a gateway router is
>>> bound
>>> + * OR
>>> + * - always redirected to a distributed gateway router
>>> port
>>> + *
>>> + * Advertise the LB IPs via all 'op' if this is a gateway
>>> router or
>>> + * throuh all DGPs of this distributed router otherwise.
>>> */
>>> + struct ovn_port *op_ = NULL;
>>> + size_t n_tracked_ports = !od->is_gw_router ? od-
>>>> n_l3dgw_ports : 1;
>>> + struct ovn_port **tracked_ports = !od->is_gw_router
>>> + ? od->l3dgw_ports
>>> + : &op_;
>>> +
>>> + for (size_t i = 0; i < n_tracked_ports; i++) {
>>> + build_lb_parsed_route_for_port(op, tracked_ports[i],
>>> lb_ips,
>>> + routes);
>>
>> I honestly never really thought about a case where we want to
>> advertise
>> the same route with multiple tracked_ports.
>> But i guess there is nothing speaking against it :)
>
> The automagical handling of metrics based on the tracked port is quite
> convenient :)
>
Yes, if I'm not wrong it was Frode who said that this is a sign that the
schema was designed properly. :)
>>
>>> + }
>>> + }
>>> +
>>> +}
>>> struct ecmp_route_list_node {
>>> struct ovs_list list_node;
>>> uint16_t id; /* starts from 1 */
>>> @@ -11480,6 +11714,8 @@ route_source_to_offset(enum route_source
>>> source)
>>> return ROUTE_PRIO_OFFSET_STATIC;
>>> case ROUTE_SOURCE_LEARNED:
>>> return ROUTE_PRIO_OFFSET_LEARNED;
>>> + case ROUTE_SOURCE_NAT:
>>> + case ROUTE_SOURCE_LB:
>>> default:
>>> OVS_NOT_REACHED();
>>> }
>>> diff --git a/northd/northd.h b/northd/northd.h
>>> index b984e124d..a767fd834 100644
>>> --- a/northd/northd.h
>>> +++ b/northd/northd.h
>>> @@ -186,11 +186,15 @@ struct route_policy {
>>> };
>>>
>>> struct routes_data {
>>> - struct hmap parsed_routes;
>>> + struct hmap parsed_routes; /* Stores struct parsed_route. */
>>> struct simap route_tables;
>>> struct hmap bfd_active_connections;
>>> };
>>>
>>> +struct dynamic_routes_data {
>>> + struct hmap parsed_routes; /* Stores struct parsed_route. */
>>> +};
>>> +
>>> struct route_policies_data {
>>> struct hmap route_policies;
>>> struct hmap bfd_active_connections;
>>> @@ -308,10 +312,12 @@ struct mcast_port_info {
>>> * (e.g., IGMP join/leave). */
>>> };
>>>
>>> -#define DRR_MODES \
>>> - DRR_MODE(CONNECTED, 0) \
>>> +#define DRR_MODES \
>>> + DRR_MODE(CONNECTED, 0) \
>>> DRR_MODE(CONNECTED_AS_HOST, 1) \
>>> - DRR_MODE(STATIC, 2)
>>> + DRR_MODE(STATIC, 2) \
>>> + DRR_MODE(NAT, 3) \
>>> + DRR_MODE(LB, 4)
>>>
>>> enum dynamic_routing_redistribute_mode_bits {
>>> #define DRR_MODE(PROTOCOL, BIT) DRRM_##PROTOCOL##_BIT = BIT,
>>> @@ -746,6 +752,10 @@ enum route_source {
>>> ROUTE_SOURCE_STATIC,
>>> /* The route is dynamically learned by an ovn-controller. */
>>> ROUTE_SOURCE_LEARNED,
>>> + /* The route is derived from a NAT's external IP. */
>>> + ROUTE_SOURCE_NAT,
>>> + /* The route is derived from a LB's VIP. */
>>> + ROUTE_SOURCE_LB,
>>> };
>>>
>>> struct parsed_route {
>>> @@ -765,6 +775,7 @@ struct parsed_route {
>>> const struct ovsdb_idl_row *source_hint;
>>> char *lrp_addr_s;
>>> const struct ovn_port *out_port;
>>> + const struct ovn_port *tracked_port; /* May be NULL. */
>>> };
>>>
>>> struct parsed_route *parsed_route_clone(const struct parsed_route
>>> *);
>>> @@ -784,6 +795,7 @@ void parsed_route_add(const struct ovn_datapath
>>> *od,
>>> const struct sset *ecmp_selection_fields,
>>> enum route_source source,
>>> const struct ovsdb_idl_row *source_hint,
>>> + const struct ovn_port *tracked_port,
>>> struct hmap *routes);
>>>
>>> bool
>>> @@ -818,6 +830,18 @@ void route_policies_destroy(struct
>>> route_policies_data *);
>>> void build_parsed_routes(const struct ovn_datapath *, const struct
>>> hmap *,
>>> const struct hmap *, struct hmap *,
>>> struct simap *,
>>> struct hmap *);
>>> +void build_nat_parsed_routes(const struct ovn_datapath *,
>>> + const struct lr_nat_record *,
>>> + struct hmap *);
>>> +void build_nat_connected_parsed_routes(const struct ovn_datapath
>>> *,
>>> + const struct
>>> lr_stateful_table *,
>>> + struct hmap *routes);
>>> +void build_lb_parsed_routes(const struct ovn_datapath *,
>>> + const struct ovn_lb_ip_set *,
>>> + struct hmap *);
>>> +void build_lb_connected_parsed_routes(const struct ovn_datapath *,
>>> + const struct
>>> lr_stateful_table *,
>>> + struct hmap *routes);
>>> uint32_t get_route_table_id(struct simap *, const char *);
>>> void routes_init(struct routes_data *);
>>> void routes_destroy(struct routes_data *);
>>> diff --git a/ovn-nb.xml b/ovn-nb.xml
>>> index 20d30dd58..bd3d2bf3e 100644
>>> --- a/ovn-nb.xml
>>> +++ b/ovn-nb.xml
>>> @@ -3101,6 +3101,24 @@ or
>>> <ref table="Advertised_Route" db="OVN_Southbound"/>
>>> table.
>>> </p>
>>>
>>> + <p>
>>> + If <code>nb</code> is in the list then northd will
>>> create entries in
>>
>> I guess this is a typo and should be "<code>lb</code>"?
>
> Thanks for catching this, indeed a typo.
>>
>>> + <ref table="Advertised_Route" db="OVN_Southbound"/>
>>> table for each
>>> + Load Balancer VIP on this router and it's neighboring
>>> routers.
>>> + Neighboring routers are those that are either directly
>>> connected,
>>> + via Logical Router Port, or those that are connected via
>>> shared
>>> + Logical Switch.
>>> + </p>
>>> +
>>> + <p>
>>> + If <code>nat</code> is in the list then northd will
>>> create entries in
>>> + <ref table="Advertised_Route" db="OVN_Southbound"/>
>>> table for each
>>> + NAT's external IP on this router and it's neighboring
>>> routers.
>>> + Neighboring routers are those that are either directly
>>> connected,
>>> + via Logical Router Port, or those that are connected via
>>> shared
>>> + Logical Switch.
>>> + </p>
>>> +
>>> <p>
>>> This value can be overwritten on a per LRP basis using
>>> <ref column="options" key="dynamic-routing-redistribute"
>>> @@ -3950,6 +3968,24 @@ or
>>> <ref table="Advertised_Route" db="OVN_Southbound"/>
>>> table.
>>> </p>
>>>
>>> + <p>
>>> + If <code>nb</code> is in the list then northd will
>>> create entries in
>>
>> here too.
>>
>> Thanks a lot for your work,
>> Felix
>
> Thanks for the review Felix.
>
Indeed, thanks for the careful review! I have a few more comments on
the ovn-northd.at tests though, please see below.
>>
>>> + <ref table="Advertised_Route" db="OVN_Southbound"/>
>>> table for each
>>> + Load Balancer VIP on this port's router, and it's
>>> neighboring
>>> + routers. Neighboring routers are those that are either
>>> directly
>>> + connected to this Logical Router Port, or those that are
>>> connected
>>> + via shared Logical Switch.
>>> + </p>
>>> +
>>> + <p>
>>> + If <code>nat</code> is in the list then northd will
>>> create entries in
>>> + <ref table="Advertised_Route" db="OVN_Southbound"/>
>>> table for each
>>> + NAT's external IP on this port's router, and it's
>>> neighboring
>>> + routers. Neighboring routers are those that are either
>>> directly
>>> + connected to this Logical Router Port, or those that are
>>> connected
>>> + via shared Logical Switch.
>>> + </p>
>>> +
>>> <p>
>>> If not set the value from <ref column="options"
>>> key="dynamic-routing-redistribute"
>>> table="Logical_Router"/> on the
>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>>> index 64991ff75..c59512b29 100644
>>> --- a/tests/ovn-northd.at
>>> +++ b/tests/ovn-northd.at
>>> @@ -15542,3 +15542,604 @@ CHECK_NO_CHANGE_AFTER_RECOMPUTE
>>>
>>> AT_CLEANUP
>>> ])
For the tests in ovn-northd.at, I think we should:
- remove unneeded DB table dumps
- reindent ovn-nbctl commands
- fix comments
- use fetch_column/check_column/check_row_count instead of bare AT_CHECKs.
I was trying to cover the review comments (and taking care of the test
nits) here:
https://github.com/dceara/ovn/commit/0785375
If it looks OK to you, feel free to use it in v7.
That commit doesn't add a test for the distributed NAT scenario though.
Thanks,
Dumitru
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev