Hi Martin, On 2/13/25 10:32 PM, Dumitru Ceara wrote: > On 2/13/25 10:18 PM, [email protected] wrote: >> On Thu, 2025-02-13 at 19:21 +0100, Felix Huettner wrote: >>> On Thu, Feb 13, 2025 at 02:33:45PM +0100, Martin Kalcok wrote: >>>> This change builds on top of the new "dynamic routing" OVN feature >>>> that allows advertising routes to the fabric network. New accepted >>>> values are added to "dynamic-routing-redistribute" option. >>>> >>>> * nat - When included in the option, ovn-controller will advertise >>>> routes for external NAT IPs of the LR, as well as those >>>> of the neighboring routers (directly connected or through >>>> the shared LS). >>>> * lb - When included in the option, ovn-controller will advertise >>>> routes for LB VIPs of the LR, as well as those >>>> of the neighboring routers (directly connected or through >>>> the shared LS). >>> >>> Hi Martin, Frode, Dumitru, >>> >>> from my perspective that patch looks good. >>> >>> Acked-By: Felix Huettner <[email protected]> >>> >>> I have added some comments on things below that might make sense to >>> do >>> differently. But honestly from my perspective they should not stand >>> in >>> the way of this patchset. >>> >>> There is a small type in ovn-nb.xml which should probably be fixed >>> though. >>> >>>> >>>> Co-authored-by: Frode Nordahl <[email protected]> >>>> Co-authored-by: Dumitru Ceara <[email protected]> >>>> Signed-off-by: Frode Nordahl <[email protected]> >>>> Signed-off-by: Dumitru Ceara <[email protected]> >>>> Signed-off-by: Martin Kalcok <[email protected]> >>>> --- >>>> TODO.rst | 6 + >>>> lib/stopwatch-names.h | 1 + >>>> northd/en-advertised-route-sync.c | 202 ++++++++-- >>>> northd/en-advertised-route-sync.h | 4 + >>>> northd/en-learned-route-sync.c | 3 +- >>>> northd/inc-proc-northd.c | 6 + >>>> northd/northd.c | 246 +++++++++++- >>>> northd/northd.h | 32 +- >>>> ovn-nb.xml | 36 ++ >>>> tests/ovn-northd.at | 601 >>>> ++++++++++++++++++++++++++++ >>>> tests/system-ovn.at | 628 >>>> ++++++++++++++++++++++++++++++ >>>> 11 files changed, 1716 insertions(+), 49 deletions(-) >>>> >>>> diff --git a/TODO.rst b/TODO.rst >>>> index d75db3152..c50b2b980 100644 >>>> --- a/TODO.rst >>>> +++ b/TODO.rst >>>> @@ -156,3 +156,9 @@ OVN To-do List >>>> monitoring conditions to update before we actually try to >>>> learn routes. >>>> Otherwise we could try to add duplicated Learned_Routes and >>>> the ovnsb >>>> commit would fail. >>>> + * Consider splitting parsed_route structure. When creating >>>> parsed routes >>>> + with tracked_port explicitly set, other members of this >>>> structure are >>>> + usually unused/default. A new structure dedicated to routes >>>> with >>>> + explicitly defined tracked_port would be more efficient. >>>> + More details in >>>> + >>>> https://mail.openvswitch.org/pipermail/ovs-dev/2025-February/420985.html >>>> diff --git a/lib/stopwatch-names.h b/lib/stopwatch-names.h >>>> index c12dd28d0..13aa5e7bf 100644 >>>> --- a/lib/stopwatch-names.h >>>> +++ b/lib/stopwatch-names.h >>>> @@ -36,5 +36,6 @@ >>>> #define LS_STATEFUL_RUN_STOPWATCH_NAME "ls_stateful" >>>> #define ADVERTISED_ROUTE_SYNC_RUN_STOPWATCH_NAME >>>> "advertised_route_sync" >>>> #define LEARNED_ROUTE_SYNC_RUN_STOPWATCH_NAME "learned_route_sync" >>>> +#define DYNAMIC_ROUTES_RUN_STOPWATCH_NAME "dynamic_routes" >>>> >>>> #endif >>>> diff --git a/northd/en-advertised-route-sync.c b/northd/en- >>>> advertised-route-sync.c >>>> index cfea7c39b..279020a25 100644 >>>> --- a/northd/en-advertised-route-sync.c >>>> +++ b/northd/en-advertised-route-sync.c >>>> @@ -30,7 +30,8 @@ advertised_route_table_sync( >>>> struct ovsdb_idl_txn *ovnsb_txn, >>>> const struct sbrec_advertised_route_table >>>> *sbrec_advertised_route_table, >>>> const struct lr_stateful_table *lr_stateful_table, >>>> - const struct hmap *parsed_routes, >>>> + const struct hmap *routes, >>>> + const struct hmap *dynamic_routes, >>>> struct advertised_route_sync_data *data); >>>> >>>> bool >>>> @@ -141,6 +142,8 @@ en_advertised_route_sync_run(struct engine_node >>>> *node, void *data OVS_UNUSED) >>>> struct advertised_route_sync_data *routes_sync_data = data; >>>> struct routes_data *routes_data >>>> = engine_get_input_data("routes", node); >>>> + struct dynamic_routes_data *dynamic_routes_data >>>> + = engine_get_input_data("dynamic_routes", node); >>>> const struct engine_context *eng_ctx = engine_get_context(); >>>> const struct sbrec_advertised_route_table >>>> *sbrec_advertised_route_table = >>>> EN_OVSDB_GET(engine_get_input("SB_advertised_route", >>>> node)); >>>> @@ -153,12 +156,84 @@ en_advertised_route_sync_run(struct >>>> engine_node *node, void *data OVS_UNUSED) >>>> sbrec_advertised_route_table, >>>> &lr_stateful_data->table, >>>> &routes_data->parsed_routes, >>>> + &dynamic_routes_data- >>>>> parsed_routes, >>>> routes_sync_data); >>>> >>>> stopwatch_stop(ADVERTISED_ROUTE_SYNC_RUN_STOPWATCH_NAME, >>>> time_msec()); >>>> engine_set_node_state(node, EN_UPDATED); >>>> } >>>> >>>> +void >>>> +*en_dynamic_routes_init(struct engine_node *node OVS_UNUSED, >>>> + struct engine_arg *arg OVS_UNUSED) >>>> +{ >>>> + struct dynamic_routes_data *data = xmalloc(sizeof *data); >>>> + *data = (struct dynamic_routes_data) { >>>> + .parsed_routes = HMAP_INITIALIZER(&data->parsed_routes), >>>> + }; >>>> + >>>> + return data; >>>> +} >>>> + >>>> +static void >>>> +en_dynamic_routes_clean(struct dynamic_routes_data *data) >>>> +{ >>>> + struct parsed_route *r; >>>> + HMAP_FOR_EACH_POP (r, key_node, &data->parsed_routes) { >>>> + parsed_route_free(r); >>>> + } >>>> +} >>>> +void >>>> +en_dynamic_routes_cleanup(void *data_) >>>> +{ >>>> + struct dynamic_routes_data *data = data_; >>>> + >>>> + en_dynamic_routes_clean(data); >>>> + hmap_destroy(&data->parsed_routes); >>>> +} >>>> + >>>> +void >>>> +en_dynamic_routes_run(struct engine_node *node, void *data) >>>> +{ >>>> + struct dynamic_routes_data *dynamic_routes_data = data; >>>> + struct northd_data *northd_data = >>>> engine_get_input_data("northd", node); >>>> + struct ed_type_lr_stateful *lr_stateful_data = >>>> + engine_get_input_data("lr_stateful", node); >>>> + >>>> + en_dynamic_routes_clean(data); >>>> + const struct lr_stateful_record *lr_stateful_rec; >>>> + HMAP_FOR_EACH (lr_stateful_rec, key_node, >>>> + &lr_stateful_data->table.entries) { >>>> + const struct ovn_datapath *od = >>>> + ovn_datapaths_find_by_index(&northd_data- >>>>> lr_datapaths, >>>> + lr_stateful_rec- >>>>> lr_index); >>>> + if (!od->dynamic_routing) { >>>> + continue; >>>> + } >>>> + build_nat_parsed_routes(od, lr_stateful_rec->lrnat_rec, >>>> + &dynamic_routes_data- >>>>> parsed_routes); >>>> + build_nat_connected_parsed_routes(od, &lr_stateful_data- >>>>> table, >>>> + &dynamic_routes_data- >>>>> parsed_routes); >>>> + >>>> + build_lb_parsed_routes(od, lr_stateful_rec->lb_ips, >>>> + &dynamic_routes_data- >>>>> parsed_routes); >>>> + build_lb_connected_parsed_routes(od, &lr_stateful_data- >>>>> table, >>>> + &dynamic_routes_data- >>>>> parsed_routes); >>>> + } >>>> + engine_set_node_state(node, EN_UPDATED); >>>> +} >>>> + >>>> +bool >>>> +dynamic_routes_lr_stateful_handler(struct engine_node *node >>>> OVS_UNUSED, >>>> + void *data OVS_UNUSED) >>>> +{ >>>> + /* XXX: Incremental processing of dynamic routes for stateful >>>> + * configuration changes is not yet supported. Return false >>>> and >>>> + * trigger a recomputation.*/ >>>> + return false; >>>> +} >>> >>> I am not sure if we should have a function that always returns false. >>> Maybe it would be more obvious if engine_add_input just gets NULL as >>> a >>> handler? >> >> I don't have strong opinion on this, but the function gives us a good >> place to store comment about eventually implementing I-P :D >> > > That's true, however, I think it's better if we pass NULL as handler. > Like that we have a single place where we can easily tell which input > handlers are not implemented yet. I'd also say, let's add a TODO.rst > item for this too, we have one for Learned Routes. > >>> >>>> + >>>> + >>>> struct ar_entry { >>>> struct hmap_node hmap_node; >>>> >>>> @@ -333,12 +408,84 @@ publish_host_routes(struct hmap *sync_routes, >>>> } >>>> } >>>> >>>> +static void >>>> +advertised_route_table_sync_route_add( >>>> + const struct lr_stateful_table *lr_stateful_table, >>>> + struct advertised_route_sync_data *data, >>>> + struct uuidset *host_route_lrps, >>>> + struct hmap *sync_routes, >>>> + const struct parsed_route *route) >>>> +{ >>>> + if (route->is_discard_route) { >>>> + return; >>>> + } >>>> + if (prefix_is_link_local(&route->prefix, route->plen)) { >>>> + return; >>>> + } >>>> + if (!route->od->dynamic_routing) { >>>> + return; >>>> + } >>>> + >>>> + enum dynamic_routing_redistribute_mode drr = >>>> + route->out_port->dynamic_routing_redistribute; >>>> + if (route->source == ROUTE_SOURCE_CONNECTED) { >>>> + if (!drr_mode_CONNECTED_is_set(drr)) { >>>> + return; >>>> + } >>>> + /* If we advertise host routes, we only need to do so once >>>> per >>>> + * LRP. */ >>>> + const struct uuid *lrp_uuid = &route->out_port->nbrp- >>>>> header_.uuid; >>>> + if (drr_mode_CONNECTED_AS_HOST_is_set(drr) && >>>> + !uuidset_contains(host_route_lrps, lrp_uuid)) { >>>> + uuidset_insert(host_route_lrps, lrp_uuid); >>>> + publish_host_routes(sync_routes, lr_stateful_table, >>>> route, data); >>>> + return; >>>> + } >>>> + } >>>> + if (route->source == ROUTE_SOURCE_STATIC && >>>> !drr_mode_STATIC_is_set(drr)) { >>>> + return; >>>> + } >>>> + if (route->source == ROUTE_SOURCE_NAT) { >>>> + if (!drr_mode_NAT_is_set(drr)) { >>>> + return; >>>> + } >>>> + /* If NAT route tracks port on a different DP than the one >>>> that >>>> + * advertises the route, we need to watch for changes on >>>> that DP as >>>> + * well. */ >>>> + if (route->tracked_port && route->tracked_port->od != >>>> route->od) { >>>> + uuidset_insert(&data->nb_lr, >>>> + &route->tracked_port->od->nbr- >>>>> header_.uuid); >>>> + } >>>> + } >>>> + if (route->source == ROUTE_SOURCE_LB) { >>>> + if (!drr_mode_LB_is_set(drr)) { >>>> + return; >>>> + } >>>> + /* If LB route tracks port on a different DP than the one >>>> that >>>> + * advertises the route, we need to watch for changes on >>>> that DP as >>>> + * well. */ >>>> + if (route->tracked_port && route->tracked_port->od != >>>> route->od) { >>>> + uuidset_insert(&data->nb_lr, >>>> + &route->tracked_port->od->nbr- >>>>> header_.uuid); >>>> + } >>>> + } >>>> + >>>> + char *ip_prefix = normalize_v46_prefix(&route->prefix, route- >>>>> plen); >>>> + const struct sbrec_port_binding *tracked_port = NULL; >>>> + if (route->tracked_port) { >>>> + tracked_port = route->tracked_port->sb; >>>> + } >>>> + ar_add_entry(sync_routes, route->od->sb, route->out_port->sb, >>>> + ip_prefix, tracked_port); >>>> +} >>>> + >>>> static void >>>> advertised_route_table_sync( >>>> struct ovsdb_idl_txn *ovnsb_txn, >>>> const struct sbrec_advertised_route_table >>>> *sbrec_advertised_route_table, >>>> const struct lr_stateful_table *lr_stateful_table, >>>> - const struct hmap *parsed_routes, >>>> + const struct hmap *routes, >>>> + const struct hmap *dynamic_routes, >>>> struct advertised_route_sync_data *data) >>>> { >>>> struct hmap sync_routes = HMAP_INITIALIZER(&sync_routes); >>>> @@ -346,47 +493,24 @@ advertised_route_table_sync( >>>> const struct parsed_route *route; >>>> >>>> struct ar_entry *route_e; >>>> - const struct sbrec_advertised_route *sb_route; >>>> - HMAP_FOR_EACH (route, key_node, parsed_routes) { >>>> - if (route->is_discard_route) { >>>> - continue; >>>> - } >>>> - if (prefix_is_link_local(&route->prefix, route->plen)) { >>>> - continue; >>>> - } >>>> - if (!route->od->dynamic_routing) { >>>> - continue; >>>> - } >>>> >>>> - enum dynamic_routing_redistribute_mode drr = >>>> - route->out_port->dynamic_routing_redistribute; >>>> - if (route->source == ROUTE_SOURCE_CONNECTED) { >>>> - if (!drr_mode_CONNECTED_is_set(drr)) { >>>> - continue; >>>> - } >>>> - /* If we advertise host routes, we only need to do so >>>> once per >>>> - * LRP. */ >>>> - const struct uuid *lrp_uuid = >>>> - &route->out_port->nbrp->header_.uuid; >>>> - if (drr_mode_CONNECTED_AS_HOST_is_set(drr) && >>>> - !uuidset_contains(&host_route_lrps, lrp_uuid)) { >>>> - uuidset_insert(&host_route_lrps, lrp_uuid); >>>> - publish_host_routes(&sync_routes, >>>> lr_stateful_table, >>>> - route, data); >>>> - continue; >>>> - } >>>> - } >>>> - if (route->source == ROUTE_SOURCE_STATIC && >>>> - !drr_mode_STATIC_is_set(drr)) { >>>> - continue; >>>> - } >>>> - >>>> - char *ip_prefix = normalize_v46_prefix(&route->prefix, >>>> route->plen); >>>> - route_e = ar_add_entry(&sync_routes, route->od->sb, >>>> - route->out_port->sb, ip_prefix, >>>> NULL); >>>> + /* First build the set of non-dynamic routes that need sync- >>>> ing. */ >>>> + HMAP_FOR_EACH (route, key_node, routes) { >>>> + advertised_route_table_sync_route_add(lr_stateful_table, >>>> + data, >>>> &host_route_lrps, >>>> + &sync_routes, >>>> + route); >>>> + } >>>> + /* Then add the set of dynamic routes that need sync-ing. */ >>>> + HMAP_FOR_EACH (route, key_node, dynamic_routes) { >>>> + advertised_route_table_sync_route_add(lr_stateful_table, >>>> + data, >>>> &host_route_lrps, >>>> + &sync_routes, >>>> + route); >>>> } >>>> uuidset_destroy(&host_route_lrps); >>>> >>>> + const struct sbrec_advertised_route *sb_route; >>>> SBREC_ADVERTISED_ROUTE_TABLE_FOR_EACH_SAFE (sb_route, >>>> >>>> sbrec_advertised_route_table) { >>>> route_e = ar_find(&sync_routes, sb_route->datapath, >>>> diff --git a/northd/en-advertised-route-sync.h b/northd/en- >>>> advertised-route-sync.h >>>> index 1f24fd329..e18b75643 100644 >>>> --- a/northd/en-advertised-route-sync.h >>>> +++ b/northd/en-advertised-route-sync.h >>>> @@ -36,4 +36,8 @@ void *en_advertised_route_sync_init(struct >>>> engine_node *, struct engine_arg *); >>>> void en_advertised_route_sync_cleanup(void *data); >>>> void en_advertised_route_sync_run(struct engine_node *, void >>>> *data); >>>> >>>> +bool dynamic_routes_lr_stateful_handler(struct engine_node *, void >>>> *data); >>>> +void *en_dynamic_routes_init(struct engine_node *, struct >>>> engine_arg *); >>>> +void en_dynamic_routes_cleanup(void *data); >>>> +void en_dynamic_routes_run(struct engine_node *, void *data); >>>> #endif /* EN_ADVERTISED_ROUTE_SYNC_H */ >>>> diff --git a/northd/en-learned-route-sync.c b/northd/en-learned- >>>> route-sync.c >>>> index 406f1551f..4e87b3265 100644 >>>> --- a/northd/en-learned-route-sync.c >>>> +++ b/northd/en-learned-route-sync.c >>>> @@ -181,7 +181,8 @@ parse_route_from_sbrec_route(struct hmap >>>> *parsed_routes_out, >>>> >>>> parsed_route_add(od, nexthop, &prefix, plen, false, >>>> lrp_addr_s, >>>> out_port, 0, false, false, NULL, >>>> - ROUTE_SOURCE_LEARNED, &route->header_, >>>> parsed_routes_out); >>>> + ROUTE_SOURCE_LEARNED, &route->header_, NULL, >>>> + parsed_routes_out); >>>> } >>>> >>>> static void >>>> diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c >>>> index 438daf1c6..1bdc10e6a 100644 >>>> --- a/northd/inc-proc-northd.c >>>> +++ b/northd/inc-proc-northd.c >>>> @@ -175,6 +175,7 @@ static ENGINE_NODE(multicast_igmp, >>>> "multicast_igmp"); >>>> static ENGINE_NODE(acl_id, "acl_id"); >>>> static ENGINE_NODE(advertised_route_sync, >>>> "advertised_route_sync"); >>>> static ENGINE_NODE(learned_route_sync, "learned_route_sync"); >>>> +static ENGINE_NODE(dynamic_routes, "dynamic_routes"); >>>> >>>> void inc_proc_northd_init(struct ovsdb_idl_loop *nb, >>>> struct ovsdb_idl_loop *sb) >>>> @@ -289,7 +290,12 @@ void inc_proc_northd_init(struct >>>> ovsdb_idl_loop *nb, >>>> engine_add_input(&en_ecmp_nexthop, &en_sb_mac_binding, >>>> ecmp_nexthop_mac_binding_handler); >>>> >>>> + engine_add_input(&en_dynamic_routes, &en_lr_stateful, >>>> + dynamic_routes_lr_stateful_handler); >>>> + engine_add_input(&en_dynamic_routes, &en_northd, >>>> engine_noop_handler); >>>> + >>>> engine_add_input(&en_advertised_route_sync, &en_routes, NULL); >>>> + engine_add_input(&en_advertised_route_sync, >>>> &en_dynamic_routes, NULL); >>>> engine_add_input(&en_advertised_route_sync, >>>> &en_sb_advertised_route, >>>> NULL); >>>> engine_add_input(&en_advertised_route_sync, &en_lr_stateful, >>>> diff --git a/northd/northd.c b/northd/northd.c >>>> index 1c9433e96..e9866e7be 100644 >>>> --- a/northd/northd.c >>>> +++ b/northd/northd.c >>>> @@ -848,6 +848,14 @@ parse_dynamic_routing_redistribute( >>>> out |= DRRM_STATIC; >>>> continue; >>>> } >>>> + if (!strcmp(token, "nat")) { >>>> + out |= DRRM_NAT; >>>> + continue; >>>> + } >>>> + if (!strcmp(token, "lb")) { >>>> + out |= DRRM_LB; >>>> + continue; >>>> + } >>>> static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, >>>> 1); >>>> VLOG_WARN_RL(&rl, >>>> "unkown dynamic-routing-redistribute option >>>> '%s' on %s", >>>> @@ -11012,6 +11020,7 @@ parsed_route_init(const struct ovn_datapath >>>> *od, >>>> bool ecmp_symmetric_reply, >>>> const struct sset *ecmp_selection_fields, >>>> enum route_source source, >>>> + const struct ovn_port *tracked_port, >>>> const struct ovsdb_idl_row *source_hint) >>>> { >>>> >>>> @@ -11027,6 +11036,7 @@ parsed_route_init(const struct ovn_datapath >>>> *od, >>>> new_pr->is_discard_route = is_discard_route; >>>> new_pr->lrp_addr_s = nullable_xstrdup(lrp_addr_s); >>>> new_pr->out_port = out_port; >>>> + new_pr->tracked_port = tracked_port; >>>> new_pr->source = source; >>>> if (ecmp_selection_fields) { >>>> sset_clone(&new_pr->ecmp_selection_fields, >>>> ecmp_selection_fields); >>>> @@ -11052,7 +11062,7 @@ parsed_route_clone(const struct >>>> parsed_route *pr) >>>> pr->od, nexthop, pr->prefix, pr->plen, pr- >>>>> is_discard_route, >>>> pr->lrp_addr_s, pr->out_port, pr->route_table_id, pr- >>>>> is_src_route, >>>> pr->ecmp_symmetric_reply, &pr->ecmp_selection_fields, pr- >>>>> source, >>>> - pr->source_hint); >>>> + pr->tracked_port, pr->source_hint); >>>> >>>> new_pr->hash = pr->hash; >>>> return new_pr; >>>> @@ -11095,13 +11105,14 @@ parsed_route_add(const struct >>>> ovn_datapath *od, >>>> const struct sset *ecmp_selection_fields, >>>> enum route_source source, >>>> const struct ovsdb_idl_row *source_hint, >>>> + const struct ovn_port *tracked_port, >>>> struct hmap *routes) >>>> { >>>> >>>> struct parsed_route *new_pr = parsed_route_init( >>>> od, nexthop, *prefix, plen, is_discard_route, lrp_addr_s, >>>> out_port, >>>> route_table_id, is_src_route, ecmp_symmetric_reply, >>>> - ecmp_selection_fields, source, source_hint); >>>> + ecmp_selection_fields, source, tracked_port, source_hint); >>>> >>>> new_pr->hash = route_hash(new_pr); >>>> >>>> @@ -11238,7 +11249,7 @@ parsed_routes_add_static(const struct >>>> ovn_datapath *od, >>>> parsed_route_add(od, nexthop, &prefix, plen, is_discard_route, >>>> lrp_addr_s, >>>> out_port, route_table_id, is_src_route, >>>> ecmp_symmetric_reply, &ecmp_selection_fields, >>>> source, >>>> - &route->header_, routes); >>>> + &route->header_, NULL, routes); >>>> sset_destroy(&ecmp_selection_fields); >>>> } >>>> >>>> @@ -11256,7 +11267,7 @@ parsed_routes_add_connected(const struct >>>> ovn_datapath *od, >>>> false, addr->addr_s, op, >>>> 0, false, >>>> false, NULL, ROUTE_SOURCE_CONNECTED, >>>> - &op->nbrp->header_, routes); >>>> + &op->nbrp->header_, NULL, routes); >>>> } >>>> >>>> for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) { >>>> @@ -11268,7 +11279,7 @@ parsed_routes_add_connected(const struct >>>> ovn_datapath *od, >>>> false, addr->addr_s, op, >>>> 0, false, >>>> false, NULL, ROUTE_SOURCE_CONNECTED, >>>> - &op->nbrp->header_, routes); >>>> + &op->nbrp->header_, NULL, routes); >>>> } >>>> } >>>> >>>> @@ -11306,6 +11317,229 @@ build_parsed_routes(const struct >>>> ovn_datapath *od, const struct hmap *lr_ports, >>>> } >>>> } >>>> >>>> +/* This function adds new parsed route for each entry in lr_nat >>>> record >>>> + * to "routes". Logical port of the route is set to >>>> "advertising_op" and >>>> + * tracked port is set to NAT's distributed gw port. If NAT >>>> doesn't have >>>> + * DGP (for example if it's set on gateway router), no tracked >>>> port will >>>> + * be set.*/ >>>> +static void >>>> +build_nat_parsed_route_for_port(const struct ovn_port >>>> *advertising_op, >>>> + const struct lr_nat_record >>>> *lr_nat, >>>> + struct hmap *routes) >>>> +{ >>>> + const struct ovn_datapath *advertising_od = advertising_op- >>>>> od; >>>> + >>>> + for (size_t i = 0; i < lr_nat->n_nat_entries; i++) { >>>> + const struct ovn_nat *nat = &lr_nat->nat_entries[i]; >>>> + int plen = nat_entry_is_v6(nat) ? 128 : 32; >>>> + struct in6_addr prefix; >>>> + ip46_parse(nat->nb->external_ip, &prefix); >>>> + parsed_route_add(advertising_od, NULL, &prefix, plen, >>>> false, >>>> + nat->nb->external_ip, advertising_op, 0, >>>> false, >>>> + false, NULL, ROUTE_SOURCE_NAT, &nat->nb- >>>>> header_, >>>> + nat->l3dgw_port, routes); >>> >>> I am thinking about if in the case of a distributed dnat_and_snat NAT >>> the tracked_port should be logical_port. >>> But it is late so i am not really sure. >>> Also it is something that can easily be changed later when there is a >>> need. >> >> I could swear we had that. >> Anyway, I was also manually testing different topologies, and I found >> out that we actually need to advertise tracked_port even for GW >> routers. The reason being that when two GW routers are connected via LS >> and both of them redistribute NAT, when you add NAT on one of the >> routers, both will announce it with the same metric. >> >> I'll roll these changes together, they should be minor. >> > > +1, let's also add a test for the distributed NAT case, please. > >>> >>>> + } >>>> +} >>>> + >>>> +/* Generate parsed routes for NAT external IPs in lr_nat, for each >>>> ovn port >>>> + * in "od" that has enabled redistribution of NAT adresses.*/ >>>> +void >>>> +build_nat_parsed_routes(const struct ovn_datapath *od, >>>> + const struct lr_nat_record *lr_nat, >>>> + struct hmap *routes) >>>> +{ >>>> + const struct ovn_port *op; >>>> + HMAP_FOR_EACH (op, dp_node, &od->ports) { >>>> + if (!drr_mode_NAT_is_set(op- >>>>> dynamic_routing_redistribute)) { >>>> + continue; >>>> + } >>>> + >>>> + build_nat_parsed_route_for_port(op, lr_nat, routes); >>>> + } >>>> +} >>>> + >>>> +/* Similar to build_nat_parsed_routes, this function generates >>>> parsed routes >>>> + * for nat records in neighboring routers. For each ovn port in >>>> "od" that has >>>> + * enabled redistribution of NAT adresses, look up their neighbors >>>> (either >>>> + * directly routers, or routers connected through common LS) and >>>> advertise >>>> + * thier external NAT IPs too.*/ >>>> +void >>>> +build_nat_connected_parsed_routes( >>>> + const struct ovn_datapath *od, >>>> + const struct lr_stateful_table *lr_stateful_table, >>>> + struct hmap *routes) >>>> +{ >>>> + const struct ovn_port *op; >>>> + HMAP_FOR_EACH (op, dp_node, &od->ports) { >>>> + if (!drr_mode_NAT_is_set(op- >>>>> dynamic_routing_redistribute)) { >>>> + continue; >>>> + } >>>> + >>>> + if (!op->peer) { >>>> + continue; >>>> + } >>>> + >>>> + struct ovn_datapath *peer_od = op->peer->od; >>>> + if (!peer_od->nbs && !peer_od->nbr) { >>>> + continue; >>>> + } >>>> + >>>> + const struct ovn_port *peer_port = NULL; >>>> + /* This is a directly connected LR peer. */ >>>> + if (peer_od->nbr) { >>>> + peer_port = op->peer; >>>> + >>>> + const struct lr_stateful_record *peer_lr_stateful = >>>> + lr_stateful_table_find_by_index(lr_stateful_table, >>>> + peer_od->index); >>>> + if (!peer_lr_stateful) { >>>> + continue; >>>> + } >>>> + >>>> + /* Advertise peer's NAT routes via the local port too. >>>> */ >>>> + build_nat_parsed_route_for_port(op, peer_lr_stateful- >>>>> lrnat_rec, >>>> + routes); >>>> + return; >>>> + } >>>> + >>>> + /* This peer is LSP, we need to check all connected router >>>> ports >>>> + * for NAT.*/ >>>> + for (size_t i = 0; i < peer_od->n_router_ports; i++) { >>>> + peer_port = peer_od->router_ports[i]->peer; >>>> + if (peer_port == op) { >>>> + /* Skip advertising router. */ >>>> + continue; >>>> + } >>>> + >>>> + const struct lr_stateful_record *peer_lr_stateful = >>>> + lr_stateful_table_find_by_index(lr_stateful_table, >>>> + peer_port->od- >>>>> index); >>>> + if (!peer_lr_stateful) { >>>> + continue; >>>> + } >>>> + >>>> + /* Advertise peer's NAT routes via the local port too. >>>> */ >>>> + build_nat_parsed_route_for_port(op, peer_lr_stateful- >>>>> lrnat_rec, >>>> + routes); >>>> + } >>>> + } >>>> +} >>>> + >>>> +/* This function adds new parsed route for each IP in lb_ips to >>>> "routes".*/ >>>> +static void >>>> +build_lb_parsed_route_for_port(const struct ovn_port >>>> *advertising_op, >>>> + const struct ovn_port >>>> *tracked_port, >>>> + const struct ovn_lb_ip_set *lb_ips, >>>> + struct hmap *routes) >>>> +{ >>>> + const struct ovn_datapath *advertising_od = advertising_op- >>>>> od; >>>> + >>>> + const char *ip_address; >>>> + SSET_FOR_EACH (ip_address, &lb_ips->ips_v4) { >>>> + struct in6_addr prefix; >>>> + ip46_parse(ip_address, &prefix); >>>> + parsed_route_add(advertising_od, NULL, &prefix, 32, false, >>>> + ip_address, advertising_op, 0, false, >>>> false, >>>> + NULL, ROUTE_SOURCE_LB, &advertising_op- >>>>> nbrp->header_, >>>> + tracked_port, routes); >>>> + } >>>> + SSET_FOR_EACH (ip_address, &lb_ips->ips_v6) { >>>> + struct in6_addr prefix; >>>> + ip46_parse(ip_address, &prefix); >>>> + parsed_route_add(advertising_od, NULL, &prefix, 128, >>>> false, >>>> + ip_address, advertising_op, 0, false, >>>> false, >>>> + NULL, ROUTE_SOURCE_LB, &advertising_op- >>>>> nbrp->header_, >>>> + tracked_port, routes); >>>> + } >>>> +} >>>> + >>>> +/* Similar to build_lb_parsed_routes, this function generates >>>> parsed routes >>>> + * for LB VIPs of neighboring routers. For each ovn port in "od" >>>> that has >>>> + * enabled redistribution of LB VIPs, look up their neighbors >>>> (either >>>> + * directly routers, or routers connected through common LS) and >>>> advertise >>>> + * thier LB VIPs too.*/ >>> >>> Since this looks extremely similar to >>> build_nat_connected_parsed_routes >>> maybe it would make sense to merge them and just call >>> build_lb_parsed_route_for_port and build_nat_parsed_route_for_port. >>> Then we don't need to iterate over the same thing twice. >>> >>> On the other hand this greatly increases clariy >> >> I'd personally stick with separate functions in this case. There's >> quite a lot of branching already, and the two function have slight >> differences that re (at least to me) easier to follow when the >> functions are separate. >> > > I think I'd prefer separate functions too to be honest. It's not that > much bloat in the end. > >>> >>>> +void >>>> +build_lb_connected_parsed_routes( >>>> + const struct ovn_datapath *od, >>>> + const struct lr_stateful_table *lr_stateful_table, >>>> + struct hmap *routes) >>>> +{ >>>> + const struct ovn_port *op; >>>> + HMAP_FOR_EACH (op, dp_node, &od->ports) { >>>> + if (!drr_mode_LB_is_set(op->dynamic_routing_redistribute)) >>>> { >>>> + continue; >>>> + } >>>> + >>>> + if (!op->peer) { >>>> + continue; >>>> + } >>>> + >>>> + struct ovn_datapath *peer_od = op->peer->od; >>>> + if (!peer_od->nbs && !peer_od->nbr) { >>>> + continue; >>>> + } >>>> + >>>> + const struct lr_stateful_record *lr_stateful_rec; >>>> + const struct ovn_port *peer_port = NULL; >>>> + /* This is directly connected LR peer. */ >>>> + if (peer_od->nbr) { >>>> + lr_stateful_rec = lr_stateful_table_find_by_index( >>>> + lr_stateful_table, peer_od->index); >>>> + peer_port = op->peer; >>>> + build_lb_parsed_route_for_port(op, peer_port, >>>> + lr_stateful_rec- >>>>> lb_ips, routes); >>>> + return; >>>> + } >>>> + >>>> + /* This peer is LSP, we need to check all connected router >>>> ports for >>>> + * LBs.*/ >>>> + for (size_t i = 0; i < peer_od->n_router_ports; i++) { >>>> + peer_port = peer_od->router_ports[i]->peer; >>>> + if (peer_port == op) { >>>> + /* no need to check for LBs on ovn_port that >>>> initiated this >>>> + * function.*/ >>>> + continue; >>>> + } >>>> + lr_stateful_rec = lr_stateful_table_find_by_index( >>>> + lr_stateful_table, peer_port->od->index); >>>> + >>>> + build_lb_parsed_route_for_port(op, peer_port, >>>> + lr_stateful_rec- >>>>> lb_ips, routes); >>>> + } >>>> + } >>>> +} >>>> + >>>> +void >>>> +build_lb_parsed_routes(const struct ovn_datapath *od, >>>> + const struct ovn_lb_ip_set *lb_ips, >>>> + struct hmap *routes) >>>> +{ >>>> + const struct ovn_port *op; >>>> + HMAP_FOR_EACH (op, dp_node, &od->ports) { >>>> + if (!drr_mode_LB_is_set(op->dynamic_routing_redistribute)) >>>> { >>>> + continue; >>>> + } >>>> + >>>> + /* Traffic processed by a load balancer is: >>>> + * - handled by the chassis where a gateway router is >>>> bound >>>> + * OR >>>> + * - always redirected to a distributed gateway router >>>> port >>>> + * >>>> + * Advertise the LB IPs via all 'op' if this is a gateway >>>> router or >>>> + * throuh all DGPs of this distributed router otherwise. >>>> */ >>>> + struct ovn_port *op_ = NULL; >>>> + size_t n_tracked_ports = !od->is_gw_router ? od- >>>>> n_l3dgw_ports : 1; >>>> + struct ovn_port **tracked_ports = !od->is_gw_router >>>> + ? od->l3dgw_ports >>>> + : &op_; >>>> + >>>> + for (size_t i = 0; i < n_tracked_ports; i++) { >>>> + build_lb_parsed_route_for_port(op, tracked_ports[i], >>>> lb_ips, >>>> + routes); >>> >>> I honestly never really thought about a case where we want to >>> advertise >>> the same route with multiple tracked_ports. >>> But i guess there is nothing speaking against it :) >> >> The automagical handling of metrics based on the tracked port is quite >> convenient :) >> > > Yes, if I'm not wrong it was Frode who said that this is a sign that the > schema was designed properly. :) > >>> >>>> + } >>>> + } >>>> + >>>> +} >>>> struct ecmp_route_list_node { >>>> struct ovs_list list_node; >>>> uint16_t id; /* starts from 1 */ >>>> @@ -11480,6 +11714,8 @@ route_source_to_offset(enum route_source >>>> source) >>>> return ROUTE_PRIO_OFFSET_STATIC; >>>> case ROUTE_SOURCE_LEARNED: >>>> return ROUTE_PRIO_OFFSET_LEARNED; >>>> + case ROUTE_SOURCE_NAT: >>>> + case ROUTE_SOURCE_LB: >>>> default: >>>> OVS_NOT_REACHED(); >>>> } >>>> diff --git a/northd/northd.h b/northd/northd.h >>>> index b984e124d..a767fd834 100644 >>>> --- a/northd/northd.h >>>> +++ b/northd/northd.h >>>> @@ -186,11 +186,15 @@ struct route_policy { >>>> }; >>>> >>>> struct routes_data { >>>> - struct hmap parsed_routes; >>>> + struct hmap parsed_routes; /* Stores struct parsed_route. */ >>>> struct simap route_tables; >>>> struct hmap bfd_active_connections; >>>> }; >>>> >>>> +struct dynamic_routes_data { >>>> + struct hmap parsed_routes; /* Stores struct parsed_route. */ >>>> +}; >>>> + >>>> struct route_policies_data { >>>> struct hmap route_policies; >>>> struct hmap bfd_active_connections; >>>> @@ -308,10 +312,12 @@ struct mcast_port_info { >>>> * (e.g., IGMP join/leave). */ >>>> }; >>>> >>>> -#define DRR_MODES \ >>>> - DRR_MODE(CONNECTED, 0) \ >>>> +#define DRR_MODES \ >>>> + DRR_MODE(CONNECTED, 0) \ >>>> DRR_MODE(CONNECTED_AS_HOST, 1) \ >>>> - DRR_MODE(STATIC, 2) >>>> + DRR_MODE(STATIC, 2) \ >>>> + DRR_MODE(NAT, 3) \ >>>> + DRR_MODE(LB, 4) >>>> >>>> enum dynamic_routing_redistribute_mode_bits { >>>> #define DRR_MODE(PROTOCOL, BIT) DRRM_##PROTOCOL##_BIT = BIT, >>>> @@ -746,6 +752,10 @@ enum route_source { >>>> ROUTE_SOURCE_STATIC, >>>> /* The route is dynamically learned by an ovn-controller. */ >>>> ROUTE_SOURCE_LEARNED, >>>> + /* The route is derived from a NAT's external IP. */ >>>> + ROUTE_SOURCE_NAT, >>>> + /* The route is derived from a LB's VIP. */ >>>> + ROUTE_SOURCE_LB, >>>> }; >>>> >>>> struct parsed_route { >>>> @@ -765,6 +775,7 @@ struct parsed_route { >>>> const struct ovsdb_idl_row *source_hint; >>>> char *lrp_addr_s; >>>> const struct ovn_port *out_port; >>>> + const struct ovn_port *tracked_port; /* May be NULL. */ >>>> }; >>>> >>>> struct parsed_route *parsed_route_clone(const struct parsed_route >>>> *); >>>> @@ -784,6 +795,7 @@ void parsed_route_add(const struct ovn_datapath >>>> *od, >>>> const struct sset *ecmp_selection_fields, >>>> enum route_source source, >>>> const struct ovsdb_idl_row *source_hint, >>>> + const struct ovn_port *tracked_port, >>>> struct hmap *routes); >>>> >>>> bool >>>> @@ -818,6 +830,18 @@ void route_policies_destroy(struct >>>> route_policies_data *); >>>> void build_parsed_routes(const struct ovn_datapath *, const struct >>>> hmap *, >>>> const struct hmap *, struct hmap *, >>>> struct simap *, >>>> struct hmap *); >>>> +void build_nat_parsed_routes(const struct ovn_datapath *, >>>> + const struct lr_nat_record *, >>>> + struct hmap *); >>>> +void build_nat_connected_parsed_routes(const struct ovn_datapath >>>> *, >>>> + const struct >>>> lr_stateful_table *, >>>> + struct hmap *routes); >>>> +void build_lb_parsed_routes(const struct ovn_datapath *, >>>> + const struct ovn_lb_ip_set *, >>>> + struct hmap *); >>>> +void build_lb_connected_parsed_routes(const struct ovn_datapath *, >>>> + const struct >>>> lr_stateful_table *, >>>> + struct hmap *routes); >>>> uint32_t get_route_table_id(struct simap *, const char *); >>>> void routes_init(struct routes_data *); >>>> void routes_destroy(struct routes_data *); >>>> diff --git a/ovn-nb.xml b/ovn-nb.xml >>>> index 20d30dd58..bd3d2bf3e 100644 >>>> --- a/ovn-nb.xml >>>> +++ b/ovn-nb.xml >>>> @@ -3101,6 +3101,24 @@ or >>>> <ref table="Advertised_Route" db="OVN_Southbound"/> >>>> table. >>>> </p> >>>> >>>> + <p> >>>> + If <code>nb</code> is in the list then northd will >>>> create entries in >>> >>> I guess this is a typo and should be "<code>lb</code>"? >> >> Thanks for catching this, indeed a typo. >>> >>>> + <ref table="Advertised_Route" db="OVN_Southbound"/> >>>> table for each >>>> + Load Balancer VIP on this router and it's neighboring >>>> routers. >>>> + Neighboring routers are those that are either directly >>>> connected, >>>> + via Logical Router Port, or those that are connected via >>>> shared >>>> + Logical Switch. >>>> + </p> >>>> + >>>> + <p> >>>> + If <code>nat</code> is in the list then northd will >>>> create entries in >>>> + <ref table="Advertised_Route" db="OVN_Southbound"/> >>>> table for each >>>> + NAT's external IP on this router and it's neighboring >>>> routers. >>>> + Neighboring routers are those that are either directly >>>> connected, >>>> + via Logical Router Port, or those that are connected via >>>> shared >>>> + Logical Switch. >>>> + </p> >>>> + >>>> <p> >>>> This value can be overwritten on a per LRP basis using >>>> <ref column="options" key="dynamic-routing-redistribute" >>>> @@ -3950,6 +3968,24 @@ or >>>> <ref table="Advertised_Route" db="OVN_Southbound"/> >>>> table. >>>> </p> >>>> >>>> + <p> >>>> + If <code>nb</code> is in the list then northd will >>>> create entries in >>> >>> here too. >>> >>> Thanks a lot for your work, >>> Felix >> >> Thanks for the review Felix. >> > > Indeed, thanks for the careful review! I have a few more comments on > the ovn-northd.at tests though, please see below. > >>> >>>> + <ref table="Advertised_Route" db="OVN_Southbound"/> >>>> table for each >>>> + Load Balancer VIP on this port's router, and it's >>>> neighboring >>>> + routers. Neighboring routers are those that are either >>>> directly >>>> + connected to this Logical Router Port, or those that are >>>> connected >>>> + via shared Logical Switch. >>>> + </p> >>>> + >>>> + <p> >>>> + If <code>nat</code> is in the list then northd will >>>> create entries in >>>> + <ref table="Advertised_Route" db="OVN_Southbound"/> >>>> table for each >>>> + NAT's external IP on this port's router, and it's >>>> neighboring >>>> + routers. Neighboring routers are those that are either >>>> directly >>>> + connected to this Logical Router Port, or those that are >>>> connected >>>> + via shared Logical Switch. >>>> + </p> >>>> + >>>> <p> >>>> If not set the value from <ref column="options" >>>> key="dynamic-routing-redistribute" >>>> table="Logical_Router"/> on the >>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at >>>> index 64991ff75..c59512b29 100644 >>>> --- a/tests/ovn-northd.at >>>> +++ b/tests/ovn-northd.at >>>> @@ -15542,3 +15542,604 @@ CHECK_NO_CHANGE_AFTER_RECOMPUTE >>>> >>>> AT_CLEANUP >>>> ]) > > For the tests in ovn-northd.at, I think we should: > - remove unneeded DB table dumps > - reindent ovn-nbctl commands > - fix comments > - use fetch_column/check_column/check_row_count instead of bare AT_CHECKs. > > I was trying to cover the review comments (and taking care of the test > nits) here: > https://github.com/dceara/ovn/commit/0785375 > > If it looks OK to you, feel free to use it in v7. > That commit doesn't add a test for the distributed NAT scenario though. >
Actually, I couldn't help it and wrote the distributed NAT test and it uncovered a bug in my incremental commit above. The distributed NAT test and the fix for the bug are available here: https://github.com/dceara/ovn/commit/119d849 The actual dev branch is: https://github.com/dceara/ovn/commits/refs/heads/tmp-bgp-lb-nat-advertise-v6/ The last two commits there can be squashed in patch 4/4 if they look OK to you too. Regards, Dumitru _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
