On 22.01.2026 00:17, [email protected] wrote: > On Mon, 2026-01-19 at 12:23 +0300, Alexandra Rukomoinikova wrote: >> 1) Added new option "distributed" for load balancers. >> With this feature, balancers will work distributedly across >> compute nodes, >> balancing to only local backends (excluded east-west traffic) >> >> 2) If load balancer is running on a router with dgp, the router will >> no >> longer be centralized on gateway - this means that access to >> physical network will also be available from hosts where the >> distributed balancer backends are located. >> >> 3) Configuration requirement for distributed load balancers: >> 1) ip_port_mapping must be specified >> 2) Balancing in underlay fabric between hosts with backends >> >> Example: >> Load Balancer: lb1 with VIP 1.1.1.1 and distributed option enabled. >> Fabric is configured with a static ECMP route for 1.1.1.1/32: >> nexthop via ip_host1 weight 1 (hosts backend1) >> nexthop via ip_host2 weight 1 (hosts backend2) >> nexthop via ip_host3 weight 2 (hosts backend3 and backend4) >> >> As part of testing, following estimates of distribution of requests >> to balancers were obtained: >> for i in $(seq 5000); do curl http://1.1.1.1:80 2>/dev/null ; echo ; >> done | awk '{print $2}' | sort | uniq -c >> 1265 “backend 4", >> 1260 “backend 3", >> 1224 “backend 2", >> 1251 “backend 1", >> Thus, requests using ecmp balancing are distributed between backends >> approximately evenly. >> >> Suggested-by: Vladislav Odintsov <[email protected]> >> Signed-off-by: Alexandra Rukomoinikova <[email protected]> >> --- >> v3 --> v4: 1) fix arp reciving for distributed load balancing since >> merging 01277be6c833ef1d1c3fb568c139d3f0ac2b363a >> 2) remove system tests bc turned out to be useless, added >> new multinode tests for use case >> 3) updated ovn-sb doc for new action >> 4) update doc about distributed option >> 5) rebased on >> https://github.com/ovn-org/ovn-fake-multinode/pull/114 > Hi Alexandra, > thank you for the v4. The multinode tests conveys the topology much > better. Overall I'd say it LGTM, though I'm not super confident in the > area of incremental processing, so an additional +1 from someone else > would be nice. I'll leave some in-line comments. Hi Martin! Thank you for your time. I'll reply below. > > I also tried the topology that we discussed in the last version[0] > (Distributed LR connected to fabric via GW routers on each chassis) and > I found that the load-balancing gets concentrated only on the chassis > that hosts chassis-redirect port for the DGP. > I feel like we should be able to avoid sending the traffic of > distributed load balancer to the CR port, I just haven't found the > right tweak to do so. What do you think? > > NOTE: I don't think that this scenario needs to be supported for this > patch to get accepted, but it would be nice if it did, because AFAIK > this is (more or less) the topology of Openstack+OVN BGP > implementation. > > [0] https://gist.github.com/mkalcok/7bc9af3f79969067bc1cea01841f6f47 > Yes, I'll fix this feature for this topology in a separate patch series. Overall, not sending traffic to gateway seems like the right approach. I'll think about it again and submit a patch. >> --- >> NEWS | 3 + >> northd/en-lb-data.c | 9 + >> northd/en-lb-data.h | 3 + >> northd/en-lr-stateful.c | 3 + >> northd/en-lr-stateful.h | 2 + >> northd/lb.c | 91 ++++++---- >> northd/lb.h | 7 +- >> northd/northd.c | 197 ++++++++++++++------- >> northd/northd.h | 17 ++ >> ovn-nb.xml | 16 +- >> ovn-sb.xml | 11 ++ >> tests/multinode-macros.at | 14 ++ >> tests/multinode.at | 159 +++++++++++++++++ >> tests/ovn-northd.at | 364 >> ++++++++++++++++++++++++++++++++++++++ >> 14 files changed, 791 insertions(+), 105 deletions(-) >> >> diff --git a/NEWS b/NEWS >> index 9883fb81d..4d49b94f1 100644 >> --- a/NEWS >> +++ b/NEWS >> @@ -82,6 +82,9 @@ Post v25.09.0 >> other_config column. >> - Introduce the capability to specify multiple ips for ovn-evpn- >> local-ip >> option. >> + - Add "distributed" option for load balancer, that forces traffic >> to be >> + routed only to backend instances running locally on the same >> chassis >> + it arrives on. >> >> OVN v25.09.0 - xxx xx xxxx >> -------------------------- >> diff --git a/northd/en-lb-data.c b/northd/en-lb-data.c >> index 6d52d465e..f67ab9c5f 100644 >> --- a/northd/en-lb-data.c >> +++ b/northd/en-lb-data.c >> @@ -166,6 +166,7 @@ lb_data_load_balancer_handler(struct engine_node >> *node, void *data) >> add_crupdated_lb_to_tracked_data(lb, trk_lb_data, >> lb->health_checks); >> trk_lb_data->has_routable_lb |= lb->routable; >> + trk_lb_data->has_distributed_lb |= lb->is_distributed; >> continue; >> } >> >> @@ -180,6 +181,7 @@ lb_data_load_balancer_handler(struct engine_node >> *node, void *data) >> add_deleted_lb_to_tracked_data(lb, trk_lb_data, >> lb->health_checks); >> trk_lb_data->has_routable_lb |= lb->routable; >> + trk_lb_data->has_distributed_lb |= lb->is_distributed; >> } else { >> /* Load balancer updated. */ >> bool health_checks = lb->health_checks; >> @@ -189,11 +191,13 @@ lb_data_load_balancer_handler(struct >> engine_node *node, void *data) >> sset_swap(&lb->ips_v6, &old_ips_v6); >> enum lb_neighbor_responder_mode neigh_mode = lb- >>> neigh_mode; >> bool routable = lb->routable; >> + bool distributed_mode = lb->is_distributed; >> ovn_northd_lb_reinit(lb, tracked_lb); >> health_checks |= lb->health_checks; >> struct crupdated_lb *clb = >> add_crupdated_lb_to_tracked_data( >> lb, trk_lb_data, health_checks); >> trk_lb_data->has_routable_lb |= lb->routable; >> + trk_lb_data->has_distributed_lb |= lb->is_distributed; >> >> /* Determine the inserted and deleted vips and store >> them in >> * the tracked data. */ >> @@ -226,6 +230,10 @@ lb_data_load_balancer_handler(struct engine_node >> *node, void *data) >> /* If neigh_mode is updated trigger a full >> recompute. */ >> return EN_UNHANDLED; >> } >> + if (distributed_mode != lb->is_distributed) { >> + /* If neigh_mode is updated trigger a full >> recompute. */ >> + return EN_UNHANDLED; >> + } >> } >> } >> >> @@ -687,6 +695,7 @@ handle_od_lb_changes(struct nbrec_load_balancer >> **nbrec_lbs, >> >> lb_uuid); >> ovs_assert(lb); >> trk_lb_data->has_routable_lb |= lb->routable; >> + trk_lb_data->has_distributed_lb |= lb- >>> is_distributed; >> } >> } >> >> diff --git a/northd/en-lb-data.h b/northd/en-lb-data.h >> index 1da087656..90e85b8c4 100644 >> --- a/northd/en-lb-data.h >> +++ b/northd/en-lb-data.h >> @@ -82,6 +82,9 @@ struct tracked_lb_data { >> >> /* Indicates if any lb (in the tracked data) has 'routable' flag >> set. */ >> bool has_routable_lb; >> + >> + /* Indicates if any lb (in the tracked data) has 'distibuted' >> flag set. */ >> + bool has_distributed_lb; >> }; >> >> /* Datapath (logical switch) to lb/lbgrp association data. */ >> diff --git a/northd/en-lr-stateful.c b/northd/en-lr-stateful.c >> index 212c0641c..5a738f4c3 100644 >> --- a/northd/en-lr-stateful.c >> +++ b/northd/en-lr-stateful.c >> @@ -326,6 +326,7 @@ lr_stateful_lb_data_handler(struct engine_node >> *node, void *data_) >> ovn_datapaths_find_by_index(input_data.lr_datapaths, >> lr_stateful_rec- >>> lr_index); >> lr_stateful_rec->has_lb_vip = od_has_lb_vip(od); >> + lr_stateful_rec->has_distributed_lb = od- >>> is_distributed; >> } >> >> return EN_HANDLED_UPDATED; >> @@ -527,7 +528,9 @@ lr_stateful_record_create(struct >> lr_stateful_table *table, >> if (nbr->n_nat) { >> lr_stateful_rebuild_vip_nats(lr_stateful_rec); >> } >> + >> lr_stateful_rec->has_lb_vip = od_has_lb_vip(od); >> + lr_stateful_rec->has_distributed_lb = od->is_distributed; >> >> hmap_insert(&table->entries, &lr_stateful_rec->key_node, >> uuid_hash(&lr_stateful_rec->nbr_uuid)); >> diff --git a/northd/en-lr-stateful.h b/northd/en-lr-stateful.h >> index 146f768c3..3b0c54521 100644 >> --- a/northd/en-lr-stateful.h >> +++ b/northd/en-lr-stateful.h >> @@ -59,6 +59,8 @@ struct lr_stateful_record { >> >> bool has_lb_vip; >> >> + bool has_distributed_lb; >> + >> /* Load Balancer vIPs relevant for this datapath. */ >> struct ovn_lb_ip_set *lb_ips; >> >> diff --git a/northd/lb.c b/northd/lb.c >> index 0822bc2d7..bfb255ccd 100644 >> --- a/northd/lb.c >> +++ b/northd/lb.c >> @@ -85,12 +85,12 @@ ovn_lb_ip_set_clone(struct ovn_lb_ip_set >> *lb_ip_set) >> return clone; >> } >> >> -static >> -void ovn_northd_lb_vip_init(struct ovn_northd_lb_vip *lb_vip_nb, >> - const struct ovn_lb_vip *lb_vip, >> - const struct nbrec_load_balancer >> *nbrec_lb, >> - const char *vip_port_str, const char >> *backend_ips, >> - bool template) >> +static void >> +ovn_northd_lb_vip_init(struct ovn_northd_lb_vip *lb_vip_nb, >> + const struct ovn_lb_vip *lb_vip, >> + const struct nbrec_load_balancer *nbrec_lb, >> + const char *vip_port_str, const char >> *backend_ips, >> + bool template) >> { >> lb_vip_nb->backend_ips = xstrdup(backend_ips); >> lb_vip_nb->n_backends = vector_len(&lb_vip->backends); >> @@ -101,19 +101,22 @@ void ovn_northd_lb_vip_init(struct >> ovn_northd_lb_vip *lb_vip_nb, >> } >> >> /* >> - * Initializes health check configuration for load balancer VIP >> - * backends. Parses the ip_port_mappings in the format : >> - * "ip:logical_port:src_ip[:az_name]". >> + * Parses ip_port_mappings in the format : >> + * "ip:logical_port[:src_ip][:az_name]". >> + * src_ip parameter is optional when distributed mode is enabled, >> + * without health checks configured. >> * If az_name is present and non-empty, it indicates this is a >> * remote service monitor (backend is in another availability zone), >> * it should be propogated to another AZ by interconnection >> processing. >> */ >> static void >> -ovn_lb_vip_backends_health_check_init(const struct ovn_northd_lb >> *lb, >> - const struct ovn_lb_vip >> *lb_vip, >> - struct ovn_northd_lb_vip >> *lb_vip_nb) >> +ovn_lb_vip_backends_ip_port_mappings_init(const struct ovn_northd_lb >> *lb, >> + const struct ovn_lb_vip >> *lb_vip, >> + struct ovn_northd_lb_vip >> *lb_vip_nb) >> { >> struct ds key = DS_EMPTY_INITIALIZER; >> + bool allow_without_src_ip = lb->is_distributed >> + && !lb_vip_nb->lb_health_check; >> >> for (size_t j = 0; j < vector_len(&lb_vip->backends); j++) { >> const struct ovn_lb_backend *backend = >> @@ -127,26 +130,34 @@ ovn_lb_vip_backends_health_check_init(const >> struct ovn_northd_lb *lb, >> continue; >> } >> >> - char *svc_mon_src_ip = NULL; >> - char *az_name = NULL; >> + struct ovn_northd_lb_backend *backend_nb = NULL; >> + char *port_name = NULL, *az_name = NULL, *first_colon = >> NULL; >> + char *svc_mon_src_ip = NULL, *src_ip = NULL; >> bool is_remote = false; >> - char *port_name = xstrdup(s); >> - char *src_ip = NULL; >> >> - char *first_colon = strchr(port_name, ':'); >> - if (!first_colon) { >> - free(port_name); >> - continue; >> + port_name = xstrdup(s); >> + first_colon = strchr(port_name, ':'); >> + >> + if (!first_colon && allow_without_src_ip) { >> + if (!*port_name) { >> + VLOG_WARN("Empty port name in distributed mode for >> IP %s", >> + ds_cstr(&key)); >> + goto cleanup; >> + } >> + is_remote = false; >> + goto init_backend; >> + } else if (!first_colon) { >> + VLOG_WARN("Expected ':' separator for: %s", port_name); >> + goto cleanup; >> } >> - *first_colon = '\0'; >> >> + *first_colon = '\0'; >> if (first_colon[1] == '[') { >> /* IPv6 case - format: port:[ipv6]:az or port:[ipv6] */ >> char *ip_end = strchr(first_colon + 2, ']'); >> if (!ip_end) { >> VLOG_WARN("Malformed IPv6 address in backend %s", >> s); >> - free(port_name); >> - continue; >> + goto cleanup; >> } >> >> src_ip = first_colon + 2; >> @@ -157,8 +168,7 @@ ovn_lb_vip_backends_health_check_init(const >> struct ovn_northd_lb *lb, >> if (!*az_name) { >> VLOG_WARN("Empty AZ name specified for backend >> %s", >> port_name); >> - free(port_name); >> - continue; >> + goto cleanup; >> } >> is_remote = true; >> } >> @@ -172,31 +182,31 @@ ovn_lb_vip_backends_health_check_init(const >> struct ovn_northd_lb *lb, >> if (!*az_name) { >> VLOG_WARN("Empty AZ name specified for backend >> %s", >> port_name); >> - free(port_name); >> - continue; >> + goto cleanup; >> } >> - is_remote = true; >> + is_remote = true; >> } >> } >> >> struct sockaddr_storage svc_mon_src_addr; >> if (!src_ip || !inet_parse_address(src_ip, >> &svc_mon_src_addr)) { >> VLOG_WARN("Invalid svc mon src IP %s", src_ip ? src_ip : >> "NULL"); >> + goto cleanup; >> } else { >> struct ds src_ip_s = DS_EMPTY_INITIALIZER; >> ss_format_address_nobracks(&svc_mon_src_addr, >> &src_ip_s); >> svc_mon_src_ip = ds_steal_cstr(&src_ip_s); >> } >> >> - if (svc_mon_src_ip) { >> - struct ovn_northd_lb_backend *backend_nb = >> - &lb_vip_nb->backends_nb[j]; >> - backend_nb->health_check = true; >> - backend_nb->logical_port = xstrdup(port_name); >> - backend_nb->svc_mon_src_ip = svc_mon_src_ip; >> - backend_nb->az_name = is_remote ? xstrdup(az_name) : >> NULL; >> - backend_nb->remote_backend = is_remote; >> - } >> +init_backend: >> + backend_nb = &lb_vip_nb->backends_nb[j]; >> + backend_nb->health_check = lb_vip_nb->lb_health_check; >> + backend_nb->logical_port = xstrdup(port_name); >> + backend_nb->svc_mon_src_ip = svc_mon_src_ip; >> + backend_nb->az_name = is_remote ? xstrdup(az_name) : NULL; >> + backend_nb->remote_backend = is_remote; >> + backend_nb->distributed_backend = lb->is_distributed; >> +cleanup: >> free(port_name); >> } >> >> @@ -367,6 +377,9 @@ ovn_northd_lb_init(struct ovn_northd_lb *lb, >> lb->hairpin_snat_ip = xstrdup(snat_ip); >> } >> >> + lb->is_distributed = smap_get_bool(&nbrec_lb->options, >> "distributed", >> + false); >> + >> sset_init(&lb->ips_v4); >> sset_init(&lb->ips_v6); >> struct smap_node *node; >> @@ -406,8 +419,8 @@ ovn_northd_lb_init(struct ovn_northd_lb *lb, >> } >> n_vips++; >> >> - if (lb_vip_nb->lb_health_check) { >> - ovn_lb_vip_backends_health_check_init(lb, lb_vip, >> lb_vip_nb); >> + if (lb_vip_nb->lb_health_check || lb->is_distributed) { >> + ovn_lb_vip_backends_ip_port_mappings_init(lb, lb_vip, >> lb_vip_nb); >> } >> } >> >> diff --git a/northd/lb.h b/northd/lb.h >> index 53dc4abf0..0f2555a1e 100644 >> --- a/northd/lb.h >> +++ b/northd/lb.h >> @@ -74,8 +74,12 @@ struct ovn_northd_lb { >> /* Indicates if the load balancer has health checks configured. >> */ >> bool health_checks; >> >> - char *hairpin_snat_ip; >> + /* Indicates if distributed option is enabled for load balancer. >> */ >> + bool is_distributed; >> + >> bool use_stateless_nat; >> + >> + char *hairpin_snat_ip; >> }; >> >> /* ovn-northd specific backend information. */ >> @@ -91,6 +95,7 @@ struct ovn_northd_lb_backend { >> bool health_check; >> /* Set to true if port does not locate in local AZ. */ >> bool remote_backend; >> + bool distributed_backend; >> /* Logical port to which the ip belong to. */ >> char *logical_port; >> /* Source IP address to be used for service monitoring. */ >> diff --git a/northd/northd.c b/northd/northd.c >> index 2d01a01d9..a8c7df7a6 100644 >> --- a/northd/northd.c >> +++ b/northd/northd.c >> @@ -565,6 +565,7 @@ ovn_datapath_create(struct hmap *datapaths, const >> struct uuid *key, >> od->localnet_ports = VECTOR_EMPTY_INITIALIZER(struct ovn_port >> *); >> od->lb_with_stateless_mode = false; >> od->ipam_info_initialized = false; >> + od->is_distributed = false; >> od->tunnel_key = sdp->sb_dp->tunnel_key; >> init_mcast_info_for_datapath(od); >> return od; >> @@ -3315,6 +3316,46 @@ ovn_lb_svc_create(struct ovsdb_idl_txn >> *ovnsb_txn, >> } >> } >> >> +static bool >> +backend_is_available(const struct ovn_northd_lb *lb, >> + const struct ovn_lb_backend *backend, >> + const struct ovn_northd_lb_backend *backend_nb, >> + const struct svc_monitors_map_data >> *svc_mons_data) >> +{ >> + struct service_monitor_info *mon_info = >> + get_service_mon(svc_mons_data->local_svc_monitors_map, >> + svc_mons_data->ic_learned_svc_monitors_map, >> + backend->ip_str, >> + backend_nb->logical_port, >> + backend->port, >> + lb->nlb->protocol && lb->nlb->protocol[0] ? >> + lb->nlb->protocol : "tcp"); >> + >> + if (!mon_info) { >> + return false; >> + } >> + >> + ovs_assert(mon_info->sbrec_mon); >> + >> + return mon_info->sbrec_mon->status && >> + strcmp(mon_info->sbrec_mon->status, "online") ? >> + false : true; > Wouldn't this function return `true` if the `mon_info->sbrec_mon- >> status` is NULL? Maybe I'm just misunderstanding the way this ternary > operation is evaluated, perhaps additional parenthesis would help here > to better draw the individual "groups" of conditions that are > evaluated. > > nit: There are two spaces between "return" and "mon_info->" The problem here is my choice of name for this function; it checks by name whether the backend is available, but returns the exact opposite binary value - I'll fix that. >> +} >> + >> +static inline void >> +append_lb_backend_to_action(const struct ovn_lb_backend *backend, >> + const struct ovn_northd_lb_backend >> *backend_nb, >> + bool distributed_mode, >> + struct ds *action) >> +{ >> + bool ipv6 = !IN6_IS_ADDR_V4MAPPED(&backend->ip); >> + if (distributed_mode) { >> + ds_put_format(action, "\"%s\":", backend_nb->logical_port); >> + } >> + ds_put_format(action, ipv6 ? "[%s]:%"PRIu16"," : >> "%s:%"PRIu16",", >> + backend->ip_str, backend->port); >> +} >> + >> static bool >> build_lb_vip_actions(const struct ovn_northd_lb *lb, >> const struct ovn_lb_vip *lb_vip, >> @@ -3340,9 +3381,11 @@ build_lb_vip_actions(const struct >> ovn_northd_lb *lb, >> } >> } >> >> - if (lb_vip_nb->lb_health_check) { >> - ds_put_cstr(action, "ct_lb_mark(backends="); >> + ds_put_format(action, "%s", lb->is_distributed >> + ? "ct_lb_mark_local(backends=" >> + : "ct_lb_mark(backends="); >> >> + if (lb_vip_nb->lb_health_check || lb->is_distributed) { >> size_t i = 0; >> size_t n_active_backends = 0; >> const struct ovn_lb_backend *backend; >> @@ -3350,45 +3393,37 @@ build_lb_vip_actions(const struct >> ovn_northd_lb *lb, >> struct ovn_northd_lb_backend *backend_nb = >> &lb_vip_nb->backends_nb[i++]; >> >> - if (!backend_nb->health_check) { >> + if (lb_vip_nb->lb_health_check && >> + !backend_nb->health_check) { > nit: This condition is bit confusing to me. Could you please add short > comment that explains when this situation can happen? When load balancer itself is distributed, but the ip_port_mappings for this backend is not defined. I really don't like these two ifs —but then the code in northd/lb.c needs to be written differently—I don't yet see how that would look good. >> continue; >> } >> >> - const char *protocol = lb->nlb->protocol; >> - if (!protocol || !protocol[0]) { >> - protocol = "tcp"; >> - } >> - >> - struct service_monitor_info *mon_info = >> - get_service_mon(svc_mons_data- >>> local_svc_monitors_map, >> - svc_mons_data- >>> ic_learned_svc_monitors_map, >> - backend->ip_str, >> - backend_nb->logical_port, >> - backend->port, >> - protocol); >> - >> - if (!mon_info) { >> + if (lb->is_distributed && >> + !backend_nb->distributed_backend) { >> continue; >> } >> >> - ovs_assert(mon_info->sbrec_mon); >> - if (mon_info->sbrec_mon->status && >> - strcmp(mon_info->sbrec_mon->status, "online")) { >> + >> + if (backend_nb->health_check && >> + !backend_is_available(lb, >> + backend, >> + backend_nb, >> + svc_mons_data)) { >> continue; >> } >> >> n_active_backends++; >> - bool ipv6 = !IN6_IS_ADDR_V4MAPPED(&backend->ip); >> - ds_put_format(action, ipv6 ? "[%s]:%"PRIu16"," : >> "%s:%"PRIu16",", >> - backend->ip_str, backend->port); >> + append_lb_backend_to_action(backend, >> + backend_nb, >> + backend_nb- >>> distributed_backend, >> + action); >> } >> ds_chomp(action, ','); >> >> drop = !n_active_backends && !lb_vip->empty_backend_rej; >> reject = !n_active_backends && lb_vip->empty_backend_rej; >> } else { >> - ds_put_format(action, "ct_lb_mark(backends=%s", >> - lb_vip_nb->backend_ips); >> + ds_put_format(action, "%s", lb_vip_nb->backend_ips); >> } >> >> if (reject) { >> @@ -3425,6 +3460,20 @@ build_lb_vip_actions(const struct >> ovn_northd_lb *lb, >> return reject; >> } >> >> +static inline void >> +handle_od_lb_datapath_modes(struct ovn_datapath *od, >> + struct ovn_lb_datapaths *lb_dps, >> + bool od_is_switch) >> +{ > nit/question: Couldn't you use NULL value of either `od->nbr` or `od- >> nbs` to determine if the OD is switch or router? Instead of using > explicit function argument? ack, i will fix >> + if (od_is_switch && od->lb_with_stateless_mode) { >> + hmapx_add(&lb_dps->ls_lb_with_stateless_mode, od); >> + } >> + >> + if (!od_is_switch && lb_dps->lb->is_distributed) { >> + od->is_distributed = true; >> + } >> +} >> + >> static void >> build_lb_datapaths(const struct hmap *lbs, const struct hmap >> *lb_groups, >> struct ovn_datapaths *ls_datapaths, >> @@ -3467,9 +3516,7 @@ build_lb_datapaths(const struct hmap *lbs, >> const struct hmap *lb_groups, >> lb_dps = ovn_lb_datapaths_find(lb_datapaths_map, >> lb_uuid); >> ovs_assert(lb_dps); >> ovn_lb_datapaths_add_ls(lb_dps, 1, &od, >> ods_size(ls_datapaths)); >> - if (od->lb_with_stateless_mode) { >> - hmapx_add(&lb_dps->ls_lb_with_stateless_mode, od); >> - } >> + handle_od_lb_datapath_modes(od, lb_dps, true); >> } >> >> for (size_t i = 0; i < od->nbs->n_load_balancer_group; i++) >> { >> @@ -3503,6 +3550,7 @@ build_lb_datapaths(const struct hmap *lbs, >> const struct hmap *lb_groups, >> lb_dps = ovn_lb_datapaths_find(lb_datapaths_map, >> lb_uuid); >> ovs_assert(lb_dps); >> ovn_lb_datapaths_add_lr(lb_dps, 1, &od, >> ods_size(lr_datapaths)); >> + handle_od_lb_datapath_modes(od, lb_dps, false); >> } >> } >> >> @@ -3855,6 +3903,7 @@ sync_pb_for_lrp(struct ovn_port *op, >> smap_add(&new, "distributed-port", op->primary_port->key); >> >> bool always_redirect = >> + !lr_stateful_rec->has_distributed_lb && >> !lr_stateful_rec->lrnat_rec->has_distributed_nat && >> !l3dgw_port_has_associated_vtep_lports(op- >>> primary_port); >> >> @@ -5444,10 +5493,7 @@ northd_handle_lb_data_changes(struct >> tracked_lb_data *trk_lb_data, >> lb_dps = ovn_lb_datapaths_find(lb_datapaths_map, >> &uuidnode->uuid); >> ovs_assert(lb_dps); >> ovn_lb_datapaths_add_ls(lb_dps, 1, &od, >> ods_size(ls_datapaths)); >> - >> - if (od->lb_with_stateless_mode) { >> - hmapx_add(&lb_dps->ls_lb_with_stateless_mode, od); >> - } >> + handle_od_lb_datapath_modes(od, lb_dps, true); >> >> /* Add the lb to the northd tracked data. */ >> hmapx_add(&nd_changes->trk_lbs.crupdated, lb_dps); >> @@ -5486,6 +5532,7 @@ northd_handle_lb_data_changes(struct >> tracked_lb_data *trk_lb_data, >> lb_dps = ovn_lb_datapaths_find(lb_datapaths_map, >> &uuidnode->uuid); >> ovs_assert(lb_dps); >> ovn_lb_datapaths_add_lr(lb_dps, 1, &od, >> ods_size(lr_datapaths)); >> + handle_od_lb_datapath_modes(od, lb_dps, false); >> >> /* Add the lb to the northd tracked data. */ >> hmapx_add(&nd_changes->trk_lbs.crupdated, lb_dps); >> @@ -9689,22 +9736,15 @@ build_lswitch_arp_chassis_resident(const >> struct ovn_datapath *od, >> { >> struct sset distributed_nat_ports = >> SSET_INITIALIZER(&distributed_nat_ports); >> - struct sset resident_ports = SSET_INITIALIZER(&resident_ports); >> - struct sset inports = SSET_INITIALIZER(&inports); >> + struct hmapx resident_ports = >> HMAPX_INITIALIZER(&resident_ports); >> struct ds match = DS_EMPTY_INITIALIZER; >> >> - struct hmapx_node *node; >> - HMAPX_FOR_EACH (node, &od->phys_ports) { >> - struct ovn_port *op = node->data; >> - sset_add(&inports, op->json_key); >> - } >> - >> struct ovn_port *op; >> VECTOR_FOR_EACH (&od->router_ports, op) { >> struct ovn_port *op_r = op->peer; >> >> if (lrp_is_l3dgw(op_r)) { >> - sset_add(&resident_ports, op_r->cr_port->json_key); >> + hmapx_add(&resident_ports, op_r); >> } >> } >> >> @@ -9720,23 +9760,32 @@ build_lswitch_arp_chassis_resident(const >> struct ovn_datapath *od, >> } >> } >> >> - if (!sset_is_empty(&inports) && !sset_is_empty(&resident_ports)) >> { >> + if (!hmapx_is_empty(&od->phys_ports) && >> !hmapx_is_empty(&resident_ports)) { >> + struct hmapx_node *node; >> const char *port_name; >> >> - SSET_FOR_EACH (port_name, &inports) { >> + HMAPX_FOR_EACH (node, &od->phys_ports) { >> + op = node->data; >> + >> ds_clear(&match); >> - ds_put_format(&match, "arp.op == 1 && inport == %s", >> - port_name); >> + ds_put_format(&match, "(arp.op == 1 || arp.op == 2) " >> + "&& inport == %s", >> + op->json_key); >> ovn_lflow_add(lflows, od, S_SWITCH_IN_CHECK_PORT_SEC, >> 75, >> ds_cstr(&match), REGBIT_EXT_ARP " = 1; >> next;", >> ar->lflow_ref); >> } >> >> - SSET_FOR_EACH (port_name, &resident_ports) { >> + HMAPX_FOR_EACH (node, &resident_ports) { >> + op = node->data; >> + >> ds_clear(&match); >> - ds_put_format(&match, REGBIT_EXT_ARP" == 1 " >> - "&& is_chassis_resident(%s)", >> - port_name); >> + ds_put_format(&match, REGBIT_EXT_ARP" == 1"); >> + if (od_is_centralized(op->od)) { >> + ds_put_format(&match, " && is_chassis_resident(%s)", >> + op >> + ->cr_port->json_key); > nit: the "->cr_port" probably doesn't need to be on its own line. > >> + } >> ovn_lflow_add(lflows, od, S_SWITCH_IN_APPLY_PORT_SEC, >> 75, >> ds_cstr(&match), "next;", ar->lflow_ref); >> } >> @@ -9755,8 +9804,7 @@ build_lswitch_arp_chassis_resident(const struct >> ovn_datapath *od, >> } >> >> sset_destroy(&distributed_nat_ports); >> - sset_destroy(&resident_ports); >> - sset_destroy(&inports); >> + hmapx_destroy(&resident_ports); >> ds_destroy(&match); >> } >> >> @@ -10769,8 +10817,13 @@ build_lswitch_ip_unicast_lookup(struct >> ovn_port *op, >> : debug_drop_action(); >> >> if (lsp_is_router(op->nbsp) && op->peer && op->peer->nbrp) { >> + /* Distributed gateway ports default to centralized mode. >> + * They operate in distributed mode only when configured >> + * on their bound router. */ >> + bool peer_lrp_is_centralized = od_is_centralized(op->peer- >>> od); >> + >> /* For ports connected to logical routers add flows to >> bypass the >> - * broadcast flooding of ARP/ND requests in table 19. We >> direct the >> + * broadcast flooding of ARP/ND requests in table 22. We >> direct the >> * requests only to the router port that owns the IP >> address. >> */ >> build_lswitch_rport_arp_req_flows(op->peer, op->od, op, >> lflows, >> @@ -10785,7 +10838,9 @@ build_lswitch_ip_unicast_lookup(struct >> ovn_port *op, >> ds_put_format(match, "eth.dst == %s", op->peer- >>> lrp_networks.ea_s); >> } >> >> - if (!vector_is_empty(&op->peer->od->l3dgw_ports) && >> + >> + if (peer_lrp_is_centralized && >> + !vector_is_empty(&op->peer->od->l3dgw_ports) && >> !vector_is_empty(&op->od->localnet_ports)) { >> add_lrp_chassis_resident_check(op->peer, match); >> } else if (op->cr_port) { >> @@ -12635,6 +12690,13 @@ build_distr_lrouter_nat_flows_for_lb(struct >> lrouter_nat_lb_flows_ctx *ctx, >> size_t new_match_len = ctx->new_match->length; >> size_t undnat_match_len = ctx->undnat_match->length; >> >> + bool lb_is_centralized = !ctx->lb->is_distributed; >> + >> + /* If load balancer is distributed, then the response traffic >> + * must be returned through the distributed port.*/ >> + const char *gw_outport = lb_is_centralized ? dgp->cr_port- >>> json_key >> + : dgp->json_key; >> + >> const char *meter = NULL; >> >> if (ctx->reject) { >> @@ -12646,8 +12708,9 @@ build_distr_lrouter_nat_flows_for_lb(struct >> lrouter_nat_lb_flows_ctx *ctx, >> dgp, meter); >> } >> >> - if (!vector_is_empty(&ctx->lb_vip->backends) || >> - !ctx->lb_vip->empty_backend_rej) { >> + if (lb_is_centralized && >> + (!vector_is_empty(&ctx->lb_vip->backends) || >> + !ctx->lb_vip->empty_backend_rej)) { >> ds_put_format(ctx->new_match, " && is_chassis_resident(%s)", >> dgp->cr_port->json_key); >> } >> @@ -12684,18 +12747,21 @@ build_distr_lrouter_nat_flows_for_lb(struct >> lrouter_nat_lb_flows_ctx *ctx, >> * the undnat stage. >> */ >> ds_put_format(ctx->undnat_match, ") && outport == %s", dgp- >>> json_key); >> - ds_clear(ctx->gw_redir_action); >> - ds_put_format(ctx->gw_redir_action, "outport = %s; next;", >> - dgp->cr_port->json_key); >> + ds_put_format(ctx->gw_redir_action, >> + "outport = %s; next;", gw_outport); >> >> ovn_lflow_add(ctx->lflows, od, S_ROUTER_IN_GW_REDIRECT, 200, >> ds_cstr(ctx->undnat_match), ds_cstr(ctx- >>> gw_redir_action), >> lflow_ref, WITH_HINT(&ctx->lb->nlb->header_)); >> ds_truncate(ctx->undnat_match, undnat_match_len); >> >> - ds_put_format(ctx->undnat_match, ") && (inport == %s || outport >> == %s)" >> - " && is_chassis_resident(%s)", dgp->json_key, dgp- >>> json_key, >> - dgp->cr_port->json_key); >> + ds_put_format(ctx->undnat_match, ") && (inport == %s || outport >> == %s)", >> + dgp->json_key, dgp->json_key); >> + >> + if (lb_is_centralized) { >> + ds_put_format(ctx->undnat_match, " && >> is_chassis_resident(%s)", >> + dgp->cr_port->json_key); >> + } >> ovn_lflow_add(ctx->lflows, od, S_ROUTER_OUT_UNDNAT, 120, >> ds_cstr(ctx->undnat_match), >> ds_cstr(&undnat_action), >> lflow_ref, WITH_HINT(&ctx->lb->nlb->header_)); >> @@ -14022,6 +14088,10 @@ build_gateway_mtu_flow(struct lflow_table >> *lflows, struct ovn_port *op, >> static bool >> consider_l3dgw_port_is_centralized(struct ovn_port *op) >> { >> + if (!od_is_centralized(op->od)) { >> + return false; >> + } >> + >> if (l3dgw_port_has_associated_vtep_lports(op)) { >> return false; >> } >> @@ -16224,7 +16294,7 @@ build_ipv6_input_flows_for_lrouter_port( >> * router's own IP address. */ >> for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) { >> ds_clear(match); >> - if (lrp_is_l3dgw(op)) { >> + if (lrp_is_l3dgw(op) && od_is_centralized(op->od)) { >> /* Traffic with eth.src = l3dgw_port->lrp_networks.ea_s >> * should only be sent from the gateway chassi, so that >> * upstream MAC learning points to the gateway chassis. >> @@ -16500,7 +16570,8 @@ build_lrouter_ipv4_ip_input(struct ovn_port >> *op, >> op->lrp_networks.ipv4_addrs[i].network_s, >> op->lrp_networks.ipv4_addrs[i].plen); >> >> - if (!vector_is_empty(&op->od->l3dgw_ports) && op->peer >> + if (od_is_centralized(op->od) && >> + !vector_is_empty(&op->od->l3dgw_ports) && op->peer >> && !vector_is_empty(&op->peer->od->localnet_ports)) { >> add_lrp_chassis_resident_check(op, match); >> } >> diff --git a/northd/northd.h b/northd/northd.h >> index e4d7de9f9..84b76f886 100644 >> --- a/northd/northd.h >> +++ b/northd/northd.h >> @@ -450,6 +450,11 @@ struct ovn_datapath { >> /* Indicates that the LS has valid vni associated with it. */ >> bool has_evpn_vni; >> >> + /* True if datapath has some distributed dependencies. >> + * Currently, this only applies to load balancers attached to >> datapath >> + * with distributed mode enabled. */ >> + bool is_distributed; >> + >> /* OVN northd only needs to know about logical router gateway >> ports for >> * NAT/LB on a distributed router. The "distributed gateway >> ports" are >> * populated only when there is a gateway chassis or ha chassis >> group >> @@ -1151,6 +1156,18 @@ ovn_port_must_learn_route(const struct >> ovn_port *op, >> return true; >> } >> >> + /* Returns true if datapath 'od' operates in centralized mode on >> gateway. >> + * >> + * Returns false when datapath is distributed. A datapath is >> distributed >> + * only when configured with the 'distributed' option enabled. In >> distributed >> + * mode, ARP/ND processing is handled locally on each node. >> + */ >> +static inline bool >> +od_is_centralized(const struct ovn_datapath *od) >> +{ >> + return !od->is_distributed; >> +} >> + > nit/personal opinion: Having a function that takes a "struct" just to > return inverted value of struct's member feels a bit > redundant/overkill. But I might be wrong. We are passing a pointer, not a structure - this is normal, and besides, this is an inline function > > >> struct ovn_port *ovn_port_find(const struct hmap *ports, const char >> *name); >> >> void build_igmp_lflows(struct hmap *igmp_groups, >> diff --git a/ovn-nb.xml b/ovn-nb.xml >> index e74c0d010..02a864151 100644 >> --- a/ovn-nb.xml >> +++ b/ovn-nb.xml >> @@ -2399,13 +2399,15 @@ >> <p> >> Maps from endpoint IP to a colon-separated pair of logical >> port name >> and source IP, >> - e.g. <code><var>port_name</var>:<var>sourc_ip</var></code> >> for IPv4. >> + e.g. >> <code><var>port_name</var>:<var>source_ip</var></code> for IPv4. >> Health checks are sent to this port with the specified >> source IP. >> For IPv6 square brackets must be used around IP address, >> e.g: >> - <code><var>port_name</var>:<var>[sourc_ip]</var></code> >> + <code><var>port_name</var>:<var>[source_ip]</var></code> >> Remote endpoint: >> Specify :target_zone_name at the end of the above syntax >> to create >> remote health checks in a specific zone. >> + For distributed load balancers - ip_port_mappings is >> required. >> + In the absence of health checks - source_ip is optional. >> </p> >> >> <p> >> @@ -2608,6 +2610,16 @@ or >> traffic may be dropped in scenarios where we have different >> chassis >> for each DGP. This option is set to <code>false</code> by >> default. >> </column> >> + >> + <column name="options" key="distributed"> >> + Option enables distributed load balancing across compute >> nodes, >> + ensuring traffic is always routed to local backends — >> eliminating >> + east-west traffic between nodes. >> + Required configuration: <ref column="ip_port_mappings"/>. >> + NOTE: The addressing of the underlay network must not >> overlap with >> + the addressing of Load Balancer VIP. > Thank you for adding this note/expansion. I wonder if we could also add > a bit more explanation to it. e.g.: > > NOTE: The addressing of the underlay network must not overlap with the > addressing of Load Balancer VIP. If the Load Balancer is attached to a > router that is directly connected to the underlay network and the VIP > belongs to the same subnet as used on the underlay network, the traffic > won't be spread across all chassis! Instead, it will be concentrated > only on the chassis that hosts the Distributed Gateway Port of the > router. > > It doesn't have to be word-for-word, but I believe that we should > mention potential effect that this misconfiguration could cause. I'm > sure it will help someone in the future when debugging a mysterious LB > behavior :D > > > ack, i will fix >> + </column> >> + >> </group> >> </table> >> >> diff --git a/ovn-sb.xml b/ovn-sb.xml >> index 623aaeffd..19c7a6877 100644 >> --- a/ovn-sb.xml >> +++ b/ovn-sb.xml >> @@ -2130,6 +2130,17 @@ >> </p> >> </dd> >> >> + <dt><code>ct_lb_mark_local;</code></dt> >> + >> <dt><code>ct_lb_mark_local(backends=<var>lport_name</var>[<var>ip</va >> r>[:<var>port</var>][,...][; >> hash_fields=<var>field1</var>,<var>field2</var>,...][; >> ct_flag]);</code></dt> >> + <dd> >> + <p> >> + Same as <code>ct_lb_mark</code>, with the key >> difference that it >> + implements local-only load balancing. This mode >> selects backends >> + only from those running on the current chassis, >> preventing >> + traffic from being forwarded gto backends on remote >> nodes. >> + </p> >> + </dd> >> + >> <dt> >> <code><var>R</var> = dns_lookup();</code> >> </dt> >> diff --git a/tests/multinode-macros.at b/tests/multinode-macros.at >> index ad09ac562..24b15f64b 100644 >> --- a/tests/multinode-macros.at >> +++ b/tests/multinode-macros.at >> @@ -22,6 +22,20 @@ m4_define([M_NS_CHECK_EXEC], >> [ AT_CHECK([M_NS_EXEC([$1], [$2], [$3])], >> m4_shift(m4_shift(m4_shift($@)))) ] >> ) >> >> +# M_FAKE_NODE_EXEC([fake_node], [command]) >> +# >> +# Execute 'command' in 'fake_node' >> +m4_define([M_FAKE_NODE_EXEC], >> + [podman exec $1 $2]) >> + >> +# M_FAKE_NODE_CHECK_EXEC([fake_node], [command], other_params...) >> +# >> +# Wrapper for AT_CHECK that executes 'command' inside 'fake_node'. >> +# 'other_params' as passed as they are to AT_CHECK. >> +m4_define([M_FAKE_NODE_CHECK_EXEC], >> + [ AT_CHECK([M_FAKE_NODE_EXEC([$1], [$2])], >> m4_shift(m4_shift($@))) ] >> +) >> + >> # M_DAEMONIZE([fake_node],[command],[pidfile]) >> m4_define([M_DAEMONIZE], >> [podman exec $1 $2 & echo $! > $3 >> diff --git a/tests/multinode.at b/tests/multinode.at >> index 83cdd6c01..b1336c31c 100644 >> --- a/tests/multinode.at >> +++ b/tests/multinode.at >> @@ -4296,3 +4296,162 @@ M_NS_CHECK_EXEC([ovn-chassis-2], [ovn-ext2], >> [ping6 -q -c 3 -i 0.3 -w 2 6812:86: >> m_wait_row_count mac_binding 1 ip="6812\:86\:\:102" >> logical_port="lr1-pub" >> >> AT_CLEANUP >> + >> +AT_SETUP([Distribute load balancing: IPv4]) >> +# >> +# ┌──────────────────────┐ >> +# │ fabric (leaf switch) │ >> +# │ gw-1 │ >> +# └─────────┬────────────┘ >> +# │ >> +# ┌──────┴──────┐ >> +# │ │ >> +# ▼ ▼ >> +# route route >> +# weight 1 weight 2 (2 backends) >> +# │ │ >> +# ▼ ▼ >> +# ┌───────┐ ┌───────┐ >> +# │Chassis│ │Chassis│ >> +# │ 1 │ │ 2 │ >> +# └───-───┘ └───-───┘ >> +# >> + >> +check_fake_multinode_setup >> +cleanup_multinode_resources >> +for c in ovn-chassis-1 ovn-chassis-2 >> +do >> + m_as $c ip link del lb-host >> + m_as $c ovs-vsctl del-br br-lb >> + m_as $c ovs-vsctl del-port lb-ovs >> +done >> + >> +OVS_WAIT_UNTIL([m_as ovn-chassis-1 ip link show | grep -q >> genev_sys]) >> +OVS_WAIT_UNTIL([m_as ovn-chassis-2 ip link show | grep -q >> genev_sys]) >> + >> +check multinode_nbctl ls-add pub \ >> + -- lsp-add-router-port pub pub-lr1 lr1-pub \ >> + -- lsp-add-localnet-port pub pub-ln public >> + >> +check multinode_nbctl lr-add lr1 \ >> + -- lrp-add lr1 lr1-pub 00:00:00:00:00:01 169.254.1.254/24 \ >> + -- lrp-add lr1 lr1-down 00:00:00:00:00:02 192.168.1.254/24 >> + >> +check multinode_nbctl ls-add ls1 >> +check multinode_nbctl lsp-add ls1 ls1p1 >> +check multinode_nbctl lsp-set-addresses ls1p1 "00:00:00:01:01:02 >> 192.168.1.1" >> +check multinode_nbctl lsp-add ls1 ls1p2 >> +check multinode_nbctl lsp-set-addresses ls1p2 "00:00:00:01:02:02 >> 192.168.1.2" >> +check multinode_nbctl lsp-add ls1 ls1p3 >> +check multinode_nbctl lsp-set-addresses ls1p3 "00:00:00:01:03:02 >> 192.168.1.3" >> +check multinode_nbctl lsp-add-router-port ls1 ls1-lr1 lr1-down >> + >> +check multinode_nbctl lrp-set-gateway-chassis lr1-pub ovn-chassis-2 >> + >> +# Create default route >> +check multinode_nbctl lr-route-add lr1 0.0.0.0/0 169.254.1.253 lr1- >> pub >> + >> +m_as ovn-chassis-1 /data/create_fake_vm.sh ls1p1 ls1p1 >> 00:00:00:01:01:02 1500 192.168.1.1 24 192.168.1.254 2001::1/64 >> 2001::a >> +m_as ovn-chassis-2 /data/create_fake_vm.sh ls1p2 ls1p2 >> 00:00:00:01:02:02 1500 192.168.1.2 24 192.168.1.254 2001::2/64 >> 2001::a >> +m_as ovn-chassis-2 /data/create_fake_vm.sh ls1p3 ls1p3 >> 00:00:00:01:03:02 1500 192.168.1.3 24 192.168.1.254 2001::3/64 >> 2001::a >> + >> +# Create load balancer >> +lb_vip="1.1.1.1" > nit: I think we should stay way from using real/routable IP addresses > in the tests (in this and other tests in this change). Let's use either > private block or one of the blocks reserved for documentation (RFC > 7537) ack, i will fix > Thanks again, > Martin. > >> +check multinode_nbctl lb-add lb1 $lb_vip:80 >> 192.168.1.1:10880,192.168.1.2:10880,192.168.1.3:10880 >> +check multinode_nbctl lr-lb-add lr1 lb1 >> +check multinode_nbctl set Load_Balancer lb1 >> ip_port_mappings:192.168.1.1=ls1p1:192.168.1.199 >> +check multinode_nbctl set Load_Balancer lb1 >> ip_port_mappings:192.168.1.2=ls1p2:192.168.1.199 >> +check multinode_nbctl set Load_Balancer lb1 >> ip_port_mappings:192.168.1.3=ls1p3:192.168.1.199 >> +check multinode_nbctl set load_balancer lb1 options:distributed=true >> + >> +ip_ch1=$(m_as ovn-chassis-1 ip a show dev eth1 | grep "inet " | awk >> '{print $2}'| cut -d '/' -f1) >> +ip_ch2=$(m_as ovn-chassis-2 ip a show dev eth1 | grep "inet " | awk >> '{print $2}'| cut -d '/' -f1) >> + >> +# Add multipath route to load balancer VIP with weighted nexthops on >> "fabric" host: >> +check m_as ovn-gw-1 ip route flush $lb_vip >> +check m_as ovn-gw-1 ip route add $lb_vip nexthop via $ip_ch1 dev >> eth1 weight 1 nexthop via $ip_ch2 dev eth1 weight 2 >> + >> +# Set kernel multipath hash policy to L3/L4 (source/destination >> IP+port) >> +# Policy 1 = Layer 3/4 hash (src/dst IP+port) >> +M_FAKE_NODE_CHECK_EXEC([ovn-gw-1], [sysctl -w >> net.ipv4.fib_multipath_hash_policy=1], \ >> +[0], [dnl >> +net.ipv4.fib_multipath_hash_policy = 1 >> +]) >> + >> +# Check OpenFlow group filling: it should only contain local >> backends >> +M_FAKE_NODE_CHECK_EXEC([ovn-chassis-1], [ovs-ofctl dump-groups br- >> int | sed -e 's/table=[[0-9]]*/table=<cleared>/g'], [0], [dnl >> +NXST_GROUP_DESC reply (xid=0x2): >> + >> group_id=1,type=select,selection_method=dp_hash,bucket=bucket_id:0,we >> ight:100,actions=ct(commit,table=<cleared>,zone=NXM_NX_REG11[[0..15]] >> ,nat(dst=192.168.1.1:10880),exec(load:0x1->NXM_NX_CT_MARK[[1]])) >> +]) >> + >> +M_FAKE_NODE_CHECK_EXEC([ovn-chassis-2], [ovs-ofctl dump-groups br- >> int | sed -e 's/table=[[0-9]]*/table=<cleared>/g'], [0], [dnl >> +NXST_GROUP_DESC reply (xid=0x2): >> + >> group_id=2,type=select,selection_method=dp_hash,bucket=bucket_id:1,we >> ight:100,actions=ct(commit,table=<cleared>,zone=NXM_NX_REG11[[0..15]] >> ,nat(dst=192.168.1.2:10880),exec(load:0x1- >>> NXM_NX_CT_MARK[[1]])),bucket=bucket_id:2,weight:100,actions=ct(commi >> t,table=<cleared>,zone=NXM_NX_REG11[[0..15]],nat(dst=192.168.1.3:1088 >> 0),exec(load:0x1->NXM_NX_CT_MARK[[1]])) >> +]) >> + >> +physicl_gw_mac_address="30:42:f5:a7:46:65" >> + >> +# Configure infrastructure on chassis hosts: >> +# lb-host (physicl_gw_mac_address) - (veth) lb-ovs - br-lb - br-int >> +for c in ovn-chassis-1 ovn-chassis-2 >> +do >> + check m_as $c ip link add lb-host type veth peer lb-ovs >> + on_exit "m_as $c ip link del lb-host" >> + >> + check m_as $c ip link set dev lb-host address >> $physicl_gw_mac_address >> + check m_as $c ip addr add 169.254.1.253/24 dev lb-host >> + check m_as $c ip link set lb-host up >> + check m_as $c ip link set lb-ovs up >> + >> + check m_as $c ovs-vsctl add-br br-lb >> + on_exit "m_as $c ovs-vsctl del-br br-lb" >> + check m_as $c ovs-vsctl add-port br-lb lb-ovs >> + on_exit "m_as $c ovs-vsctl del-port lb-ovs" >> + check m_as $c ovs-vsctl set open . external-ids:ovn-bridge- >> mappings=public:br-lb >> + >> + check m_as $c ip route flush $lb_vip >> + check m_as $c ip r add $lb_vip via 169.254.1.254 dev lb-host >> + on_exit "m_as $c ip route flush $lb_vip" >> +done >> + >> +OVS_WAIT_UNTIL([m_as ovn-chassis-1 ovs-vsctl show | grep -q patch- >> pub-ln-to-br-int]) >> +OVS_WAIT_UNTIL([m_as ovn-chassis-2 ovs-vsctl show | grep -q patch- >> pub-ln-to-br-int]) >> + >> +M_NS_DAEMONIZE([ovn-chassis-1], [ls1p1], [$PYTHON3 >> /data/tcp_metadata_server.py \ >> + --bind-host >> 192.168.1.1 \ >> + --bind-port 10880 >> \ >> + --reply-string >> ls1p1 >/dev/null 2>&1], [nc1.pid]) >> +M_NS_DAEMONIZE([ovn-chassis-2], [ls1p2], [$PYTHON3 >> /data/tcp_metadata_server.py \ >> + --bind-host >> 192.168.1.2 \ >> + --bind-port 10880 >> \ >> + --reply-string >> ls1p2 >/dev/null 2>&1], [nc2.pid]) >> +M_NS_DAEMONIZE([ovn-chassis-2], [ls1p3], [$PYTHON3 >> /data/tcp_metadata_server.py \ >> + --bind-host >> 192.168.1.3 \ >> + --bind-port 10880 >> \ >> + --reply-string >> ls1p3 >/dev/null 2>&1], [nc3.pid]) >> + >> +# Running tcpdump on chassis nodes to ensure there's no east-west >> traffic during balancing, >> +# which means balancing is happening directly on the node. >> +M_START_TCPDUMP([ovn-chassis-1], [-c 2 -neei genev_sys_6081 port >> 10880], [ch1_genev]) >> +M_START_TCPDUMP([ovn-chassis-1], [-c 2 -neei eth2 port 10880], >> [ch1_eth2]) >> +M_START_TCPDUMP([ovn-chassis-2], [-c 2 -neei genev_sys_6081 port >> 10880], [ch2_genev]) >> +M_START_TCPDUMP([ovn-chassis-2], [-c 2 -neei eth2 port 10880], >> [ch2_eth2]) >> + >> +M_FAKE_NODE_CHECK_EXEC([ovn-gw-1], [/bin/bash -c 'for i in $(seq >> 500); \ >> + do curl --http0.9 >> http://1.1.1.1:80 2>/dev/null ; \ >> + echo ; done | sort | uniq -c ' > reply], >> [0], []) >> + >> +# Check that requests are distributed among all backends. >> +AT_CHECK([grep -q ls1p1 reply && grep -q ls1p2 reply && grep -q >> ls1p3 reply], [0], []) >> + >> +# Check east-west traffic is clean >> +AT_CHECK([cat ch1_genev.tcpdump], [0], [dnl >> +]) >> +AT_CHECK([cat ch1_eth2.tcpdump], [0], [dnl >> +]) >> +AT_CHECK([cat ch2_genev.tcpdump], [0], [dnl >> +]) >> +AT_CHECK([cat ch2_eth2.tcpdump], [0], [dnl >> +]) >> + >> +AT_CLEANUP >> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at >> index 25655c456..0d5af3568 100644 >> --- a/tests/ovn-northd.at >> +++ b/tests/ovn-northd.at >> @@ -18289,6 +18289,9 @@ ovn_start >> # ip_port_mappings syntax: ip:lport_name:src_ip:<az_name>(for remote >> lports) >> >> check ovn-nbctl ls-add ls1 >> +check ovn-nbctl lr-add lr1 >> + >> +ovn-appctl -t ovn-northd vlog/disable-rate-limit >> >> check ovn-nbctl lb-add lb1_ipv4 1.1.1.1:80 >> 192.168.0.1:10880,192.168.0.2:10880,192.168.0.3:10880 >> AT_CHECK([ovn-nbctl --wait=sb \ >> @@ -18358,6 +18361,154 @@ check ovn-nbctl set load_balancer lb1_ipv4 >> ip_port_mappings:192.168.0.1=lport1:1 >> check_row_count sb:Service_Monitor 0 >> >> OVS_WAIT_UNTIL([grep "Empty AZ name specified" northd/ovn- >> northd.log]) >> + >> +check ovn-nbctl lb-del lb1_ipv4 >> + >> +# Check correct setup of distributed load balancers. >> +echo > northd/ovn-northd.log >> +check ovn-nbctl lb-add lb_distubuted 1.1.1.1:80 >> 192.168.0.1:10880,192.168.0.2:10880 >> +check ovn-nbctl lr-lb-add lr1 lb_distubuted >> +check ovn-nbctl set load_balancer lb_distubuted >> options:distributed=true >> + >> +# Check that load balancer does not work in a distributed mode - >> there is no ip_port_mappings setting >> +ovn-sbctl lflow-list lr1 > lr1_lflow >> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 | >> ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80), action=(drop;) >> +]) >> + >> +# Check that the load balancer has only one backend available since >> the only one backend has ip_port_mappings >> +check ovn-nbctl set load_balancer lb_distubuted >> ip_port_mappings:192.168.0.1=lport1 >> +ovn-sbctl lflow-list lr1 > lr1_lflow >> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 | >> ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80), >> action=(ct_lb_mark_local(backends="lport1":192.168.0.1:10880);) >> +]) >> + >> +check ovn-nbctl set load_balancer lb_distubuted >> ip_port_mappings:192.168.0.2=lport2 >> +ovn-sbctl lflow-list lr1 > lr1_lflow >> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 | >> ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80), >> action=(ct_lb_mark_local(backends="lport1":192.168.0.1:10880,"lport2" >> :192.168.0.2:10880);) >> +]) >> + >> +# Check if health check is configured, ip_port_mappings must be >> provided. >> +AT_CHECK([ovn-nbctl --wait=sb \ >> + -- --id=@hc create Load_Balancer_Health_Check >> vip="1.1.1.1\:80" \ >> + options:failure_count=100 \ >> + -- add Load_Balancer lb_distubuted health_check @hc | >> uuidfilt], [0], [<0> >> +]) >> + >> +ovn-sbctl lflow-list lr1 > lr1_lflow >> +OVS_WAIT_UNTIL([grep "Expected ':' separator for:" northd/ovn- >> northd.log]) >> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 | >> ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80), action=(drop;) >> +]) >> + >> +check ovn-nbctl lb-del lb_distubuted >> +echo > northd/ovn-northd.log >> + >> +# ipv6 configuration >> +check ovn-nbctl lb-add lb1 [[2001::a]]:80 >> [[2001::3]]:80,[[2002::3]]:80 >> + >> +check_uuid ovn-nbctl --wait=sb -- --id=@hc create \ >> +Load_Balancer_Health_Check vip="\[\[2001\:\:a\]\]\:80" -- add >> Load_Balancer . \ >> +health_check @hc >> + >> +check_row_count sb:Service_Monitor 0 >> +check ovn-nbctl --wait=sb set load_balancer . >> ip_port_mappings:\"[[2001::3]]\"=\"lport1:[[2001::2]]\" >> + >> +check_row_count sb:Service_Monitor 1 >> +ovn-sbctl list service_monitor >> +check_column "2001::3" sb:Service_Monitor ip logical_port=lport1 >> +check_column 80 sb:Service_Monitor port logical_port=lport1 >> +check_column tcp sb:Service_Monitor protocol logical_port=lport1 >> +check_column "2001::2" sb:Service_Monitor src_ip logical_port=lport1 >> +check_column false sb:Service_Monitor ic_learned logical_port=lport1 >> +check_column false sb:Service_Monitor remote logical_port=lport1 >> +check_column "" sb:Service_Monitor logical_input_port >> logical_port=lport1 >> + >> +# Empty src_ip. >> +check ovn-nbctl clear load_balancer lb1 ip_port_mappings >> +check ovn-nbctl --wait=sb set load_balancer . >> ip_port_mappings:\"[[2001::3]]\"=\"lport1:\" >> +OVS_WAIT_UNTIL([grep "Invalid svc mon src IP" northd/ovn- >> northd.log]) >> +check_row_count sb:Service_Monitor 0 >> +echo > northd/ovn-northd.log >> + >> +# Uncorrect ip_address. >> +check ovn-nbctl --wait=sb set load_balancer . >> ip_port_mappings:\"[[invalid]]\"=\"lport1:\" >> +OVS_WAIT_UNTIL([grep "bad IP address" northd/ovn-northd.log]) >> +echo > northd/ovn-northd.log >> + >> +check ovn-nbctl --wait=sb set load_balancer . >> ip_port_mappings:\"[[2001::3]]\"=\"lport1:invalid\" >> +OVS_WAIT_UNTIL([grep "bad IP address" northd/ovn-northd.log]) >> +echo > northd/ovn-northd.log >> + >> +check ovn-nbctl --wait=sb set load_balancer . >> ip_port_mappings:\"[[2001::3]]\"=\"lport1:[[2001::2]]:az_name\" >> +check_row_count sb:Service_Monitor 1 >> +ovn-sbctl list service_monitor >> +check_column "2001::3" sb:Service_Monitor ip logical_port=lport1 >> +check_column 80 sb:Service_Monitor port logical_port=lport1 >> +check_column tcp sb:Service_Monitor protocol logical_port=lport1 >> +check_column "2001::2" sb:Service_Monitor src_ip logical_port=lport1 >> +check_column false sb:Service_Monitor ic_learned logical_port=lport1 >> +check_column true sb:Service_Monitor remote logical_port=lport1 >> +check_column "" sb:Service_Monitor logical_input_port >> logical_port=lport1 >> + >> +uuid=$(ovn-sbctl -d bare --no-headings --columns _uuid find >> Service_Monitor logical_port=lport1) >> + >> +# Check az_name presence in options. >> +AT_CHECK([ovn-sbctl get Service_Monitor ${uuid} options:az-name], >> +[0], [az_name >> +]) >> + >> +check ovn-nbctl --wait=sb set load_balancer . >> ip_port_mappings:\"[[2001::3]]\"=\"lport1:[[2001::2]]:\" >> +check_row_count sb:Service_Monitor 0 >> +OVS_WAIT_UNTIL([grep "Empty AZ name specified" northd/ovn- >> northd.log]) >> + >> +echo > northd/ovn-northd.log >> +check ovn-nbctl lb-del lb1 >> + >> +# Check correct setup of distributed load balancers. >> +check ovn-nbctl lb-add lb_distubuted [[2001::a]]:80 >> [[2001::3]]:80,[[2002::3]]:80 >> +check ovn-nbctl lr-lb-add lr1 lb_distubuted >> +check ovn-nbctl set load_balancer lb_distubuted >> options:distributed=true >> + >> +# Check that load balancer does not work in a distributed mode - >> there is no ip_port_mappings setting >> +ovn-sbctl lflow-list lr1 > lr1_lflow >> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 | >> ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80), action=(drop;) >> +]) >> + >> +echo > northd/ovn-northd.log >> +check ovn-nbctl set load_balancer . >> ip_port_mappings:\"[[2001::3]]\"=\"lport1\" >> +ovn-sbctl lflow-list lr1 > lr1_lflow >> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 | >> ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80), >> action=(ct_lb_mark_local(backends="lport1":[[2001::3]]:80);) >> +]) >> + >> +echo > northd/ovn-northd.log >> +check ovn-nbctl set load_balancer . >> ip_port_mappings:\"[[2002::3]]\"=\"lport2\" >> +ovn-sbctl lflow-list lr1 > lr1_lflow >> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 | >> ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80), >> action=(ct_lb_mark_local(backends="lport1":[[2001::3]]:80,"lport2":[[ >> 2002::3]]:80);) >> +]) >> + >> +echo > northd/ovn-northd.log >> +check_uuid ovn-nbctl --wait=sb -- --id=@hc create \ >> +Load_Balancer_Health_Check vip="\[\[2001\:\:a\]\]\:80" -- add >> Load_Balancer . \ >> +health_check @hc >> +OVS_WAIT_UNTIL([grep "Expected ':' separator for:" northd/ovn- >> northd.log]) >> +ovn-sbctl lflow-list lr1 > lr1_lflow >> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 | >> ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80), action=(drop;) >> +]) >> + >> +check ovn-nbctl --wait=sb set load_balancer . >> ip_port_mappings:\"[[2001::3]]\"=\"lport1:[[2001::2]]\" >> +ovn-sbctl lflow-list lr1 > lr1_lflow >> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 | >> ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80), >> action=(ct_lb_mark_local(backends="lport1":[[2001::3]]:80);) >> +]) >> + >> +check ovn-nbctl lb-del lb_distubuted >> + >> OVN_CLEANUP_NORTHD >> AT_CLEANUP >> ]) >> @@ -19251,3 +19402,216 @@ AT_CHECK([grep "lr_in_policy[[^_]]" >> lr0flows | ovn_strip_lflows | sort], [0], [d >> OVN_CLEANUP_NORTHD >> AT_CLEANUP >> ]) >> + >> +OVN_FOR_EACH_NORTHD_NO_HV([ >> +AT_SETUP([Distributed lb: logical-flow test - IPv4/IPv6 case]) >> +ovn_start >> + >> +check ovn-nbctl ls-add outside >> + >> +check ovn-nbctl lsp-add outside outside \ >> + -- lsp-set-addresses outside unknown \ >> + -- lsp-set-type outside localnet >> + >> +check ovn-nbctl --wait=sb set Logical_Switch_Port outside >> tag_request=2 >> + >> +check ovn-nbctl lsp-add outside outside-down \ >> + -- lsp-set-type outside-down router \ >> + -- lsp-set-addresses outside-down router \ >> + -- lsp-set-options outside-down router-port=lr1-up >> + >> +check ovn-nbctl lr-add lr1 \ >> + -- lrp-add lr1 lr1-up 11:11:11:11:11:11 169.254.0.1/24 >> 2001:db8:abcd:0002::bad/64 \ >> + -- lrp-add lr1 lr1-down 12:12:12:12:12:12 192.168.0.1/24 >> 2001:db8:abcd:0001::c0fe/64 >> + >> +check ovn-nbctl ls-add ls1 \ >> + -- lsp-add ls1 lport1 \ >> + -- lsp-set-addresses lport1 "13:13:13:13:13:13 192.168.0.101" >> \ >> + -- lsp-add ls1 lport2 \ >> + -- lsp-set-addresses lport2 "14:14:14:14:14:14 192.168.0.102" >> + >> +check ovn-nbctl lsp-add ls1 ls1-up \ >> + -- lsp-set-type ls1-up router \ >> + -- lsp-set-addresses ls1-up router \ >> + -- lsp-set-options ls1-up router-port=lr1-down >> + >> +check ovn-nbctl --wait=sb sync >> + >> +check ovn-nbctl ha-chassis-group-add gateway >> +check ovn-nbctl ha-chassis-group-add-chassis gateway hv1 1 >> +ha_g_uuid=$(fetch_column nb:HA_Chassis_Group _uuid name=gateway) >> +lr1_up_uuid=$(fetch_column nb:Logical_Router_Port _uuid name=lr1-up) >> +check ovn-nbctl set logical_router_port $lr1_up_uuid >> ha_chassis_group=$ha_g_uuid >> + >> +check ovn-nbctl --wait=sb sync >> + >> +check ovn-nbctl lb-add lb1_ipv4 1.1.1.1:80 >> 192.168.0.101:10880,192.168.0.102:10880 >> +check ovn-nbctl set Load_Balancer lb1_ipv4 >> ip_port_mappings:192.168.0.101=lport1:192.168.0.199 >> +check ovn-nbctl set Load_Balancer lb1_ipv4 >> ip_port_mappings:192.168.0.102=lport2:192.168.0.199 >> +check ovn-nbctl lr-lb-add lr1 lb1_ipv4 >> +check ovn-nbctl --wait=sb sync >> + >> +check ovn-nbctl lb-add lb1_ipv6 [[2000::1]]:80 >> [[2001:db8:abcd:1::2]]:10882 >> +check ovn-nbctl set Load_Balancer lb1_ipv6 >> ip_port_mappings:\"[[2001:db8:abcd:1::2]]\"=\"lport1\" >> +check ovn-nbctl lr-lb-add lr1 lb1_ipv6 >> +check ovn-nbctl --wait=sb sync >> + >> +ovn-sbctl lflow-list lr1 > lr1_lflows_before >> +ovn-sbctl lflow-list outside > outside_lflows_before >> + >> +AT_CHECK([cat outside_lflows_before | grep ls_in_l2_lkup | grep >> priority=50 | ovn_strip_lflows], [0], [dnl >> + table=??(ls_in_l2_lkup ), priority=50 , match=(eth.dst == >> 11:11:11:11:11:11 && is_chassis_resident("cr-lr1-up")), >> action=(outport = "outside-down"; output;) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_before | grep lr_in_ip_input | grep >> priority=90 | grep 169.254.0.1 | ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_ip_input ), priority=90 , match=(inport == >> "lr1-up" && arp.op == 1 && arp.tpa == 169.254.0.1 && arp.spa == >> 169.254.0.0/24 && is_chassis_resident("cr-lr1-up")), action=(eth.dst >> = eth.src; eth.src = xreg0[[0..47]]; arp.op = 2; /* ARP reply */ >> arp.tha = arp.sha; arp.sha = xreg0[[0..47]]; arp.tpa <-> arp.spa; >> outport = inport; flags.loopback = 1; output;) >> + table=??(lr_in_ip_input ), priority=90 , match=(ip4.dst == >> 169.254.0.1 && icmp4.type == 8 && icmp4.code == 0), action=(ip4.dst >> <-> ip4.src; ip.ttl = 255; icmp4.type = 0; flags.loopback = 1; next; >> ) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_before | grep lr_in_ip_input | grep >> priority=90 | grep 2001:db8:abcd:2::bad | ovn_strip_lflows], [0], >> [dnl >> + table=??(lr_in_ip_input ), priority=90 , match=(inport == >> "lr1-up" && ip6.dst == {2001:db8:abcd:2::bad, ff02::1:ff00:bad} && >> nd_ns && nd.target == 2001:db8:abcd:2::bad && >> is_chassis_resident("cr-lr1-up")), action=(nd_na_router { eth.src = >> xreg0[[0..47]]; ip6.src = nd.target; nd.tll = xreg0[[0..47]]; outport >> = inport; flags.loopback = 1; output; };) >> + table=??(lr_in_ip_input ), priority=90 , match=(ip6.dst == >> {2001:db8:abcd:2::bad, fe80::1311:11ff:fe11:1111} && icmp6.type == >> 128 && icmp6.code == 0), action=(ip6.dst <-> ip6.src; ip.ttl = 255; >> icmp6.type = 129; flags.loopback = 1; next; ) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_before | grep lr_in_admission | grep >> priority=50 | ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_admission ), priority=50 , match=(eth.dst == >> 11:11:11:11:11:11 && inport == "lr1-up" && is_chassis_resident("cr- >> lr1-up")), action=(xreg0[[0..47]] = 11:11:11:11:11:11; next;) >> + table=??(lr_in_admission ), priority=50 , match=(eth.dst == >> 12:12:12:12:12:12 && inport == "lr1-down"), action=(xreg0[[0..47]] = >> 12:12:12:12:12:12; next;) >> + table=??(lr_in_admission ), priority=50 , match=(eth.mcast && >> inport == "lr1-down"), action=(xreg0[[0..47]] = 12:12:12:12:12:12; >> next;) >> + table=??(lr_in_admission ), priority=50 , match=(eth.mcast && >> inport == "lr1-up"), action=(xreg0[[0..47]] = 11:11:11:11:11:11; >> next;) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_before | grep lr_out_undnat | grep >> priority=120 | ovn_strip_lflows], [0], [dnl >> + table=??(lr_out_undnat ), priority=120 , match=(ip4 && >> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src == >> 192.168.0.102 && tcp.src == 10880)) && (inport == "lr1-up" || outport >> == "lr1-up") && is_chassis_resident("cr-lr1-up")), action=(ct_dnat;) >> + table=??(lr_out_undnat ), priority=120 , match=(ip6 && >> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && (inport == >> "lr1-up" || outport == "lr1-up") && is_chassis_resident("cr-lr1- >> up")), action=(ct_dnat;) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_before | grep lr_in_gw_redirect | >> ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_gw_redirect ), priority=0 , match=(1), >> action=(next;) >> + table=??(lr_in_gw_redirect ), priority=200 , match=(ip4 && >> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src == >> 192.168.0.102 && tcp.src == 10880)) && outport == "lr1-up"), >> action=(outport = "cr-lr1-up"; next;) >> + table=??(lr_in_gw_redirect ), priority=200 , match=(ip6 && >> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && outport == >> "lr1-up"), action=(outport = "cr-lr1-up"; next;) >> + table=??(lr_in_gw_redirect ), priority=50 , match=(outport == >> "lr1-up"), action=(outport = "cr-lr1-up"; next;) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_before | grep lr_in_dnat | grep >> priority=120 | ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80 && is_chassis_resident("cr-lr1-up")), >> action=(ct_lb_mark(backends=192.168.0.101:10880,192.168.0.102:10880); >> ) >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip6 && ip6.dst == 2000::1 && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80 && is_chassis_resident("cr-lr1-up")), >> action=(ct_lb_mark(backends=[[2001:db8:abcd:1::2]]:10882);) >> +]) >> + >> +AT_CHECK([cat outside_lflows_before | grep ls_in_check_port_sec | >> grep priority=75 | ovn_strip_lflows], [0], [dnl >> + table=??(ls_in_check_port_sec), priority=75 , match=((arp.op == >> 1 || arp.op == 2) && inport == "outside"), action=(reg0[[22]] = 1; >> next;) >> +]) >> + >> +AT_CHECK([cat outside_lflows_before | grep ls_in_apply_port_sec | >> grep priority=75 | ovn_strip_lflows], [0], [dnl >> + table=??(ls_in_apply_port_sec), priority=75 , match=(reg0[[22]] >> == 1 && is_chassis_resident("cr-lr1-up")), action=(next;) >> +]) >> + >> +check ovn-nbctl clear logical_router_port $lr1_up_uuid >> ha_chassis_group >> +check ovn-nbctl ha-chassis-group-del gateway >> +check ovn-nbctl ha-chassis-group-add gateway2 >> +check ovn-nbctl ha-chassis-group-add-chassis gateway2 test 1 >> +ha_g_uuid=$(fetch_column nb:HA_Chassis_Group _uuid name=gateway2) >> +lr1_up_uuid=$(fetch_column nb:Logical_Router_Port _uuid name=lr1-up) >> +check ovn-nbctl set logical_router_port $lr1_up_uuid >> ha_chassis_group=$ha_g_uuid >> + >> +check ovn-nbctl set load_balancer lb1_ipv4 options:distributed=true >> +check ovn-nbctl --wait=hv sync >> + >> +ovn-sbctl lflow-list outside > outside_lflows_after >> +ovn-sbctl lflow-list lr1 > lr1_lflows_after >> + >> +AT_CHECK([cat outside_lflows_after | grep ls_in_l2_lkup | grep >> priority=50 | ovn_strip_lflows], [0], [dnl >> + table=??(ls_in_l2_lkup ), priority=50 , match=(eth.dst == >> 11:11:11:11:11:11), action=(outport = "outside-down"; output;) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_after | grep lr_in_ip_input | grep >> priority=90 | grep 169.254.0.1 | ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_ip_input ), priority=90 , match=(inport == >> "lr1-up" && arp.op == 1 && arp.tpa == 169.254.0.1 && arp.spa == >> 169.254.0.0/24), action=(eth.dst = eth.src; eth.src = xreg0[[0..47]]; >> arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = >> xreg0[[0..47]]; arp.tpa <-> arp.spa; outport = inport; flags.loopback >> = 1; output;) >> + table=??(lr_in_ip_input ), priority=90 , match=(ip4.dst == >> 169.254.0.1 && icmp4.type == 8 && icmp4.code == 0), action=(ip4.dst >> <-> ip4.src; ip.ttl = 255; icmp4.type = 0; flags.loopback = 1; next; >> ) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_after | grep lr_in_ip_input | grep >> priority=90 | grep 2001:db8:abcd:2::bad | ovn_strip_lflows], [0], >> [dnl >> + table=??(lr_in_ip_input ), priority=90 , match=(inport == >> "lr1-up" && ip6.dst == {2001:db8:abcd:2::bad, ff02::1:ff00:bad} && >> nd_ns && nd.target == 2001:db8:abcd:2::bad), action=(nd_na_router { >> eth.src = xreg0[[0..47]]; ip6.src = nd.target; nd.tll = >> xreg0[[0..47]]; outport = inport; flags.loopback = 1; output; };) >> + table=??(lr_in_ip_input ), priority=90 , match=(ip6.dst == >> {2001:db8:abcd:2::bad, fe80::1311:11ff:fe11:1111} && icmp6.type == >> 128 && icmp6.code == 0), action=(ip6.dst <-> ip6.src; ip.ttl = 255; >> icmp6.type = 129; flags.loopback = 1; next; ) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_after | grep lr_in_admission | grep >> priority=50 | ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_admission ), priority=50 , match=(eth.dst == >> 11:11:11:11:11:11 && inport == "lr1-up"), action=(xreg0[[0..47]] = >> 11:11:11:11:11:11; next;) >> + table=??(lr_in_admission ), priority=50 , match=(eth.dst == >> 12:12:12:12:12:12 && inport == "lr1-down"), action=(xreg0[[0..47]] = >> 12:12:12:12:12:12; next;) >> + table=??(lr_in_admission ), priority=50 , match=(eth.mcast && >> inport == "lr1-down"), action=(xreg0[[0..47]] = 12:12:12:12:12:12; >> next;) >> + table=??(lr_in_admission ), priority=50 , match=(eth.mcast && >> inport == "lr1-up"), action=(xreg0[[0..47]] = 11:11:11:11:11:11; >> next;) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_after | grep lr_out_undnat | grep >> priority=120 | ovn_strip_lflows], [0], [dnl >> + table=??(lr_out_undnat ), priority=120 , match=(ip4 && >> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src == >> 192.168.0.102 && tcp.src == 10880)) && (inport == "lr1-up" || outport >> == "lr1-up")), action=(ct_dnat;) >> + table=??(lr_out_undnat ), priority=120 , match=(ip6 && >> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && (inport == >> "lr1-up" || outport == "lr1-up") && is_chassis_resident("cr-lr1- >> up")), action=(ct_dnat;) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_after | grep lr_in_gw_redirect | >> ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_gw_redirect ), priority=0 , match=(1), >> action=(next;) >> + table=??(lr_in_gw_redirect ), priority=200 , match=(ip4 && >> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src == >> 192.168.0.102 && tcp.src == 10880)) && outport == "lr1-up"), >> action=(outport = "lr1-up"; next;) >> + table=??(lr_in_gw_redirect ), priority=200 , match=(ip6 && >> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && outport == >> "lr1-up"), action=(outport = "cr-lr1-up"; next;) >> + table=??(lr_in_gw_redirect ), priority=50 , match=(outport == >> "lr1-up"), action=(outport = "cr-lr1-up"; next;) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_after | grep lr_in_dnat | grep priority=120 >> | ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80), >> action=(ct_lb_mark_local(backends="lport1":192.168.0.101:10880,"lport >> 2":192.168.0.102:10880);) >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip6 && ip6.dst == 2000::1 && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80 && is_chassis_resident("cr-lr1-up")), >> action=(ct_lb_mark(backends=[[2001:db8:abcd:1::2]]:10882);) >> +]) >> + >> +AT_CHECK([cat outside_lflows_after | grep ls_in_check_port_sec | >> grep priority=75 | ovn_strip_lflows], [0], [dnl >> + table=??(ls_in_check_port_sec), priority=75 , match=((arp.op == >> 1 || arp.op == 2) && inport == "outside"), action=(reg0[[22]] = 1; >> next;) >> +]) >> + >> +AT_CHECK([cat outside_lflows_after | grep ls_in_apply_port_sec | >> grep priority=75 | ovn_strip_lflows], [0], [dnl >> + table=??(ls_in_apply_port_sec), priority=75 , match=(reg0[[22]] >> == 1), action=(next;) >> +]) >> + >> +check ovn-nbctl set load_balancer lb1_ipv6 options:distributed=true >> +check ovn-nbctl --wait=hv sync >> + >> +ovn-sbctl lflow-list outside > outside_lflows_after >> +ovn-sbctl lflow-list lr1 > lr1_lflows_after >> + >> +AT_CHECK([cat lr1_lflows_after | grep lr_out_undnat | grep >> priority=120 | ovn_strip_lflows], [0], [dnl >> + table=??(lr_out_undnat ), priority=120 , match=(ip4 && >> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src == >> 192.168.0.102 && tcp.src == 10880)) && (inport == "lr1-up" || outport >> == "lr1-up")), action=(ct_dnat;) >> + table=??(lr_out_undnat ), priority=120 , match=(ip6 && >> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && (inport == >> "lr1-up" || outport == "lr1-up")), action=(ct_dnat;) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_after | grep lr_in_gw_redirect | >> ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_gw_redirect ), priority=0 , match=(1), >> action=(next;) >> + table=??(lr_in_gw_redirect ), priority=200 , match=(ip4 && >> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src == >> 192.168.0.102 && tcp.src == 10880)) && outport == "lr1-up"), >> action=(outport = "lr1-up"; next;) >> + table=??(lr_in_gw_redirect ), priority=200 , match=(ip6 && >> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && outport == >> "lr1-up"), action=(outport = "lr1-up"; next;) >> + table=??(lr_in_gw_redirect ), priority=50 , match=(outport == >> "lr1-up"), action=(outport = "cr-lr1-up"; next;) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_after | grep lr_in_dnat | grep priority=120 >> | ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80), >> action=(ct_lb_mark_local(backends="lport1":192.168.0.101:10880,"lport >> 2":192.168.0.102:10880);) >> + table=??(lr_in_dnat ), priority=120 , match=(ct.new && >> !ct.rel && ip6 && ip6.dst == 2000::1 && reg1[[16..23]] == 6 && >> reg1[[0..15]] == 80), >> action=(ct_lb_mark_local(backends="lport1":[[2001:db8:abcd:1::2]]:108 >> 82);) >> +]) >> + >> +check ovn-nbctl set load_balancer lb1_ipv6 options:distributed=false >> +check ovn-nbctl --wait=hv sync >> + >> +AT_CHECK([cat outside_lflows_after | grep ls_in_l2_lkup | grep >> priority=50 | ovn_strip_lflows], [0], [dnl >> + table=??(ls_in_l2_lkup ), priority=50 , match=(eth.dst == >> 11:11:11:11:11:11), action=(outport = "outside-down"; output;) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_after | grep lr_in_ip_input | grep >> priority=90 | grep 169.254.0.1 | ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_ip_input ), priority=90 , match=(inport == >> "lr1-up" && arp.op == 1 && arp.tpa == 169.254.0.1 && arp.spa == >> 169.254.0.0/24), action=(eth.dst = eth.src; eth.src = xreg0[[0..47]]; >> arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = >> xreg0[[0..47]]; arp.tpa <-> arp.spa; outport = inport; flags.loopback >> = 1; output;) >> + table=??(lr_in_ip_input ), priority=90 , match=(ip4.dst == >> 169.254.0.1 && icmp4.type == 8 && icmp4.code == 0), action=(ip4.dst >> <-> ip4.src; ip.ttl = 255; icmp4.type = 0; flags.loopback = 1; next; >> ) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_after | grep lr_in_ip_input | grep >> priority=90 | grep 2001:db8:abcd:2::bad | ovn_strip_lflows], [0], >> [dnl >> + table=??(lr_in_ip_input ), priority=90 , match=(inport == >> "lr1-up" && ip6.dst == {2001:db8:abcd:2::bad, ff02::1:ff00:bad} && >> nd_ns && nd.target == 2001:db8:abcd:2::bad), action=(nd_na_router { >> eth.src = xreg0[[0..47]]; ip6.src = nd.target; nd.tll = >> xreg0[[0..47]]; outport = inport; flags.loopback = 1; output; };) >> + table=??(lr_in_ip_input ), priority=90 , match=(ip6.dst == >> {2001:db8:abcd:2::bad, fe80::1311:11ff:fe11:1111} && icmp6.type == >> 128 && icmp6.code == 0), action=(ip6.dst <-> ip6.src; ip.ttl = 255; >> icmp6.type = 129; flags.loopback = 1; next; ) >> +]) >> + >> +AT_CHECK([cat lr1_lflows_after | grep lr_in_admission | grep >> priority=50 | ovn_strip_lflows], [0], [dnl >> + table=??(lr_in_admission ), priority=50 , match=(eth.dst == >> 11:11:11:11:11:11 && inport == "lr1-up"), action=(xreg0[[0..47]] = >> 11:11:11:11:11:11; next;) >> + table=??(lr_in_admission ), priority=50 , match=(eth.dst == >> 12:12:12:12:12:12 && inport == "lr1-down"), action=(xreg0[[0..47]] = >> 12:12:12:12:12:12; next;) >> + table=??(lr_in_admission ), priority=50 , match=(eth.mcast && >> inport == "lr1-down"), action=(xreg0[[0..47]] = 12:12:12:12:12:12; >> next;) >> + table=??(lr_in_admission ), priority=50 , match=(eth.mcast && >> inport == "lr1-up"), action=(xreg0[[0..47]] = 11:11:11:11:11:11; >> next;) >> +]) >> + >> +OVN_CLEANUP_NORTHD >> +AT_CLEANUP >> +])
-- regards, Alexandra. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
