On 22.01.2026 00:17, [email protected] wrote:
> On Mon, 2026-01-19 at 12:23 +0300, Alexandra Rukomoinikova wrote:
>> 1) Added new option "distributed" for load balancers.
>>     With this feature, balancers will work distributedly across
>> compute nodes,
>>     balancing to only local backends (excluded east-west traffic)
>>
>> 2) If load balancer is running on a router with dgp, the router will
>> no
>>     longer be centralized on gateway - this means that access to
>> physical network will also be available from hosts where the
>> distributed balancer backends are located.
>>
>> 3) Configuration requirement for distributed load balancers:
>>      1) ip_port_mapping must be specified
>>      2) Balancing in underlay fabric between hosts with backends
>>
>> Example:
>> Load Balancer: lb1 with VIP 1.1.1.1 and distributed option enabled.
>> Fabric is configured with a static ECMP route for 1.1.1.1/32:
>>      nexthop via ip_host1 weight 1 (hosts backend1)
>>      nexthop via ip_host2 weight 1 (hosts backend2)
>>      nexthop via ip_host3 weight 2 (hosts backend3 and backend4)
>>
>> As part of testing, following estimates of distribution of requests
>> to balancers were obtained:
>> for i in $(seq 5000); do curl http://1.1.1.1:80 2>/dev/null ; echo ;
>> done | awk '{print $2}' | sort | uniq -c
>>     1265 “backend 4",
>>     1260 “backend 3",
>>     1224 “backend 2",
>>     1251 “backend 1",
>> Thus, requests using ecmp balancing are distributed between backends
>> approximately evenly.
>>
>> Suggested-by: Vladislav Odintsov <[email protected]>
>> Signed-off-by: Alexandra Rukomoinikova <[email protected]>
>> ---
>>   v3 --> v4: 1) fix arp reciving for distributed load balancing since
>> merging 01277be6c833ef1d1c3fb568c139d3f0ac2b363a
>>              2) remove system tests bc turned out to be useless, added
>> new multinode tests for use case
>>              3) updated ovn-sb doc for new action
>>              4) update doc about distributed option
>>              5) rebased on
>> https://github.com/ovn-org/ovn-fake-multinode/pull/114
> Hi Alexandra,
> thank you for the v4. The multinode tests conveys the topology much
> better. Overall I'd say it LGTM, though I'm not super confident in the
> area of incremental processing, so an additional +1 from someone else
> would be nice. I'll leave some in-line comments.
Hi Martin! Thank you for your time. I'll reply below.
>
> I also tried the topology that we discussed in the last version[0]
> (Distributed LR connected to fabric via GW routers on each chassis) and
> I found that the load-balancing gets concentrated only on the chassis
> that hosts chassis-redirect port for the DGP.
> I feel like we should be able to avoid sending the traffic of
> distributed load balancer to the CR port, I just haven't found the
> right tweak to do so. What do you think?
>
> NOTE: I don't think that this scenario needs to be supported for this
> patch to get accepted, but it would be nice if it did, because AFAIK
> this is (more or less) the topology of Openstack+OVN BGP
> implementation.
>
> [0] https://gist.github.com/mkalcok/7bc9af3f79969067bc1cea01841f6f47
>   
Yes, I'll fix this feature for this topology in a separate patch series. 
Overall, not sending traffic to gateway seems like the right approach. 
I'll think about it again and submit a patch.
>> ---
>>   NEWS                      |   3 +
>>   northd/en-lb-data.c       |   9 +
>>   northd/en-lb-data.h       |   3 +
>>   northd/en-lr-stateful.c   |   3 +
>>   northd/en-lr-stateful.h   |   2 +
>>   northd/lb.c               |  91 ++++++----
>>   northd/lb.h               |   7 +-
>>   northd/northd.c           | 197 ++++++++++++++-------
>>   northd/northd.h           |  17 ++
>>   ovn-nb.xml                |  16 +-
>>   ovn-sb.xml                |  11 ++
>>   tests/multinode-macros.at |  14 ++
>>   tests/multinode.at        | 159 +++++++++++++++++
>>   tests/ovn-northd.at       | 364
>> ++++++++++++++++++++++++++++++++++++++
>>   14 files changed, 791 insertions(+), 105 deletions(-)
>>
>> diff --git a/NEWS b/NEWS
>> index 9883fb81d..4d49b94f1 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -82,6 +82,9 @@ Post v25.09.0
>>        other_config column.
>>      - Introduce the capability to specify multiple ips for ovn-evpn-
>> local-ip
>>        option.
>> +  - Add "distributed" option for load balancer, that forces traffic
>> to be
>> +    routed only to backend instances running locally on the same
>> chassis
>> +    it arrives on.
>>   
>>   OVN v25.09.0 - xxx xx xxxx
>>   --------------------------
>> diff --git a/northd/en-lb-data.c b/northd/en-lb-data.c
>> index 6d52d465e..f67ab9c5f 100644
>> --- a/northd/en-lb-data.c
>> +++ b/northd/en-lb-data.c
>> @@ -166,6 +166,7 @@ lb_data_load_balancer_handler(struct engine_node
>> *node, void *data)
>>               add_crupdated_lb_to_tracked_data(lb, trk_lb_data,
>>                                                lb->health_checks);
>>               trk_lb_data->has_routable_lb |= lb->routable;
>> +            trk_lb_data->has_distributed_lb |= lb->is_distributed;
>>               continue;
>>           }
>>   
>> @@ -180,6 +181,7 @@ lb_data_load_balancer_handler(struct engine_node
>> *node, void *data)
>>               add_deleted_lb_to_tracked_data(lb, trk_lb_data,
>>                                              lb->health_checks);
>>               trk_lb_data->has_routable_lb |= lb->routable;
>> +            trk_lb_data->has_distributed_lb |= lb->is_distributed;
>>           } else {
>>               /* Load balancer updated. */
>>               bool health_checks = lb->health_checks;
>> @@ -189,11 +191,13 @@ lb_data_load_balancer_handler(struct
>> engine_node *node, void *data)
>>               sset_swap(&lb->ips_v6, &old_ips_v6);
>>               enum lb_neighbor_responder_mode neigh_mode = lb-
>>> neigh_mode;
>>               bool routable = lb->routable;
>> +            bool distributed_mode = lb->is_distributed;
>>               ovn_northd_lb_reinit(lb, tracked_lb);
>>               health_checks |= lb->health_checks;
>>               struct crupdated_lb *clb =
>> add_crupdated_lb_to_tracked_data(
>>                   lb, trk_lb_data, health_checks);
>>               trk_lb_data->has_routable_lb |= lb->routable;
>> +            trk_lb_data->has_distributed_lb |= lb->is_distributed;
>>   
>>               /* Determine the inserted and deleted vips and store
>> them in
>>                * the tracked data. */
>> @@ -226,6 +230,10 @@ lb_data_load_balancer_handler(struct engine_node
>> *node, void *data)
>>                   /* If neigh_mode is updated trigger a full
>> recompute. */
>>                   return EN_UNHANDLED;
>>               }
>> +            if (distributed_mode != lb->is_distributed) {
>> +                /* If neigh_mode is updated trigger a full
>> recompute. */
>> +                return EN_UNHANDLED;
>> +            }
>>           }
>>       }
>>   
>> @@ -687,6 +695,7 @@ handle_od_lb_changes(struct nbrec_load_balancer
>> **nbrec_lbs,
>>                                                                
>> lb_uuid);
>>                   ovs_assert(lb);
>>                   trk_lb_data->has_routable_lb |= lb->routable;
>> +                trk_lb_data->has_distributed_lb |= lb-
>>> is_distributed;
>>               }
>>           }
>>   
>> diff --git a/northd/en-lb-data.h b/northd/en-lb-data.h
>> index 1da087656..90e85b8c4 100644
>> --- a/northd/en-lb-data.h
>> +++ b/northd/en-lb-data.h
>> @@ -82,6 +82,9 @@ struct tracked_lb_data {
>>   
>>       /* Indicates if any lb (in the tracked data) has 'routable' flag
>> set. */
>>       bool has_routable_lb;
>> +
>> +    /* Indicates if any lb (in the tracked data) has 'distibuted'
>> flag set. */
>> +    bool has_distributed_lb;
>>   };
>>   
>>   /* Datapath (logical switch) to lb/lbgrp association data. */
>> diff --git a/northd/en-lr-stateful.c b/northd/en-lr-stateful.c
>> index 212c0641c..5a738f4c3 100644
>> --- a/northd/en-lr-stateful.c
>> +++ b/northd/en-lr-stateful.c
>> @@ -326,6 +326,7 @@ lr_stateful_lb_data_handler(struct engine_node
>> *node, void *data_)
>>                   ovn_datapaths_find_by_index(input_data.lr_datapaths,
>>                                               lr_stateful_rec-
>>> lr_index);
>>               lr_stateful_rec->has_lb_vip = od_has_lb_vip(od);
>> +            lr_stateful_rec->has_distributed_lb = od-
>>> is_distributed;
>>           }
>>   
>>           return EN_HANDLED_UPDATED;
>> @@ -527,7 +528,9 @@ lr_stateful_record_create(struct
>> lr_stateful_table *table,
>>       if (nbr->n_nat) {
>>           lr_stateful_rebuild_vip_nats(lr_stateful_rec);
>>       }
>> +
>>       lr_stateful_rec->has_lb_vip = od_has_lb_vip(od);
>> +    lr_stateful_rec->has_distributed_lb = od->is_distributed;
>>   
>>       hmap_insert(&table->entries, &lr_stateful_rec->key_node,
>>                   uuid_hash(&lr_stateful_rec->nbr_uuid));
>> diff --git a/northd/en-lr-stateful.h b/northd/en-lr-stateful.h
>> index 146f768c3..3b0c54521 100644
>> --- a/northd/en-lr-stateful.h
>> +++ b/northd/en-lr-stateful.h
>> @@ -59,6 +59,8 @@ struct lr_stateful_record {
>>   
>>       bool has_lb_vip;
>>   
>> +    bool has_distributed_lb;
>> +
>>       /* Load Balancer vIPs relevant for this datapath. */
>>       struct ovn_lb_ip_set *lb_ips;
>>   
>> diff --git a/northd/lb.c b/northd/lb.c
>> index 0822bc2d7..bfb255ccd 100644
>> --- a/northd/lb.c
>> +++ b/northd/lb.c
>> @@ -85,12 +85,12 @@ ovn_lb_ip_set_clone(struct ovn_lb_ip_set
>> *lb_ip_set)
>>       return clone;
>>   }
>>   
>> -static
>> -void ovn_northd_lb_vip_init(struct ovn_northd_lb_vip *lb_vip_nb,
>> -                            const struct ovn_lb_vip *lb_vip,
>> -                            const struct nbrec_load_balancer
>> *nbrec_lb,
>> -                            const char *vip_port_str, const char
>> *backend_ips,
>> -                            bool template)
>> +static void
>> +ovn_northd_lb_vip_init(struct ovn_northd_lb_vip *lb_vip_nb,
>> +                       const struct ovn_lb_vip *lb_vip,
>> +                       const struct nbrec_load_balancer *nbrec_lb,
>> +                       const char *vip_port_str, const char
>> *backend_ips,
>> +                       bool template)
>>   {
>>       lb_vip_nb->backend_ips = xstrdup(backend_ips);
>>       lb_vip_nb->n_backends = vector_len(&lb_vip->backends);
>> @@ -101,19 +101,22 @@ void ovn_northd_lb_vip_init(struct
>> ovn_northd_lb_vip *lb_vip_nb,
>>   }
>>   
>>   /*
>> - * Initializes health check configuration for load balancer VIP
>> - * backends. Parses the ip_port_mappings in the format :
>> - * "ip:logical_port:src_ip[:az_name]".
>> + * Parses ip_port_mappings in the format :
>> + * "ip:logical_port[:src_ip][:az_name]".
>> + * src_ip parameter is optional when distributed mode is enabled,
>> + * without health checks configured.
>>    * If az_name is present and non-empty, it indicates this is a
>>    * remote service monitor (backend is in another availability zone),
>>    * it should be propogated to another AZ by interconnection
>> processing.
>>    */
>>   static void
>> -ovn_lb_vip_backends_health_check_init(const struct ovn_northd_lb
>> *lb,
>> -                                      const struct ovn_lb_vip
>> *lb_vip,
>> -                                      struct ovn_northd_lb_vip
>> *lb_vip_nb)
>> +ovn_lb_vip_backends_ip_port_mappings_init(const struct ovn_northd_lb
>> *lb,
>> +                                          const struct ovn_lb_vip
>> *lb_vip,
>> +                                          struct ovn_northd_lb_vip
>> *lb_vip_nb)
>>   {
>>       struct ds key = DS_EMPTY_INITIALIZER;
>> +    bool allow_without_src_ip = lb->is_distributed
>> +                                && !lb_vip_nb->lb_health_check;
>>   
>>       for (size_t j = 0; j < vector_len(&lb_vip->backends); j++) {
>>           const struct ovn_lb_backend *backend =
>> @@ -127,26 +130,34 @@ ovn_lb_vip_backends_health_check_init(const
>> struct ovn_northd_lb *lb,
>>               continue;
>>           }
>>   
>> -        char *svc_mon_src_ip = NULL;
>> -        char *az_name = NULL;
>> +        struct ovn_northd_lb_backend *backend_nb = NULL;
>> +        char *port_name = NULL, *az_name = NULL, *first_colon =
>> NULL;
>> +        char *svc_mon_src_ip = NULL, *src_ip = NULL;
>>           bool is_remote = false;
>> -        char *port_name = xstrdup(s);
>> -        char *src_ip = NULL;
>>   
>> -        char *first_colon = strchr(port_name, ':');
>> -        if (!first_colon) {
>> -            free(port_name);
>> -            continue;
>> +        port_name = xstrdup(s);
>> +        first_colon = strchr(port_name, ':');
>> +
>> +        if (!first_colon && allow_without_src_ip) {
>> +            if (!*port_name) {
>> +                VLOG_WARN("Empty port name in distributed mode for
>> IP %s",
>> +                          ds_cstr(&key));
>> +                goto cleanup;
>> +            }
>> +            is_remote = false;
>> +            goto init_backend;
>> +        } else if (!first_colon) {
>> +            VLOG_WARN("Expected ':' separator for: %s", port_name);
>> +            goto cleanup;
>>           }
>> -        *first_colon = '\0';
>>   
>> +        *first_colon = '\0';
>>           if (first_colon[1] == '[') {
>>               /* IPv6 case - format: port:[ipv6]:az or port:[ipv6] */
>>               char *ip_end = strchr(first_colon + 2, ']');
>>               if (!ip_end) {
>>                   VLOG_WARN("Malformed IPv6 address in backend %s",
>> s);
>> -                free(port_name);
>> -                continue;
>> +                goto cleanup;
>>               }
>>   
>>               src_ip = first_colon + 2;
>> @@ -157,8 +168,7 @@ ovn_lb_vip_backends_health_check_init(const
>> struct ovn_northd_lb *lb,
>>                   if (!*az_name) {
>>                       VLOG_WARN("Empty AZ name specified for backend
>> %s",
>>                                 port_name);
>> -                    free(port_name);
>> -                    continue;
>> +                    goto cleanup;
>>                   }
>>                   is_remote = true;
>>               }
>> @@ -172,31 +182,31 @@ ovn_lb_vip_backends_health_check_init(const
>> struct ovn_northd_lb *lb,
>>                   if (!*az_name) {
>>                       VLOG_WARN("Empty AZ name specified for backend
>> %s",
>>                                 port_name);
>> -                    free(port_name);
>> -                    continue;
>> +                    goto cleanup;
>>                   }
>> -            is_remote = true;
>> +                is_remote = true;
>>               }
>>           }
>>   
>>           struct sockaddr_storage svc_mon_src_addr;
>>           if (!src_ip || !inet_parse_address(src_ip,
>> &svc_mon_src_addr)) {
>>               VLOG_WARN("Invalid svc mon src IP %s", src_ip ? src_ip :
>> "NULL");
>> +            goto cleanup;
>>           } else {
>>               struct ds src_ip_s = DS_EMPTY_INITIALIZER;
>>               ss_format_address_nobracks(&svc_mon_src_addr,
>> &src_ip_s);
>>               svc_mon_src_ip = ds_steal_cstr(&src_ip_s);
>>           }
>>   
>> -        if (svc_mon_src_ip) {
>> -            struct ovn_northd_lb_backend *backend_nb =
>> -                &lb_vip_nb->backends_nb[j];
>> -            backend_nb->health_check = true;
>> -            backend_nb->logical_port = xstrdup(port_name);
>> -            backend_nb->svc_mon_src_ip = svc_mon_src_ip;
>> -            backend_nb->az_name = is_remote ? xstrdup(az_name) :
>> NULL;
>> -            backend_nb->remote_backend = is_remote;
>> -        }
>> +init_backend:
>> +        backend_nb = &lb_vip_nb->backends_nb[j];
>> +        backend_nb->health_check = lb_vip_nb->lb_health_check;
>> +        backend_nb->logical_port = xstrdup(port_name);
>> +        backend_nb->svc_mon_src_ip = svc_mon_src_ip;
>> +        backend_nb->az_name = is_remote ? xstrdup(az_name) : NULL;
>> +        backend_nb->remote_backend = is_remote;
>> +        backend_nb->distributed_backend = lb->is_distributed;
>> +cleanup:
>>           free(port_name);
>>       }
>>   
>> @@ -367,6 +377,9 @@ ovn_northd_lb_init(struct ovn_northd_lb *lb,
>>           lb->hairpin_snat_ip = xstrdup(snat_ip);
>>       }
>>   
>> +    lb->is_distributed = smap_get_bool(&nbrec_lb->options,
>> "distributed",
>> +                                       false);
>> +
>>       sset_init(&lb->ips_v4);
>>       sset_init(&lb->ips_v6);
>>       struct smap_node *node;
>> @@ -406,8 +419,8 @@ ovn_northd_lb_init(struct ovn_northd_lb *lb,
>>           }
>>           n_vips++;
>>   
>> -        if (lb_vip_nb->lb_health_check) {
>> -            ovn_lb_vip_backends_health_check_init(lb, lb_vip,
>> lb_vip_nb);
>> +        if (lb_vip_nb->lb_health_check || lb->is_distributed) {
>> +            ovn_lb_vip_backends_ip_port_mappings_init(lb, lb_vip,
>> lb_vip_nb);
>>           }
>>       }
>>   
>> diff --git a/northd/lb.h b/northd/lb.h
>> index 53dc4abf0..0f2555a1e 100644
>> --- a/northd/lb.h
>> +++ b/northd/lb.h
>> @@ -74,8 +74,12 @@ struct ovn_northd_lb {
>>       /* Indicates if the load balancer has health checks configured.
>> */
>>       bool health_checks;
>>   
>> -    char *hairpin_snat_ip;
>> +    /* Indicates if distributed option is enabled for load balancer.
>> */
>> +    bool is_distributed;
>> +
>>       bool use_stateless_nat;
>> +
>> +    char *hairpin_snat_ip;
>>   };
>>   
>>   /* ovn-northd specific backend information. */
>> @@ -91,6 +95,7 @@ struct ovn_northd_lb_backend {
>>       bool health_check;
>>        /* Set to true if port does not locate in local AZ. */
>>       bool remote_backend;
>> +    bool distributed_backend;
>>       /* Logical port to which the ip belong to. */
>>       char *logical_port;
>>       /* Source IP address to be used for service monitoring. */
>> diff --git a/northd/northd.c b/northd/northd.c
>> index 2d01a01d9..a8c7df7a6 100644
>> --- a/northd/northd.c
>> +++ b/northd/northd.c
>> @@ -565,6 +565,7 @@ ovn_datapath_create(struct hmap *datapaths, const
>> struct uuid *key,
>>       od->localnet_ports = VECTOR_EMPTY_INITIALIZER(struct ovn_port
>> *);
>>       od->lb_with_stateless_mode = false;
>>       od->ipam_info_initialized = false;
>> +    od->is_distributed = false;
>>       od->tunnel_key = sdp->sb_dp->tunnel_key;
>>       init_mcast_info_for_datapath(od);
>>       return od;
>> @@ -3315,6 +3316,46 @@ ovn_lb_svc_create(struct ovsdb_idl_txn
>> *ovnsb_txn,
>>       }
>>   }
>>   
>> +static bool
>> +backend_is_available(const struct ovn_northd_lb *lb,
>> +                     const struct ovn_lb_backend *backend,
>> +                     const struct ovn_northd_lb_backend *backend_nb,
>> +                     const struct svc_monitors_map_data
>> *svc_mons_data)
>> +{
>> +    struct service_monitor_info *mon_info =
>> +        get_service_mon(svc_mons_data->local_svc_monitors_map,
>> +                        svc_mons_data->ic_learned_svc_monitors_map,
>> +                        backend->ip_str,
>> +                        backend_nb->logical_port,
>> +                        backend->port,
>> +                        lb->nlb->protocol && lb->nlb->protocol[0] ?
>> +                        lb->nlb->protocol : "tcp");
>> +
>> +    if (!mon_info) {
>> +        return false;
>> +    }
>> +
>> +    ovs_assert(mon_info->sbrec_mon);
>> +
>> +    return  mon_info->sbrec_mon->status &&
>> +            strcmp(mon_info->sbrec_mon->status, "online") ?
>> +            false : true;
> Wouldn't this function return `true` if the `mon_info->sbrec_mon-
>> status` is NULL? Maybe I'm just misunderstanding the way this ternary
> operation is evaluated, perhaps additional parenthesis would help here
> to better draw the individual "groups" of conditions that are
> evaluated.
>
> nit: There are two spaces between "return" and "mon_info->"
The problem here is my choice of name for this function; it checks by 
name whether the backend is available, but returns the exact opposite 
binary value - I'll fix that.
>> +}
>> +
>> +static inline void
>> +append_lb_backend_to_action(const struct ovn_lb_backend *backend,
>> +                            const struct ovn_northd_lb_backend
>> *backend_nb,
>> +                            bool distributed_mode,
>> +                            struct ds *action)
>> +{
>> +    bool ipv6 = !IN6_IS_ADDR_V4MAPPED(&backend->ip);
>> +    if (distributed_mode) {
>> +        ds_put_format(action, "\"%s\":", backend_nb->logical_port);
>> +    }
>> +    ds_put_format(action, ipv6 ? "[%s]:%"PRIu16"," :
>> "%s:%"PRIu16",",
>> +                  backend->ip_str, backend->port);
>> +}
>> +
>>   static bool
>>   build_lb_vip_actions(const struct ovn_northd_lb *lb,
>>                        const struct ovn_lb_vip *lb_vip,
>> @@ -3340,9 +3381,11 @@ build_lb_vip_actions(const struct
>> ovn_northd_lb *lb,
>>           }
>>       }
>>   
>> -    if (lb_vip_nb->lb_health_check) {
>> -        ds_put_cstr(action, "ct_lb_mark(backends=");
>> +    ds_put_format(action, "%s", lb->is_distributed
>> +                  ? "ct_lb_mark_local(backends="
>> +                  : "ct_lb_mark(backends=");
>>   
>> +    if (lb_vip_nb->lb_health_check || lb->is_distributed) {
>>           size_t i = 0;
>>           size_t n_active_backends = 0;
>>           const struct ovn_lb_backend *backend;
>> @@ -3350,45 +3393,37 @@ build_lb_vip_actions(const struct
>> ovn_northd_lb *lb,
>>               struct ovn_northd_lb_backend *backend_nb =
>>                   &lb_vip_nb->backends_nb[i++];
>>   
>> -            if (!backend_nb->health_check) {
>> +            if (lb_vip_nb->lb_health_check &&
>> +                !backend_nb->health_check) {
> nit: This condition is bit confusing to me. Could you please add short
> comment that explains when this situation can happen?
When load balancer itself is distributed, but the ip_port_mappings for 
this backend is not defined. I really don't like these two ifs —but then 
the code in northd/lb.c needs to be written differently—I don't yet see 
how that would look good.
>>                   continue;
>>               }
>>   
>> -            const char *protocol = lb->nlb->protocol;
>> -            if (!protocol || !protocol[0]) {
>> -                protocol = "tcp";
>> -            }
>> -
>> -            struct service_monitor_info *mon_info =
>> -                get_service_mon(svc_mons_data-
>>> local_svc_monitors_map,
>> -                                svc_mons_data-
>>> ic_learned_svc_monitors_map,
>> -                                backend->ip_str,
>> -                                backend_nb->logical_port,
>> -                                backend->port,
>> -                                protocol);
>> -
>> -            if (!mon_info) {
>> +            if (lb->is_distributed &&
>> +                !backend_nb->distributed_backend) {
>>                   continue;
>>               }
>>   
>> -            ovs_assert(mon_info->sbrec_mon);
>> -            if (mon_info->sbrec_mon->status &&
>> -                    strcmp(mon_info->sbrec_mon->status, "online")) {
>> +
>> +            if (backend_nb->health_check &&
>> +                !backend_is_available(lb,
>> +                                      backend,
>> +                                      backend_nb,
>> +                                      svc_mons_data)) {
>>                   continue;
>>               }
>>   
>>               n_active_backends++;
>> -            bool ipv6 = !IN6_IS_ADDR_V4MAPPED(&backend->ip);
>> -            ds_put_format(action, ipv6 ? "[%s]:%"PRIu16"," :
>> "%s:%"PRIu16",",
>> -                          backend->ip_str, backend->port);
>> +            append_lb_backend_to_action(backend,
>> +                                        backend_nb,
>> +                                        backend_nb-
>>> distributed_backend,
>> +                                        action);
>>           }
>>           ds_chomp(action, ',');
>>   
>>           drop = !n_active_backends && !lb_vip->empty_backend_rej;
>>           reject = !n_active_backends && lb_vip->empty_backend_rej;
>>       } else {
>> -        ds_put_format(action, "ct_lb_mark(backends=%s",
>> -                      lb_vip_nb->backend_ips);
>> +        ds_put_format(action, "%s", lb_vip_nb->backend_ips);
>>       }
>>   
>>       if (reject) {
>> @@ -3425,6 +3460,20 @@ build_lb_vip_actions(const struct
>> ovn_northd_lb *lb,
>>       return reject;
>>   }
>>   
>> +static inline void
>> +handle_od_lb_datapath_modes(struct ovn_datapath *od,
>> +                            struct ovn_lb_datapaths *lb_dps,
>> +                            bool od_is_switch)
>> +{
> nit/question: Couldn't you use NULL value of either `od->nbr` or `od-
>> nbs` to determine if the OD is switch or router? Instead of using
> explicit function argument?
ack, i will fix
>> +    if (od_is_switch && od->lb_with_stateless_mode) {
>> +        hmapx_add(&lb_dps->ls_lb_with_stateless_mode, od);
>> +    }
>> +
>> +    if (!od_is_switch && lb_dps->lb->is_distributed) {
>> +        od->is_distributed = true;
>> +    }
>> +}
>> +
>>   static void
>>   build_lb_datapaths(const struct hmap *lbs, const struct hmap
>> *lb_groups,
>>                      struct ovn_datapaths *ls_datapaths,
>> @@ -3467,9 +3516,7 @@ build_lb_datapaths(const struct hmap *lbs,
>> const struct hmap *lb_groups,
>>               lb_dps = ovn_lb_datapaths_find(lb_datapaths_map,
>> lb_uuid);
>>               ovs_assert(lb_dps);
>>               ovn_lb_datapaths_add_ls(lb_dps, 1, &od,
>> ods_size(ls_datapaths));
>> -            if (od->lb_with_stateless_mode) {
>> -                hmapx_add(&lb_dps->ls_lb_with_stateless_mode, od);
>> -            }
>> +            handle_od_lb_datapath_modes(od, lb_dps, true);
>>           }
>>   
>>           for (size_t i = 0; i < od->nbs->n_load_balancer_group; i++)
>> {
>> @@ -3503,6 +3550,7 @@ build_lb_datapaths(const struct hmap *lbs,
>> const struct hmap *lb_groups,
>>               lb_dps = ovn_lb_datapaths_find(lb_datapaths_map,
>> lb_uuid);
>>               ovs_assert(lb_dps);
>>               ovn_lb_datapaths_add_lr(lb_dps, 1, &od,
>> ods_size(lr_datapaths));
>> +            handle_od_lb_datapath_modes(od, lb_dps, false);
>>           }
>>       }
>>   
>> @@ -3855,6 +3903,7 @@ sync_pb_for_lrp(struct ovn_port *op,
>>           smap_add(&new, "distributed-port", op->primary_port->key);
>>   
>>           bool always_redirect =
>> +            !lr_stateful_rec->has_distributed_lb &&
>>               !lr_stateful_rec->lrnat_rec->has_distributed_nat &&
>>               !l3dgw_port_has_associated_vtep_lports(op-
>>> primary_port);
>>   
>> @@ -5444,10 +5493,7 @@ northd_handle_lb_data_changes(struct
>> tracked_lb_data *trk_lb_data,
>>               lb_dps = ovn_lb_datapaths_find(lb_datapaths_map,
>> &uuidnode->uuid);
>>               ovs_assert(lb_dps);
>>               ovn_lb_datapaths_add_ls(lb_dps, 1, &od,
>> ods_size(ls_datapaths));
>> -
>> -            if (od->lb_with_stateless_mode) {
>> -                hmapx_add(&lb_dps->ls_lb_with_stateless_mode, od);
>> -            }
>> +            handle_od_lb_datapath_modes(od, lb_dps, true);
>>   
>>               /* Add the lb to the northd tracked data. */
>>               hmapx_add(&nd_changes->trk_lbs.crupdated, lb_dps);
>> @@ -5486,6 +5532,7 @@ northd_handle_lb_data_changes(struct
>> tracked_lb_data *trk_lb_data,
>>               lb_dps = ovn_lb_datapaths_find(lb_datapaths_map,
>> &uuidnode->uuid);
>>               ovs_assert(lb_dps);
>>               ovn_lb_datapaths_add_lr(lb_dps, 1, &od,
>> ods_size(lr_datapaths));
>> +            handle_od_lb_datapath_modes(od, lb_dps, false);
>>   
>>               /* Add the lb to the northd tracked data. */
>>               hmapx_add(&nd_changes->trk_lbs.crupdated, lb_dps);
>> @@ -9689,22 +9736,15 @@ build_lswitch_arp_chassis_resident(const
>> struct ovn_datapath *od,
>>   {
>>       struct sset distributed_nat_ports =
>>           SSET_INITIALIZER(&distributed_nat_ports);
>> -    struct sset resident_ports = SSET_INITIALIZER(&resident_ports);
>> -    struct sset inports = SSET_INITIALIZER(&inports);
>> +    struct hmapx resident_ports =
>> HMAPX_INITIALIZER(&resident_ports);
>>       struct ds match = DS_EMPTY_INITIALIZER;
>>   
>> -    struct hmapx_node *node;
>> -    HMAPX_FOR_EACH (node, &od->phys_ports) {
>> -        struct ovn_port *op = node->data;
>> -        sset_add(&inports, op->json_key);
>> -    }
>> -
>>       struct ovn_port *op;
>>       VECTOR_FOR_EACH (&od->router_ports, op) {
>>           struct ovn_port *op_r = op->peer;
>>   
>>           if (lrp_is_l3dgw(op_r)) {
>> -            sset_add(&resident_ports, op_r->cr_port->json_key);
>> +            hmapx_add(&resident_ports, op_r);
>>           }
>>       }
>>   
>> @@ -9720,23 +9760,32 @@ build_lswitch_arp_chassis_resident(const
>> struct ovn_datapath *od,
>>           }
>>       }
>>   
>> -    if (!sset_is_empty(&inports) && !sset_is_empty(&resident_ports))
>> {
>> +    if (!hmapx_is_empty(&od->phys_ports) &&
>> !hmapx_is_empty(&resident_ports)) {
>> +        struct hmapx_node *node;
>>           const char *port_name;
>>   
>> -        SSET_FOR_EACH (port_name, &inports) {
>> +        HMAPX_FOR_EACH (node, &od->phys_ports) {
>> +            op = node->data;
>> +
>>               ds_clear(&match);
>> -            ds_put_format(&match, "arp.op == 1 && inport == %s",
>> -                          port_name);
>> +            ds_put_format(&match, "(arp.op == 1 || arp.op == 2) "
>> +                                  "&& inport == %s",
>> +                          op->json_key);
>>               ovn_lflow_add(lflows, od, S_SWITCH_IN_CHECK_PORT_SEC,
>> 75,
>>                             ds_cstr(&match), REGBIT_EXT_ARP " = 1;
>> next;",
>>                             ar->lflow_ref);
>>           }
>>   
>> -        SSET_FOR_EACH (port_name, &resident_ports) {
>> +        HMAPX_FOR_EACH (node, &resident_ports) {
>> +            op = node->data;
>> +
>>               ds_clear(&match);
>> -            ds_put_format(&match, REGBIT_EXT_ARP" == 1 "
>> -                                  "&& is_chassis_resident(%s)",
>> -                          port_name);
>> +            ds_put_format(&match, REGBIT_EXT_ARP" == 1");
>> +            if (od_is_centralized(op->od)) {
>> +                ds_put_format(&match, " && is_chassis_resident(%s)",
>> +                              op
>> +                              ->cr_port->json_key);
> nit: the "->cr_port" probably doesn't need to be on its own line.
>
>> +            }
>>               ovn_lflow_add(lflows, od, S_SWITCH_IN_APPLY_PORT_SEC,
>> 75,
>>                             ds_cstr(&match), "next;", ar->lflow_ref);
>>           }
>> @@ -9755,8 +9804,7 @@ build_lswitch_arp_chassis_resident(const struct
>> ovn_datapath *od,
>>       }
>>   
>>       sset_destroy(&distributed_nat_ports);
>> -    sset_destroy(&resident_ports);
>> -    sset_destroy(&inports);
>> +    hmapx_destroy(&resident_ports);
>>       ds_destroy(&match);
>>   }
>>   
>> @@ -10769,8 +10817,13 @@ build_lswitch_ip_unicast_lookup(struct
>> ovn_port *op,
>>                            : debug_drop_action();
>>   
>>       if (lsp_is_router(op->nbsp) && op->peer && op->peer->nbrp) {
>> +        /* Distributed gateway ports default to centralized mode.
>> +         * They operate in distributed mode only when configured
>> +         * on their bound router. */
>> +        bool peer_lrp_is_centralized = od_is_centralized(op->peer-
>>> od);
>> +
>>           /* For ports connected to logical routers add flows to
>> bypass the
>> -         * broadcast flooding of ARP/ND requests in table 19. We
>> direct the
>> +         * broadcast flooding of ARP/ND requests in table 22. We
>> direct the
>>            * requests only to the router port that owns the IP
>> address.
>>            */
>>           build_lswitch_rport_arp_req_flows(op->peer, op->od, op,
>> lflows,
>> @@ -10785,7 +10838,9 @@ build_lswitch_ip_unicast_lookup(struct
>> ovn_port *op,
>>               ds_put_format(match, "eth.dst == %s", op->peer-
>>> lrp_networks.ea_s);
>>           }
>>   
>> -        if (!vector_is_empty(&op->peer->od->l3dgw_ports) &&
>> +
>> +        if (peer_lrp_is_centralized &&
>> +            !vector_is_empty(&op->peer->od->l3dgw_ports) &&
>>               !vector_is_empty(&op->od->localnet_ports)) {
>>               add_lrp_chassis_resident_check(op->peer, match);
>>           } else if (op->cr_port) {
>> @@ -12635,6 +12690,13 @@ build_distr_lrouter_nat_flows_for_lb(struct
>> lrouter_nat_lb_flows_ctx *ctx,
>>       size_t new_match_len = ctx->new_match->length;
>>       size_t undnat_match_len = ctx->undnat_match->length;
>>   
>> +    bool lb_is_centralized = !ctx->lb->is_distributed;
>> +
>> +    /* If load balancer is distributed, then the response traffic
>> +     * must be returned through the distributed port.*/
>> +    const char *gw_outport = lb_is_centralized ? dgp->cr_port-
>>> json_key
>> +                                               : dgp->json_key;
>> +
>>       const char *meter = NULL;
>>   
>>       if (ctx->reject) {
>> @@ -12646,8 +12708,9 @@ build_distr_lrouter_nat_flows_for_lb(struct
>> lrouter_nat_lb_flows_ctx *ctx,
>>                                                       dgp, meter);
>>       }
>>   
>> -    if (!vector_is_empty(&ctx->lb_vip->backends) ||
>> -        !ctx->lb_vip->empty_backend_rej) {
>> +    if (lb_is_centralized &&
>> +        (!vector_is_empty(&ctx->lb_vip->backends) ||
>> +        !ctx->lb_vip->empty_backend_rej)) {
>>           ds_put_format(ctx->new_match, " && is_chassis_resident(%s)",
>>                         dgp->cr_port->json_key);
>>       }
>> @@ -12684,18 +12747,21 @@ build_distr_lrouter_nat_flows_for_lb(struct
>> lrouter_nat_lb_flows_ctx *ctx,
>>        * the undnat stage.
>>        */
>>       ds_put_format(ctx->undnat_match, ") && outport == %s", dgp-
>>> json_key);
>> -    ds_clear(ctx->gw_redir_action);
>> -    ds_put_format(ctx->gw_redir_action, "outport = %s; next;",
>> -                  dgp->cr_port->json_key);
>> +    ds_put_format(ctx->gw_redir_action,
>> +                  "outport = %s; next;", gw_outport);
>>   
>>       ovn_lflow_add(ctx->lflows, od, S_ROUTER_IN_GW_REDIRECT, 200,
>>                     ds_cstr(ctx->undnat_match), ds_cstr(ctx-
>>> gw_redir_action),
>>                     lflow_ref, WITH_HINT(&ctx->lb->nlb->header_));
>>       ds_truncate(ctx->undnat_match, undnat_match_len);
>>   
>> -    ds_put_format(ctx->undnat_match, ") && (inport == %s || outport
>> == %s)"
>> -                  " && is_chassis_resident(%s)", dgp->json_key, dgp-
>>> json_key,
>> -                  dgp->cr_port->json_key);
>> +    ds_put_format(ctx->undnat_match, ") && (inport == %s || outport
>> == %s)",
>> +                  dgp->json_key, dgp->json_key);
>> +
>> +    if (lb_is_centralized) {
>> +        ds_put_format(ctx->undnat_match, " &&
>> is_chassis_resident(%s)",
>> +                      dgp->cr_port->json_key);
>> +    }
>>       ovn_lflow_add(ctx->lflows, od, S_ROUTER_OUT_UNDNAT, 120,
>>                     ds_cstr(ctx->undnat_match),
>> ds_cstr(&undnat_action),
>>                     lflow_ref, WITH_HINT(&ctx->lb->nlb->header_));
>> @@ -14022,6 +14088,10 @@ build_gateway_mtu_flow(struct lflow_table
>> *lflows, struct ovn_port *op,
>>   static bool
>>   consider_l3dgw_port_is_centralized(struct ovn_port *op)
>>   {
>> +    if (!od_is_centralized(op->od)) {
>> +        return false;
>> +    }
>> +
>>       if (l3dgw_port_has_associated_vtep_lports(op)) {
>>           return false;
>>       }
>> @@ -16224,7 +16294,7 @@ build_ipv6_input_flows_for_lrouter_port(
>>        * router's own IP address. */
>>       for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
>>           ds_clear(match);
>> -        if (lrp_is_l3dgw(op)) {
>> +        if (lrp_is_l3dgw(op) && od_is_centralized(op->od)) {
>>               /* Traffic with eth.src = l3dgw_port->lrp_networks.ea_s
>>                * should only be sent from the gateway chassi, so that
>>                * upstream MAC learning points to the gateway chassis.
>> @@ -16500,7 +16570,8 @@ build_lrouter_ipv4_ip_input(struct ovn_port
>> *op,
>>                         op->lrp_networks.ipv4_addrs[i].network_s,
>>                         op->lrp_networks.ipv4_addrs[i].plen);
>>   
>> -        if (!vector_is_empty(&op->od->l3dgw_ports) && op->peer
>> +        if (od_is_centralized(op->od) &&
>> +            !vector_is_empty(&op->od->l3dgw_ports) && op->peer
>>               && !vector_is_empty(&op->peer->od->localnet_ports)) {
>>               add_lrp_chassis_resident_check(op, match);
>>           }
>> diff --git a/northd/northd.h b/northd/northd.h
>> index e4d7de9f9..84b76f886 100644
>> --- a/northd/northd.h
>> +++ b/northd/northd.h
>> @@ -450,6 +450,11 @@ struct ovn_datapath {
>>       /* Indicates that the LS has valid vni associated with it. */
>>       bool has_evpn_vni;
>>   
>> +    /* True if datapath has some distributed dependencies.
>> +     * Currently, this only applies to load balancers attached to
>> datapath
>> +     * with distributed mode enabled. */
>> +    bool is_distributed;
>> +
>>       /* OVN northd only needs to know about logical router gateway
>> ports for
>>        * NAT/LB on a distributed router.  The "distributed gateway
>> ports" are
>>        * populated only when there is a gateway chassis or ha chassis
>> group
>> @@ -1151,6 +1156,18 @@ ovn_port_must_learn_route(const struct
>> ovn_port *op,
>>       return true;
>>   }
>>   
>> + /* Returns true if datapath 'od' operates in centralized mode on
>> gateway.
>> + *
>> + * Returns false when datapath is distributed. A datapath is
>> distributed
>> + * only when configured with the 'distributed' option enabled. In
>> distributed
>> + * mode, ARP/ND processing is handled locally on each node.
>> + */
>> +static inline bool
>> +od_is_centralized(const struct ovn_datapath *od)
>> +{
>> +    return !od->is_distributed;
>> +}
>> +
> nit/personal opinion: Having a function that takes a "struct" just to
> return inverted value of struct's member feels a bit
> redundant/overkill. But I might be wrong.
We are passing a pointer, not a structure - this is normal, and besides, 
this is an inline function
>
>   
>>   struct ovn_port *ovn_port_find(const struct hmap *ports, const char
>> *name);
>>   
>>   void build_igmp_lflows(struct hmap *igmp_groups,
>> diff --git a/ovn-nb.xml b/ovn-nb.xml
>> index e74c0d010..02a864151 100644
>> --- a/ovn-nb.xml
>> +++ b/ovn-nb.xml
>> @@ -2399,13 +2399,15 @@
>>           <p>
>>             Maps from endpoint IP to a colon-separated pair of logical
>> port name
>>             and source IP,
>> -          e.g. <code><var>port_name</var>:<var>sourc_ip</var></code>
>> for IPv4.
>> +          e.g.
>> <code><var>port_name</var>:<var>source_ip</var></code> for IPv4.
>>             Health checks are sent to this port with the specified
>> source IP.
>>             For IPv6 square brackets must be used around IP address,
>> e.g:
>> -          <code><var>port_name</var>:<var>[sourc_ip]</var></code>
>> +          <code><var>port_name</var>:<var>[source_ip]</var></code>
>>             Remote endpoint:
>>             Specify :target_zone_name at the end of the above syntax
>> to create
>>             remote health checks in a specific zone.
>> +          For distributed load balancers - ip_port_mappings is
>> required.
>> +          In the absence of health checks - source_ip is optional.
>>           </p>
>>   
>>           <p>
>> @@ -2608,6 +2610,16 @@ or
>>           traffic may be dropped in scenarios where we have different
>> chassis
>>           for each DGP. This option is set to <code>false</code> by
>> default.
>>         </column>
>> +
>> +      <column name="options" key="distributed">
>> +        Option enables distributed load balancing across compute
>> nodes,
>> +        ensuring traffic is always routed to local backends —
>> eliminating
>> +        east-west traffic between nodes.
>> +        Required configuration: <ref column="ip_port_mappings"/>.
>> +        NOTE: The addressing of the underlay network must not
>> overlap with
>> +        the addressing of Load Balancer VIP.
> Thank you for adding this note/expansion. I wonder if we could also add
> a bit more explanation to it. e.g.:
>
> NOTE: The addressing of the underlay network must not overlap with the
> addressing of Load Balancer VIP. If the Load Balancer is attached to a
> router that is directly connected to the underlay network and the VIP
> belongs to the same subnet as used on the underlay network, the traffic
> won't be spread across all chassis! Instead, it will be concentrated
> only on the chassis that hosts the Distributed Gateway Port of the
> router.
>
> It doesn't have to be word-for-word, but I believe that we should
> mention potential effect that this misconfiguration could cause. I'm
> sure it will help someone in the future when debugging a mysterious LB
> behavior :D
>
>
>
ack, i will fix
>> +      </column>
>> +
>>       </group>
>>     </table>
>>   
>> diff --git a/ovn-sb.xml b/ovn-sb.xml
>> index 623aaeffd..19c7a6877 100644
>> --- a/ovn-sb.xml
>> +++ b/ovn-sb.xml
>> @@ -2130,6 +2130,17 @@
>>             </p>
>>           </dd>
>>   
>> +        <dt><code>ct_lb_mark_local;</code></dt>
>> +
>> <dt><code>ct_lb_mark_local(backends=<var>lport_name</var>[<var>ip</va
>> r>[:<var>port</var>][,...][;
>> hash_fields=<var>field1</var>,<var>field2</var>,...][;
>> ct_flag]);</code></dt>
>> +        <dd>
>> +          <p>
>> +              Same as <code>ct_lb_mark</code>, with the key
>> difference that it
>> +              implements local-only load balancing. This mode
>> selects backends
>> +              only from those running on the current chassis,
>> preventing
>> +              traffic from being forwarded gto backends on remote
>> nodes.
>> +          </p>
>> +        </dd>
>> +
>>           <dt>
>>             <code><var>R</var> = dns_lookup();</code>
>>           </dt>
>> diff --git a/tests/multinode-macros.at b/tests/multinode-macros.at
>> index ad09ac562..24b15f64b 100644
>> --- a/tests/multinode-macros.at
>> +++ b/tests/multinode-macros.at
>> @@ -22,6 +22,20 @@ m4_define([M_NS_CHECK_EXEC],
>>       [ AT_CHECK([M_NS_EXEC([$1], [$2], [$3])],
>> m4_shift(m4_shift(m4_shift($@)))) ]
>>   )
>>   
>> +# M_FAKE_NODE_EXEC([fake_node], [command])
>> +#
>> +# Execute 'command' in 'fake_node'
>> +m4_define([M_FAKE_NODE_EXEC],
>> +    [podman exec $1 $2])
>> +
>> +# M_FAKE_NODE_CHECK_EXEC([fake_node], [command], other_params...)
>> +#
>> +# Wrapper for AT_CHECK that executes 'command' inside 'fake_node'.
>> +# 'other_params' as passed as they are to AT_CHECK.
>> +m4_define([M_FAKE_NODE_CHECK_EXEC],
>> +    [ AT_CHECK([M_FAKE_NODE_EXEC([$1], [$2])],
>> m4_shift(m4_shift($@))) ]
>> +)
>> +
>>   # M_DAEMONIZE([fake_node],[command],[pidfile])
>>   m4_define([M_DAEMONIZE],
>>       [podman exec $1 $2 & echo $! > $3
>> diff --git a/tests/multinode.at b/tests/multinode.at
>> index 83cdd6c01..b1336c31c 100644
>> --- a/tests/multinode.at
>> +++ b/tests/multinode.at
>> @@ -4296,3 +4296,162 @@ M_NS_CHECK_EXEC([ovn-chassis-2], [ovn-ext2],
>> [ping6 -q -c 3 -i 0.3 -w 2 6812:86:
>>   m_wait_row_count mac_binding 1 ip="6812\:86\:\:102"
>> logical_port="lr1-pub"
>>   
>>   AT_CLEANUP
>> +
>> +AT_SETUP([Distribute load balancing: IPv4])
>> +#
>> +# ┌──────────────────────┐
>> +# │ fabric (leaf switch) │
>> +# │       gw-1           │
>> +# └─────────┬────────────┘
>> +#           │
>> +#    ┌──────┴──────┐
>> +#    │             │
>> +#    ▼             ▼
>> +#  route          route
>> +#  weight 1      weight 2 (2 backends)
>> +#    │             │
>> +#    ▼             ▼
>> +# ┌───────┐      ┌───────┐
>> +# │Chassis│      │Chassis│
>> +# │   1   │      │   2   │
>> +# └───-───┘      └───-───┘
>> +#
>> +
>> +check_fake_multinode_setup
>> +cleanup_multinode_resources
>> +for c in ovn-chassis-1 ovn-chassis-2
>> +do
>> +    m_as $c ip link del lb-host
>> +    m_as $c ovs-vsctl del-br br-lb
>> +    m_as $c ovs-vsctl del-port lb-ovs
>> +done
>> +
>> +OVS_WAIT_UNTIL([m_as ovn-chassis-1 ip link show | grep -q
>> genev_sys])
>> +OVS_WAIT_UNTIL([m_as ovn-chassis-2 ip link show | grep -q
>> genev_sys])
>> +
>> +check multinode_nbctl ls-add pub                \
>> +    -- lsp-add-router-port pub pub-lr1 lr1-pub  \
>> +    -- lsp-add-localnet-port pub pub-ln public
>> +
>> +check multinode_nbctl lr-add lr1 \
>> +    -- lrp-add lr1 lr1-pub 00:00:00:00:00:01 169.254.1.254/24 \
>> +    -- lrp-add lr1 lr1-down 00:00:00:00:00:02 192.168.1.254/24
>> +
>> +check multinode_nbctl ls-add ls1
>> +check multinode_nbctl lsp-add ls1 ls1p1
>> +check multinode_nbctl lsp-set-addresses ls1p1 "00:00:00:01:01:02
>> 192.168.1.1"
>> +check multinode_nbctl lsp-add ls1 ls1p2
>> +check multinode_nbctl lsp-set-addresses ls1p2 "00:00:00:01:02:02
>> 192.168.1.2"
>> +check multinode_nbctl lsp-add ls1 ls1p3
>> +check multinode_nbctl lsp-set-addresses ls1p3 "00:00:00:01:03:02
>> 192.168.1.3"
>> +check multinode_nbctl lsp-add-router-port ls1 ls1-lr1 lr1-down
>> +
>> +check multinode_nbctl lrp-set-gateway-chassis lr1-pub ovn-chassis-2
>> +
>> +# Create default route
>> +check multinode_nbctl lr-route-add lr1 0.0.0.0/0 169.254.1.253 lr1-
>> pub
>> +
>> +m_as ovn-chassis-1 /data/create_fake_vm.sh ls1p1 ls1p1
>> 00:00:00:01:01:02 1500 192.168.1.1 24 192.168.1.254 2001::1/64
>> 2001::a
>> +m_as ovn-chassis-2 /data/create_fake_vm.sh ls1p2 ls1p2
>> 00:00:00:01:02:02 1500 192.168.1.2 24 192.168.1.254 2001::2/64
>> 2001::a
>> +m_as ovn-chassis-2 /data/create_fake_vm.sh ls1p3 ls1p3
>> 00:00:00:01:03:02 1500 192.168.1.3 24 192.168.1.254 2001::3/64
>> 2001::a
>> +
>> +# Create load balancer
>> +lb_vip="1.1.1.1"
> nit: I think we should stay way from using real/routable IP addresses
> in the tests (in this and other tests in this change). Let's use either
> private block or one of the blocks reserved for documentation (RFC
> 7537)
ack, i will fix
> Thanks again,
> Martin.
>
>> +check multinode_nbctl lb-add lb1 $lb_vip:80
>> 192.168.1.1:10880,192.168.1.2:10880,192.168.1.3:10880
>> +check multinode_nbctl lr-lb-add lr1 lb1
>> +check multinode_nbctl set Load_Balancer lb1
>> ip_port_mappings:192.168.1.1=ls1p1:192.168.1.199
>> +check multinode_nbctl set Load_Balancer lb1
>> ip_port_mappings:192.168.1.2=ls1p2:192.168.1.199
>> +check multinode_nbctl set Load_Balancer lb1
>> ip_port_mappings:192.168.1.3=ls1p3:192.168.1.199
>> +check multinode_nbctl set load_balancer lb1 options:distributed=true
>> +
>> +ip_ch1=$(m_as ovn-chassis-1 ip a show dev eth1 | grep "inet " | awk
>> '{print $2}'| cut -d '/' -f1)
>> +ip_ch2=$(m_as ovn-chassis-2 ip a show dev eth1 | grep "inet " | awk
>> '{print $2}'| cut -d '/' -f1)
>> +
>> +# Add multipath route to load balancer VIP with weighted nexthops on
>> "fabric" host:
>> +check m_as ovn-gw-1 ip route flush $lb_vip
>> +check m_as ovn-gw-1 ip route add $lb_vip  nexthop via $ip_ch1 dev
>> eth1 weight 1 nexthop via $ip_ch2 dev eth1 weight 2
>> +
>> +# Set kernel multipath hash policy to L3/L4 (source/destination
>> IP+port)
>> +# Policy 1 = Layer 3/4 hash (src/dst IP+port)
>> +M_FAKE_NODE_CHECK_EXEC([ovn-gw-1], [sysctl -w
>> net.ipv4.fib_multipath_hash_policy=1], \
>> +[0], [dnl
>> +net.ipv4.fib_multipath_hash_policy = 1
>> +])
>> +
>> +# Check OpenFlow group filling: it should only contain local
>> backends
>> +M_FAKE_NODE_CHECK_EXEC([ovn-chassis-1], [ovs-ofctl dump-groups br-
>> int | sed -e 's/table=[[0-9]]*/table=<cleared>/g'], [0], [dnl
>> +NXST_GROUP_DESC reply (xid=0x2):
>> +
>> group_id=1,type=select,selection_method=dp_hash,bucket=bucket_id:0,we
>> ight:100,actions=ct(commit,table=<cleared>,zone=NXM_NX_REG11[[0..15]]
>> ,nat(dst=192.168.1.1:10880),exec(load:0x1->NXM_NX_CT_MARK[[1]]))
>> +])
>> +
>> +M_FAKE_NODE_CHECK_EXEC([ovn-chassis-2], [ovs-ofctl dump-groups br-
>> int | sed -e 's/table=[[0-9]]*/table=<cleared>/g'], [0], [dnl
>> +NXST_GROUP_DESC reply (xid=0x2):
>> +
>> group_id=2,type=select,selection_method=dp_hash,bucket=bucket_id:1,we
>> ight:100,actions=ct(commit,table=<cleared>,zone=NXM_NX_REG11[[0..15]]
>> ,nat(dst=192.168.1.2:10880),exec(load:0x1-
>>> NXM_NX_CT_MARK[[1]])),bucket=bucket_id:2,weight:100,actions=ct(commi
>> t,table=<cleared>,zone=NXM_NX_REG11[[0..15]],nat(dst=192.168.1.3:1088
>> 0),exec(load:0x1->NXM_NX_CT_MARK[[1]]))
>> +])
>> +
>> +physicl_gw_mac_address="30:42:f5:a7:46:65"
>> +
>> +# Configure infrastructure on chassis hosts:
>> +# lb-host (physicl_gw_mac_address) - (veth) lb-ovs - br-lb - br-int
>> +for c in ovn-chassis-1 ovn-chassis-2
>> +do
>> +    check m_as $c ip link add lb-host type veth peer lb-ovs
>> +    on_exit "m_as $c ip link del lb-host"
>> +
>> +    check m_as $c ip link set dev lb-host address
>> $physicl_gw_mac_address
>> +    check m_as $c ip addr add 169.254.1.253/24 dev lb-host
>> +    check m_as $c ip link set lb-host up
>> +    check m_as $c ip link set lb-ovs up
>> +
>> +    check m_as $c ovs-vsctl add-br br-lb
>> +    on_exit "m_as $c ovs-vsctl del-br br-lb"
>> +    check m_as $c ovs-vsctl add-port br-lb lb-ovs
>> +    on_exit "m_as $c ovs-vsctl del-port lb-ovs"
>> +    check m_as $c ovs-vsctl set open . external-ids:ovn-bridge-
>> mappings=public:br-lb
>> +
>> +    check m_as $c ip route flush $lb_vip
>> +    check m_as $c ip r add $lb_vip via 169.254.1.254 dev lb-host
>> +    on_exit "m_as $c ip route flush $lb_vip"
>> +done
>> +
>> +OVS_WAIT_UNTIL([m_as ovn-chassis-1 ovs-vsctl show | grep -q patch-
>> pub-ln-to-br-int])
>> +OVS_WAIT_UNTIL([m_as ovn-chassis-2 ovs-vsctl show | grep -q patch-
>> pub-ln-to-br-int])
>> +
>> +M_NS_DAEMONIZE([ovn-chassis-1], [ls1p1], [$PYTHON3
>> /data/tcp_metadata_server.py \
>> +                                                   --bind-host
>> 192.168.1.1 \
>> +                                                   --bind-port 10880
>> \
>> +                                                   --reply-string
>> ls1p1 >/dev/null 2>&1], [nc1.pid])
>> +M_NS_DAEMONIZE([ovn-chassis-2], [ls1p2], [$PYTHON3
>> /data/tcp_metadata_server.py \
>> +                                                   --bind-host
>> 192.168.1.2 \
>> +                                                   --bind-port 10880
>> \
>> +                                                   --reply-string
>> ls1p2 >/dev/null 2>&1], [nc2.pid])
>> +M_NS_DAEMONIZE([ovn-chassis-2], [ls1p3], [$PYTHON3
>> /data/tcp_metadata_server.py \
>> +                                                   --bind-host
>> 192.168.1.3 \
>> +                                                   --bind-port 10880
>> \
>> +                                                   --reply-string
>> ls1p3 >/dev/null 2>&1], [nc3.pid])
>> +
>> +# Running tcpdump on chassis nodes to ensure there's no east-west
>> traffic during balancing,
>> +# which means balancing is happening directly on the node.
>> +M_START_TCPDUMP([ovn-chassis-1], [-c 2 -neei genev_sys_6081 port
>> 10880], [ch1_genev])
>> +M_START_TCPDUMP([ovn-chassis-1], [-c 2 -neei eth2 port 10880],
>> [ch1_eth2])
>> +M_START_TCPDUMP([ovn-chassis-2], [-c 2 -neei genev_sys_6081 port
>> 10880], [ch2_genev])
>> +M_START_TCPDUMP([ovn-chassis-2], [-c 2 -neei eth2 port 10880],
>> [ch2_eth2])
>> +
>> +M_FAKE_NODE_CHECK_EXEC([ovn-gw-1], [/bin/bash -c 'for i in $(seq
>> 500); \
>> +                          do curl --http0.9
>> http://1.1.1.1:80 2>/dev/null ; \
>> +                          echo ; done | sort | uniq -c ' > reply],
>> [0], [])
>> +
>> +# Check that requests are distributed among all backends.
>> +AT_CHECK([grep -q ls1p1 reply && grep -q ls1p2 reply && grep -q
>> ls1p3 reply], [0], [])
>> +
>> +# Check east-west traffic is clean
>> +AT_CHECK([cat ch1_genev.tcpdump], [0], [dnl
>> +])
>> +AT_CHECK([cat ch1_eth2.tcpdump], [0], [dnl
>> +])
>> +AT_CHECK([cat ch2_genev.tcpdump], [0], [dnl
>> +])
>> +AT_CHECK([cat ch2_eth2.tcpdump], [0], [dnl
>> +])
>> +
>> +AT_CLEANUP
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index 25655c456..0d5af3568 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -18289,6 +18289,9 @@ ovn_start
>>   # ip_port_mappings syntax: ip:lport_name:src_ip:<az_name>(for remote
>> lports)
>>   
>>   check ovn-nbctl ls-add ls1
>> +check ovn-nbctl lr-add lr1
>> +
>> +ovn-appctl -t ovn-northd vlog/disable-rate-limit
>>   
>>   check ovn-nbctl lb-add lb1_ipv4 1.1.1.1:80
>> 192.168.0.1:10880,192.168.0.2:10880,192.168.0.3:10880
>>   AT_CHECK([ovn-nbctl --wait=sb \
>> @@ -18358,6 +18361,154 @@ check ovn-nbctl set load_balancer lb1_ipv4
>> ip_port_mappings:192.168.0.1=lport1:1
>>   check_row_count sb:Service_Monitor 0
>>   
>>   OVS_WAIT_UNTIL([grep "Empty AZ name specified" northd/ovn-
>> northd.log])
>> +
>> +check ovn-nbctl lb-del lb1_ipv4
>> +
>> +# Check correct setup of distributed load balancers.
>> +echo > northd/ovn-northd.log
>> +check ovn-nbctl lb-add lb_distubuted 1.1.1.1:80
>> 192.168.0.1:10880,192.168.0.2:10880
>> +check ovn-nbctl lr-lb-add lr1 lb_distubuted
>> +check ovn-nbctl set load_balancer lb_distubuted
>> options:distributed=true
>> +
>> +# Check that load balancer does not work in a distributed mode -
>> there is no ip_port_mappings setting
>> +ovn-sbctl lflow-list lr1 > lr1_lflow
>> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
>> ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80), action=(drop;)
>> +])
>> +
>> +# Check that the load balancer has only one backend available since
>> the only one backend has ip_port_mappings
>> +check ovn-nbctl set load_balancer lb_distubuted
>> ip_port_mappings:192.168.0.1=lport1
>> +ovn-sbctl lflow-list lr1 > lr1_lflow
>> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
>> ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80),
>> action=(ct_lb_mark_local(backends="lport1":192.168.0.1:10880);)
>> +])
>> +
>> +check ovn-nbctl set load_balancer lb_distubuted
>> ip_port_mappings:192.168.0.2=lport2
>> +ovn-sbctl lflow-list lr1 > lr1_lflow
>> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
>> ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80),
>> action=(ct_lb_mark_local(backends="lport1":192.168.0.1:10880,"lport2"
>> :192.168.0.2:10880);)
>> +])
>> +
>> +# Check if health check is configured, ip_port_mappings must be
>> provided.
>> +AT_CHECK([ovn-nbctl --wait=sb \
>> +          -- --id=@hc create Load_Balancer_Health_Check
>> vip="1.1.1.1\:80" \
>> +             options:failure_count=100 \
>> +          -- add Load_Balancer lb_distubuted health_check @hc |
>> uuidfilt], [0], [<0>
>> +])
>> +
>> +ovn-sbctl lflow-list lr1 > lr1_lflow
>> +OVS_WAIT_UNTIL([grep "Expected ':' separator for:" northd/ovn-
>> northd.log])
>> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
>> ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80), action=(drop;)
>> +])
>> +
>> +check ovn-nbctl lb-del lb_distubuted
>> +echo > northd/ovn-northd.log
>> +
>> +# ipv6 configuration
>> +check ovn-nbctl lb-add lb1 [[2001::a]]:80
>> [[2001::3]]:80,[[2002::3]]:80
>> +
>> +check_uuid ovn-nbctl --wait=sb -- --id=@hc create \
>> +Load_Balancer_Health_Check vip="\[\[2001\:\:a\]\]\:80" -- add
>> Load_Balancer . \
>> +health_check @hc
>> +
>> +check_row_count sb:Service_Monitor 0
>> +check ovn-nbctl --wait=sb set load_balancer .
>> ip_port_mappings:\"[[2001::3]]\"=\"lport1:[[2001::2]]\"
>> +
>> +check_row_count sb:Service_Monitor 1
>> +ovn-sbctl list service_monitor
>> +check_column "2001::3" sb:Service_Monitor ip logical_port=lport1
>> +check_column 80 sb:Service_Monitor port logical_port=lport1
>> +check_column tcp sb:Service_Monitor protocol logical_port=lport1
>> +check_column "2001::2" sb:Service_Monitor src_ip logical_port=lport1
>> +check_column false sb:Service_Monitor ic_learned logical_port=lport1
>> +check_column false sb:Service_Monitor remote logical_port=lport1
>> +check_column "" sb:Service_Monitor logical_input_port
>> logical_port=lport1
>> +
>> +# Empty src_ip.
>> +check ovn-nbctl clear load_balancer lb1 ip_port_mappings
>> +check ovn-nbctl --wait=sb set load_balancer .
>> ip_port_mappings:\"[[2001::3]]\"=\"lport1:\"
>> +OVS_WAIT_UNTIL([grep "Invalid svc mon src IP" northd/ovn-
>> northd.log])
>> +check_row_count sb:Service_Monitor 0
>> +echo > northd/ovn-northd.log
>> +
>> +# Uncorrect ip_address.
>> +check ovn-nbctl --wait=sb set load_balancer .
>> ip_port_mappings:\"[[invalid]]\"=\"lport1:\"
>> +OVS_WAIT_UNTIL([grep "bad IP address" northd/ovn-northd.log])
>> +echo > northd/ovn-northd.log
>> +
>> +check ovn-nbctl --wait=sb set load_balancer .
>> ip_port_mappings:\"[[2001::3]]\"=\"lport1:invalid\"
>> +OVS_WAIT_UNTIL([grep "bad IP address" northd/ovn-northd.log])
>> +echo > northd/ovn-northd.log
>> +
>> +check ovn-nbctl --wait=sb set load_balancer .
>> ip_port_mappings:\"[[2001::3]]\"=\"lport1:[[2001::2]]:az_name\"
>> +check_row_count sb:Service_Monitor 1
>> +ovn-sbctl list service_monitor
>> +check_column "2001::3" sb:Service_Monitor ip logical_port=lport1
>> +check_column 80 sb:Service_Monitor port logical_port=lport1
>> +check_column tcp sb:Service_Monitor protocol logical_port=lport1
>> +check_column "2001::2" sb:Service_Monitor src_ip logical_port=lport1
>> +check_column false sb:Service_Monitor ic_learned logical_port=lport1
>> +check_column true sb:Service_Monitor remote logical_port=lport1
>> +check_column "" sb:Service_Monitor logical_input_port
>> logical_port=lport1
>> +
>> +uuid=$(ovn-sbctl -d bare --no-headings --columns _uuid find
>> Service_Monitor logical_port=lport1)
>> +
>> +# Check az_name presence in options.
>> +AT_CHECK([ovn-sbctl get Service_Monitor ${uuid} options:az-name],
>> +[0], [az_name
>> +])
>> +
>> +check ovn-nbctl --wait=sb set load_balancer .
>> ip_port_mappings:\"[[2001::3]]\"=\"lport1:[[2001::2]]:\"
>> +check_row_count sb:Service_Monitor 0
>> +OVS_WAIT_UNTIL([grep "Empty AZ name specified" northd/ovn-
>> northd.log])
>> +
>> +echo > northd/ovn-northd.log
>> +check ovn-nbctl lb-del lb1
>> +
>> +# Check correct setup of distributed load balancers.
>> +check ovn-nbctl lb-add lb_distubuted [[2001::a]]:80
>> [[2001::3]]:80,[[2002::3]]:80
>> +check ovn-nbctl lr-lb-add lr1 lb_distubuted
>> +check ovn-nbctl set load_balancer lb_distubuted
>> options:distributed=true
>> +
>> +# Check that load balancer does not work in a distributed mode -
>> there is no ip_port_mappings setting
>> +ovn-sbctl lflow-list lr1 > lr1_lflow
>> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
>> ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80), action=(drop;)
>> +])
>> +
>> +echo > northd/ovn-northd.log
>> +check ovn-nbctl set load_balancer .
>> ip_port_mappings:\"[[2001::3]]\"=\"lport1\"
>> +ovn-sbctl lflow-list lr1 > lr1_lflow
>> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
>> ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80),
>> action=(ct_lb_mark_local(backends="lport1":[[2001::3]]:80);)
>> +])
>> +
>> +echo > northd/ovn-northd.log
>> +check ovn-nbctl set load_balancer .
>> ip_port_mappings:\"[[2002::3]]\"=\"lport2\"
>> +ovn-sbctl lflow-list lr1 > lr1_lflow
>> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
>> ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80),
>> action=(ct_lb_mark_local(backends="lport1":[[2001::3]]:80,"lport2":[[
>> 2002::3]]:80);)
>> +])
>> +
>> +echo > northd/ovn-northd.log
>> +check_uuid ovn-nbctl --wait=sb -- --id=@hc create \
>> +Load_Balancer_Health_Check vip="\[\[2001\:\:a\]\]\:80" -- add
>> Load_Balancer . \
>> +health_check @hc
>> +OVS_WAIT_UNTIL([grep "Expected ':' separator for:" northd/ovn-
>> northd.log])
>> +ovn-sbctl lflow-list lr1 > lr1_lflow
>> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
>> ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80), action=(drop;)
>> +])
>> +
>> +check ovn-nbctl --wait=sb set load_balancer .
>> ip_port_mappings:\"[[2001::3]]\"=\"lport1:[[2001::2]]\"
>> +ovn-sbctl lflow-list lr1 > lr1_lflow
>> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
>> ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80),
>> action=(ct_lb_mark_local(backends="lport1":[[2001::3]]:80);)
>> +])
>> +
>> +check ovn-nbctl lb-del lb_distubuted
>> +
>>   OVN_CLEANUP_NORTHD
>>   AT_CLEANUP
>>   ])
>> @@ -19251,3 +19402,216 @@ AT_CHECK([grep "lr_in_policy[[^_]]"
>> lr0flows | ovn_strip_lflows | sort], [0], [d
>>   OVN_CLEANUP_NORTHD
>>   AT_CLEANUP
>>   ])
>> +
>> +OVN_FOR_EACH_NORTHD_NO_HV([
>> +AT_SETUP([Distributed lb: logical-flow test - IPv4/IPv6 case])
>> +ovn_start
>> +
>> +check ovn-nbctl ls-add outside
>> +
>> +check ovn-nbctl lsp-add outside outside \
>> +      -- lsp-set-addresses outside unknown \
>> +      -- lsp-set-type outside localnet
>> +
>> +check ovn-nbctl --wait=sb set Logical_Switch_Port outside
>> tag_request=2
>> +
>> +check ovn-nbctl lsp-add outside outside-down \
>> +      -- lsp-set-type outside-down router \
>> +      -- lsp-set-addresses outside-down router \
>> +      -- lsp-set-options outside-down router-port=lr1-up
>> +
>> +check ovn-nbctl lr-add lr1 \
>> +      -- lrp-add lr1 lr1-up 11:11:11:11:11:11 169.254.0.1/24
>> 2001:db8:abcd:0002::bad/64 \
>> +      -- lrp-add lr1 lr1-down 12:12:12:12:12:12 192.168.0.1/24
>> 2001:db8:abcd:0001::c0fe/64
>> +
>> +check ovn-nbctl ls-add ls1 \
>> +      -- lsp-add ls1 lport1 \
>> +      -- lsp-set-addresses lport1 "13:13:13:13:13:13 192.168.0.101"
>> \
>> +      -- lsp-add ls1 lport2 \
>> +      -- lsp-set-addresses lport2 "14:14:14:14:14:14 192.168.0.102"
>> +
>> +check ovn-nbctl lsp-add ls1 ls1-up \
>> +      -- lsp-set-type ls1-up router \
>> +      -- lsp-set-addresses ls1-up router \
>> +      -- lsp-set-options ls1-up router-port=lr1-down
>> +
>> +check ovn-nbctl --wait=sb sync
>> +
>> +check ovn-nbctl ha-chassis-group-add gateway
>> +check ovn-nbctl ha-chassis-group-add-chassis gateway hv1 1
>> +ha_g_uuid=$(fetch_column nb:HA_Chassis_Group _uuid name=gateway)
>> +lr1_up_uuid=$(fetch_column nb:Logical_Router_Port _uuid name=lr1-up)
>> +check ovn-nbctl set logical_router_port $lr1_up_uuid
>> ha_chassis_group=$ha_g_uuid
>> +
>> +check ovn-nbctl --wait=sb sync
>> +
>> +check ovn-nbctl lb-add lb1_ipv4 1.1.1.1:80
>> 192.168.0.101:10880,192.168.0.102:10880
>> +check ovn-nbctl set Load_Balancer lb1_ipv4
>> ip_port_mappings:192.168.0.101=lport1:192.168.0.199
>> +check ovn-nbctl set Load_Balancer lb1_ipv4
>> ip_port_mappings:192.168.0.102=lport2:192.168.0.199
>> +check ovn-nbctl lr-lb-add lr1 lb1_ipv4
>> +check ovn-nbctl --wait=sb sync
>> +
>> +check ovn-nbctl lb-add lb1_ipv6 [[2000::1]]:80
>> [[2001:db8:abcd:1::2]]:10882
>> +check ovn-nbctl set Load_Balancer lb1_ipv6
>> ip_port_mappings:\"[[2001:db8:abcd:1::2]]\"=\"lport1\"
>> +check ovn-nbctl lr-lb-add lr1 lb1_ipv6
>> +check ovn-nbctl --wait=sb sync
>> +
>> +ovn-sbctl lflow-list lr1 > lr1_lflows_before
>> +ovn-sbctl lflow-list outside > outside_lflows_before
>> +
>> +AT_CHECK([cat outside_lflows_before | grep ls_in_l2_lkup | grep
>> priority=50 | ovn_strip_lflows], [0], [dnl
>> +  table=??(ls_in_l2_lkup      ), priority=50   , match=(eth.dst ==
>> 11:11:11:11:11:11 && is_chassis_resident("cr-lr1-up")),
>> action=(outport = "outside-down"; output;)
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_before | grep lr_in_ip_input | grep
>> priority=90 | grep 169.254.0.1 | ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_ip_input     ), priority=90   , match=(inport ==
>> "lr1-up" && arp.op == 1 && arp.tpa == 169.254.0.1 && arp.spa ==
>> 169.254.0.0/24 && is_chassis_resident("cr-lr1-up")), action=(eth.dst
>> = eth.src; eth.src = xreg0[[0..47]]; arp.op = 2; /* ARP reply */
>> arp.tha = arp.sha; arp.sha = xreg0[[0..47]]; arp.tpa <-> arp.spa;
>> outport = inport; flags.loopback = 1; output;)
>> +  table=??(lr_in_ip_input     ), priority=90   , match=(ip4.dst ==
>> 169.254.0.1 && icmp4.type == 8 && icmp4.code == 0), action=(ip4.dst
>> <-> ip4.src; ip.ttl = 255; icmp4.type = 0; flags.loopback = 1; next;
>> )
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_before | grep lr_in_ip_input | grep
>> priority=90 | grep 2001:db8:abcd:2::bad | ovn_strip_lflows], [0],
>> [dnl
>> +  table=??(lr_in_ip_input     ), priority=90   , match=(inport ==
>> "lr1-up" && ip6.dst == {2001:db8:abcd:2::bad, ff02::1:ff00:bad} &&
>> nd_ns && nd.target == 2001:db8:abcd:2::bad &&
>> is_chassis_resident("cr-lr1-up")), action=(nd_na_router { eth.src =
>> xreg0[[0..47]]; ip6.src = nd.target; nd.tll = xreg0[[0..47]]; outport
>> = inport; flags.loopback = 1; output; };)
>> +  table=??(lr_in_ip_input     ), priority=90   , match=(ip6.dst ==
>> {2001:db8:abcd:2::bad, fe80::1311:11ff:fe11:1111} && icmp6.type ==
>> 128 && icmp6.code == 0), action=(ip6.dst <-> ip6.src; ip.ttl = 255;
>> icmp6.type = 129; flags.loopback = 1; next; )
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_before | grep lr_in_admission | grep
>> priority=50 | ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_admission    ), priority=50   , match=(eth.dst ==
>> 11:11:11:11:11:11 && inport == "lr1-up" && is_chassis_resident("cr-
>> lr1-up")), action=(xreg0[[0..47]] = 11:11:11:11:11:11; next;)
>> +  table=??(lr_in_admission    ), priority=50   , match=(eth.dst ==
>> 12:12:12:12:12:12 && inport == "lr1-down"), action=(xreg0[[0..47]] =
>> 12:12:12:12:12:12; next;)
>> +  table=??(lr_in_admission    ), priority=50   , match=(eth.mcast &&
>> inport == "lr1-down"), action=(xreg0[[0..47]] = 12:12:12:12:12:12;
>> next;)
>> +  table=??(lr_in_admission    ), priority=50   , match=(eth.mcast &&
>> inport == "lr1-up"), action=(xreg0[[0..47]] = 11:11:11:11:11:11;
>> next;)
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_before | grep lr_out_undnat | grep
>> priority=120 | ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_out_undnat      ), priority=120  , match=(ip4 &&
>> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src ==
>> 192.168.0.102 && tcp.src == 10880)) && (inport == "lr1-up" || outport
>> == "lr1-up") && is_chassis_resident("cr-lr1-up")), action=(ct_dnat;)
>> +  table=??(lr_out_undnat      ), priority=120  , match=(ip6 &&
>> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && (inport ==
>> "lr1-up" || outport == "lr1-up") && is_chassis_resident("cr-lr1-
>> up")), action=(ct_dnat;)
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_before | grep lr_in_gw_redirect |
>> ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_gw_redirect  ), priority=0    , match=(1),
>> action=(next;)
>> +  table=??(lr_in_gw_redirect  ), priority=200  , match=(ip4 &&
>> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src ==
>> 192.168.0.102 && tcp.src == 10880)) && outport == "lr1-up"),
>> action=(outport = "cr-lr1-up"; next;)
>> +  table=??(lr_in_gw_redirect  ), priority=200  , match=(ip6 &&
>> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && outport ==
>> "lr1-up"), action=(outport = "cr-lr1-up"; next;)
>> +  table=??(lr_in_gw_redirect  ), priority=50   , match=(outport ==
>> "lr1-up"), action=(outport = "cr-lr1-up"; next;)
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_before | grep lr_in_dnat | grep
>> priority=120 | ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80 && is_chassis_resident("cr-lr1-up")),
>> action=(ct_lb_mark(backends=192.168.0.101:10880,192.168.0.102:10880);
>> )
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip6 && ip6.dst == 2000::1 && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80 && is_chassis_resident("cr-lr1-up")),
>> action=(ct_lb_mark(backends=[[2001:db8:abcd:1::2]]:10882);)
>> +])
>> +
>> +AT_CHECK([cat outside_lflows_before | grep ls_in_check_port_sec |
>> grep priority=75 | ovn_strip_lflows], [0], [dnl
>> +  table=??(ls_in_check_port_sec), priority=75   , match=((arp.op ==
>> 1 || arp.op == 2) && inport == "outside"), action=(reg0[[22]] = 1;
>> next;)
>> +])
>> +
>> +AT_CHECK([cat outside_lflows_before | grep ls_in_apply_port_sec |
>> grep priority=75 | ovn_strip_lflows], [0], [dnl
>> +  table=??(ls_in_apply_port_sec), priority=75   , match=(reg0[[22]]
>> == 1 && is_chassis_resident("cr-lr1-up")), action=(next;)
>> +])
>> +
>> +check ovn-nbctl clear logical_router_port $lr1_up_uuid
>> ha_chassis_group
>> +check ovn-nbctl ha-chassis-group-del gateway
>> +check ovn-nbctl ha-chassis-group-add gateway2
>> +check ovn-nbctl ha-chassis-group-add-chassis gateway2 test 1
>> +ha_g_uuid=$(fetch_column nb:HA_Chassis_Group _uuid name=gateway2)
>> +lr1_up_uuid=$(fetch_column nb:Logical_Router_Port _uuid name=lr1-up)
>> +check ovn-nbctl set logical_router_port $lr1_up_uuid
>> ha_chassis_group=$ha_g_uuid
>> +
>> +check ovn-nbctl set load_balancer lb1_ipv4 options:distributed=true
>> +check ovn-nbctl --wait=hv sync
>> +
>> +ovn-sbctl lflow-list outside > outside_lflows_after
>> +ovn-sbctl lflow-list lr1 > lr1_lflows_after
>> +
>> +AT_CHECK([cat outside_lflows_after | grep ls_in_l2_lkup | grep
>> priority=50 | ovn_strip_lflows], [0], [dnl
>> +  table=??(ls_in_l2_lkup      ), priority=50   , match=(eth.dst ==
>> 11:11:11:11:11:11), action=(outport = "outside-down"; output;)
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_after | grep lr_in_ip_input | grep
>> priority=90 | grep 169.254.0.1 | ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_ip_input     ), priority=90   , match=(inport ==
>> "lr1-up" && arp.op == 1 && arp.tpa == 169.254.0.1 && arp.spa ==
>> 169.254.0.0/24), action=(eth.dst = eth.src; eth.src = xreg0[[0..47]];
>> arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha =
>> xreg0[[0..47]]; arp.tpa <-> arp.spa; outport = inport; flags.loopback
>> = 1; output;)
>> +  table=??(lr_in_ip_input     ), priority=90   , match=(ip4.dst ==
>> 169.254.0.1 && icmp4.type == 8 && icmp4.code == 0), action=(ip4.dst
>> <-> ip4.src; ip.ttl = 255; icmp4.type = 0; flags.loopback = 1; next;
>> )
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_after | grep lr_in_ip_input | grep
>> priority=90 | grep 2001:db8:abcd:2::bad | ovn_strip_lflows], [0],
>> [dnl
>> +  table=??(lr_in_ip_input     ), priority=90   , match=(inport ==
>> "lr1-up" && ip6.dst == {2001:db8:abcd:2::bad, ff02::1:ff00:bad} &&
>> nd_ns && nd.target == 2001:db8:abcd:2::bad), action=(nd_na_router {
>> eth.src = xreg0[[0..47]]; ip6.src = nd.target; nd.tll =
>> xreg0[[0..47]]; outport = inport; flags.loopback = 1; output; };)
>> +  table=??(lr_in_ip_input     ), priority=90   , match=(ip6.dst ==
>> {2001:db8:abcd:2::bad, fe80::1311:11ff:fe11:1111} && icmp6.type ==
>> 128 && icmp6.code == 0), action=(ip6.dst <-> ip6.src; ip.ttl = 255;
>> icmp6.type = 129; flags.loopback = 1; next; )
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_after | grep lr_in_admission | grep
>> priority=50 | ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_admission    ), priority=50   , match=(eth.dst ==
>> 11:11:11:11:11:11 && inport == "lr1-up"), action=(xreg0[[0..47]] =
>> 11:11:11:11:11:11; next;)
>> +  table=??(lr_in_admission    ), priority=50   , match=(eth.dst ==
>> 12:12:12:12:12:12 && inport == "lr1-down"), action=(xreg0[[0..47]] =
>> 12:12:12:12:12:12; next;)
>> +  table=??(lr_in_admission    ), priority=50   , match=(eth.mcast &&
>> inport == "lr1-down"), action=(xreg0[[0..47]] = 12:12:12:12:12:12;
>> next;)
>> +  table=??(lr_in_admission    ), priority=50   , match=(eth.mcast &&
>> inport == "lr1-up"), action=(xreg0[[0..47]] = 11:11:11:11:11:11;
>> next;)
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_after | grep lr_out_undnat | grep
>> priority=120 | ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_out_undnat      ), priority=120  , match=(ip4 &&
>> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src ==
>> 192.168.0.102 && tcp.src == 10880)) && (inport == "lr1-up" || outport
>> == "lr1-up")), action=(ct_dnat;)
>> +  table=??(lr_out_undnat      ), priority=120  , match=(ip6 &&
>> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && (inport ==
>> "lr1-up" || outport == "lr1-up") && is_chassis_resident("cr-lr1-
>> up")), action=(ct_dnat;)
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_after | grep lr_in_gw_redirect |
>> ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_gw_redirect  ), priority=0    , match=(1),
>> action=(next;)
>> +  table=??(lr_in_gw_redirect  ), priority=200  , match=(ip4 &&
>> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src ==
>> 192.168.0.102 && tcp.src == 10880)) && outport == "lr1-up"),
>> action=(outport = "lr1-up"; next;)
>> +  table=??(lr_in_gw_redirect  ), priority=200  , match=(ip6 &&
>> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && outport ==
>> "lr1-up"), action=(outport = "cr-lr1-up"; next;)
>> +  table=??(lr_in_gw_redirect  ), priority=50   , match=(outport ==
>> "lr1-up"), action=(outport = "cr-lr1-up"; next;)
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_after | grep lr_in_dnat | grep priority=120
>> | ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80),
>> action=(ct_lb_mark_local(backends="lport1":192.168.0.101:10880,"lport
>> 2":192.168.0.102:10880);)
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip6 && ip6.dst == 2000::1 && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80 && is_chassis_resident("cr-lr1-up")),
>> action=(ct_lb_mark(backends=[[2001:db8:abcd:1::2]]:10882);)
>> +])
>> +
>> +AT_CHECK([cat outside_lflows_after | grep ls_in_check_port_sec |
>> grep priority=75 | ovn_strip_lflows], [0], [dnl
>> +  table=??(ls_in_check_port_sec), priority=75   , match=((arp.op ==
>> 1 || arp.op == 2) && inport == "outside"), action=(reg0[[22]] = 1;
>> next;)
>> +])
>> +
>> +AT_CHECK([cat outside_lflows_after | grep ls_in_apply_port_sec |
>> grep priority=75 | ovn_strip_lflows], [0], [dnl
>> +  table=??(ls_in_apply_port_sec), priority=75   , match=(reg0[[22]]
>> == 1), action=(next;)
>> +])
>> +
>> +check ovn-nbctl set load_balancer lb1_ipv6 options:distributed=true
>> +check ovn-nbctl --wait=hv sync
>> +
>> +ovn-sbctl lflow-list outside > outside_lflows_after
>> +ovn-sbctl lflow-list lr1 > lr1_lflows_after
>> +
>> +AT_CHECK([cat lr1_lflows_after | grep lr_out_undnat | grep
>> priority=120 | ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_out_undnat      ), priority=120  , match=(ip4 &&
>> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src ==
>> 192.168.0.102 && tcp.src == 10880)) && (inport == "lr1-up" || outport
>> == "lr1-up")), action=(ct_dnat;)
>> +  table=??(lr_out_undnat      ), priority=120  , match=(ip6 &&
>> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && (inport ==
>> "lr1-up" || outport == "lr1-up")), action=(ct_dnat;)
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_after | grep lr_in_gw_redirect |
>> ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_gw_redirect  ), priority=0    , match=(1),
>> action=(next;)
>> +  table=??(lr_in_gw_redirect  ), priority=200  , match=(ip4 &&
>> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src ==
>> 192.168.0.102 && tcp.src == 10880)) && outport == "lr1-up"),
>> action=(outport = "lr1-up"; next;)
>> +  table=??(lr_in_gw_redirect  ), priority=200  , match=(ip6 &&
>> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && outport ==
>> "lr1-up"), action=(outport = "lr1-up"; next;)
>> +  table=??(lr_in_gw_redirect  ), priority=50   , match=(outport ==
>> "lr1-up"), action=(outport = "cr-lr1-up"; next;)
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_after | grep lr_in_dnat | grep priority=120
>> | ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80),
>> action=(ct_lb_mark_local(backends="lport1":192.168.0.101:10880,"lport
>> 2":192.168.0.102:10880);)
>> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
>> !ct.rel && ip6 && ip6.dst == 2000::1 && reg1[[16..23]] == 6 &&
>> reg1[[0..15]] == 80),
>> action=(ct_lb_mark_local(backends="lport1":[[2001:db8:abcd:1::2]]:108
>> 82);)
>> +])
>> +
>> +check ovn-nbctl set load_balancer lb1_ipv6 options:distributed=false
>> +check ovn-nbctl --wait=hv sync
>> +
>> +AT_CHECK([cat outside_lflows_after | grep ls_in_l2_lkup | grep
>> priority=50 | ovn_strip_lflows], [0], [dnl
>> +  table=??(ls_in_l2_lkup      ), priority=50   , match=(eth.dst ==
>> 11:11:11:11:11:11), action=(outport = "outside-down"; output;)
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_after | grep lr_in_ip_input | grep
>> priority=90 | grep 169.254.0.1 | ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_ip_input     ), priority=90   , match=(inport ==
>> "lr1-up" && arp.op == 1 && arp.tpa == 169.254.0.1 && arp.spa ==
>> 169.254.0.0/24), action=(eth.dst = eth.src; eth.src = xreg0[[0..47]];
>> arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha =
>> xreg0[[0..47]]; arp.tpa <-> arp.spa; outport = inport; flags.loopback
>> = 1; output;)
>> +  table=??(lr_in_ip_input     ), priority=90   , match=(ip4.dst ==
>> 169.254.0.1 && icmp4.type == 8 && icmp4.code == 0), action=(ip4.dst
>> <-> ip4.src; ip.ttl = 255; icmp4.type = 0; flags.loopback = 1; next;
>> )
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_after | grep lr_in_ip_input | grep
>> priority=90 | grep 2001:db8:abcd:2::bad | ovn_strip_lflows], [0],
>> [dnl
>> +  table=??(lr_in_ip_input     ), priority=90   , match=(inport ==
>> "lr1-up" && ip6.dst == {2001:db8:abcd:2::bad, ff02::1:ff00:bad} &&
>> nd_ns && nd.target == 2001:db8:abcd:2::bad), action=(nd_na_router {
>> eth.src = xreg0[[0..47]]; ip6.src = nd.target; nd.tll =
>> xreg0[[0..47]]; outport = inport; flags.loopback = 1; output; };)
>> +  table=??(lr_in_ip_input     ), priority=90   , match=(ip6.dst ==
>> {2001:db8:abcd:2::bad, fe80::1311:11ff:fe11:1111} && icmp6.type ==
>> 128 && icmp6.code == 0), action=(ip6.dst <-> ip6.src; ip.ttl = 255;
>> icmp6.type = 129; flags.loopback = 1; next; )
>> +])
>> +
>> +AT_CHECK([cat lr1_lflows_after | grep lr_in_admission | grep
>> priority=50 | ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_admission    ), priority=50   , match=(eth.dst ==
>> 11:11:11:11:11:11 && inport == "lr1-up"), action=(xreg0[[0..47]] =
>> 11:11:11:11:11:11; next;)
>> +  table=??(lr_in_admission    ), priority=50   , match=(eth.dst ==
>> 12:12:12:12:12:12 && inport == "lr1-down"), action=(xreg0[[0..47]] =
>> 12:12:12:12:12:12; next;)
>> +  table=??(lr_in_admission    ), priority=50   , match=(eth.mcast &&
>> inport == "lr1-down"), action=(xreg0[[0..47]] = 12:12:12:12:12:12;
>> next;)
>> +  table=??(lr_in_admission    ), priority=50   , match=(eth.mcast &&
>> inport == "lr1-up"), action=(xreg0[[0..47]] = 11:11:11:11:11:11;
>> next;)
>> +])
>> +
>> +OVN_CLEANUP_NORTHD
>> +AT_CLEANUP
>> +])


-- 
regards,
Alexandra.

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to