On Mon, 2026-01-19 at 12:23 +0300, Alexandra Rukomoinikova wrote:
> 1) Added new option "distributed" for load balancers.
>    With this feature, balancers will work distributedly across
> compute nodes,
>    balancing to only local backends (excluded east-west traffic)
> 
> 2) If load balancer is running on a router with dgp, the router will
> no
>    longer be centralized on gateway - this means that access to
> physical network will also be available from hosts where the
> distributed balancer backends are located.
> 
> 3) Configuration requirement for distributed load balancers:
>     1) ip_port_mapping must be specified
>     2) Balancing in underlay fabric between hosts with backends
> 
> Example:
> Load Balancer: lb1 with VIP 1.1.1.1 and distributed option enabled.
> Fabric is configured with a static ECMP route for 1.1.1.1/32:
>     nexthop via ip_host1 weight 1 (hosts backend1)
>     nexthop via ip_host2 weight 1 (hosts backend2)
>     nexthop via ip_host3 weight 2 (hosts backend3 and backend4)
> 
> As part of testing, following estimates of distribution of requests
> to balancers were obtained:
> for i in $(seq 5000); do curl http://1.1.1.1:80 2>/dev/null ; echo ;
> done | awk '{print $2}' | sort | uniq -c
>    1265 “backend 4",
>    1260 “backend 3",
>    1224 “backend 2",
>    1251 “backend 1",
> Thus, requests using ecmp balancing are distributed between backends
> approximately evenly.
> 
> Suggested-by: Vladislav Odintsov <[email protected]>
> Signed-off-by: Alexandra Rukomoinikova <[email protected]>
> ---
>  v3 --> v4: 1) fix arp reciving for distributed load balancing since
> merging 01277be6c833ef1d1c3fb568c139d3f0ac2b363a
>             2) remove system tests bc turned out to be useless, added
> new multinode tests for use case
>             3) updated ovn-sb doc for new action
>             4) update doc about distributed option
>             5) rebased on
> https://github.com/ovn-org/ovn-fake-multinode/pull/114

Hi Alexandra,
thank you for the v4. The multinode tests conveys the topology much
better. Overall I'd say it LGTM, though I'm not super confident in the
area of incremental processing, so an additional +1 from someone else
would be nice. I'll leave some in-line comments.

I also tried the topology that we discussed in the last version[0]
(Distributed LR connected to fabric via GW routers on each chassis) and
I found that the load-balancing gets concentrated only on the chassis
that hosts chassis-redirect port for the DGP.
I feel like we should be able to avoid sending the traffic of
distributed load balancer to the CR port, I just haven't found the
right tweak to do so. What do you think?

NOTE: I don't think that this scenario needs to be supported for this
patch to get accepted, but it would be nice if it did, because AFAIK
this is (more or less) the topology of Openstack+OVN BGP
implementation.

[0] https://gist.github.com/mkalcok/7bc9af3f79969067bc1cea01841f6f47
 
> ---
>  NEWS                      |   3 +
>  northd/en-lb-data.c       |   9 +
>  northd/en-lb-data.h       |   3 +
>  northd/en-lr-stateful.c   |   3 +
>  northd/en-lr-stateful.h   |   2 +
>  northd/lb.c               |  91 ++++++----
>  northd/lb.h               |   7 +-
>  northd/northd.c           | 197 ++++++++++++++-------
>  northd/northd.h           |  17 ++
>  ovn-nb.xml                |  16 +-
>  ovn-sb.xml                |  11 ++
>  tests/multinode-macros.at |  14 ++
>  tests/multinode.at        | 159 +++++++++++++++++
>  tests/ovn-northd.at       | 364
> ++++++++++++++++++++++++++++++++++++++
>  14 files changed, 791 insertions(+), 105 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index 9883fb81d..4d49b94f1 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -82,6 +82,9 @@ Post v25.09.0
>       other_config column.
>     - Introduce the capability to specify multiple ips for ovn-evpn-
> local-ip
>       option.
> +  - Add "distributed" option for load balancer, that forces traffic
> to be
> +    routed only to backend instances running locally on the same
> chassis
> +    it arrives on.
>  
>  OVN v25.09.0 - xxx xx xxxx
>  --------------------------
> diff --git a/northd/en-lb-data.c b/northd/en-lb-data.c
> index 6d52d465e..f67ab9c5f 100644
> --- a/northd/en-lb-data.c
> +++ b/northd/en-lb-data.c
> @@ -166,6 +166,7 @@ lb_data_load_balancer_handler(struct engine_node
> *node, void *data)
>              add_crupdated_lb_to_tracked_data(lb, trk_lb_data,
>                                               lb->health_checks);
>              trk_lb_data->has_routable_lb |= lb->routable;
> +            trk_lb_data->has_distributed_lb |= lb->is_distributed;
>              continue;
>          }
>  
> @@ -180,6 +181,7 @@ lb_data_load_balancer_handler(struct engine_node
> *node, void *data)
>              add_deleted_lb_to_tracked_data(lb, trk_lb_data,
>                                             lb->health_checks);
>              trk_lb_data->has_routable_lb |= lb->routable;
> +            trk_lb_data->has_distributed_lb |= lb->is_distributed;
>          } else {
>              /* Load balancer updated. */
>              bool health_checks = lb->health_checks;
> @@ -189,11 +191,13 @@ lb_data_load_balancer_handler(struct
> engine_node *node, void *data)
>              sset_swap(&lb->ips_v6, &old_ips_v6);
>              enum lb_neighbor_responder_mode neigh_mode = lb-
> >neigh_mode;
>              bool routable = lb->routable;
> +            bool distributed_mode = lb->is_distributed;
>              ovn_northd_lb_reinit(lb, tracked_lb);
>              health_checks |= lb->health_checks;
>              struct crupdated_lb *clb =
> add_crupdated_lb_to_tracked_data(
>                  lb, trk_lb_data, health_checks);
>              trk_lb_data->has_routable_lb |= lb->routable;
> +            trk_lb_data->has_distributed_lb |= lb->is_distributed;
>  
>              /* Determine the inserted and deleted vips and store
> them in
>               * the tracked data. */
> @@ -226,6 +230,10 @@ lb_data_load_balancer_handler(struct engine_node
> *node, void *data)
>                  /* If neigh_mode is updated trigger a full
> recompute. */
>                  return EN_UNHANDLED;
>              }
> +            if (distributed_mode != lb->is_distributed) {
> +                /* If neigh_mode is updated trigger a full
> recompute. */
> +                return EN_UNHANDLED;
> +            }
>          }
>      }
>  
> @@ -687,6 +695,7 @@ handle_od_lb_changes(struct nbrec_load_balancer
> **nbrec_lbs,
>                                                               
> lb_uuid);
>                  ovs_assert(lb);
>                  trk_lb_data->has_routable_lb |= lb->routable;
> +                trk_lb_data->has_distributed_lb |= lb-
> >is_distributed;
>              }
>          }
>  
> diff --git a/northd/en-lb-data.h b/northd/en-lb-data.h
> index 1da087656..90e85b8c4 100644
> --- a/northd/en-lb-data.h
> +++ b/northd/en-lb-data.h
> @@ -82,6 +82,9 @@ struct tracked_lb_data {
>  
>      /* Indicates if any lb (in the tracked data) has 'routable' flag
> set. */
>      bool has_routable_lb;
> +
> +    /* Indicates if any lb (in the tracked data) has 'distibuted'
> flag set. */
> +    bool has_distributed_lb;
>  };
>  
>  /* Datapath (logical switch) to lb/lbgrp association data. */
> diff --git a/northd/en-lr-stateful.c b/northd/en-lr-stateful.c
> index 212c0641c..5a738f4c3 100644
> --- a/northd/en-lr-stateful.c
> +++ b/northd/en-lr-stateful.c
> @@ -326,6 +326,7 @@ lr_stateful_lb_data_handler(struct engine_node
> *node, void *data_)
>                  ovn_datapaths_find_by_index(input_data.lr_datapaths,
>                                              lr_stateful_rec-
> >lr_index);
>              lr_stateful_rec->has_lb_vip = od_has_lb_vip(od);
> +            lr_stateful_rec->has_distributed_lb = od-
> >is_distributed;
>          }
>  
>          return EN_HANDLED_UPDATED;
> @@ -527,7 +528,9 @@ lr_stateful_record_create(struct
> lr_stateful_table *table,
>      if (nbr->n_nat) {
>          lr_stateful_rebuild_vip_nats(lr_stateful_rec);
>      }
> +
>      lr_stateful_rec->has_lb_vip = od_has_lb_vip(od);
> +    lr_stateful_rec->has_distributed_lb = od->is_distributed;
>  
>      hmap_insert(&table->entries, &lr_stateful_rec->key_node,
>                  uuid_hash(&lr_stateful_rec->nbr_uuid));
> diff --git a/northd/en-lr-stateful.h b/northd/en-lr-stateful.h
> index 146f768c3..3b0c54521 100644
> --- a/northd/en-lr-stateful.h
> +++ b/northd/en-lr-stateful.h
> @@ -59,6 +59,8 @@ struct lr_stateful_record {
>  
>      bool has_lb_vip;
>  
> +    bool has_distributed_lb;
> +
>      /* Load Balancer vIPs relevant for this datapath. */
>      struct ovn_lb_ip_set *lb_ips;
>  
> diff --git a/northd/lb.c b/northd/lb.c
> index 0822bc2d7..bfb255ccd 100644
> --- a/northd/lb.c
> +++ b/northd/lb.c
> @@ -85,12 +85,12 @@ ovn_lb_ip_set_clone(struct ovn_lb_ip_set
> *lb_ip_set)
>      return clone;
>  }
>  
> -static
> -void ovn_northd_lb_vip_init(struct ovn_northd_lb_vip *lb_vip_nb,
> -                            const struct ovn_lb_vip *lb_vip,
> -                            const struct nbrec_load_balancer
> *nbrec_lb,
> -                            const char *vip_port_str, const char
> *backend_ips,
> -                            bool template)
> +static void
> +ovn_northd_lb_vip_init(struct ovn_northd_lb_vip *lb_vip_nb,
> +                       const struct ovn_lb_vip *lb_vip,
> +                       const struct nbrec_load_balancer *nbrec_lb,
> +                       const char *vip_port_str, const char
> *backend_ips,
> +                       bool template)
>  {
>      lb_vip_nb->backend_ips = xstrdup(backend_ips);
>      lb_vip_nb->n_backends = vector_len(&lb_vip->backends);
> @@ -101,19 +101,22 @@ void ovn_northd_lb_vip_init(struct
> ovn_northd_lb_vip *lb_vip_nb,
>  }
>  
>  /*
> - * Initializes health check configuration for load balancer VIP
> - * backends. Parses the ip_port_mappings in the format :
> - * "ip:logical_port:src_ip[:az_name]".
> + * Parses ip_port_mappings in the format :
> + * "ip:logical_port[:src_ip][:az_name]".
> + * src_ip parameter is optional when distributed mode is enabled,
> + * without health checks configured.
>   * If az_name is present and non-empty, it indicates this is a
>   * remote service monitor (backend is in another availability zone),
>   * it should be propogated to another AZ by interconnection
> processing.
>   */
>  static void
> -ovn_lb_vip_backends_health_check_init(const struct ovn_northd_lb
> *lb,
> -                                      const struct ovn_lb_vip
> *lb_vip,
> -                                      struct ovn_northd_lb_vip
> *lb_vip_nb)
> +ovn_lb_vip_backends_ip_port_mappings_init(const struct ovn_northd_lb
> *lb,
> +                                          const struct ovn_lb_vip
> *lb_vip,
> +                                          struct ovn_northd_lb_vip
> *lb_vip_nb)
>  {
>      struct ds key = DS_EMPTY_INITIALIZER;
> +    bool allow_without_src_ip = lb->is_distributed
> +                                && !lb_vip_nb->lb_health_check;
>  
>      for (size_t j = 0; j < vector_len(&lb_vip->backends); j++) {
>          const struct ovn_lb_backend *backend =
> @@ -127,26 +130,34 @@ ovn_lb_vip_backends_health_check_init(const
> struct ovn_northd_lb *lb,
>              continue;
>          }
>  
> -        char *svc_mon_src_ip = NULL;
> -        char *az_name = NULL;
> +        struct ovn_northd_lb_backend *backend_nb = NULL;
> +        char *port_name = NULL, *az_name = NULL, *first_colon =
> NULL;
> +        char *svc_mon_src_ip = NULL, *src_ip = NULL;
>          bool is_remote = false;
> -        char *port_name = xstrdup(s);
> -        char *src_ip = NULL;
>  
> -        char *first_colon = strchr(port_name, ':');
> -        if (!first_colon) {
> -            free(port_name);
> -            continue;
> +        port_name = xstrdup(s);
> +        first_colon = strchr(port_name, ':');
> +
> +        if (!first_colon && allow_without_src_ip) {
> +            if (!*port_name) {
> +                VLOG_WARN("Empty port name in distributed mode for
> IP %s",
> +                          ds_cstr(&key));
> +                goto cleanup;
> +            }
> +            is_remote = false;
> +            goto init_backend;
> +        } else if (!first_colon) {
> +            VLOG_WARN("Expected ':' separator for: %s", port_name);
> +            goto cleanup;
>          }
> -        *first_colon = '\0';
>  
> +        *first_colon = '\0';
>          if (first_colon[1] == '[') {
>              /* IPv6 case - format: port:[ipv6]:az or port:[ipv6] */
>              char *ip_end = strchr(first_colon + 2, ']');
>              if (!ip_end) {
>                  VLOG_WARN("Malformed IPv6 address in backend %s",
> s);
> -                free(port_name);
> -                continue;
> +                goto cleanup;
>              }
>  
>              src_ip = first_colon + 2;
> @@ -157,8 +168,7 @@ ovn_lb_vip_backends_health_check_init(const
> struct ovn_northd_lb *lb,
>                  if (!*az_name) {
>                      VLOG_WARN("Empty AZ name specified for backend
> %s",
>                                port_name);
> -                    free(port_name);
> -                    continue;
> +                    goto cleanup;
>                  }
>                  is_remote = true;
>              }
> @@ -172,31 +182,31 @@ ovn_lb_vip_backends_health_check_init(const
> struct ovn_northd_lb *lb,
>                  if (!*az_name) {
>                      VLOG_WARN("Empty AZ name specified for backend
> %s",
>                                port_name);
> -                    free(port_name);
> -                    continue;
> +                    goto cleanup;
>                  }
> -            is_remote = true;
> +                is_remote = true;
>              }
>          }
>  
>          struct sockaddr_storage svc_mon_src_addr;
>          if (!src_ip || !inet_parse_address(src_ip,
> &svc_mon_src_addr)) {
>              VLOG_WARN("Invalid svc mon src IP %s", src_ip ? src_ip :
> "NULL");
> +            goto cleanup;
>          } else {
>              struct ds src_ip_s = DS_EMPTY_INITIALIZER;
>              ss_format_address_nobracks(&svc_mon_src_addr,
> &src_ip_s);
>              svc_mon_src_ip = ds_steal_cstr(&src_ip_s);
>          }
>  
> -        if (svc_mon_src_ip) {
> -            struct ovn_northd_lb_backend *backend_nb =
> -                &lb_vip_nb->backends_nb[j];
> -            backend_nb->health_check = true;
> -            backend_nb->logical_port = xstrdup(port_name);
> -            backend_nb->svc_mon_src_ip = svc_mon_src_ip;
> -            backend_nb->az_name = is_remote ? xstrdup(az_name) :
> NULL;
> -            backend_nb->remote_backend = is_remote;
> -        }
> +init_backend:
> +        backend_nb = &lb_vip_nb->backends_nb[j];
> +        backend_nb->health_check = lb_vip_nb->lb_health_check;
> +        backend_nb->logical_port = xstrdup(port_name);
> +        backend_nb->svc_mon_src_ip = svc_mon_src_ip;
> +        backend_nb->az_name = is_remote ? xstrdup(az_name) : NULL;
> +        backend_nb->remote_backend = is_remote;
> +        backend_nb->distributed_backend = lb->is_distributed;
> +cleanup:
>          free(port_name);
>      }
>  
> @@ -367,6 +377,9 @@ ovn_northd_lb_init(struct ovn_northd_lb *lb,
>          lb->hairpin_snat_ip = xstrdup(snat_ip);
>      }
>  
> +    lb->is_distributed = smap_get_bool(&nbrec_lb->options,
> "distributed",
> +                                       false);
> +
>      sset_init(&lb->ips_v4);
>      sset_init(&lb->ips_v6);
>      struct smap_node *node;
> @@ -406,8 +419,8 @@ ovn_northd_lb_init(struct ovn_northd_lb *lb,
>          }
>          n_vips++;
>  
> -        if (lb_vip_nb->lb_health_check) {
> -            ovn_lb_vip_backends_health_check_init(lb, lb_vip,
> lb_vip_nb);
> +        if (lb_vip_nb->lb_health_check || lb->is_distributed) {
> +            ovn_lb_vip_backends_ip_port_mappings_init(lb, lb_vip,
> lb_vip_nb);
>          }
>      }
>  
> diff --git a/northd/lb.h b/northd/lb.h
> index 53dc4abf0..0f2555a1e 100644
> --- a/northd/lb.h
> +++ b/northd/lb.h
> @@ -74,8 +74,12 @@ struct ovn_northd_lb {
>      /* Indicates if the load balancer has health checks configured.
> */
>      bool health_checks;
>  
> -    char *hairpin_snat_ip;
> +    /* Indicates if distributed option is enabled for load balancer.
> */
> +    bool is_distributed;
> +
>      bool use_stateless_nat;
> +
> +    char *hairpin_snat_ip;
>  };
>  
>  /* ovn-northd specific backend information. */
> @@ -91,6 +95,7 @@ struct ovn_northd_lb_backend {
>      bool health_check;
>       /* Set to true if port does not locate in local AZ. */
>      bool remote_backend;
> +    bool distributed_backend;
>      /* Logical port to which the ip belong to. */
>      char *logical_port;
>      /* Source IP address to be used for service monitoring. */
> diff --git a/northd/northd.c b/northd/northd.c
> index 2d01a01d9..a8c7df7a6 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -565,6 +565,7 @@ ovn_datapath_create(struct hmap *datapaths, const
> struct uuid *key,
>      od->localnet_ports = VECTOR_EMPTY_INITIALIZER(struct ovn_port
> *);
>      od->lb_with_stateless_mode = false;
>      od->ipam_info_initialized = false;
> +    od->is_distributed = false;
>      od->tunnel_key = sdp->sb_dp->tunnel_key;
>      init_mcast_info_for_datapath(od);
>      return od;
> @@ -3315,6 +3316,46 @@ ovn_lb_svc_create(struct ovsdb_idl_txn
> *ovnsb_txn,
>      }
>  }
>  
> +static bool
> +backend_is_available(const struct ovn_northd_lb *lb,
> +                     const struct ovn_lb_backend *backend,
> +                     const struct ovn_northd_lb_backend *backend_nb,
> +                     const struct svc_monitors_map_data
> *svc_mons_data)
> +{
> +    struct service_monitor_info *mon_info =
> +        get_service_mon(svc_mons_data->local_svc_monitors_map,
> +                        svc_mons_data->ic_learned_svc_monitors_map,
> +                        backend->ip_str,
> +                        backend_nb->logical_port,
> +                        backend->port,
> +                        lb->nlb->protocol && lb->nlb->protocol[0] ?
> +                        lb->nlb->protocol : "tcp");
> +
> +    if (!mon_info) {
> +        return false;
> +    }
> +
> +    ovs_assert(mon_info->sbrec_mon);
> +
> +    return  mon_info->sbrec_mon->status &&
> +            strcmp(mon_info->sbrec_mon->status, "online") ?
> +            false : true;

Wouldn't this function return `true` if the `mon_info->sbrec_mon-
>status` is NULL? Maybe I'm just misunderstanding the way this ternary
operation is evaluated, perhaps additional parenthesis would help here
to better draw the individual "groups" of conditions that are
evaluated.   

nit: There are two spaces between "return" and "mon_info->"

> +}
> +
> +static inline void
> +append_lb_backend_to_action(const struct ovn_lb_backend *backend,
> +                            const struct ovn_northd_lb_backend
> *backend_nb,
> +                            bool distributed_mode,
> +                            struct ds *action)
> +{
> +    bool ipv6 = !IN6_IS_ADDR_V4MAPPED(&backend->ip);
> +    if (distributed_mode) {
> +        ds_put_format(action, "\"%s\":", backend_nb->logical_port);
> +    }
> +    ds_put_format(action, ipv6 ? "[%s]:%"PRIu16"," :
> "%s:%"PRIu16",",
> +                  backend->ip_str, backend->port);
> +}
> +
>  static bool
>  build_lb_vip_actions(const struct ovn_northd_lb *lb,
>                       const struct ovn_lb_vip *lb_vip,
> @@ -3340,9 +3381,11 @@ build_lb_vip_actions(const struct
> ovn_northd_lb *lb,
>          }
>      }
>  
> -    if (lb_vip_nb->lb_health_check) {
> -        ds_put_cstr(action, "ct_lb_mark(backends=");
> +    ds_put_format(action, "%s", lb->is_distributed
> +                  ? "ct_lb_mark_local(backends="
> +                  : "ct_lb_mark(backends=");
>  
> +    if (lb_vip_nb->lb_health_check || lb->is_distributed) {
>          size_t i = 0;
>          size_t n_active_backends = 0;
>          const struct ovn_lb_backend *backend;
> @@ -3350,45 +3393,37 @@ build_lb_vip_actions(const struct
> ovn_northd_lb *lb,
>              struct ovn_northd_lb_backend *backend_nb =
>                  &lb_vip_nb->backends_nb[i++];
>  
> -            if (!backend_nb->health_check) {
> +            if (lb_vip_nb->lb_health_check &&
> +                !backend_nb->health_check) {

nit: This condition is bit confusing to me. Could you please add short
comment that explains when this situation can happen?
>                  continue;
>              }
>  
> -            const char *protocol = lb->nlb->protocol;
> -            if (!protocol || !protocol[0]) {
> -                protocol = "tcp";
> -            }
> -
> -            struct service_monitor_info *mon_info =
> -                get_service_mon(svc_mons_data-
> >local_svc_monitors_map,
> -                                svc_mons_data-
> >ic_learned_svc_monitors_map,
> -                                backend->ip_str,
> -                                backend_nb->logical_port,
> -                                backend->port,
> -                                protocol);
> -
> -            if (!mon_info) {
> +            if (lb->is_distributed &&
> +                !backend_nb->distributed_backend) {
>                  continue;
>              }
>  
> -            ovs_assert(mon_info->sbrec_mon);
> -            if (mon_info->sbrec_mon->status &&
> -                    strcmp(mon_info->sbrec_mon->status, "online")) {
> +
> +            if (backend_nb->health_check &&
> +                !backend_is_available(lb,
> +                                      backend,
> +                                      backend_nb,
> +                                      svc_mons_data)) {
>                  continue;
>              }
>  
>              n_active_backends++;
> -            bool ipv6 = !IN6_IS_ADDR_V4MAPPED(&backend->ip);
> -            ds_put_format(action, ipv6 ? "[%s]:%"PRIu16"," :
> "%s:%"PRIu16",",
> -                          backend->ip_str, backend->port);
> +            append_lb_backend_to_action(backend,
> +                                        backend_nb,
> +                                        backend_nb-
> >distributed_backend,
> +                                        action);
>          }
>          ds_chomp(action, ',');
>  
>          drop = !n_active_backends && !lb_vip->empty_backend_rej;
>          reject = !n_active_backends && lb_vip->empty_backend_rej;
>      } else {
> -        ds_put_format(action, "ct_lb_mark(backends=%s",
> -                      lb_vip_nb->backend_ips);
> +        ds_put_format(action, "%s", lb_vip_nb->backend_ips);
>      }
>  
>      if (reject) {
> @@ -3425,6 +3460,20 @@ build_lb_vip_actions(const struct
> ovn_northd_lb *lb,
>      return reject;
>  }
>  
> +static inline void
> +handle_od_lb_datapath_modes(struct ovn_datapath *od,
> +                            struct ovn_lb_datapaths *lb_dps,
> +                            bool od_is_switch)
> +{

nit/question: Couldn't you use NULL value of either `od->nbr` or `od-
>nbs` to determine if the OD is switch or router? Instead of using
explicit function argument?

> +    if (od_is_switch && od->lb_with_stateless_mode) {
> +        hmapx_add(&lb_dps->ls_lb_with_stateless_mode, od);
> +    }
> +
> +    if (!od_is_switch && lb_dps->lb->is_distributed) {
> +        od->is_distributed = true;
> +    }
> +}
> +
>  static void
>  build_lb_datapaths(const struct hmap *lbs, const struct hmap
> *lb_groups,
>                     struct ovn_datapaths *ls_datapaths,
> @@ -3467,9 +3516,7 @@ build_lb_datapaths(const struct hmap *lbs,
> const struct hmap *lb_groups,
>              lb_dps = ovn_lb_datapaths_find(lb_datapaths_map,
> lb_uuid);
>              ovs_assert(lb_dps);
>              ovn_lb_datapaths_add_ls(lb_dps, 1, &od,
> ods_size(ls_datapaths));
> -            if (od->lb_with_stateless_mode) {
> -                hmapx_add(&lb_dps->ls_lb_with_stateless_mode, od);
> -            }
> +            handle_od_lb_datapath_modes(od, lb_dps, true);
>          }
>  
>          for (size_t i = 0; i < od->nbs->n_load_balancer_group; i++)
> {
> @@ -3503,6 +3550,7 @@ build_lb_datapaths(const struct hmap *lbs,
> const struct hmap *lb_groups,
>              lb_dps = ovn_lb_datapaths_find(lb_datapaths_map,
> lb_uuid);
>              ovs_assert(lb_dps);
>              ovn_lb_datapaths_add_lr(lb_dps, 1, &od,
> ods_size(lr_datapaths));
> +            handle_od_lb_datapath_modes(od, lb_dps, false);
>          }
>      }
>  
> @@ -3855,6 +3903,7 @@ sync_pb_for_lrp(struct ovn_port *op,
>          smap_add(&new, "distributed-port", op->primary_port->key);
>  
>          bool always_redirect =
> +            !lr_stateful_rec->has_distributed_lb &&
>              !lr_stateful_rec->lrnat_rec->has_distributed_nat &&
>              !l3dgw_port_has_associated_vtep_lports(op-
> >primary_port);
>  
> @@ -5444,10 +5493,7 @@ northd_handle_lb_data_changes(struct
> tracked_lb_data *trk_lb_data,
>              lb_dps = ovn_lb_datapaths_find(lb_datapaths_map,
> &uuidnode->uuid);
>              ovs_assert(lb_dps);
>              ovn_lb_datapaths_add_ls(lb_dps, 1, &od,
> ods_size(ls_datapaths));
> -
> -            if (od->lb_with_stateless_mode) {
> -                hmapx_add(&lb_dps->ls_lb_with_stateless_mode, od);
> -            }
> +            handle_od_lb_datapath_modes(od, lb_dps, true);
>  
>              /* Add the lb to the northd tracked data. */
>              hmapx_add(&nd_changes->trk_lbs.crupdated, lb_dps);
> @@ -5486,6 +5532,7 @@ northd_handle_lb_data_changes(struct
> tracked_lb_data *trk_lb_data,
>              lb_dps = ovn_lb_datapaths_find(lb_datapaths_map,
> &uuidnode->uuid);
>              ovs_assert(lb_dps);
>              ovn_lb_datapaths_add_lr(lb_dps, 1, &od,
> ods_size(lr_datapaths));
> +            handle_od_lb_datapath_modes(od, lb_dps, false);
>  
>              /* Add the lb to the northd tracked data. */
>              hmapx_add(&nd_changes->trk_lbs.crupdated, lb_dps);
> @@ -9689,22 +9736,15 @@ build_lswitch_arp_chassis_resident(const
> struct ovn_datapath *od,
>  {
>      struct sset distributed_nat_ports =
>          SSET_INITIALIZER(&distributed_nat_ports);
> -    struct sset resident_ports = SSET_INITIALIZER(&resident_ports);
> -    struct sset inports = SSET_INITIALIZER(&inports);
> +    struct hmapx resident_ports =
> HMAPX_INITIALIZER(&resident_ports);
>      struct ds match = DS_EMPTY_INITIALIZER;
>  
> -    struct hmapx_node *node;
> -    HMAPX_FOR_EACH (node, &od->phys_ports) {
> -        struct ovn_port *op = node->data;
> -        sset_add(&inports, op->json_key);
> -    }
> -
>      struct ovn_port *op;
>      VECTOR_FOR_EACH (&od->router_ports, op) {
>          struct ovn_port *op_r = op->peer;
>  
>          if (lrp_is_l3dgw(op_r)) {
> -            sset_add(&resident_ports, op_r->cr_port->json_key);
> +            hmapx_add(&resident_ports, op_r);
>          }
>      }
>  
> @@ -9720,23 +9760,32 @@ build_lswitch_arp_chassis_resident(const
> struct ovn_datapath *od,
>          }
>      }
>  
> -    if (!sset_is_empty(&inports) && !sset_is_empty(&resident_ports))
> {
> +    if (!hmapx_is_empty(&od->phys_ports) &&
> !hmapx_is_empty(&resident_ports)) {
> +        struct hmapx_node *node;
>          const char *port_name;
>  
> -        SSET_FOR_EACH (port_name, &inports) {
> +        HMAPX_FOR_EACH (node, &od->phys_ports) {
> +            op = node->data;
> +
>              ds_clear(&match);
> -            ds_put_format(&match, "arp.op == 1 && inport == %s",
> -                          port_name);
> +            ds_put_format(&match, "(arp.op == 1 || arp.op == 2) "
> +                                  "&& inport == %s",
> +                          op->json_key);
>              ovn_lflow_add(lflows, od, S_SWITCH_IN_CHECK_PORT_SEC,
> 75,
>                            ds_cstr(&match), REGBIT_EXT_ARP " = 1;
> next;",
>                            ar->lflow_ref);
>          }
>  
> -        SSET_FOR_EACH (port_name, &resident_ports) {
> +        HMAPX_FOR_EACH (node, &resident_ports) {
> +            op = node->data;
> +
>              ds_clear(&match);
> -            ds_put_format(&match, REGBIT_EXT_ARP" == 1 "
> -                                  "&& is_chassis_resident(%s)",
> -                          port_name);
> +            ds_put_format(&match, REGBIT_EXT_ARP" == 1");
> +            if (od_is_centralized(op->od)) {
> +                ds_put_format(&match, " && is_chassis_resident(%s)",
> +                              op
> +                              ->cr_port->json_key);

nit: the "->cr_port" probably doesn't need to be on its own line. 

> +            }
>              ovn_lflow_add(lflows, od, S_SWITCH_IN_APPLY_PORT_SEC,
> 75,
>                            ds_cstr(&match), "next;", ar->lflow_ref);
>          }
> @@ -9755,8 +9804,7 @@ build_lswitch_arp_chassis_resident(const struct
> ovn_datapath *od,
>      }
>  
>      sset_destroy(&distributed_nat_ports);
> -    sset_destroy(&resident_ports);
> -    sset_destroy(&inports);
> +    hmapx_destroy(&resident_ports);
>      ds_destroy(&match);
>  }
>  
> @@ -10769,8 +10817,13 @@ build_lswitch_ip_unicast_lookup(struct
> ovn_port *op,
>                           : debug_drop_action();
>  
>      if (lsp_is_router(op->nbsp) && op->peer && op->peer->nbrp) {
> +        /* Distributed gateway ports default to centralized mode.
> +         * They operate in distributed mode only when configured
> +         * on their bound router. */
> +        bool peer_lrp_is_centralized = od_is_centralized(op->peer-
> >od);
> +
>          /* For ports connected to logical routers add flows to
> bypass the
> -         * broadcast flooding of ARP/ND requests in table 19. We
> direct the
> +         * broadcast flooding of ARP/ND requests in table 22. We
> direct the
>           * requests only to the router port that owns the IP
> address.
>           */
>          build_lswitch_rport_arp_req_flows(op->peer, op->od, op,
> lflows,
> @@ -10785,7 +10838,9 @@ build_lswitch_ip_unicast_lookup(struct
> ovn_port *op,
>              ds_put_format(match, "eth.dst == %s", op->peer-
> >lrp_networks.ea_s);
>          }
>  
> -        if (!vector_is_empty(&op->peer->od->l3dgw_ports) &&
> +
> +        if (peer_lrp_is_centralized &&
> +            !vector_is_empty(&op->peer->od->l3dgw_ports) &&
>              !vector_is_empty(&op->od->localnet_ports)) {
>              add_lrp_chassis_resident_check(op->peer, match);
>          } else if (op->cr_port) {
> @@ -12635,6 +12690,13 @@ build_distr_lrouter_nat_flows_for_lb(struct
> lrouter_nat_lb_flows_ctx *ctx,
>      size_t new_match_len = ctx->new_match->length;
>      size_t undnat_match_len = ctx->undnat_match->length;
>  
> +    bool lb_is_centralized = !ctx->lb->is_distributed;
> +
> +    /* If load balancer is distributed, then the response traffic
> +     * must be returned through the distributed port.*/
> +    const char *gw_outport = lb_is_centralized ? dgp->cr_port-
> >json_key
> +                                               : dgp->json_key;
> +
>      const char *meter = NULL;
>  
>      if (ctx->reject) {
> @@ -12646,8 +12708,9 @@ build_distr_lrouter_nat_flows_for_lb(struct
> lrouter_nat_lb_flows_ctx *ctx,
>                                                      dgp, meter);
>      }
>  
> -    if (!vector_is_empty(&ctx->lb_vip->backends) ||
> -        !ctx->lb_vip->empty_backend_rej) {
> +    if (lb_is_centralized &&
> +        (!vector_is_empty(&ctx->lb_vip->backends) ||
> +        !ctx->lb_vip->empty_backend_rej)) {
>          ds_put_format(ctx->new_match, " && is_chassis_resident(%s)",
>                        dgp->cr_port->json_key);
>      }
> @@ -12684,18 +12747,21 @@ build_distr_lrouter_nat_flows_for_lb(struct
> lrouter_nat_lb_flows_ctx *ctx,
>       * the undnat stage.
>       */
>      ds_put_format(ctx->undnat_match, ") && outport == %s", dgp-
> >json_key);
> -    ds_clear(ctx->gw_redir_action);
> -    ds_put_format(ctx->gw_redir_action, "outport = %s; next;",
> -                  dgp->cr_port->json_key);
> +    ds_put_format(ctx->gw_redir_action,
> +                  "outport = %s; next;", gw_outport);
>  
>      ovn_lflow_add(ctx->lflows, od, S_ROUTER_IN_GW_REDIRECT, 200,
>                    ds_cstr(ctx->undnat_match), ds_cstr(ctx-
> >gw_redir_action),
>                    lflow_ref, WITH_HINT(&ctx->lb->nlb->header_));
>      ds_truncate(ctx->undnat_match, undnat_match_len);
>  
> -    ds_put_format(ctx->undnat_match, ") && (inport == %s || outport
> == %s)"
> -                  " && is_chassis_resident(%s)", dgp->json_key, dgp-
> >json_key,
> -                  dgp->cr_port->json_key);
> +    ds_put_format(ctx->undnat_match, ") && (inport == %s || outport
> == %s)",
> +                  dgp->json_key, dgp->json_key);
> +
> +    if (lb_is_centralized) {
> +        ds_put_format(ctx->undnat_match, " &&
> is_chassis_resident(%s)",
> +                      dgp->cr_port->json_key);
> +    }
>      ovn_lflow_add(ctx->lflows, od, S_ROUTER_OUT_UNDNAT, 120,
>                    ds_cstr(ctx->undnat_match),
> ds_cstr(&undnat_action),
>                    lflow_ref, WITH_HINT(&ctx->lb->nlb->header_));
> @@ -14022,6 +14088,10 @@ build_gateway_mtu_flow(struct lflow_table
> *lflows, struct ovn_port *op,
>  static bool
>  consider_l3dgw_port_is_centralized(struct ovn_port *op)
>  {
> +    if (!od_is_centralized(op->od)) {
> +        return false;
> +    }
> +
>      if (l3dgw_port_has_associated_vtep_lports(op)) {
>          return false;
>      }
> @@ -16224,7 +16294,7 @@ build_ipv6_input_flows_for_lrouter_port(
>       * router's own IP address. */
>      for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
>          ds_clear(match);
> -        if (lrp_is_l3dgw(op)) {
> +        if (lrp_is_l3dgw(op) && od_is_centralized(op->od)) {
>              /* Traffic with eth.src = l3dgw_port->lrp_networks.ea_s
>               * should only be sent from the gateway chassi, so that
>               * upstream MAC learning points to the gateway chassis.
> @@ -16500,7 +16570,8 @@ build_lrouter_ipv4_ip_input(struct ovn_port
> *op,
>                        op->lrp_networks.ipv4_addrs[i].network_s,
>                        op->lrp_networks.ipv4_addrs[i].plen);
>  
> -        if (!vector_is_empty(&op->od->l3dgw_ports) && op->peer
> +        if (od_is_centralized(op->od) &&
> +            !vector_is_empty(&op->od->l3dgw_ports) && op->peer
>              && !vector_is_empty(&op->peer->od->localnet_ports)) {
>              add_lrp_chassis_resident_check(op, match);
>          }
> diff --git a/northd/northd.h b/northd/northd.h
> index e4d7de9f9..84b76f886 100644
> --- a/northd/northd.h
> +++ b/northd/northd.h
> @@ -450,6 +450,11 @@ struct ovn_datapath {
>      /* Indicates that the LS has valid vni associated with it. */
>      bool has_evpn_vni;
>  
> +    /* True if datapath has some distributed dependencies.
> +     * Currently, this only applies to load balancers attached to
> datapath
> +     * with distributed mode enabled. */
> +    bool is_distributed;
> +
>      /* OVN northd only needs to know about logical router gateway
> ports for
>       * NAT/LB on a distributed router.  The "distributed gateway
> ports" are
>       * populated only when there is a gateway chassis or ha chassis
> group
> @@ -1151,6 +1156,18 @@ ovn_port_must_learn_route(const struct
> ovn_port *op,
>      return true;
>  }
>  
> + /* Returns true if datapath 'od' operates in centralized mode on
> gateway.
> + *
> + * Returns false when datapath is distributed. A datapath is
> distributed
> + * only when configured with the 'distributed' option enabled. In
> distributed
> + * mode, ARP/ND processing is handled locally on each node.
> + */
> +static inline bool
> +od_is_centralized(const struct ovn_datapath *od)
> +{
> +    return !od->is_distributed;
> +}
> +

nit/personal opinion: Having a function that takes a "struct" just to
return inverted value of struct's member feels a bit
redundant/overkill. But I might be wrong.

 
>  struct ovn_port *ovn_port_find(const struct hmap *ports, const char
> *name);
>  
>  void build_igmp_lflows(struct hmap *igmp_groups,
> diff --git a/ovn-nb.xml b/ovn-nb.xml
> index e74c0d010..02a864151 100644
> --- a/ovn-nb.xml
> +++ b/ovn-nb.xml
> @@ -2399,13 +2399,15 @@
>          <p>
>            Maps from endpoint IP to a colon-separated pair of logical
> port name
>            and source IP,
> -          e.g. <code><var>port_name</var>:<var>sourc_ip</var></code>
> for IPv4.
> +          e.g.
> <code><var>port_name</var>:<var>source_ip</var></code> for IPv4.
>            Health checks are sent to this port with the specified
> source IP.
>            For IPv6 square brackets must be used around IP address,
> e.g:
> -          <code><var>port_name</var>:<var>[sourc_ip]</var></code>
> +          <code><var>port_name</var>:<var>[source_ip]</var></code>
>            Remote endpoint:
>            Specify :target_zone_name at the end of the above syntax
> to create
>            remote health checks in a specific zone.
> +          For distributed load balancers - ip_port_mappings is
> required.
> +          In the absence of health checks - source_ip is optional.
>          </p>
>  
>          <p>
> @@ -2608,6 +2610,16 @@ or
>          traffic may be dropped in scenarios where we have different
> chassis
>          for each DGP. This option is set to <code>false</code> by
> default.
>        </column>
> +
> +      <column name="options" key="distributed">
> +        Option enables distributed load balancing across compute
> nodes,
> +        ensuring traffic is always routed to local backends —
> eliminating
> +        east-west traffic between nodes.
> +        Required configuration: <ref column="ip_port_mappings"/>.
> +        NOTE: The addressing of the underlay network must not
> overlap with
> +        the addressing of Load Balancer VIP.

Thank you for adding this note/expansion. I wonder if we could also add
a bit more explanation to it. e.g.:

NOTE: The addressing of the underlay network must not overlap with the
addressing of Load Balancer VIP. If the Load Balancer is attached to a
router that is directly connected to the underlay network and the VIP
belongs to the same subnet as used on the underlay network, the traffic
won't be spread across all chassis! Instead, it will be concentrated
only on the chassis that hosts the Distributed Gateway Port of the
router.

It doesn't have to be word-for-word, but I believe that we should
mention potential effect that this misconfiguration could cause. I'm
sure it will help someone in the future when debugging a mysterious LB
behavior :D 



> +      </column>
> +
>      </group>
>    </table>
>  
> diff --git a/ovn-sb.xml b/ovn-sb.xml
> index 623aaeffd..19c7a6877 100644
> --- a/ovn-sb.xml
> +++ b/ovn-sb.xml
> @@ -2130,6 +2130,17 @@
>            </p>
>          </dd>
>  
> +        <dt><code>ct_lb_mark_local;</code></dt>
> +       
> <dt><code>ct_lb_mark_local(backends=<var>lport_name</var>[<var>ip</va
> r>[:<var>port</var>][,...][;
> hash_fields=<var>field1</var>,<var>field2</var>,...][;
> ct_flag]);</code></dt>
> +        <dd>
> +          <p>
> +              Same as <code>ct_lb_mark</code>, with the key
> difference that it
> +              implements local-only load balancing. This mode
> selects backends
> +              only from those running on the current chassis,
> preventing
> +              traffic from being forwarded gto backends on remote
> nodes.
> +          </p>
> +        </dd>
> +
>          <dt>
>            <code><var>R</var> = dns_lookup();</code>
>          </dt>
> diff --git a/tests/multinode-macros.at b/tests/multinode-macros.at
> index ad09ac562..24b15f64b 100644
> --- a/tests/multinode-macros.at
> +++ b/tests/multinode-macros.at
> @@ -22,6 +22,20 @@ m4_define([M_NS_CHECK_EXEC],
>      [ AT_CHECK([M_NS_EXEC([$1], [$2], [$3])],
> m4_shift(m4_shift(m4_shift($@)))) ]
>  )
>  
> +# M_FAKE_NODE_EXEC([fake_node], [command])
> +#
> +# Execute 'command' in 'fake_node'
> +m4_define([M_FAKE_NODE_EXEC],
> +    [podman exec $1 $2])
> +
> +# M_FAKE_NODE_CHECK_EXEC([fake_node], [command], other_params...)
> +#
> +# Wrapper for AT_CHECK that executes 'command' inside 'fake_node'.
> +# 'other_params' as passed as they are to AT_CHECK.
> +m4_define([M_FAKE_NODE_CHECK_EXEC],
> +    [ AT_CHECK([M_FAKE_NODE_EXEC([$1], [$2])],
> m4_shift(m4_shift($@))) ]
> +)
> +
>  # M_DAEMONIZE([fake_node],[command],[pidfile])
>  m4_define([M_DAEMONIZE],
>      [podman exec $1 $2 & echo $! > $3
> diff --git a/tests/multinode.at b/tests/multinode.at
> index 83cdd6c01..b1336c31c 100644
> --- a/tests/multinode.at
> +++ b/tests/multinode.at
> @@ -4296,3 +4296,162 @@ M_NS_CHECK_EXEC([ovn-chassis-2], [ovn-ext2],
> [ping6 -q -c 3 -i 0.3 -w 2 6812:86:
>  m_wait_row_count mac_binding 1 ip="6812\:86\:\:102"
> logical_port="lr1-pub"
>  
>  AT_CLEANUP
> +
> +AT_SETUP([Distribute load balancing: IPv4])
> +#
> +# ┌──────────────────────┐
> +# │ fabric (leaf switch) │
> +# │       gw-1           │
> +# └─────────┬────────────┘
> +#           │
> +#    ┌──────┴──────┐
> +#    │             │
> +#    ▼             ▼
> +#  route          route
> +#  weight 1      weight 2 (2 backends)
> +#    │             │
> +#    ▼             ▼
> +# ┌───────┐      ┌───────┐
> +# │Chassis│      │Chassis│
> +# │   1   │      │   2   │
> +# └───-───┘      └───-───┘
> +#
> +
> +check_fake_multinode_setup
> +cleanup_multinode_resources
> +for c in ovn-chassis-1 ovn-chassis-2
> +do
> +    m_as $c ip link del lb-host
> +    m_as $c ovs-vsctl del-br br-lb
> +    m_as $c ovs-vsctl del-port lb-ovs
> +done
> +
> +OVS_WAIT_UNTIL([m_as ovn-chassis-1 ip link show | grep -q
> genev_sys])
> +OVS_WAIT_UNTIL([m_as ovn-chassis-2 ip link show | grep -q
> genev_sys])
> +
> +check multinode_nbctl ls-add pub                \
> +    -- lsp-add-router-port pub pub-lr1 lr1-pub  \
> +    -- lsp-add-localnet-port pub pub-ln public
> +
> +check multinode_nbctl lr-add lr1 \
> +    -- lrp-add lr1 lr1-pub 00:00:00:00:00:01 169.254.1.254/24 \
> +    -- lrp-add lr1 lr1-down 00:00:00:00:00:02 192.168.1.254/24
> +
> +check multinode_nbctl ls-add ls1
> +check multinode_nbctl lsp-add ls1 ls1p1
> +check multinode_nbctl lsp-set-addresses ls1p1 "00:00:00:01:01:02
> 192.168.1.1"
> +check multinode_nbctl lsp-add ls1 ls1p2
> +check multinode_nbctl lsp-set-addresses ls1p2 "00:00:00:01:02:02
> 192.168.1.2"
> +check multinode_nbctl lsp-add ls1 ls1p3
> +check multinode_nbctl lsp-set-addresses ls1p3 "00:00:00:01:03:02
> 192.168.1.3"
> +check multinode_nbctl lsp-add-router-port ls1 ls1-lr1 lr1-down
> +
> +check multinode_nbctl lrp-set-gateway-chassis lr1-pub ovn-chassis-2
> +
> +# Create default route
> +check multinode_nbctl lr-route-add lr1 0.0.0.0/0 169.254.1.253 lr1-
> pub
> +
> +m_as ovn-chassis-1 /data/create_fake_vm.sh ls1p1 ls1p1
> 00:00:00:01:01:02 1500 192.168.1.1 24 192.168.1.254 2001::1/64
> 2001::a
> +m_as ovn-chassis-2 /data/create_fake_vm.sh ls1p2 ls1p2
> 00:00:00:01:02:02 1500 192.168.1.2 24 192.168.1.254 2001::2/64
> 2001::a
> +m_as ovn-chassis-2 /data/create_fake_vm.sh ls1p3 ls1p3
> 00:00:00:01:03:02 1500 192.168.1.3 24 192.168.1.254 2001::3/64
> 2001::a
> +
> +# Create load balancer
> +lb_vip="1.1.1.1"

nit: I think we should stay way from using real/routable IP addresses
in the tests (in this and other tests in this change). Let's use either
private block or one of the blocks reserved for documentation (RFC
7537)

Thanks again,
Martin.

> +check multinode_nbctl lb-add lb1 $lb_vip:80
> 192.168.1.1:10880,192.168.1.2:10880,192.168.1.3:10880
> +check multinode_nbctl lr-lb-add lr1 lb1
> +check multinode_nbctl set Load_Balancer lb1
> ip_port_mappings:192.168.1.1=ls1p1:192.168.1.199
> +check multinode_nbctl set Load_Balancer lb1
> ip_port_mappings:192.168.1.2=ls1p2:192.168.1.199
> +check multinode_nbctl set Load_Balancer lb1
> ip_port_mappings:192.168.1.3=ls1p3:192.168.1.199
> +check multinode_nbctl set load_balancer lb1 options:distributed=true
> +
> +ip_ch1=$(m_as ovn-chassis-1 ip a show dev eth1 | grep "inet " | awk
> '{print $2}'| cut -d '/' -f1)
> +ip_ch2=$(m_as ovn-chassis-2 ip a show dev eth1 | grep "inet " | awk
> '{print $2}'| cut -d '/' -f1)
> +
> +# Add multipath route to load balancer VIP with weighted nexthops on
> "fabric" host:
> +check m_as ovn-gw-1 ip route flush $lb_vip
> +check m_as ovn-gw-1 ip route add $lb_vip  nexthop via $ip_ch1 dev
> eth1 weight 1 nexthop via $ip_ch2 dev eth1 weight 2
> +
> +# Set kernel multipath hash policy to L3/L4 (source/destination
> IP+port)
> +# Policy 1 = Layer 3/4 hash (src/dst IP+port)
> +M_FAKE_NODE_CHECK_EXEC([ovn-gw-1], [sysctl -w
> net.ipv4.fib_multipath_hash_policy=1], \
> +[0], [dnl
> +net.ipv4.fib_multipath_hash_policy = 1
> +])
> +
> +# Check OpenFlow group filling: it should only contain local
> backends
> +M_FAKE_NODE_CHECK_EXEC([ovn-chassis-1], [ovs-ofctl dump-groups br-
> int | sed -e 's/table=[[0-9]]*/table=<cleared>/g'], [0], [dnl
> +NXST_GROUP_DESC reply (xid=0x2):
> +
> group_id=1,type=select,selection_method=dp_hash,bucket=bucket_id:0,we
> ight:100,actions=ct(commit,table=<cleared>,zone=NXM_NX_REG11[[0..15]]
> ,nat(dst=192.168.1.1:10880),exec(load:0x1->NXM_NX_CT_MARK[[1]]))
> +])
> +
> +M_FAKE_NODE_CHECK_EXEC([ovn-chassis-2], [ovs-ofctl dump-groups br-
> int | sed -e 's/table=[[0-9]]*/table=<cleared>/g'], [0], [dnl
> +NXST_GROUP_DESC reply (xid=0x2):
> +
> group_id=2,type=select,selection_method=dp_hash,bucket=bucket_id:1,we
> ight:100,actions=ct(commit,table=<cleared>,zone=NXM_NX_REG11[[0..15]]
> ,nat(dst=192.168.1.2:10880),exec(load:0x1-
> >NXM_NX_CT_MARK[[1]])),bucket=bucket_id:2,weight:100,actions=ct(commi
> t,table=<cleared>,zone=NXM_NX_REG11[[0..15]],nat(dst=192.168.1.3:1088
> 0),exec(load:0x1->NXM_NX_CT_MARK[[1]]))
> +])
> +
> +physicl_gw_mac_address="30:42:f5:a7:46:65"
> +
> +# Configure infrastructure on chassis hosts:
> +# lb-host (physicl_gw_mac_address) - (veth) lb-ovs - br-lb - br-int
> +for c in ovn-chassis-1 ovn-chassis-2
> +do
> +    check m_as $c ip link add lb-host type veth peer lb-ovs
> +    on_exit "m_as $c ip link del lb-host"
> +
> +    check m_as $c ip link set dev lb-host address
> $physicl_gw_mac_address
> +    check m_as $c ip addr add 169.254.1.253/24 dev lb-host
> +    check m_as $c ip link set lb-host up
> +    check m_as $c ip link set lb-ovs up
> +
> +    check m_as $c ovs-vsctl add-br br-lb
> +    on_exit "m_as $c ovs-vsctl del-br br-lb"
> +    check m_as $c ovs-vsctl add-port br-lb lb-ovs
> +    on_exit "m_as $c ovs-vsctl del-port lb-ovs"
> +    check m_as $c ovs-vsctl set open . external-ids:ovn-bridge-
> mappings=public:br-lb
> +
> +    check m_as $c ip route flush $lb_vip
> +    check m_as $c ip r add $lb_vip via 169.254.1.254 dev lb-host
> +    on_exit "m_as $c ip route flush $lb_vip"
> +done
> +
> +OVS_WAIT_UNTIL([m_as ovn-chassis-1 ovs-vsctl show | grep -q patch-
> pub-ln-to-br-int])
> +OVS_WAIT_UNTIL([m_as ovn-chassis-2 ovs-vsctl show | grep -q patch-
> pub-ln-to-br-int])
> +
> +M_NS_DAEMONIZE([ovn-chassis-1], [ls1p1], [$PYTHON3
> /data/tcp_metadata_server.py \
> +                                                   --bind-host
> 192.168.1.1 \
> +                                                   --bind-port 10880
> \
> +                                                   --reply-string
> ls1p1 >/dev/null 2>&1], [nc1.pid])
> +M_NS_DAEMONIZE([ovn-chassis-2], [ls1p2], [$PYTHON3
> /data/tcp_metadata_server.py \
> +                                                   --bind-host
> 192.168.1.2 \
> +                                                   --bind-port 10880
> \
> +                                                   --reply-string
> ls1p2 >/dev/null 2>&1], [nc2.pid])
> +M_NS_DAEMONIZE([ovn-chassis-2], [ls1p3], [$PYTHON3
> /data/tcp_metadata_server.py \
> +                                                   --bind-host
> 192.168.1.3 \
> +                                                   --bind-port 10880
> \
> +                                                   --reply-string
> ls1p3 >/dev/null 2>&1], [nc3.pid])
> +
> +# Running tcpdump on chassis nodes to ensure there's no east-west
> traffic during balancing,
> +# which means balancing is happening directly on the node.
> +M_START_TCPDUMP([ovn-chassis-1], [-c 2 -neei genev_sys_6081 port
> 10880], [ch1_genev])
> +M_START_TCPDUMP([ovn-chassis-1], [-c 2 -neei eth2 port 10880],
> [ch1_eth2])
> +M_START_TCPDUMP([ovn-chassis-2], [-c 2 -neei genev_sys_6081 port
> 10880], [ch2_genev])
> +M_START_TCPDUMP([ovn-chassis-2], [-c 2 -neei eth2 port 10880],
> [ch2_eth2])
> +
> +M_FAKE_NODE_CHECK_EXEC([ovn-gw-1], [/bin/bash -c 'for i in $(seq
> 500); \
> +                          do curl --http0.9
> http://1.1.1.1:80 2>/dev/null ; \
> +                          echo ; done | sort | uniq -c ' > reply],
> [0], [])
> +
> +# Check that requests are distributed among all backends.
> +AT_CHECK([grep -q ls1p1 reply && grep -q ls1p2 reply && grep -q
> ls1p3 reply], [0], [])
> +
> +# Check east-west traffic is clean
> +AT_CHECK([cat ch1_genev.tcpdump], [0], [dnl
> +])
> +AT_CHECK([cat ch1_eth2.tcpdump], [0], [dnl
> +])
> +AT_CHECK([cat ch2_genev.tcpdump], [0], [dnl
> +])
> +AT_CHECK([cat ch2_eth2.tcpdump], [0], [dnl
> +])
> +
> +AT_CLEANUP
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 25655c456..0d5af3568 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -18289,6 +18289,9 @@ ovn_start
>  # ip_port_mappings syntax: ip:lport_name:src_ip:<az_name>(for remote
> lports)
>  
>  check ovn-nbctl ls-add ls1
> +check ovn-nbctl lr-add lr1
> +
> +ovn-appctl -t ovn-northd vlog/disable-rate-limit
>  
>  check ovn-nbctl lb-add lb1_ipv4 1.1.1.1:80
> 192.168.0.1:10880,192.168.0.2:10880,192.168.0.3:10880
>  AT_CHECK([ovn-nbctl --wait=sb \
> @@ -18358,6 +18361,154 @@ check ovn-nbctl set load_balancer lb1_ipv4
> ip_port_mappings:192.168.0.1=lport1:1
>  check_row_count sb:Service_Monitor 0
>  
>  OVS_WAIT_UNTIL([grep "Empty AZ name specified" northd/ovn-
> northd.log])
> +
> +check ovn-nbctl lb-del lb1_ipv4
> +
> +# Check correct setup of distributed load balancers.
> +echo > northd/ovn-northd.log
> +check ovn-nbctl lb-add lb_distubuted 1.1.1.1:80
> 192.168.0.1:10880,192.168.0.2:10880
> +check ovn-nbctl lr-lb-add lr1 lb_distubuted
> +check ovn-nbctl set load_balancer lb_distubuted
> options:distributed=true
> +
> +# Check that load balancer does not work in a distributed mode -
> there is no ip_port_mappings setting
> +ovn-sbctl lflow-list lr1 > lr1_lflow
> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
> ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80), action=(drop;)
> +])
> +
> +# Check that the load balancer has only one backend available since
> the only one backend has ip_port_mappings
> +check ovn-nbctl set load_balancer lb_distubuted
> ip_port_mappings:192.168.0.1=lport1
> +ovn-sbctl lflow-list lr1 > lr1_lflow
> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
> ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80),
> action=(ct_lb_mark_local(backends="lport1":192.168.0.1:10880);)
> +])
> +
> +check ovn-nbctl set load_balancer lb_distubuted
> ip_port_mappings:192.168.0.2=lport2
> +ovn-sbctl lflow-list lr1 > lr1_lflow
> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
> ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80),
> action=(ct_lb_mark_local(backends="lport1":192.168.0.1:10880,"lport2"
> :192.168.0.2:10880);)
> +])
> +
> +# Check if health check is configured, ip_port_mappings must be
> provided.
> +AT_CHECK([ovn-nbctl --wait=sb \
> +          -- --id=@hc create Load_Balancer_Health_Check
> vip="1.1.1.1\:80" \
> +             options:failure_count=100 \
> +          -- add Load_Balancer lb_distubuted health_check @hc |
> uuidfilt], [0], [<0>
> +])
> +
> +ovn-sbctl lflow-list lr1 > lr1_lflow
> +OVS_WAIT_UNTIL([grep "Expected ':' separator for:" northd/ovn-
> northd.log])
> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
> ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80), action=(drop;)
> +])
> +
> +check ovn-nbctl lb-del lb_distubuted
> +echo > northd/ovn-northd.log
> +
> +# ipv6 configuration
> +check ovn-nbctl lb-add lb1 [[2001::a]]:80
> [[2001::3]]:80,[[2002::3]]:80
> +
> +check_uuid ovn-nbctl --wait=sb -- --id=@hc create \
> +Load_Balancer_Health_Check vip="\[\[2001\:\:a\]\]\:80" -- add
> Load_Balancer . \
> +health_check @hc
> +
> +check_row_count sb:Service_Monitor 0
> +check ovn-nbctl --wait=sb set load_balancer .
> ip_port_mappings:\"[[2001::3]]\"=\"lport1:[[2001::2]]\"
> +
> +check_row_count sb:Service_Monitor 1
> +ovn-sbctl list service_monitor
> +check_column "2001::3" sb:Service_Monitor ip logical_port=lport1
> +check_column 80 sb:Service_Monitor port logical_port=lport1
> +check_column tcp sb:Service_Monitor protocol logical_port=lport1
> +check_column "2001::2" sb:Service_Monitor src_ip logical_port=lport1
> +check_column false sb:Service_Monitor ic_learned logical_port=lport1
> +check_column false sb:Service_Monitor remote logical_port=lport1
> +check_column "" sb:Service_Monitor logical_input_port
> logical_port=lport1
> +
> +# Empty src_ip.
> +check ovn-nbctl clear load_balancer lb1 ip_port_mappings
> +check ovn-nbctl --wait=sb set load_balancer .
> ip_port_mappings:\"[[2001::3]]\"=\"lport1:\"
> +OVS_WAIT_UNTIL([grep "Invalid svc mon src IP" northd/ovn-
> northd.log])
> +check_row_count sb:Service_Monitor 0
> +echo > northd/ovn-northd.log
> +
> +# Uncorrect ip_address.
> +check ovn-nbctl --wait=sb set load_balancer .
> ip_port_mappings:\"[[invalid]]\"=\"lport1:\"
> +OVS_WAIT_UNTIL([grep "bad IP address" northd/ovn-northd.log])
> +echo > northd/ovn-northd.log
> +
> +check ovn-nbctl --wait=sb set load_balancer .
> ip_port_mappings:\"[[2001::3]]\"=\"lport1:invalid\"
> +OVS_WAIT_UNTIL([grep "bad IP address" northd/ovn-northd.log])
> +echo > northd/ovn-northd.log
> +
> +check ovn-nbctl --wait=sb set load_balancer .
> ip_port_mappings:\"[[2001::3]]\"=\"lport1:[[2001::2]]:az_name\"
> +check_row_count sb:Service_Monitor 1
> +ovn-sbctl list service_monitor
> +check_column "2001::3" sb:Service_Monitor ip logical_port=lport1
> +check_column 80 sb:Service_Monitor port logical_port=lport1
> +check_column tcp sb:Service_Monitor protocol logical_port=lport1
> +check_column "2001::2" sb:Service_Monitor src_ip logical_port=lport1
> +check_column false sb:Service_Monitor ic_learned logical_port=lport1
> +check_column true sb:Service_Monitor remote logical_port=lport1
> +check_column "" sb:Service_Monitor logical_input_port
> logical_port=lport1
> +
> +uuid=$(ovn-sbctl -d bare --no-headings --columns _uuid find
> Service_Monitor logical_port=lport1)
> +
> +# Check az_name presence in options.
> +AT_CHECK([ovn-sbctl get Service_Monitor ${uuid} options:az-name],
> +[0], [az_name
> +])
> +
> +check ovn-nbctl --wait=sb set load_balancer .
> ip_port_mappings:\"[[2001::3]]\"=\"lport1:[[2001::2]]:\"
> +check_row_count sb:Service_Monitor 0
> +OVS_WAIT_UNTIL([grep "Empty AZ name specified" northd/ovn-
> northd.log])
> +
> +echo > northd/ovn-northd.log
> +check ovn-nbctl lb-del lb1
> +
> +# Check correct setup of distributed load balancers.
> +check ovn-nbctl lb-add lb_distubuted [[2001::a]]:80
> [[2001::3]]:80,[[2002::3]]:80
> +check ovn-nbctl lr-lb-add lr1 lb_distubuted
> +check ovn-nbctl set load_balancer lb_distubuted
> options:distributed=true
> +
> +# Check that load balancer does not work in a distributed mode -
> there is no ip_port_mappings setting
> +ovn-sbctl lflow-list lr1 > lr1_lflow
> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
> ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80), action=(drop;)
> +])
> +
> +echo > northd/ovn-northd.log
> +check ovn-nbctl set load_balancer .
> ip_port_mappings:\"[[2001::3]]\"=\"lport1\"
> +ovn-sbctl lflow-list lr1 > lr1_lflow
> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
> ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80),
> action=(ct_lb_mark_local(backends="lport1":[[2001::3]]:80);)
> +])
> +
> +echo > northd/ovn-northd.log
> +check ovn-nbctl set load_balancer .
> ip_port_mappings:\"[[2002::3]]\"=\"lport2\"
> +ovn-sbctl lflow-list lr1 > lr1_lflow
> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
> ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80),
> action=(ct_lb_mark_local(backends="lport1":[[2001::3]]:80,"lport2":[[
> 2002::3]]:80);)
> +])
> +
> +echo > northd/ovn-northd.log
> +check_uuid ovn-nbctl --wait=sb -- --id=@hc create \
> +Load_Balancer_Health_Check vip="\[\[2001\:\:a\]\]\:80" -- add
> Load_Balancer . \
> +health_check @hc
> +OVS_WAIT_UNTIL([grep "Expected ':' separator for:" northd/ovn-
> northd.log])
> +ovn-sbctl lflow-list lr1 > lr1_lflow
> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
> ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80), action=(drop;)
> +])
> +
> +check ovn-nbctl --wait=sb set load_balancer .
> ip_port_mappings:\"[[2001::3]]\"=\"lport1:[[2001::2]]\"
> +ovn-sbctl lflow-list lr1 > lr1_lflow
> +AT_CHECK([cat lr1_lflow | grep lr_in_dnat | grep priority=120 |
> ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip6 && ip6.dst == 2001::a && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80),
> action=(ct_lb_mark_local(backends="lport1":[[2001::3]]:80);)
> +])
> +
> +check ovn-nbctl lb-del lb_distubuted
> +
>  OVN_CLEANUP_NORTHD
>  AT_CLEANUP
>  ])
> @@ -19251,3 +19402,216 @@ AT_CHECK([grep "lr_in_policy[[^_]]"
> lr0flows | ovn_strip_lflows | sort], [0], [d
>  OVN_CLEANUP_NORTHD
>  AT_CLEANUP
>  ])
> +
> +OVN_FOR_EACH_NORTHD_NO_HV([
> +AT_SETUP([Distributed lb: logical-flow test - IPv4/IPv6 case])
> +ovn_start
> +
> +check ovn-nbctl ls-add outside
> +
> +check ovn-nbctl lsp-add outside outside \
> +      -- lsp-set-addresses outside unknown \
> +      -- lsp-set-type outside localnet
> +
> +check ovn-nbctl --wait=sb set Logical_Switch_Port outside
> tag_request=2
> +
> +check ovn-nbctl lsp-add outside outside-down \
> +      -- lsp-set-type outside-down router \
> +      -- lsp-set-addresses outside-down router \
> +      -- lsp-set-options outside-down router-port=lr1-up
> +
> +check ovn-nbctl lr-add lr1 \
> +      -- lrp-add lr1 lr1-up 11:11:11:11:11:11 169.254.0.1/24
> 2001:db8:abcd:0002::bad/64 \
> +      -- lrp-add lr1 lr1-down 12:12:12:12:12:12 192.168.0.1/24
> 2001:db8:abcd:0001::c0fe/64
> +
> +check ovn-nbctl ls-add ls1 \
> +      -- lsp-add ls1 lport1 \
> +      -- lsp-set-addresses lport1 "13:13:13:13:13:13 192.168.0.101"
> \
> +      -- lsp-add ls1 lport2 \
> +      -- lsp-set-addresses lport2 "14:14:14:14:14:14 192.168.0.102"
> +
> +check ovn-nbctl lsp-add ls1 ls1-up \
> +      -- lsp-set-type ls1-up router \
> +      -- lsp-set-addresses ls1-up router \
> +      -- lsp-set-options ls1-up router-port=lr1-down
> +
> +check ovn-nbctl --wait=sb sync
> +
> +check ovn-nbctl ha-chassis-group-add gateway
> +check ovn-nbctl ha-chassis-group-add-chassis gateway hv1 1
> +ha_g_uuid=$(fetch_column nb:HA_Chassis_Group _uuid name=gateway)
> +lr1_up_uuid=$(fetch_column nb:Logical_Router_Port _uuid name=lr1-up)
> +check ovn-nbctl set logical_router_port $lr1_up_uuid
> ha_chassis_group=$ha_g_uuid
> +
> +check ovn-nbctl --wait=sb sync
> +
> +check ovn-nbctl lb-add lb1_ipv4 1.1.1.1:80
> 192.168.0.101:10880,192.168.0.102:10880
> +check ovn-nbctl set Load_Balancer lb1_ipv4
> ip_port_mappings:192.168.0.101=lport1:192.168.0.199
> +check ovn-nbctl set Load_Balancer lb1_ipv4
> ip_port_mappings:192.168.0.102=lport2:192.168.0.199
> +check ovn-nbctl lr-lb-add lr1 lb1_ipv4
> +check ovn-nbctl --wait=sb sync
> +
> +check ovn-nbctl lb-add lb1_ipv6 [[2000::1]]:80 
> [[2001:db8:abcd:1::2]]:10882
> +check ovn-nbctl set Load_Balancer lb1_ipv6
> ip_port_mappings:\"[[2001:db8:abcd:1::2]]\"=\"lport1\"
> +check ovn-nbctl lr-lb-add lr1 lb1_ipv6
> +check ovn-nbctl --wait=sb sync
> +
> +ovn-sbctl lflow-list lr1 > lr1_lflows_before
> +ovn-sbctl lflow-list outside > outside_lflows_before
> +
> +AT_CHECK([cat outside_lflows_before | grep ls_in_l2_lkup | grep
> priority=50 | ovn_strip_lflows], [0], [dnl
> +  table=??(ls_in_l2_lkup      ), priority=50   , match=(eth.dst ==
> 11:11:11:11:11:11 && is_chassis_resident("cr-lr1-up")),
> action=(outport = "outside-down"; output;)
> +])
> +
> +AT_CHECK([cat lr1_lflows_before | grep lr_in_ip_input | grep
> priority=90 | grep 169.254.0.1 | ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_ip_input     ), priority=90   , match=(inport ==
> "lr1-up" && arp.op == 1 && arp.tpa == 169.254.0.1 && arp.spa ==
> 169.254.0.0/24 && is_chassis_resident("cr-lr1-up")), action=(eth.dst
> = eth.src; eth.src = xreg0[[0..47]]; arp.op = 2; /* ARP reply */
> arp.tha = arp.sha; arp.sha = xreg0[[0..47]]; arp.tpa <-> arp.spa;
> outport = inport; flags.loopback = 1; output;)
> +  table=??(lr_in_ip_input     ), priority=90   , match=(ip4.dst ==
> 169.254.0.1 && icmp4.type == 8 && icmp4.code == 0), action=(ip4.dst
> <-> ip4.src; ip.ttl = 255; icmp4.type = 0; flags.loopback = 1; next;
> )
> +])
> +
> +AT_CHECK([cat lr1_lflows_before | grep lr_in_ip_input | grep
> priority=90 | grep 2001:db8:abcd:2::bad | ovn_strip_lflows], [0],
> [dnl
> +  table=??(lr_in_ip_input     ), priority=90   , match=(inport ==
> "lr1-up" && ip6.dst == {2001:db8:abcd:2::bad, ff02::1:ff00:bad} &&
> nd_ns && nd.target == 2001:db8:abcd:2::bad &&
> is_chassis_resident("cr-lr1-up")), action=(nd_na_router { eth.src =
> xreg0[[0..47]]; ip6.src = nd.target; nd.tll = xreg0[[0..47]]; outport
> = inport; flags.loopback = 1; output; };)
> +  table=??(lr_in_ip_input     ), priority=90   , match=(ip6.dst ==
> {2001:db8:abcd:2::bad, fe80::1311:11ff:fe11:1111} && icmp6.type ==
> 128 && icmp6.code == 0), action=(ip6.dst <-> ip6.src; ip.ttl = 255;
> icmp6.type = 129; flags.loopback = 1; next; )
> +])
> +
> +AT_CHECK([cat lr1_lflows_before | grep lr_in_admission | grep 
> priority=50 | ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_admission    ), priority=50   , match=(eth.dst ==
> 11:11:11:11:11:11 && inport == "lr1-up" && is_chassis_resident("cr-
> lr1-up")), action=(xreg0[[0..47]] = 11:11:11:11:11:11; next;)
> +  table=??(lr_in_admission    ), priority=50   , match=(eth.dst ==
> 12:12:12:12:12:12 && inport == "lr1-down"), action=(xreg0[[0..47]] =
> 12:12:12:12:12:12; next;)
> +  table=??(lr_in_admission    ), priority=50   , match=(eth.mcast &&
> inport == "lr1-down"), action=(xreg0[[0..47]] = 12:12:12:12:12:12;
> next;)
> +  table=??(lr_in_admission    ), priority=50   , match=(eth.mcast &&
> inport == "lr1-up"), action=(xreg0[[0..47]] = 11:11:11:11:11:11;
> next;)
> +])
> +
> +AT_CHECK([cat lr1_lflows_before | grep lr_out_undnat | grep
> priority=120 | ovn_strip_lflows], [0], [dnl
> +  table=??(lr_out_undnat      ), priority=120  , match=(ip4 &&
> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src ==
> 192.168.0.102 && tcp.src == 10880)) && (inport == "lr1-up" || outport
> == "lr1-up") && is_chassis_resident("cr-lr1-up")), action=(ct_dnat;)
> +  table=??(lr_out_undnat      ), priority=120  , match=(ip6 &&
> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && (inport ==
> "lr1-up" || outport == "lr1-up") && is_chassis_resident("cr-lr1-
> up")), action=(ct_dnat;)
> +])
> +
> +AT_CHECK([cat lr1_lflows_before | grep lr_in_gw_redirect |
> ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_gw_redirect  ), priority=0    , match=(1),
> action=(next;)
> +  table=??(lr_in_gw_redirect  ), priority=200  , match=(ip4 &&
> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src ==
> 192.168.0.102 && tcp.src == 10880)) && outport == "lr1-up"),
> action=(outport = "cr-lr1-up"; next;)
> +  table=??(lr_in_gw_redirect  ), priority=200  , match=(ip6 &&
> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && outport ==
> "lr1-up"), action=(outport = "cr-lr1-up"; next;)
> +  table=??(lr_in_gw_redirect  ), priority=50   , match=(outport ==
> "lr1-up"), action=(outport = "cr-lr1-up"; next;)
> +])
> +
> +AT_CHECK([cat lr1_lflows_before | grep lr_in_dnat | grep
> priority=120 | ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80 && is_chassis_resident("cr-lr1-up")),
> action=(ct_lb_mark(backends=192.168.0.101:10880,192.168.0.102:10880);
> )
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip6 && ip6.dst == 2000::1 && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80 && is_chassis_resident("cr-lr1-up")),
> action=(ct_lb_mark(backends=[[2001:db8:abcd:1::2]]:10882);)
> +])
> +
> +AT_CHECK([cat outside_lflows_before | grep ls_in_check_port_sec |
> grep priority=75 | ovn_strip_lflows], [0], [dnl
> +  table=??(ls_in_check_port_sec), priority=75   , match=((arp.op ==
> 1 || arp.op == 2) && inport == "outside"), action=(reg0[[22]] = 1;
> next;)
> +])
> +
> +AT_CHECK([cat outside_lflows_before | grep ls_in_apply_port_sec |
> grep priority=75 | ovn_strip_lflows], [0], [dnl
> +  table=??(ls_in_apply_port_sec), priority=75   , match=(reg0[[22]]
> == 1 && is_chassis_resident("cr-lr1-up")), action=(next;)
> +])
> +
> +check ovn-nbctl clear logical_router_port $lr1_up_uuid
> ha_chassis_group
> +check ovn-nbctl ha-chassis-group-del gateway
> +check ovn-nbctl ha-chassis-group-add gateway2
> +check ovn-nbctl ha-chassis-group-add-chassis gateway2 test 1
> +ha_g_uuid=$(fetch_column nb:HA_Chassis_Group _uuid name=gateway2)
> +lr1_up_uuid=$(fetch_column nb:Logical_Router_Port _uuid name=lr1-up)
> +check ovn-nbctl set logical_router_port $lr1_up_uuid
> ha_chassis_group=$ha_g_uuid
> +
> +check ovn-nbctl set load_balancer lb1_ipv4 options:distributed=true
> +check ovn-nbctl --wait=hv sync
> +
> +ovn-sbctl lflow-list outside > outside_lflows_after
> +ovn-sbctl lflow-list lr1 > lr1_lflows_after
> +
> +AT_CHECK([cat outside_lflows_after | grep ls_in_l2_lkup | grep
> priority=50 | ovn_strip_lflows], [0], [dnl
> +  table=??(ls_in_l2_lkup      ), priority=50   , match=(eth.dst ==
> 11:11:11:11:11:11), action=(outport = "outside-down"; output;)
> +])
> +
> +AT_CHECK([cat lr1_lflows_after | grep lr_in_ip_input | grep
> priority=90 | grep 169.254.0.1 | ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_ip_input     ), priority=90   , match=(inport ==
> "lr1-up" && arp.op == 1 && arp.tpa == 169.254.0.1 && arp.spa ==
> 169.254.0.0/24), action=(eth.dst = eth.src; eth.src = xreg0[[0..47]];
> arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha =
> xreg0[[0..47]]; arp.tpa <-> arp.spa; outport = inport; flags.loopback
> = 1; output;)
> +  table=??(lr_in_ip_input     ), priority=90   , match=(ip4.dst ==
> 169.254.0.1 && icmp4.type == 8 && icmp4.code == 0), action=(ip4.dst
> <-> ip4.src; ip.ttl = 255; icmp4.type = 0; flags.loopback = 1; next;
> )
> +])
> +
> +AT_CHECK([cat lr1_lflows_after | grep lr_in_ip_input | grep
> priority=90 | grep 2001:db8:abcd:2::bad | ovn_strip_lflows], [0],
> [dnl
> +  table=??(lr_in_ip_input     ), priority=90   , match=(inport ==
> "lr1-up" && ip6.dst == {2001:db8:abcd:2::bad, ff02::1:ff00:bad} &&
> nd_ns && nd.target == 2001:db8:abcd:2::bad), action=(nd_na_router {
> eth.src = xreg0[[0..47]]; ip6.src = nd.target; nd.tll =
> xreg0[[0..47]]; outport = inport; flags.loopback = 1; output; };)
> +  table=??(lr_in_ip_input     ), priority=90   , match=(ip6.dst ==
> {2001:db8:abcd:2::bad, fe80::1311:11ff:fe11:1111} && icmp6.type ==
> 128 && icmp6.code == 0), action=(ip6.dst <-> ip6.src; ip.ttl = 255;
> icmp6.type = 129; flags.loopback = 1; next; )
> +])
> +
> +AT_CHECK([cat lr1_lflows_after | grep lr_in_admission | grep
> priority=50 | ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_admission    ), priority=50   , match=(eth.dst ==
> 11:11:11:11:11:11 && inport == "lr1-up"), action=(xreg0[[0..47]] =
> 11:11:11:11:11:11; next;)
> +  table=??(lr_in_admission    ), priority=50   , match=(eth.dst ==
> 12:12:12:12:12:12 && inport == "lr1-down"), action=(xreg0[[0..47]] =
> 12:12:12:12:12:12; next;)
> +  table=??(lr_in_admission    ), priority=50   , match=(eth.mcast &&
> inport == "lr1-down"), action=(xreg0[[0..47]] = 12:12:12:12:12:12;
> next;)
> +  table=??(lr_in_admission    ), priority=50   , match=(eth.mcast &&
> inport == "lr1-up"), action=(xreg0[[0..47]] = 11:11:11:11:11:11;
> next;)
> +])
> +
> +AT_CHECK([cat lr1_lflows_after | grep lr_out_undnat | grep
> priority=120 | ovn_strip_lflows], [0], [dnl
> +  table=??(lr_out_undnat      ), priority=120  , match=(ip4 &&
> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src ==
> 192.168.0.102 && tcp.src == 10880)) && (inport == "lr1-up" || outport
> == "lr1-up")), action=(ct_dnat;)
> +  table=??(lr_out_undnat      ), priority=120  , match=(ip6 &&
> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && (inport ==
> "lr1-up" || outport == "lr1-up") && is_chassis_resident("cr-lr1-
> up")), action=(ct_dnat;)
> +])
> +
> +AT_CHECK([cat lr1_lflows_after | grep lr_in_gw_redirect |
> ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_gw_redirect  ), priority=0    , match=(1),
> action=(next;)
> +  table=??(lr_in_gw_redirect  ), priority=200  , match=(ip4 &&
> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src ==
> 192.168.0.102 && tcp.src == 10880)) && outport == "lr1-up"),
> action=(outport = "lr1-up"; next;)
> +  table=??(lr_in_gw_redirect  ), priority=200  , match=(ip6 &&
> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && outport ==
> "lr1-up"), action=(outport = "cr-lr1-up"; next;)
> +  table=??(lr_in_gw_redirect  ), priority=50   , match=(outport ==
> "lr1-up"), action=(outport = "cr-lr1-up"; next;)
> +])
> +
> +AT_CHECK([cat lr1_lflows_after | grep lr_in_dnat | grep priority=120
> | ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80),
> action=(ct_lb_mark_local(backends="lport1":192.168.0.101:10880,"lport
> 2":192.168.0.102:10880);)
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip6 && ip6.dst == 2000::1 && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80 && is_chassis_resident("cr-lr1-up")),
> action=(ct_lb_mark(backends=[[2001:db8:abcd:1::2]]:10882);)
> +])
> +
> +AT_CHECK([cat outside_lflows_after | grep ls_in_check_port_sec |
> grep priority=75 | ovn_strip_lflows], [0], [dnl
> +  table=??(ls_in_check_port_sec), priority=75   , match=((arp.op ==
> 1 || arp.op == 2) && inport == "outside"), action=(reg0[[22]] = 1;
> next;)
> +])
> +
> +AT_CHECK([cat outside_lflows_after | grep ls_in_apply_port_sec |
> grep priority=75 | ovn_strip_lflows], [0], [dnl
> +  table=??(ls_in_apply_port_sec), priority=75   , match=(reg0[[22]]
> == 1), action=(next;)
> +])
> +
> +check ovn-nbctl set load_balancer lb1_ipv6 options:distributed=true
> +check ovn-nbctl --wait=hv sync
> +
> +ovn-sbctl lflow-list outside > outside_lflows_after
> +ovn-sbctl lflow-list lr1 > lr1_lflows_after
> +
> +AT_CHECK([cat lr1_lflows_after | grep lr_out_undnat | grep
> priority=120 | ovn_strip_lflows], [0], [dnl
> +  table=??(lr_out_undnat      ), priority=120  , match=(ip4 &&
> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src ==
> 192.168.0.102 && tcp.src == 10880)) && (inport == "lr1-up" || outport
> == "lr1-up")), action=(ct_dnat;)
> +  table=??(lr_out_undnat      ), priority=120  , match=(ip6 &&
> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && (inport ==
> "lr1-up" || outport == "lr1-up")), action=(ct_dnat;)
> +])
> +
> +AT_CHECK([cat lr1_lflows_after | grep lr_in_gw_redirect |
> ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_gw_redirect  ), priority=0    , match=(1),
> action=(next;)
> +  table=??(lr_in_gw_redirect  ), priority=200  , match=(ip4 &&
> ((ip4.src == 192.168.0.101 && tcp.src == 10880) || (ip4.src ==
> 192.168.0.102 && tcp.src == 10880)) && outport == "lr1-up"),
> action=(outport = "lr1-up"; next;)
> +  table=??(lr_in_gw_redirect  ), priority=200  , match=(ip6 &&
> ((ip6.src == 2001:db8:abcd:1::2 && tcp.src == 10882)) && outport ==
> "lr1-up"), action=(outport = "lr1-up"; next;)
> +  table=??(lr_in_gw_redirect  ), priority=50   , match=(outport ==
> "lr1-up"), action=(outport = "cr-lr1-up"; next;)
> +])
> +
> +AT_CHECK([cat lr1_lflows_after | grep lr_in_dnat | grep priority=120
> | ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip4 && ip4.dst == 1.1.1.1 && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80),
> action=(ct_lb_mark_local(backends="lport1":192.168.0.101:10880,"lport
> 2":192.168.0.102:10880);)
> +  table=??(lr_in_dnat         ), priority=120  , match=(ct.new &&
> !ct.rel && ip6 && ip6.dst == 2000::1 && reg1[[16..23]] == 6 &&
> reg1[[0..15]] == 80),
> action=(ct_lb_mark_local(backends="lport1":[[2001:db8:abcd:1::2]]:108
> 82);)
> +])
> +
> +check ovn-nbctl set load_balancer lb1_ipv6 options:distributed=false
> +check ovn-nbctl --wait=hv sync
> +
> +AT_CHECK([cat outside_lflows_after | grep ls_in_l2_lkup | grep
> priority=50 | ovn_strip_lflows], [0], [dnl
> +  table=??(ls_in_l2_lkup      ), priority=50   , match=(eth.dst ==
> 11:11:11:11:11:11), action=(outport = "outside-down"; output;)
> +])
> +
> +AT_CHECK([cat lr1_lflows_after | grep lr_in_ip_input | grep
> priority=90 | grep 169.254.0.1 | ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_ip_input     ), priority=90   , match=(inport ==
> "lr1-up" && arp.op == 1 && arp.tpa == 169.254.0.1 && arp.spa ==
> 169.254.0.0/24), action=(eth.dst = eth.src; eth.src = xreg0[[0..47]];
> arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha =
> xreg0[[0..47]]; arp.tpa <-> arp.spa; outport = inport; flags.loopback
> = 1; output;)
> +  table=??(lr_in_ip_input     ), priority=90   , match=(ip4.dst ==
> 169.254.0.1 && icmp4.type == 8 && icmp4.code == 0), action=(ip4.dst
> <-> ip4.src; ip.ttl = 255; icmp4.type = 0; flags.loopback = 1; next;
> )
> +])
> +
> +AT_CHECK([cat lr1_lflows_after | grep lr_in_ip_input | grep
> priority=90 | grep 2001:db8:abcd:2::bad | ovn_strip_lflows], [0],
> [dnl
> +  table=??(lr_in_ip_input     ), priority=90   , match=(inport ==
> "lr1-up" && ip6.dst == {2001:db8:abcd:2::bad, ff02::1:ff00:bad} &&
> nd_ns && nd.target == 2001:db8:abcd:2::bad), action=(nd_na_router {
> eth.src = xreg0[[0..47]]; ip6.src = nd.target; nd.tll =
> xreg0[[0..47]]; outport = inport; flags.loopback = 1; output; };)
> +  table=??(lr_in_ip_input     ), priority=90   , match=(ip6.dst ==
> {2001:db8:abcd:2::bad, fe80::1311:11ff:fe11:1111} && icmp6.type ==
> 128 && icmp6.code == 0), action=(ip6.dst <-> ip6.src; ip.ttl = 255;
> icmp6.type = 129; flags.loopback = 1; next; )
> +])
> +
> +AT_CHECK([cat lr1_lflows_after | grep lr_in_admission | grep
> priority=50 | ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_admission    ), priority=50   , match=(eth.dst ==
> 11:11:11:11:11:11 && inport == "lr1-up"), action=(xreg0[[0..47]] =
> 11:11:11:11:11:11; next;)
> +  table=??(lr_in_admission    ), priority=50   , match=(eth.dst ==
> 12:12:12:12:12:12 && inport == "lr1-down"), action=(xreg0[[0..47]] =
> 12:12:12:12:12:12; next;)
> +  table=??(lr_in_admission    ), priority=50   , match=(eth.mcast &&
> inport == "lr1-down"), action=(xreg0[[0..47]] = 12:12:12:12:12:12;
> next;)
> +  table=??(lr_in_admission    ), priority=50   , match=(eth.mcast &&
> inport == "lr1-up"), action=(xreg0[[0..47]] = 11:11:11:11:11:11;
> next;)
> +])
> +
> +OVN_CLEANUP_NORTHD
> +AT_CLEANUP
> +])
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to