On Thu, Jul 20, 2023 at 5:02 PM Lorenzo Bianconi <
[email protected]> wrote:

> When using VLAN backed networks and OVN routers leveraging the
> 'ovn-chassis-mac-mappings' option for east-west traffic, the eth.src field
> is
> replaced by the chassis mac address in order to not expose the router mac
> address from different nodes and confuse the TOR switch. However doing so
> the TOR switch is not able to learn the port/mac bindings for routed E/W
> traffic and it is force to always flood it. Fix this issue adding the
> capability to configure a given timeout for garp sent by ovn-controller
> and not disable it after the exponential backoff in order to keep
> refreshing the entries in TOR swtich fdb table.
> More into about the issue can be found here [0].
>
> [0]
> https://mail.openvswitch.org/pipermail/ovs-discuss/2020-September/050678.html
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2087779
>
> Signed-off-by: Lorenzo Bianconi <[email protected]>
>

Hi Lorenzo,

I have a couple of comments below.

---
> Changes since v1:
> - add uni-test
> - add documentation
> ---
>  controller/ovn-controller.8.xml | 12 ++++++++
>  controller/ovn-controller.c     |  5 +++-
>  controller/pinctrl.c            | 52 ++++++++++++++++++++++++++++-----
>  controller/pinctrl.h            |  4 ++-
>  tests/ovn.at                    | 17 +++++++++++
>  5 files changed, 81 insertions(+), 9 deletions(-)
>
> diff --git a/controller/ovn-controller.8.xml
> b/controller/ovn-controller.8.xml
> index 0883d8da9..f2893f7ee 100644
> --- a/controller/ovn-controller.8.xml
> +++ b/controller/ovn-controller.8.xml
> @@ -365,6 +365,18 @@
>          heplful to pin source outer IP for the tunnel when multiple
> interfaces
>          are used on the host for overlay traffic.
>        </dd>
> +      <dt><code>external_ids:garp-max-timeout</code></dt>
>

The name is very confusing, it doesn't limit the value as specified, but
instead it's just timeout. I would suggest dropping the "max" from the name
or treat it as "max" and let the backoff do the work, capping it at the
specified value.


> +      <dd>
> +        When used, this configuration value specifies the maximum timeout
> +        (in seconds) between two consecutive GARP packets sent by
> +        <code>ovn-controller</code>.
> +        <code>ovn-controller</code> by default sends just 4 GARP packets
> +        with an exponential backoff timeout.
> +        Setting <code>external_ids:garp-max-timeout</code> allows to
> +        continue sending GARPs with a given timeout.
> +        Setting <code>external_ids:garp-max-timeout</code> to 0 will
> +        reset the default behaviour.
> +      </dd>
>      </dl>
>
>      <p>
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 236974f4f..b6d044bd0 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -1040,6 +1040,7 @@ ctrl_register_ovs_idl(struct ovsdb_idl *ovs_idl)
>       * calls are after the "non-track" calls. */
>      ovsdb_idl_add_table(ovs_idl, &ovsrec_table_open_vswitch);
>      ovsdb_idl_add_column(ovs_idl, &ovsrec_open_vswitch_col_other_config);
> +    ovsdb_idl_add_column(ovs_idl, &ovsrec_open_vswitch_col_external_ids);
>

Why do we need this condition? We have other config values in this table
that are acted upon during the ovn-controller.


>      ovsdb_idl_add_column(ovs_idl, &ovsrec_open_vswitch_col_bridges);
>      ovsdb_idl_add_column(ovs_idl, &ovsrec_open_vswitch_col_datapaths);
>      ovsdb_idl_add_table(ovs_idl, &ovsrec_table_interface);
> @@ -5363,7 +5364,9 @@ main(int argc, char *argv[])
>                                      &runtime_data->local_datapaths,
>                                      &runtime_data->active_tunnels,
>
>  &runtime_data->local_active_ports_ipv6_pd,
> -
> &runtime_data->local_active_ports_ras);
> +                                    &runtime_data->local_active_ports_ras,
> +                                    ovsrec_open_vswitch_table_get(
> +                                            ovs_idl_loop.idl));
>                          stopwatch_stop(PINCTRL_RUN_STOPWATCH_NAME,
>                                         time_msec());
>                          mirror_run(ovs_idl_txn,
> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> index bed90fe0b..9e724d259 100644
> --- a/controller/pinctrl.c
> +++ b/controller/pinctrl.c
> @@ -165,6 +165,7 @@ VLOG_DEFINE_THIS_MODULE(pinctrl);
>  static struct ovs_mutex pinctrl_mutex = OVS_MUTEX_INITIALIZER;
>  static struct seq *pinctrl_handler_seq;
>  static struct seq *pinctrl_main_seq;
> +static long long int garp_rarp_max_timeout = LLONG_MAX;
>
>  static void *pinctrl_handler(void *arg);
>
> @@ -227,7 +228,8 @@ static void send_garp_rarp_prepare(
>      const struct ovsrec_bridge *,
>      const struct sbrec_chassis *,
>      const struct hmap *local_datapaths,
> -    const struct sset *active_tunnels)
> +    const struct sset *active_tunnels,
> +    const struct ovsrec_open_vswitch_table *ovs_table)
>      OVS_REQUIRES(pinctrl_mutex);
>  static void send_garp_rarp_run(struct rconn *swconn,
>                                 long long int *send_garp_rarp_time)
> @@ -3492,7 +3494,8 @@ pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
>              const struct hmap *local_datapaths,
>              const struct sset *active_tunnels,
>              const struct shash *local_active_ports_ipv6_pd,
> -            const struct shash *local_active_ports_ras)
> +            const struct shash *local_active_ports_ras,
> +            const struct ovsrec_open_vswitch_table *ovs_table)
>  {
>      ovs_mutex_lock(&pinctrl_mutex);
>      run_put_mac_bindings(ovnsb_idl_txn, sbrec_datapath_binding_by_key,
> @@ -3503,7 +3506,7 @@ pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
>      send_garp_rarp_prepare(ovnsb_idl_txn, sbrec_port_binding_by_datapath,
>                             sbrec_port_binding_by_name,
>                             sbrec_mac_binding_by_lport_ip, br_int, chassis,
> -                           local_datapaths, active_tunnels);
> +                           local_datapaths, active_tunnels, ovs_table);
>      prepare_ipv6_ras(local_active_ports_ras, sbrec_port_binding_by_name);
>      prepare_ipv6_prefixd(ovnsb_idl_txn, sbrec_port_binding_by_name,
>                           local_active_ports_ipv6_pd, chassis,
> @@ -4424,7 +4427,8 @@ send_garp_rarp_update(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
>                        struct ovsdb_idl_index
> *sbrec_mac_binding_by_lport_ip,
>                        const struct hmap *local_datapaths,
>                        const struct sbrec_port_binding *binding_rec,
> -                      struct shash *nat_addresses)
> +                      struct shash *nat_addresses,
> +                      long long int garp_max_timeout)
>  {
>      volatile struct garp_rarp_data *garp_rarp = NULL;
>
> @@ -4450,6 +4454,11 @@ send_garp_rarp_update(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
>                  if (garp_rarp) {
>                      garp_rarp->dp_key = binding_rec->datapath->tunnel_key;
>                      garp_rarp->port_key = binding_rec->tunnel_key;
> +                    if (garp_max_timeout != garp_rarp_max_timeout) {
> +                        /* reset backoff */
> +                        garp_rarp->announce_time = time_msec() + 1000;
> +                        garp_rarp->backoff = 1;
> +                    }
>                  } else {
>                      add_garp_rarp(name, laddrs->ea,
>                                    laddrs->ipv4_addrs[i].addr,
> @@ -4474,6 +4483,11 @@ send_garp_rarp_update(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
>                      if (garp_rarp) {
>                          garp_rarp->dp_key =
> binding_rec->datapath->tunnel_key;
>                          garp_rarp->port_key = binding_rec->tunnel_key;
> +                        if (garp_max_timeout != garp_rarp_max_timeout) {
> +                            /* reset backoff */
> +                            garp_rarp->announce_time = time_msec() + 1000;
> +                            garp_rarp->backoff = 1;
> +                        }
>                      } else {
>                          add_garp_rarp(name, laddrs->ea,
>                                        0,
> binding_rec->datapath->tunnel_key,
> @@ -4493,6 +4507,11 @@ send_garp_rarp_update(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
>      if (garp_rarp) {
>          garp_rarp->dp_key = binding_rec->datapath->tunnel_key;
>          garp_rarp->port_key = binding_rec->tunnel_key;
> +        if (garp_max_timeout != garp_rarp_max_timeout) {
> +            /* reset backoff */
> +            garp_rarp->announce_time = time_msec() + 1000;
> +            garp_rarp->backoff = 1;
> +        }
>          return;
>      }
>
> @@ -4582,6 +4601,8 @@ send_garp_rarp(struct rconn *swconn, struct
> garp_rarp_data *garp_rarp,
>      if (garp_rarp->backoff < 16) {
>          garp_rarp->backoff *= 2;
>          garp_rarp->announce_time = current_time + garp_rarp->backoff *
> 1000;
> +    } else if (garp_rarp_max_timeout != LLONG_MAX) {
> +        garp_rarp->announce_time = current_time + garp_rarp_max_timeout;
>      } else {
>          garp_rarp->announce_time = LLONG_MAX;
>      }
> @@ -5881,13 +5902,26 @@ send_garp_rarp_prepare(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
>                         const struct ovsrec_bridge *br_int,
>                         const struct sbrec_chassis *chassis,
>                         const struct hmap *local_datapaths,
> -                       const struct sset *active_tunnels)
> +                       const struct sset *active_tunnels,
> +                       const struct ovsrec_open_vswitch_table *ovs_table)
>      OVS_REQUIRES(pinctrl_mutex)
>  {
>      struct sset localnet_vifs = SSET_INITIALIZER(&localnet_vifs);
>      struct sset local_l3gw_ports = SSET_INITIALIZER(&local_l3gw_ports);
>      struct sset nat_ip_keys = SSET_INITIALIZER(&nat_ip_keys);
>      struct shash nat_addresses;
> +    unsigned long long garp_max_timeout = LLONG_MAX;
> +    const struct ovsrec_open_vswitch *cfg =
> +        ovsrec_open_vswitch_table_first(ovs_table);
> +    if (cfg) {
> +        garp_max_timeout = smap_get_ullong(
> +                &cfg->external_ids, "garp-max-timeout", LLONG_MAX);
>




> +        if (!garp_max_timeout) {
> +            garp_max_timeout = LLONG_MAX;
> +        } else if (garp_max_timeout != LLONG_MAX) {
> +            garp_max_timeout *= 1000; /* sec */
> +        }
> +    }
>
>      shash_init(&nat_addresses);
>
> @@ -5918,7 +5952,8 @@ send_garp_rarp_prepare(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
>          if (pb) {
>              send_garp_rarp_update(ovnsb_idl_txn,
>                                    sbrec_mac_binding_by_lport_ip,
> -                                  local_datapaths, pb, &nat_addresses);
> +                                  local_datapaths, pb, &nat_addresses,
> +                                  garp_max_timeout);
>          }
>      }
>
> @@ -5929,7 +5964,8 @@ send_garp_rarp_prepare(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
>              = lport_lookup_by_name(sbrec_port_binding_by_name, gw_port);
>          if (pb) {
>              send_garp_rarp_update(ovnsb_idl_txn,
> sbrec_mac_binding_by_lport_ip,
> -                                  local_datapaths, pb, &nat_addresses);
> +                                  local_datapaths, pb, &nat_addresses,
> +                                  garp_max_timeout);
>          }
>      }
>
> @@ -5947,6 +5983,8 @@ send_garp_rarp_prepare(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
>      shash_destroy(&nat_addresses);
>
>      sset_destroy(&nat_ip_keys);
> +
> +    garp_rarp_max_timeout = garp_max_timeout;
>  }
>
>  static bool
> diff --git a/controller/pinctrl.h b/controller/pinctrl.h
> index 279a49fbc..23343f097 100644
> --- a/controller/pinctrl.h
> +++ b/controller/pinctrl.h
> @@ -30,6 +30,7 @@ struct ovsdb_idl;
>  struct ovsdb_idl_index;
>  struct ovsdb_idl_txn;
>  struct ovsrec_bridge;
> +struct ovsrec_open_vswitch_table;
>  struct sbrec_chassis;
>  struct sbrec_dns_table;
>  struct sbrec_controller_event_table;
> @@ -57,7 +58,8 @@ void pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
>                   const struct hmap *local_datapaths,
>                   const struct sset *active_tunnels,
>                   const struct shash *local_active_ports_ipv6_pd,
> -                 const struct shash *local_active_ports_ras);
> +                 const struct shash *local_active_ports_ras,
> +                 const struct ovsrec_open_vswitch_table *ovs_table);
>  void pinctrl_wait(struct ovsdb_idl_txn *ovnsb_idl_txn);
>  void pinctrl_destroy(void);
>  void pinctrl_set_br_int_name(const char *br_int_name);
> diff --git a/tests/ovn.at b/tests/ovn.at
> index 882a548db..3124cef3b 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -9094,6 +9094,7 @@ AT_CLEANUP
>
>  OVN_FOR_EACH_NORTHD([
>  AT_SETUP([send gratuitous arp for l3gateway only on selected chassis])
> +AT_SKIP_IF([test $HAVE_TCPDUMP = no])
>  ovn_start
>
>  # Create logical switch
> @@ -9183,6 +9184,22 @@ sleep 2
>  OVN_CHECK_PACKETS_CONTAIN([hv2/snoopvif-tx.pcap], [arp_expected])
>  OVN_CHECK_PACKETS([hv1/snoopvif-tx.pcap], [empty_expected])
>
> +# Temporarily remove lr0 chassis
> +AT_CHECK([ovn-nbctl remove logical_router lr0 options chassis])
>

We should sync on hv so ovn-controller can catch that.


> +
> +as hv1 reset_pcap_file snoopvif hv1/snoopvif
> +as hv2 reset_pcap_file snoopvif hv2/snoopvif
> +
> +AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv1])
>

Same here.


> +# set garp max timeout to 32s
> +AT_CHECK([as hv1 ovs-vsctl set Open_vSwitch .
> external-ids:garp-max-timeout=32])
>

Is there any reason why we don't test 1 sec value? We wouldn't have to wait
for more than 180 s for this test to finish, but just ~6 sec instead which
is still quite long, but more reasonable.


> +
> +OVS_CTL_TIMEOUT=180
> +OVS_WAIT_UNTIL([
> +n_arp=$(tcpdump -c 6 -ner hv1/snoopvif-tx.pcap arp | wc -l)
> +test "$n_arp" = 6
> +])
> +
>  OVN_CLEANUP([hv1],[hv2])
>
>  AT_CLEANUP
> --
> 2.41.0
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Thanks,
Ales
-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA <https://www.redhat.com>

[email protected]    IM: amusil
<https://red.ht/sig>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to