On Thu, Jul 20, 2023 at 5:02 PM Lorenzo Bianconi < [email protected]> wrote:
> When using VLAN backed networks and OVN routers leveraging the > 'ovn-chassis-mac-mappings' option for east-west traffic, the eth.src field > is > replaced by the chassis mac address in order to not expose the router mac > address from different nodes and confuse the TOR switch. However doing so > the TOR switch is not able to learn the port/mac bindings for routed E/W > traffic and it is force to always flood it. Fix this issue adding the > capability to configure a given timeout for garp sent by ovn-controller > and not disable it after the exponential backoff in order to keep > refreshing the entries in TOR swtich fdb table. > More into about the issue can be found here [0]. > > [0] > https://mail.openvswitch.org/pipermail/ovs-discuss/2020-September/050678.html > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2087779 > > Signed-off-by: Lorenzo Bianconi <[email protected]> > Hi Lorenzo, I have a couple of comments below. --- > Changes since v1: > - add uni-test > - add documentation > --- > controller/ovn-controller.8.xml | 12 ++++++++ > controller/ovn-controller.c | 5 +++- > controller/pinctrl.c | 52 ++++++++++++++++++++++++++++----- > controller/pinctrl.h | 4 ++- > tests/ovn.at | 17 +++++++++++ > 5 files changed, 81 insertions(+), 9 deletions(-) > > diff --git a/controller/ovn-controller.8.xml > b/controller/ovn-controller.8.xml > index 0883d8da9..f2893f7ee 100644 > --- a/controller/ovn-controller.8.xml > +++ b/controller/ovn-controller.8.xml > @@ -365,6 +365,18 @@ > heplful to pin source outer IP for the tunnel when multiple > interfaces > are used on the host for overlay traffic. > </dd> > + <dt><code>external_ids:garp-max-timeout</code></dt> > The name is very confusing, it doesn't limit the value as specified, but instead it's just timeout. I would suggest dropping the "max" from the name or treat it as "max" and let the backoff do the work, capping it at the specified value. > + <dd> > + When used, this configuration value specifies the maximum timeout > + (in seconds) between two consecutive GARP packets sent by > + <code>ovn-controller</code>. > + <code>ovn-controller</code> by default sends just 4 GARP packets > + with an exponential backoff timeout. > + Setting <code>external_ids:garp-max-timeout</code> allows to > + continue sending GARPs with a given timeout. > + Setting <code>external_ids:garp-max-timeout</code> to 0 will > + reset the default behaviour. > + </dd> > </dl> > > <p> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index 236974f4f..b6d044bd0 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -1040,6 +1040,7 @@ ctrl_register_ovs_idl(struct ovsdb_idl *ovs_idl) > * calls are after the "non-track" calls. */ > ovsdb_idl_add_table(ovs_idl, &ovsrec_table_open_vswitch); > ovsdb_idl_add_column(ovs_idl, &ovsrec_open_vswitch_col_other_config); > + ovsdb_idl_add_column(ovs_idl, &ovsrec_open_vswitch_col_external_ids); > Why do we need this condition? We have other config values in this table that are acted upon during the ovn-controller. > ovsdb_idl_add_column(ovs_idl, &ovsrec_open_vswitch_col_bridges); > ovsdb_idl_add_column(ovs_idl, &ovsrec_open_vswitch_col_datapaths); > ovsdb_idl_add_table(ovs_idl, &ovsrec_table_interface); > @@ -5363,7 +5364,9 @@ main(int argc, char *argv[]) > &runtime_data->local_datapaths, > &runtime_data->active_tunnels, > > &runtime_data->local_active_ports_ipv6_pd, > - > &runtime_data->local_active_ports_ras); > + &runtime_data->local_active_ports_ras, > + ovsrec_open_vswitch_table_get( > + ovs_idl_loop.idl)); > stopwatch_stop(PINCTRL_RUN_STOPWATCH_NAME, > time_msec()); > mirror_run(ovs_idl_txn, > diff --git a/controller/pinctrl.c b/controller/pinctrl.c > index bed90fe0b..9e724d259 100644 > --- a/controller/pinctrl.c > +++ b/controller/pinctrl.c > @@ -165,6 +165,7 @@ VLOG_DEFINE_THIS_MODULE(pinctrl); > static struct ovs_mutex pinctrl_mutex = OVS_MUTEX_INITIALIZER; > static struct seq *pinctrl_handler_seq; > static struct seq *pinctrl_main_seq; > +static long long int garp_rarp_max_timeout = LLONG_MAX; > > static void *pinctrl_handler(void *arg); > > @@ -227,7 +228,8 @@ static void send_garp_rarp_prepare( > const struct ovsrec_bridge *, > const struct sbrec_chassis *, > const struct hmap *local_datapaths, > - const struct sset *active_tunnels) > + const struct sset *active_tunnels, > + const struct ovsrec_open_vswitch_table *ovs_table) > OVS_REQUIRES(pinctrl_mutex); > static void send_garp_rarp_run(struct rconn *swconn, > long long int *send_garp_rarp_time) > @@ -3492,7 +3494,8 @@ pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn, > const struct hmap *local_datapaths, > const struct sset *active_tunnels, > const struct shash *local_active_ports_ipv6_pd, > - const struct shash *local_active_ports_ras) > + const struct shash *local_active_ports_ras, > + const struct ovsrec_open_vswitch_table *ovs_table) > { > ovs_mutex_lock(&pinctrl_mutex); > run_put_mac_bindings(ovnsb_idl_txn, sbrec_datapath_binding_by_key, > @@ -3503,7 +3506,7 @@ pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn, > send_garp_rarp_prepare(ovnsb_idl_txn, sbrec_port_binding_by_datapath, > sbrec_port_binding_by_name, > sbrec_mac_binding_by_lport_ip, br_int, chassis, > - local_datapaths, active_tunnels); > + local_datapaths, active_tunnels, ovs_table); > prepare_ipv6_ras(local_active_ports_ras, sbrec_port_binding_by_name); > prepare_ipv6_prefixd(ovnsb_idl_txn, sbrec_port_binding_by_name, > local_active_ports_ipv6_pd, chassis, > @@ -4424,7 +4427,8 @@ send_garp_rarp_update(struct ovsdb_idl_txn > *ovnsb_idl_txn, > struct ovsdb_idl_index > *sbrec_mac_binding_by_lport_ip, > const struct hmap *local_datapaths, > const struct sbrec_port_binding *binding_rec, > - struct shash *nat_addresses) > + struct shash *nat_addresses, > + long long int garp_max_timeout) > { > volatile struct garp_rarp_data *garp_rarp = NULL; > > @@ -4450,6 +4454,11 @@ send_garp_rarp_update(struct ovsdb_idl_txn > *ovnsb_idl_txn, > if (garp_rarp) { > garp_rarp->dp_key = binding_rec->datapath->tunnel_key; > garp_rarp->port_key = binding_rec->tunnel_key; > + if (garp_max_timeout != garp_rarp_max_timeout) { > + /* reset backoff */ > + garp_rarp->announce_time = time_msec() + 1000; > + garp_rarp->backoff = 1; > + } > } else { > add_garp_rarp(name, laddrs->ea, > laddrs->ipv4_addrs[i].addr, > @@ -4474,6 +4483,11 @@ send_garp_rarp_update(struct ovsdb_idl_txn > *ovnsb_idl_txn, > if (garp_rarp) { > garp_rarp->dp_key = > binding_rec->datapath->tunnel_key; > garp_rarp->port_key = binding_rec->tunnel_key; > + if (garp_max_timeout != garp_rarp_max_timeout) { > + /* reset backoff */ > + garp_rarp->announce_time = time_msec() + 1000; > + garp_rarp->backoff = 1; > + } > } else { > add_garp_rarp(name, laddrs->ea, > 0, > binding_rec->datapath->tunnel_key, > @@ -4493,6 +4507,11 @@ send_garp_rarp_update(struct ovsdb_idl_txn > *ovnsb_idl_txn, > if (garp_rarp) { > garp_rarp->dp_key = binding_rec->datapath->tunnel_key; > garp_rarp->port_key = binding_rec->tunnel_key; > + if (garp_max_timeout != garp_rarp_max_timeout) { > + /* reset backoff */ > + garp_rarp->announce_time = time_msec() + 1000; > + garp_rarp->backoff = 1; > + } > return; > } > > @@ -4582,6 +4601,8 @@ send_garp_rarp(struct rconn *swconn, struct > garp_rarp_data *garp_rarp, > if (garp_rarp->backoff < 16) { > garp_rarp->backoff *= 2; > garp_rarp->announce_time = current_time + garp_rarp->backoff * > 1000; > + } else if (garp_rarp_max_timeout != LLONG_MAX) { > + garp_rarp->announce_time = current_time + garp_rarp_max_timeout; > } else { > garp_rarp->announce_time = LLONG_MAX; > } > @@ -5881,13 +5902,26 @@ send_garp_rarp_prepare(struct ovsdb_idl_txn > *ovnsb_idl_txn, > const struct ovsrec_bridge *br_int, > const struct sbrec_chassis *chassis, > const struct hmap *local_datapaths, > - const struct sset *active_tunnels) > + const struct sset *active_tunnels, > + const struct ovsrec_open_vswitch_table *ovs_table) > OVS_REQUIRES(pinctrl_mutex) > { > struct sset localnet_vifs = SSET_INITIALIZER(&localnet_vifs); > struct sset local_l3gw_ports = SSET_INITIALIZER(&local_l3gw_ports); > struct sset nat_ip_keys = SSET_INITIALIZER(&nat_ip_keys); > struct shash nat_addresses; > + unsigned long long garp_max_timeout = LLONG_MAX; > + const struct ovsrec_open_vswitch *cfg = > + ovsrec_open_vswitch_table_first(ovs_table); > + if (cfg) { > + garp_max_timeout = smap_get_ullong( > + &cfg->external_ids, "garp-max-timeout", LLONG_MAX); > > + if (!garp_max_timeout) { > + garp_max_timeout = LLONG_MAX; > + } else if (garp_max_timeout != LLONG_MAX) { > + garp_max_timeout *= 1000; /* sec */ > + } > + } > > shash_init(&nat_addresses); > > @@ -5918,7 +5952,8 @@ send_garp_rarp_prepare(struct ovsdb_idl_txn > *ovnsb_idl_txn, > if (pb) { > send_garp_rarp_update(ovnsb_idl_txn, > sbrec_mac_binding_by_lport_ip, > - local_datapaths, pb, &nat_addresses); > + local_datapaths, pb, &nat_addresses, > + garp_max_timeout); > } > } > > @@ -5929,7 +5964,8 @@ send_garp_rarp_prepare(struct ovsdb_idl_txn > *ovnsb_idl_txn, > = lport_lookup_by_name(sbrec_port_binding_by_name, gw_port); > if (pb) { > send_garp_rarp_update(ovnsb_idl_txn, > sbrec_mac_binding_by_lport_ip, > - local_datapaths, pb, &nat_addresses); > + local_datapaths, pb, &nat_addresses, > + garp_max_timeout); > } > } > > @@ -5947,6 +5983,8 @@ send_garp_rarp_prepare(struct ovsdb_idl_txn > *ovnsb_idl_txn, > shash_destroy(&nat_addresses); > > sset_destroy(&nat_ip_keys); > + > + garp_rarp_max_timeout = garp_max_timeout; > } > > static bool > diff --git a/controller/pinctrl.h b/controller/pinctrl.h > index 279a49fbc..23343f097 100644 > --- a/controller/pinctrl.h > +++ b/controller/pinctrl.h > @@ -30,6 +30,7 @@ struct ovsdb_idl; > struct ovsdb_idl_index; > struct ovsdb_idl_txn; > struct ovsrec_bridge; > +struct ovsrec_open_vswitch_table; > struct sbrec_chassis; > struct sbrec_dns_table; > struct sbrec_controller_event_table; > @@ -57,7 +58,8 @@ void pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn, > const struct hmap *local_datapaths, > const struct sset *active_tunnels, > const struct shash *local_active_ports_ipv6_pd, > - const struct shash *local_active_ports_ras); > + const struct shash *local_active_ports_ras, > + const struct ovsrec_open_vswitch_table *ovs_table); > void pinctrl_wait(struct ovsdb_idl_txn *ovnsb_idl_txn); > void pinctrl_destroy(void); > void pinctrl_set_br_int_name(const char *br_int_name); > diff --git a/tests/ovn.at b/tests/ovn.at > index 882a548db..3124cef3b 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -9094,6 +9094,7 @@ AT_CLEANUP > > OVN_FOR_EACH_NORTHD([ > AT_SETUP([send gratuitous arp for l3gateway only on selected chassis]) > +AT_SKIP_IF([test $HAVE_TCPDUMP = no]) > ovn_start > > # Create logical switch > @@ -9183,6 +9184,22 @@ sleep 2 > OVN_CHECK_PACKETS_CONTAIN([hv2/snoopvif-tx.pcap], [arp_expected]) > OVN_CHECK_PACKETS([hv1/snoopvif-tx.pcap], [empty_expected]) > > +# Temporarily remove lr0 chassis > +AT_CHECK([ovn-nbctl remove logical_router lr0 options chassis]) > We should sync on hv so ovn-controller can catch that. > + > +as hv1 reset_pcap_file snoopvif hv1/snoopvif > +as hv2 reset_pcap_file snoopvif hv2/snoopvif > + > +AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv1]) > Same here. > +# set garp max timeout to 32s > +AT_CHECK([as hv1 ovs-vsctl set Open_vSwitch . > external-ids:garp-max-timeout=32]) > Is there any reason why we don't test 1 sec value? We wouldn't have to wait for more than 180 s for this test to finish, but just ~6 sec instead which is still quite long, but more reasonable. > + > +OVS_CTL_TIMEOUT=180 > +OVS_WAIT_UNTIL([ > +n_arp=$(tcpdump -c 6 -ner hv1/snoopvif-tx.pcap arp | wc -l) > +test "$n_arp" = 6 > +]) > + > OVN_CLEANUP([hv1],[hv2]) > > AT_CLEANUP > -- > 2.41.0 > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > Thanks, Ales -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> [email protected] IM: amusil <https://red.ht/sig> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
