Hi Xavier,

I looked only at the change to ovn-controller to make sure I could
understand everything. I left some comments in areas that confused me.
I'll take a deeper look at the changes once I get those areas of
confusion cleared up.

On Tue, Mar 3, 2026 at 4:21 AM Xavier Simonart via dev
<[email protected]> wrote:
>
> If a server unexpectedly rebooted, OVS, when restarted, sets BFD
> UP on bfd-enabled geneve tunnels.
> However, if it takes time to restart OVN, an HA gw chassis
> would attract the traffic while being unable to handle it
> (as no flows), resulting in traffic loss.
>
> This is fixed by re-using ovs flow-restore-wait.
> If set, OVS waits (prevents upcalls, ignores bfd, ...) until reset.
> Once OVS receives the notification of flow-restore-wait being false,
> it restarts handling upcalls, bfd... and ignores any new change to
> flow-restore-wait.
>
> Hence OVN toggles flow-restore-wait: set it to false, waits for ack
> from OVS and then sets it back to true.
> If server reboots, OVS will see flow-restore-wait being true.
>
> "ovs-ctl restart" also uses flow-restore-wait.

This sentence is ambiguous. Does `ovs-ctl restart` set
flow-restore-wait true or false? I tried to answer this by looking at
ovs-ctl.in, but it's not explicitly stated in the script.

> So OVS will wait either "ovs-ctl restart" or OVN sets flow-restore-wait
> to false.
>
> Reported-at: https://issues.redhat.com/browse/FDP-3075
> Signed-off-by: Xavier Simonart <[email protected]>
> ---
>  controller/ovn-controller.c | 133 +++++++++-
>  tests/multinode-macros.at   |  22 ++
>  tests/multinode.at          | 504 ++++++++++++++++++++++++------------
>  3 files changed, 488 insertions(+), 171 deletions(-)
>
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 4353f6094..c59c4d44d 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -211,6 +211,119 @@ static char *get_file_system_id(void)
>      free(filename);
>      return ret;
>  }
> +
> +/* Set/unset flow-restore-wait, and inc ovs next_cfg if false */
> +static void set_flow_restore_wait(struct ovsdb_idl_txn *ovs_idl_txn,
> +                                  const struct ovsrec_open_vswitch *cfg,
> +                                  const struct smap *other_config,
> +                                  const char *val)
> +{
> +    struct smap new_config;
> +    smap_clone(&new_config, other_config);
> +    smap_replace(&new_config, "flow-restore-wait", val);
> +    ovsrec_open_vswitch_set_other_config(cfg, &new_config);
> +    ovsdb_idl_txn_increment(ovs_idl_txn, &cfg->header_,
> +                            &ovsrec_open_vswitch_col_next_cfg, true);
> +    smap_destroy(&new_config);
> +}
> +
> +static void
> +manage_flow_restore_wait(struct ovsdb_idl_txn *ovs_idl_txn,
> +                         const struct ovsrec_open_vswitch *cfg,
> +                         uint64_t ofctrl_cur_cfg, uint64_t ovs_next_cfg,
> +                         int ovs_txn_status)
> +{
> +    enum flow_restore_wait_state {
> +        FRW_INIT,              /* Initial state */
> +        FRW_WAIT_TXN_COMPLETE, /* Sent false, waiting txn to complete */
> +        FRW_TXN_SUCCESS,       /* Txn completed. Waiting for OVS Ack. */
> +        FRW_DONE               /* Everything completed */
> +    };
> +
> +    static int64_t frw_next_cfg;
> +    static enum flow_restore_wait_state frw_state;
> +    static bool ofctrl_was_connected = false;
> +
> +    bool ofctrl_connected = ofctrl_is_connected();
> +
> +    if (!ovs_idl_txn || !cfg) {
> +        return;
> +    }
> +
> +    /* If OVS is stopped/started, make sure flow-restore-wait is toggled */
> +    if (ofctrl_connected && !ofctrl_was_connected) {
> +        frw_state = FRW_INIT;
> +    }
> +    ofctrl_was_connected = ofctrl_connected;
> +
> +    if (!ofctrl_connected) {
> +        return;
> +    }
> +
> +    bool frw = smap_get_bool(&cfg->other_config, "flow-restore-wait", false);
> +    switch (frw_state) {
> +    case FRW_INIT:
> +        if (ofctrl_cur_cfg > 0) {
> +            set_flow_restore_wait(ovs_idl_txn, cfg, &cfg->other_config,
> +                                  "false");
> +            frw_state = FRW_WAIT_TXN_COMPLETE;
> +            VLOG_INFO("Setting flow-restore-wait=false "
> +                      "(cur_cfg=%"PRIu64")", ofctrl_cur_cfg);
> +        }
> +        break;
> +
> +    case FRW_WAIT_TXN_COMPLETE:
> +        /* ovs_idl_txn != NULL => transaction completed */

This may seem like a nitpick, but this comment confused me at first.
manage_flow_restore_wait() can only be called if ovs_idl_txn is
non-NULL. The comment then made me mix up ovs_idl_txn and
ovs_txn_status. I almost commented that the if statement below can
never evaluate to true, but that's not the case.

> +        if (ovs_txn_status == 0) {
> +            /* Previous transaction failed. */
> +            set_flow_restore_wait(ovs_idl_txn, cfg, &cfg->other_config,
> +                                  "false");
> +            break;
> +        }
> +        /* txn succeeded, get next_cfg */
> +        frw_next_cfg = ovs_next_cfg;
> +        frw_state = FRW_TXN_SUCCESS;
> +        /* fall through */
> +
> +    case FRW_TXN_SUCCESS:
> +        if (ovs_next_cfg < frw_next_cfg) {
> +            /* DB was reset, next_cfg went backwards */
> +            VLOG_INFO("OVS DB reset (next_cfg %"PRId64" -> %"PRIu64"), "
> +                      "resetting state to FRW_INIT",

The comment says that we are resetting state to FRW_INIT, but the code
actually sets the state to FRW_WAIT_TXN_COMPLETE. In this case, I
think the code is correct and the comment is incorrect.

> +                      frw_next_cfg, ovs_next_cfg);
> +            set_flow_restore_wait(ovs_idl_txn, cfg, &cfg->other_config,
> +                                  "false");
> +            frw_state = FRW_WAIT_TXN_COMPLETE;
> +            break;
> +        }
> +
> +        if (!frw) {
> +            if (cfg->cur_cfg >= frw_next_cfg) {
> +                set_flow_restore_wait(ovs_idl_txn, cfg, &cfg->other_config,
> +                                      "true");
> +                frw_state = FRW_DONE;
> +                VLOG_INFO("Setting flow-restore-wait=true");
> +            }
> +        } else {
> +            /* The transaction to false succeeded but frw is true.
> +             * So, another task already set it to true */
> +            frw_state = FRW_DONE;
> +            VLOG_INFO("flow-restore-wait was already true");
> +        }
> +        break;
> +    case FRW_DONE:
> +        if (!frw) {
> +            /* frw has been cleared (e.g. by ovs-ctl restart) or txn failed. 
> */

Does "cleared" here mean that flow-restore-wait is not present in the OVS DB?

> +            set_flow_restore_wait(ovs_idl_txn, cfg, &cfg->other_config,
> +                                  "false");
> +            frw_state = FRW_WAIT_TXN_COMPLETE;
> +            VLOG_INFO("OVS frw cleared, restarting flow-restore-wait 
> sequence "
> +                      "(cur_cfg=%"PRIu64")", ofctrl_cur_cfg);
> +        }
> +        break;
> +    }
> +}
> +
>  /* Only set monitor conditions on tables that are available in the
>   * server schema.
>   */
> @@ -7117,6 +7230,7 @@ main(int argc, char *argv[])
>      struct unixctl_server *unixctl;
>      struct ovn_exit_args exit_args = {0};
>      struct br_int_remote br_int_remote = {0};
> +    static uint64_t next_cfg = 0;
>      int retval;
>
>      /* Read from system-id-override file once on startup. */
> @@ -7444,6 +7558,7 @@ main(int argc, char *argv[])
>
>      /* Main loop. */
>      int ovnsb_txn_status = 1;
> +    int ovs_txn_status = 1;
>      bool sb_monitor_all = false;
>      struct tracked_acl_ids *tracked_acl_ids = NULL;
>      while (!exit_args.exiting) {
> @@ -7545,6 +7660,11 @@ main(int argc, char *argv[])
>          pinctrl_update_swconn(br_int_remote.target,
>                                br_int_remote.probe_interval);
>
> +        if (cfg && ovs_idl_txn && ovs_txn_status == -1) {
> +            /* txn was in progress and is now completed */
> +            next_cfg = cfg->next_cfg;
> +        }
> +
>          /* Enable ACL matching for double tagged traffic. */
>          if (ovs_idl_txn && cfg) {
>              int vlan_limit = smap_get_int(
> @@ -7894,6 +8014,12 @@ main(int argc, char *argv[])
>                      stopwatch_start(OFCTRL_SEQNO_RUN_STOPWATCH_NAME,
>                                      time_msec());
>                      ofctrl_seqno_run(ofctrl_get_cur_cfg());
> +                    if (ovs_idl_txn) {
> +                        manage_flow_restore_wait(ovs_idl_txn, cfg,
> +                                                 ofctrl_get_cur_cfg(),
> +                                                 next_cfg, ovs_txn_status);
> +                    }
> +
>                      stopwatch_stop(OFCTRL_SEQNO_RUN_STOPWATCH_NAME,
>                                     time_msec());
>                      stopwatch_start(IF_STATUS_MGR_RUN_STOPWATCH_NAME,
> @@ -7993,7 +8119,7 @@ main(int argc, char *argv[])
>              OVS_NOT_REACHED();
>          }
>
> -        int ovs_txn_status = ovsdb_idl_loop_commit_and_wait(&ovs_idl_loop);
> +        ovs_txn_status = ovsdb_idl_loop_commit_and_wait(&ovs_idl_loop);
>          if (!ovs_txn_status) {
>              /* The transaction failed. */
>              vif_plug_clear_deleted(
> @@ -8012,6 +8138,9 @@ main(int argc, char *argv[])
>                      &vif_plug_deleted_iface_ids);
>              vif_plug_finish_changed(
>                      &vif_plug_changed_iface_ids);
> +            if (cfg) {
> +                next_cfg = cfg->next_cfg;
> +            }
>          } else if (ovs_txn_status == -1) {
>              /* The commit is still in progress */
>          } else {
> @@ -8085,7 +8214,7 @@ loop_done:
>              }
>
>              ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop);
> -            int ovs_txn_status = 
> ovsdb_idl_loop_commit_and_wait(&ovs_idl_loop);
> +            ovs_txn_status = ovsdb_idl_loop_commit_and_wait(&ovs_idl_loop);
>              if (!ovs_txn_status) {
>                  /* The transaction failed. */
>                  vif_plug_clear_deleted(
> diff --git a/tests/multinode-macros.at b/tests/multinode-macros.at
> index c4415ce1c..071b01890 100644
> --- a/tests/multinode-macros.at
> +++ b/tests/multinode-macros.at
> @@ -41,6 +41,28 @@ m4_define([M_START_TCPDUMP],
>      ]
>  )
>
> +m4_define([_M_START_TCPDUMPS_RECURSIVE], [
> +     m4_if(m4_eval($# > 3), [1], [dnl
> +        names="$names $3"
> +        echo "Running podman exec $1 tcpdump -l $2 >$3.tcpdump 2>$3.stderr"
> +        podman exec $1 tcpdump -l $2 >$3.tcpdump 2>$3.stderr &
> +        echo "podman exec $1 ps -ef | grep -v grep | grep tcpdump && podman 
> exec $1 killall tcpdump" >> cleanup
> +        _M_START_TCPDUMPS_RECURSIVE(m4_shift(m4_shift(m4_shift($@))))
> +        ])
> +    ]
> +)
> +
> +# Start Multiple tcpdump. Useful to speed up when many tcpdump
> +# must be started as waiting for "listening" takes usually 1 second.
> +m4_define([M_START_TCPDUMPS],
> +    [
> +     names=""
> +     _M_START_TCPDUMPS_RECURSIVE($@)
> +     for name in $names; do
> +         OVS_WAIT_UNTIL([grep -q "listening" ${name}.stderr])
> +     done
> +    ]
> +)
>
>  # M_FORMAT_CT([ip-addr])
>  #
> diff --git a/tests/multinode.at b/tests/multinode.at
> index e02bd6f07..2396a7247 100644
> --- a/tests/multinode.at
> +++ b/tests/multinode.at
> @@ -2986,42 +2986,42 @@ AT_CLEANUP
>
>  AT_SETUP([HA: Check for missing garp on leader when BFD goes back up])
>  # Network topology
> -#    
> ┌────────────────────────────────────────────────────────────────────────────────────────────────────────┐
> -#    │                                                                       
>                                  │
> -#    │    ┌───────────────────┐    ┌───────────────────┐    
> ┌───────────────────┐    ┌───────────────────┐    │
> -#    │    │   ovn-chassis-1   │    │  ovn-gw-1         │    │  ovn-gw-2      
>    │    │  ovn-chassis-2    │    │
> -#    │    └─────────┬─────────┘    └───────────────────┘    
> └───────────────────┘    └───────────────────┘    │
> -#    │    ┌─────────┴─────────┐                                              
>                                  │
> -#    │    │       inside1     │                                              
>                                  │
> -#    │    │   192.168.1.1/24  │                                              
>                                  │
> -#    │    └─────────┬─────────┘                                              
>                                  │
> -#    │    ┌─────────┴─────────┐                                              
>                                  │
> -#    │    │       inside      │                                              
>                                  │
> -#    │    └─────────┬─────────┘                                              
>                                  │
> -#    │    ┌─────────┴─────────┐                                              
>                                  │
> -#    │    │    192.168.1.254  │                                              
>                                  │
> -#    │    │         R1        │                                              
>                                  │
> -#    │    │    192.168.0.254  │                                              
>                                  │
> -#    │    └─────────┬─────────┘                                              
>                                  │
> -#    │              └------eth1---------------┬--------eth1-----------┐      
>                                  │
> -#    │                             ┌──────────┴────────┐    
> ┌─────────┴─────────┐                             │
> -#    │                             │    192.168.1.254  │    │   
> 192.168.1.254   │                             │
> -#    │                             │         R1        │    │         R1     
>    │                             │
> -#    │                             │    192.168.0.254  │    │   
> 192.168.0.254   │                             │
> -#    │                             └─────────┬─────────┘    
> └─────────┬─────────┘                             │
> -#    │                                       │                        │      
>         ┌───────────────────┐    │
> -#    │                             ┌─────────┴─────────┐    
> ┌─────────┴─────────┐    │    192.168.0.1    │    │
> -#    │                             │       outside     │    │       outside  
>    │    │        ext1       │    │
> -#    │                             └─────────┬─────────┘    
> └─────────┬─────────┘    └─────────┬─────────┘    │
> -#    │                             ┌─────────┴─────────┐    
> ┌─────────┴─────────┐    ┌─────────┴─────────┐    │
> -#    │                             │    ln-outside     │    │    ln-outside  
>    │    │       ln-ext1     │    │
> -#    │                             └─────────┬─────────┘    
> └─────────┬─────────┘    └─────────┬─────────┘    │
> -#    │                             ┌─────────┴─────────┐    
> ┌─────────┴─────────┐    ┌─────────┴─────────┐    │
> -#    │                             │       br-ex       │    │       br-ex    
>    │    │       br-ex       │    │
> -#    │                             └─────────┬─────────┘    
> └─────────┬─────────┘    └─────────┬─────────┘    │
> -#    │                                       
> └---------eth2-----------┴-------eth2-------------┘              │
> -#    │                                                                       
>                                  │
> -#    
> └────────────────────────────────────────────────────────────────────────────────────────────────────────┘
> +#    
> ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
> +#    │                                                                       
>                                                          │
> +#    │   ┌───────────────────┐    ┌───────────────────┐    
> ┌───────────────────┐    ┌───────────────────┐    ┌───────────────────┐    │
> +#    │   │   ovn-chassis-1   │    │   ovn-chassis-2   │    │  ovn-gw-1       
>   │    │  ovn-gw-2         │    │  ovn-chassis-3    │    │
> +#    │   └─────────┬─────────┘    └─────────┬─────────┘    
> └───────────────────┘    └───────────────────┘    └───────────────────┘    │
> +#    │   ┌─────────┴─────────┐    ┌─────────┴─────────┐                      
>                                                          │
> +#    │   │       inside1     │    │       inside2     │                      
>                                                          │
> +#    │   │   192.168.1.1/24  │    │   192.168.1.2/24  │                      
>                                                          │
> +#    │   └─────────┬─────────┘    └─────────┬─────────┘                      
>                                                          │
> +#    │           ┌─┴────────────────────────┴─┐                              
>                                                          │
> +#    │           │           inside           │                              
>                                                          │
> +#    │           └──────────────┬─────────────┘                              
>                                                          │
> +#    │                ┌─────────┴─────────┐                                  
>                                                          │
> +#    │                │    192.168.1.254  │                                  
>                                                          │
> +#    │                │         R1        │                                  
>                                                          │
> +#    │                │    192.168.0.254  │                                  
>                                                          │
> +#    │                └─────────┬─────────┘                                  
>                                                          │
> +#    │                          
> └------eth1---------------------------┬--------eth1-----------┐               
>                         │
> +#    │                                                     
> ┌──────────┴────────┐    ┌─────────┴─────────┐                             │
> +#    │                                                     │    
> 192.168.1.254  │    │   192.168.1.254   │                             │
> +#    │                                                     │         R1      
>   │    │         R1        │                             │
> +#    │                                                     │    
> 192.168.0.254  │    │   192.168.0.254   │                             │
> +#    │                                                     
> └─────────┬─────────┘    └─────────┬─────────┘                             │
> +#    │                                                               │       
>                  │              ┌───────────────────┐    │
> +#    │                                                     
> ┌─────────┴─────────┐    ┌─────────┴─────────┐    │    192.168.0.1    │    │
> +#    │                                                     │       outside   
>   │    │       outside     │    │        ext1       │    │
> +#    │                                                     
> └─────────┬─────────┘    └─────────┬─────────┘    └─────────┬─────────┘    │
> +#    │                                                     
> ┌─────────┴─────────┐    ┌─────────┴─────────┐    ┌─────────┴─────────┐    │
> +#    │                                                     │    ln-outside   
>   │    │    ln-outside     │    │       ln-ext1     │    │
> +#    │                                                     
> └─────────┬─────────┘    └─────────┬─────────┘    └─────────┬─────────┘    │
> +#    │                                                     
> ┌─────────┴─────────┐    ┌─────────┴─────────┐    ┌─────────┴─────────┐    │
> +#    │                                                     │       br-ex     
>   │    │       br-ex       │    │       br-ex       │    │
> +#    │                                                     
> └─────────┬─────────┘    └─────────┬─────────┘    └─────────┬─────────┘    │
> +#    │                                                               
> └---------eth2-----------┴-------eth2-------------┘              │
> +#    │                                                                       
>                                                          │
> +#    
> └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
>
>  # The goal of this test is the check that GARP are properly generated by 
> higest priority traffic when
>  # BFD goes down, and back up, and this whether the BFD event is due either 
> to some bfd packet lost
> @@ -3030,6 +3030,12 @@ AT_SETUP([HA: Check for missing garp on leader when 
> BFD goes back up])
>  # So gw3 should in this test neither send garp or receive packets.
>  #
>  # Enable vconn so we can check the GARP from a log perspective.
> +on_exit "podman exec ovn-gw-1 ovn-appctl vlog/set info"
> +on_exit "podman exec ovn-gw-1 ovn-appctl vlog/enable-rate-limit"
> +on_exit "podman exec ovn-gw-2 ovn-appctl vlog/set info"
> +on_exit "podman exec ovn-gw-2 ovn-appctl vlog/enable-rate-limit"
> +on_exit "podman exec ovn-gw-3 ovn-appctl vlog/set info"
> +on_exit "podman exec ovn-gw-3 ovn-appctl vlog/enable-rate-limit"
>  m_as ovn-gw-1 ovn-appctl vlog/set vconn:dbg
>  m_as ovn-gw-2 ovn-appctl vlog/set vconn:dbg
>  m_as ovn-gw-3 ovn-appctl vlog/set vconn:dbg
> @@ -3037,12 +3043,17 @@ m_as ovn-gw-1 ovn-appctl vlog/disable-rate-limit
>  m_as ovn-gw-2 ovn-appctl vlog/disable-rate-limit
>  m_as ovn-gw-3 ovn-appctl vlog/disable-rate-limit
>
> +# Decrease revalidation time on ovs switch simulating ToR.
> +on_exit "OVS_RUNDIR= ovs-vsctl set Open_vSwitch . 
> other_config:max-revalidator=500"
> +OVS_RUNDIR= ovs-vsctl set Open_vSwitch . other_config:max-revalidator=100
> +
>  check_fake_multinode_setup
>
>  # Delete the multinode NB and OVS resources before starting the test.
>  cleanup_multinode_resources
>
>  ip_ch1=$(m_as ovn-chassis-1 ip a show dev eth1 | grep "inet " | awk '{print 
> $2}'| cut -d '/' -f1)
> +ip_ch2=$(m_as ovn-chassis-2 ip a show dev eth1 | grep "inet " | awk '{print 
> $2}'| cut -d '/' -f1)
>  ip_gw1=$(m_as ovn-gw-1 ip a show dev eth1 | grep "inet " | awk '{print $2}'| 
> cut -d '/' -f1)
>  ip_gw2=$(m_as ovn-gw-2 ip a show dev eth1 | grep "inet " | awk '{print $2}'| 
> cut -d '/' -f1)
>  ip_gw3=$(m_as ovn-gw-3 ip a show dev eth1 | grep "inet " | awk '{print $2}'| 
> cut -d '/' -f1)
> @@ -3050,25 +3061,35 @@ ip_gw3=$(m_as ovn-gw-3 ip a show dev eth1 | grep 
> "inet " | awk '{print $2}'| cut
>  from_gw1_to_gw2=$(m_as ovn-gw-1 ovs-vsctl --bare --columns=name find 
> interface options:remote_ip=$ip_gw2)
>  from_gw1_to_gw3=$(m_as ovn-gw-1 ovs-vsctl --bare --columns=name find 
> interface options:remote_ip=$ip_gw3)
>  from_gw1_to_ch1=$(m_as ovn-gw-1 ovs-vsctl --bare --columns=name find 
> interface options:remote_ip=$ip_ch1)
> +from_gw1_to_ch2=$(m_as ovn-gw-1 ovs-vsctl --bare --columns=name find 
> interface options:remote_ip=$ip_ch2)
>  from_gw2_to_gw1=$(m_as ovn-gw-2 ovs-vsctl --bare --columns=name find 
> interface options:remote_ip=$ip_gw1)
>  from_gw2_to_gw3=$(m_as ovn-gw-2 ovs-vsctl --bare --columns=name find 
> interface options:remote_ip=$ip_gw3)
>  from_gw2_to_ch1=$(m_as ovn-gw-2 ovs-vsctl --bare --columns=name find 
> interface options:remote_ip=$ip_ch1)
> +from_gw2_to_ch2=$(m_as ovn-gw-2 ovs-vsctl --bare --columns=name find 
> interface options:remote_ip=$ip_ch2)
>  from_ch1_to_gw1=$(m_as ovn-chassis-1 ovs-vsctl --bare --columns=name find 
> interface options:remote_ip=$ip_gw1)
>  from_ch1_to_gw2=$(m_as ovn-chassis-1 ovs-vsctl --bare --columns=name find 
> interface options:remote_ip=$ip_gw2)
> +from_ch2_to_gw1=$(m_as ovn-chassis-2 ovs-vsctl --bare --columns=name find 
> interface options:remote_ip=$ip_gw1)
> +from_ch2_to_gw2=$(m_as ovn-chassis-2 ovs-vsctl --bare --columns=name find 
> interface options:remote_ip=$ip_gw2)
>
>  m_as ovn-chassis-1 ip link del hv1-vif1-p
> -m_as ovn-chassis-2 ip link del ext1-p
> +m_as ovn-chassis-2 ip link del hv2-vif1-p
> +m_as ovn-chassis-3 ip link del ext1-p
>
>  OVS_WAIT_UNTIL([m_as ovn-chassis-1 ip link show | grep -q genev_sys])
>  OVS_WAIT_UNTIL([m_as ovn-chassis-2 ip link show | grep -q genev_sys])
> +OVS_WAIT_UNTIL([m_as ovn-chassis-3 ip link show | grep -q genev_sys])
>  OVS_WAIT_UNTIL([m_as ovn-gw-1 ip link show | grep -q genev_sys])
>  OVS_WAIT_UNTIL([m_as ovn-gw-2 ip link show | grep -q genev_sys])
>  OVS_WAIT_UNTIL([m_as ovn-gw-3 ip link show | grep -q genev_sys])
>
> +# Use "aggressive" bfd parameters
> +check multinode_nbctl set NB_Global . options:"bfd-min-rx"=500
> +check multinode_nbctl set NB_Global . options:"bfd-min-tx"=100
>  check multinode_nbctl ls-add inside
>  check multinode_nbctl ls-add outside
>  check multinode_nbctl ls-add ext
>  check multinode_nbctl lsp-add inside inside1 -- lsp-set-addresses inside1 
> "f0:00:c0:a8:01:01 192.168.1.1"
> +check multinode_nbctl lsp-add inside inside2 -- lsp-set-addresses inside2 
> "f0:00:c0:a8:01:02 192.168.1.2"
>  check multinode_nbctl lsp-add ext ext1 -- lsp-set-addresses ext1 
> "00:00:c0:a8:00:01 192.168.0.1"
>
>  multinode_nbctl create Logical_Router name=R1
> @@ -3100,12 +3121,14 @@ m_as ovn-gw-3 ovs-vsctl remove open . external_ids 
> garp-max-timeout-sec
>
>  m_as ovn-chassis-1 ovs-vsctl set open . 
> external-ids:ovn-bridge-mappings=public:br-ex
>  m_as ovn-chassis-2 ovs-vsctl set open . 
> external-ids:ovn-bridge-mappings=public:br-ex
> +m_as ovn-chassis-3 ovs-vsctl set open . 
> external-ids:ovn-bridge-mappings=public:br-ex
>  m_as ovn-gw-1 ovs-vsctl set open . 
> external-ids:ovn-bridge-mappings=public:br-ex
>  m_as ovn-gw-2 ovs-vsctl set open . 
> external-ids:ovn-bridge-mappings=public:br-ex
>  m_as ovn-gw-3 ovs-vsctl set open . 
> external-ids:ovn-bridge-mappings=public:br-ex
>
>  m_as ovn-chassis-1 /data/create_fake_vm.sh inside1 hv1-vif1 
> f0:00:c0:a8:01:01 1500 192.168.1.1 24 192.168.1.254 2000::1/64 2000::a
> -m_as ovn-chassis-2 /data/create_fake_vm.sh ext1 ext1 00:00:c0:a8:00:01 1500 
> 192.168.0.1 24 192.168.0.254 1000::3/64 1000::a
> +m_as ovn-chassis-2 /data/create_fake_vm.sh inside2 hv2-vif1 
> f0:00:c0:a8:01:02 1500 192.168.1.2 24 192.168.1.254 2000::2/64 2000::a
> +m_as ovn-chassis-3 /data/create_fake_vm.sh ext1 ext1 00:00:c0:a8:00:01 1500 
> 192.168.0.1 24 192.168.0.254 1000::3/64 1000::a
>
>  # There should be one ha_chassis_group with the name "R1_outside"
>  m_check_row_count HA_Chassis_Group 1 name=R1_outside
> @@ -3160,53 +3183,67 @@ for chassis in $from_ch1_to_gw1 $from_ch1_to_gw2; do
>      wait_bfd_enabled ovn-chassis-1 $chassis
>  done
>
> +# check BFD enablement on tunnel ports from ovn-chassis-2 ###########
> +for chassis in $from_ch2_to_gw1 $from_ch2_to_gw2; do
> +    echo "checking ovn-chassis-2 -> $chassis"
> +    wait_bfd_enabled ovn-chassis-2 $chassis
> +done
> +
>  # Make sure there is no nft table left. Do not use nft directly as might not 
> be installed in container.
>  gw1_pid=$(podman inspect -f '{{.State.Pid}}' ovn-gw-1)
>  nsenter --net=/proc/$gw1_pid/ns/net nft list tables | grep ovn-test && 
> nsenter --net=/proc/$gw1_pid/ns/net nft delete table ip ovn-test
> -on_exit "nsenter --net=/proc/$gw1_pid/ns/net nft list tables | grep ovn-test 
> && nsenter --net=/proc/$gw1_pid/ns/net nft delete table ip ovn-test"
> +on_exit "if [[ -d "/proc/$gw1_pid" ]]; then nsenter 
> --net=/proc/$gw1_pid/ns/net nft list tables | grep ovn-test && nsenter 
> --net=/proc/$gw1_pid/ns/net nft delete table ip ovn-test; fi"
>
> -for chassis in $from_gw1_to_gw2 $from_gw1_to_gw3 $from_gw1_to_ch1; do
> +for chassis in $from_gw1_to_gw2 $from_gw1_to_gw3 $from_gw1_to_ch1 
> $from_gw1_to_ch2; do
>      wait_bfd_up ovn-gw-1 $chassis
>  done
> -for chassis in $from_gw2_to_gw1 $from_gw2_to_gw3 $from_gw2_to_ch1; do
> +for chassis in $from_gw2_to_gw1 $from_gw2_to_gw3 $from_gw2_to_ch1 
> $from_gw2_to_ch2; do
>      wait_bfd_up ovn-gw-2 $chassis
>  done
>  for chassis in $from_ch1_to_gw1 $from_ch1_to_gw2; do
>      wait_bfd_up ovn-chassis-1 $chassis
>  done
> +for chassis in $from_ch2_to_gw1 $from_ch2_to_gw2; do
> +    wait_bfd_up ovn-chassis-2 $chassis
> +done
>
>  m_wait_row_count Port_Binding 1 logical_port=cr-R1_outside 
> chassis=$gw1_chassis
>  check multinode_nbctl --wait=hv sync
>
>  start_tcpdump() {
>      echo "$(date +%H:%M:%S.%03N) Starting tcpdump"
> -    M_START_TCPDUMP([ovn-chassis-1], [-neei hv1-vif1-p], [ch1])
> -    M_START_TCPDUMP([ovn-chassis-2], [-neei eth2], [ch2])
> -    M_START_TCPDUMP([ovn-gw-1], [-neei eth2], [gw1])
> -    M_START_TCPDUMP([ovn-gw-1], [-neei eth2 -Q out], [gw1_out])
> -    M_START_TCPDUMP([ovn-gw-2], [-neei eth2], [gw2])
> -    M_START_TCPDUMP([ovn-gw-2], [-neei eth2 -Q out], [gw2_out])
> -    M_START_TCPDUMP([ovn-gw-3], [-neei eth2], [gw3])
> -    M_START_TCPDUMP([ovn-gw-3], [-neei eth2 -Q out], [gw3_out])
> +    M_START_TCPDUMPS([ovn-chassis-1], [-neei hv1-vif1-p], [ch1],
> +                    [ovn-chassis-2], [-neei hv2-vif1-p], [ch2],
> +                    [ovn-chassis-3], [-neei eth2], [ch3],
> +                    [ovn-gw-1], [-neei eth2], [gw1],
> +                    [ovn-gw-1], [-neei eth2 -Q out], [gw1_out],
> +                    [ovn-gw-2], [-neei eth2], [gw2],
> +                    [ovn-gw-2], [-neei eth2 -Q out], [gw2_out],
> +                    [ovn-gw-3], [-neei eth2], [gw3],
> +                    [ovn-gw-3], [-neei eth2 -Q out], [gw3_out],
> +                    [ovn-gw-1], [-neei eth1], [gw1_eth1],
> +                    [ovn-gw-2], [-neei eth1], [gw2_eth1],
> +                    [ovn-chassis-1], [-neei eth1], [ch1_eth1],
> +                    [ovn-chassis-2], [-neei eth1], [ch2_eth1])
>  }
>
>  stop_tcpdump() {
>      echo "$(date +%H:%M:%S.%03N) Stopping tcpdump"
> -    m_kill 'ovn-gw-1 ovn-gw-2 ovn-gw-3 ovn-chassis-1 ovn-chassis-2' tcpdump
> +    m_kill 'ovn-gw-1 ovn-gw-2 ovn-gw-3 ovn-chassis-1 ovn-chassis-2 
> ovn-chassis-3' tcpdump
>  }
>
> -# Send packets from chassis2 (ext1) to chassis1
> +# Send packets from ovn-chassis-3 (ext1) to ovn-chassis-1
>  send_background_packets() {
>      echo "$(date +%H:%M:%S.%03N) Sending packets in Background"
>      start_tcpdump
> -    M_NS_DAEMONIZE([ovn-chassis-2], [ext1], [ping -f -i 0.1 192.168.1.1], 
> [ping.pid])
> +    M_NS_DAEMONIZE([ovn-chassis-3], [ext1], [ping -f -i 0.1 192.168.1.1], 
> [ping.pid])
>  }
>
>  stop_sending_background_packets() {
>      echo "$(date +%H:%M:%S.%03N) Stopping Background process"
>      m_as ovn-chassis-1 ps -ef | grep -v grep | grep -q ping && \
>          m_as ovn-chassis-1 echo "Stopping ping on ovn-chassis-1" && killall 
> ping
> -    m_as ovn-chassis-2 ps -ef | grep -v grep | grep -q ping && \
> +    m_as ovn-chassis-3 ps -ef | grep -v grep | grep -q ping && \
>          m_as ovn-chassis-2 echo "Stopping ping on ovn-chassis-2" && killall 
> ping
>      stop_tcpdump
>  }
> @@ -3216,8 +3253,8 @@ check_for_new_garps() {
>      expecting_garp=$2
>      n_new_garps=$(cat ${hv}_out.tcpdump | grep -c "f0:00:c0:a8:00:fe > 
> Broadcast, ethertype ARP (0x0806), length 42: Request who-has 192.168.0.254 
> tell 192.168.0.254, length 28")
>
> -    if [ "$expecting_garp" == "true" ]; then
> -        AS_BOX([$(date +%H:%M:%S.%03N) Waiting/checking for garp from $hv - 
> Starting with $n_new_garps])
> +    if [[ "$expecting_garp" == "true" ]]; then
> +        echo "$(date +%H:%M:%S.%03N) Waiting/checking for garp from $hv - 
> Starting with $n_new_garps"
>          OVS_WAIT_UNTIL([
>              n_garps=$n_new_garps
>              n_new_garps=$(cat ${hv}_out.tcpdump | grep -c "f0:00:c0:a8:00:fe 
> > Broadcast, ethertype ARP (0x0806), length 42: Request who-has 192.168.0.254 
> tell 192.168.0.254, length 28")
> @@ -3225,7 +3262,7 @@ check_for_new_garps() {
>              test "$n_garps" -ne "$n_new_garps"
>          ])
>      else
> -        AS_BOX([$(date +%H:%M:%S.%03N) Checking no garp from ${hv}])
> +        echo "$(date +%H:%M:%S.%03N) Checking no garp from ${hv}"
>          # Waiting a few seconds to get a chance to see unexpected garps.
>          sleep 3
>          n_garps=$(cat ${hv}_out.tcpdump | grep -c "f0:00:c0:a8:00:fe > 
> Broadcast, ethertype ARP (0x0806), length 42: Request who-has 192.168.0.254 
> tell 192.168.0.254, length 28")
> @@ -3241,8 +3278,8 @@ check_for_new_echo_pkts() {
>      n_new_echo_req=$(cat ${hv}.tcpdump | grep -c "$mac_src > $mac_dst, 
> ethertype IPv4 (0x0800), length 98: 192.168.0.1 > 192.168.1.1: ICMP echo 
> request")
>      n_new_echo_rep=$(cat ${hv}.tcpdump | grep -c "$mac_dst > $mac_src, 
> ethertype IPv4 (0x0800), length 98: 192.168.1.1 > 192.168.0.1: ICMP echo 
> reply")
>
> -    if [ "$expecting_pkts" == "true" ]; then
> -        AS_BOX([$(date +%H:%M:%S.%03N) Waiting/checking for echo pkts 
> through ${hv}])
> +    if [[ "$expecting_pkts" == "true" ]]; then
> +        echo "$(date +%H:%M:%S.%03N) Waiting/checking for echo pkts through 
> ${hv}"
>          echo "Starting with $n_new_echo_req requests and $n_new_echo_rep 
> replies so far on ${hv}."
>          OVS_WAIT_UNTIL([
>              n_echo_req=$n_new_echo_req
> @@ -3253,7 +3290,7 @@ check_for_new_echo_pkts() {
>              test "$n_echo_req" -ne "$n_new_echo_req" && test "$n_echo_rep" 
> -ne "$n_new_echo_rep"
>          ])
>      else
> -        AS_BOX([$(date +%H:%M:%S.%03N) Checking no pkts from ${hv}])
> +        echo "$(date +%H:%M:%S.%03N) Checking no pkts from ${hv}"
>          # Waiting a few seconds to get a chance to see unexpected pkts.
>          sleep 3
>          n_echo_req=$(cat ${hv}.tcpdump | grep -c "$mac_src > $mac_dst, 
> ethertype IPv4 (0x0800), length 98: 192.168.0.1 > 192.168.1.1: ICMP echo 
> request")
> @@ -3271,22 +3308,44 @@ dump_statistics() {
>      ch1_rep=$(grep -c "ICMP echo reply" ch1.tcpdump)
>      ch2_req=$(grep -c "ICMP echo request" ch2.tcpdump)
>      ch2_rep=$(grep -c "ICMP echo reply" ch2.tcpdump)
> +    ch3_req=$(grep -c "ICMP echo request" ch3.tcpdump)
> +    ch3_rep=$(grep -c "ICMP echo reply" ch3.tcpdump)
>      gw1_req=$(grep -c "ICMP echo request" gw1.tcpdump)
>      gw1_rep=$(grep -c "ICMP echo reply" gw1.tcpdump)
>      gw2_req=$(grep -c "ICMP echo request" gw2.tcpdump)
>      gw2_rep=$(grep -c "ICMP echo reply" gw2.tcpdump)
>      gw3_req=$(grep -c "ICMP echo request" gw3.tcpdump)
>      gw3_rep=$(grep -c "ICMP echo reply" gw3.tcpdump)
> -    echo "$n1 claims in gw1, $n2 in gw2 and $n3 on gw3"
> -    echo "ch2_request=$ch2_req gw1_request=$gw1_req gw2_request=$gw2_req 
> gw3_request=$gw3_req ch1_request=$ch1_req ch1_reply=$ch1_rep 
> gw1_reply=$gw1_rep gw2_reply=$gw2_rep gw3_reply=$gw3_rep ch2_reply=$ch2_rep"
> +    echo "$n1 claims in gw1, $n2 in gw2 and $n3 on gw3" >&2
> +    echo "ch3_req=$ch3_req gw_req=($gw1_req + $gw2_req +$gw3_req) 
> ch1_req=$ch1_req ch1_rep=$ch1_rep gw_rep=($gw1_rep + $gw2_rep + $gw3_rep) 
> ch3_rep=$ch3_rep ch2=($ch2_req+$ch2_rep)" >&2
> +    echo "$((ch3_req - ch3_rep))"
>  }
>
> -check_migration_between_gw1_and_gw2() {
> -    action=$1
> -    send_background_packets
> +add_port() {
> +    bridge=$1
> +    interface=$2
> +    address=$3
> +    echo "Adding $bridge $interface $address"
> +
> +    pid=$(podman inspect -f '{{.State.Pid}}' ovn-gw-1)
> +    ln -sf /proc/$pid/ns/net /var/run/netns/$pid
> +    port=$(OVS_RUNDIR= ovs-vsctl --data=bare --no-heading --columns=name 
> find interface \
> +           external_ids:container_id=ovn-gw-1 
> external_ids:container_iface="$interface")
> +    port="${port:0:13}"
> +    ip link add "${port}_l" type veth peer name "${port}_c"
> +    ip link set "${port}_l" up
> +    ip link set "${port}_c" netns $pid
> +    ip netns exec $pid ip link set dev "${port}_c" name "$interface"
> +    ip netns exec $pid ip link set "$interface" up
> +    if [[ -n "$address" ]]; then
> +        ip netns exec $pid ip addr add "$address" dev "$interface"
> +    fi
> +}
>
> +prepare() {
> +    send_background_packets
>      # We make sure gw1 is leader since enough time that it generated all its 
> garps.
> -    AS_BOX([$(date +%H:%M:%S.%03N) Waiting all garps sent by gw1])
> +    echo $(date +%H:%M:%S.%03N) Waiting all garps sent by gw1
>      n_new_garps=$(cat gw1_out.tcpdump | grep -c "f0:00:c0:a8:00:fe > 
> Broadcast, ethertype ARP (0x0806), length 42: Request who-has 192.168.0.254 
> tell 192.168.0.254, length 28")
>      OVS_WAIT_UNTIL([
>          n_garps=$n_new_garps
> @@ -3302,130 +3361,229 @@ check_migration_between_gw1_and_gw2() {
>      check_for_new_echo_pkts gw2 "00:00:c0:a8:00:01" "f0:00:c0:a8:00:fe" 
> "false"
>      check_for_new_echo_pkts gw3 "00:00:c0:a8:00:01" "f0:00:c0:a8:00:fe" 
> "false"
>
> +    # All packets should go through gw1, and none through gw2 or gw3.
> +    check_packets "true" "false" "false" "true"
>      flap_count_gw_1=$(m_as ovn-gw-1 ovs-vsctl get interface $from_gw1_to_gw2 
> bfd_status | sed 's/.*flap_count=\"\([[0-9]]*\).*/\1/g')
>      flap_count_gw_2=$(m_as ovn-gw-2 ovs-vsctl get interface $from_gw2_to_gw1 
> bfd_status | sed 's/.*flap_count=\"\([[0-9]]*\).*/\1/g')
> +}
>
> -    if [ test "$action" == "stop_bfd" ]; then
> -        AS_BOX([$(date +%H:%M:%S.%03N) Blocking bfd on gw1 (from $ip_gw1 to 
> $ip_gw2)])
> -        nsenter --net=/proc/$gw1_pid/ns/net nft add table ip ovn-test
> -        nsenter --net=/proc/$gw1_pid/ns/net nft 'add chain ip ovn-test INPUT 
> { type filter hook input priority 0; policy accept; }'
> -        # Drop BFD from gw-1 to gw-2: geneve port (6081), inner port 3784 
> (0xec8), Session state Up, Init, Down.
> -        nsenter --net=/proc/$gw1_pid/ns/net nft add rule ip ovn-test INPUT 
> ip daddr $ip_gw1 ip saddr $ip_gw2 udp dport 6081 '@th,416,16 == 0x0ec8 
> @th,472,8 == 0xc0  counter drop'
> -        nsenter --net=/proc/$gw1_pid/ns/net nft add rule ip ovn-test INPUT 
> ip daddr $ip_gw1 ip saddr $ip_gw2 udp dport 6081 '@th,416,16 == 0x0ec8 
> @th,472,8 == 0x80  counter drop'
> -        nsenter --net=/proc/$gw1_pid/ns/net nft add rule ip ovn-test INPUT 
> ip daddr $ip_gw1 ip saddr $ip_gw2 udp dport 6081 '@th,416,16 == 0x0ec8 
> @th,472,8 == 0x40  counter drop'
> -
> -        # We do not check that packets go through gw2 as BFD between 
> chassis-2 and gw1 is still up
> -    fi
> +check_loss_after_flap()
> +{
> +    dead=$1
> +    max_expected_loss=$2
>
> -    if [ test "$action" == "kill_gw2" ]; then
> -        AS_BOX([$(date +%H:%M:%S.%03N) Killing gw2 ovn-controller])
> -        on_exit 'm_as ovn-gw-2 /usr/share/openvswitch/scripts/ovs-ctl status 
> ||
> -                 m_as ovn-gw-2 /usr/share/openvswitch/scripts/ovs-ctl start 
> --system-id=ovn-gw-1'
> -        on_exit 'm_as ovn-gw-2 /usr/share/ovn/scripts/ovn-ctl 
> status_controller ||
> -                 m_as ovn-gw-2 /usr/share/ovn/scripts/ovn-ctl 
> start_controller ${CONTROLLER_SSL_ARGS}'
> -
> -        m_as ovn-gw-2 kill -9 $(m_as ovn-gw-2 cat 
> /run/ovn/ovn-controller.pid)
> -        m_as ovn-gw-2 kill -9 $(m_as ovn-gw-2 cat 
> /run/openvswitch/ovs-vswitchd.pid)
> -        m_as ovn-gw-2 kill -9 $(m_as ovn-gw-2 cat 
> /run/openvswitch/ovsdb-server.pid)
> -        # Also delete datapath (flows)
> -        m_as ovn-gw-2 ovs-dpctl del-dp system@ovs-system
> -    fi
> -
> -    if [ test "$action" == "kill_gw1" ]; then
> -        AS_BOX([$(date +%H:%M:%S.%03N) Killing gw1 ovn-controller])
> -        on_exit 'm_as ovn-gw-1 /usr/share/openvswitch/scripts/ovs-ctl status 
> ||
> -                 m_as ovn-gw-1 /usr/share/openvswitch/scripts/ovs-ctl start 
> --system-id=ovn-gw-1'
> -        on_exit 'm_as ovn-gw-1 /usr/share/ovn/scripts/ovn-ctl 
> status_controller ||
> -                 m_as ovn-gw-1 /usr/share/ovn/scripts/ovn-ctl 
> start_controller ${CONTROLLER_SSL_ARGS}'
> -
> -        m_as ovn-gw-1 kill -9 $(m_as ovn-gw-1 cat 
> /run/ovn/ovn-controller.pid)
> -        m_as ovn-gw-1 kill -9 $(m_as ovn-gw-1 cat 
> /run/openvswitch/ovs-vswitchd.pid)
> -        m_as ovn-gw-1 kill -9 $(m_as ovn-gw-1 cat 
> /run/openvswitch/ovsdb-server.pid)
> -        # Also delete datapath (flows)
> -        m_as ovn-gw-1 ovs-dpctl del-dp system@ovs-system
> -    fi
> -
> -    if [ test "$action" == "kill_gw2" ]; then
> -        AS_BOX([$(date +%H:%M:%S.%03N) Waiting for flap count between gw1 
> and gw2 to increase])
> +    if [[ "$dead" == "gw2" ]]; then
> +        echo "$(date +%H:%M:%S.%03N) Waiting for flap count between gw1 and 
> gw2 to increase"
>          OVS_WAIT_UNTIL([
>              new_flap_count=$(m_as ovn-gw-1 ovs-vsctl get interfac 
> $from_gw1_to_gw2 bfd_status | sed 's/.*flap_count=\"\([[0-9]]*\).*/\1/g')
>              echo "Comparing $new_flap_count versus $flap_count_gw_1"
>              test "$new_flap_count" -gt "$((flap_count_gw_1))"
>          ])
>      else
> -        AS_BOX([$(date +%H:%M:%S.%03N) Waiting for flap count between gw2 
> and gw1 to increase])
> +        echo "$(date +%H:%M:%S.%03N) Waiting for flap count between gw2 and 
> gw1 to increase])"
>          OVS_WAIT_UNTIL([
>              new_flap_count=$(m_as ovn-gw-2 ovs-vsctl get interfac 
> $from_gw2_to_gw1 bfd_status | sed 's/.*flap_count=\"\([[0-9]]*\).*/\1/g')
>              echo "Comparing $new_flap_count versus $flap_count_gw_2"
>              test "$new_flap_count" -gt "$((flap_count_gw_2))"
>          ])
> -
>      fi
> -    AS_BOX([$(date +%H:%M:%S.%03N) Flapped!])
>
> +    echo "$(date +%H:%M:%S.%03N) Flapped!"
>      # Wait a few more second for the fight.
> +    sleep 4
> +
> +    echo "$(date +%H:%M:%S.%03N) Statistics after flapping"
> +    lost=$(dump_statistics)
> +    echo "===> $lost packet lost while handling migration"
> +    AT_CHECK([test "$lost" -le "$max_expected_loss"])
> +}
> +
> +final_check()
> +{
> +    action=$1
> +    lost=$2
> +    max_expected_loss_after_restoration=$3
> +
> +    # Wait a little more to get packets while network is restored
>      sleep 2
> -    AS_BOX([$(date +%H:%M:%S.%03N) Statistics after flapping])
> -    dump_statistics
> -
> -    if [ test "$action" == "stop_bfd" ]; then
> -        # gw1 still alive and gw2 tried to claim => gw1 should restart 
> generating garps.
> -        check_for_new_garps gw1 "true"
> -        check_for_new_garps gw2 "false"
> -        check_for_new_garps gw3 "false"
> -        check_for_new_echo_pkts gw1 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe 
> "true"
> -        check_for_new_echo_pkts gw2 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe 
> "false"
> -        check_for_new_echo_pkts gw3 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe 
> "false"
> -        check_for_new_echo_pkts ch1 f0:00:c0:a8:01:fe f0:00:c0:a8:01:01 
> "true"
> -        AS_BOX([$(date +%H:%M:%S.%03N) Unblocking bfd on gw1])
> -        nsenter --net=/proc/$gw1_pid/ns/net nft -a list ruleset
> -        nsenter --net=/proc/$gw1_pid/ns/net nft delete table ip ovn-test
> -    fi
> +    echo "$(date +%H:%M:%S.%03N) Statistics after network restored (after 
> $action)"
> +    new_lost=$(dump_statistics)
> +    echo "===> $((new_lost - lost)) packets lost during network restoration"
> +    AT_CHECK([test "$((new_lost - lost))" -le 
> "$max_expected_loss_after_restoration"])
> +    stop_sending_background_packets
> +}
>
> -    if [ test "$action" == "kill_gw2" ]; then
> -        # gw1 still alive, but gw2 did not try to claim => gw1 should not 
> generate new garps.
> -        check_for_new_garps gw1 "false"
> -        check_for_new_garps gw2 "false"
> -        check_for_new_garps gw3 "false"
> -        check_for_new_echo_pkts gw1 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe 
> "true"
> -        check_for_new_echo_pkts gw2 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe 
> "false"
> -        check_for_new_echo_pkts gw3 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe 
> "false"
> -        check_for_new_echo_pkts ch1 f0:00:c0:a8:01:fe f0:00:c0:a8:01:01 
> "true"
> -        AS_BOX([$(date +%H:%M:%S.%03N) Restarting gw2 ovn-vswitchd])
> -        m_as ovn-gw-2 /usr/share/openvswitch/scripts/ovs-ctl start 
> --system-id=ovn-gw-2
> -
> -        AS_BOX([$(date +%H:%M:%S.%03N) Restarting gw2 ovn-controller])
> -        m_as ovn-gw-2 /usr/share/ovn/scripts/ovn-ctl start_controller 
> ${CONTROLLER_SSL_ARGS}
> -    fi
> +check_garps()
> +{
> +    check_for_new_garps gw1 "$1"
> +    check_for_new_garps gw2 "$2"
> +    check_for_new_garps gw3 "$3"
> +}
>
> -    if [ test "$action" == "kill_gw1" ]; then
> -        # gw1 died => gw2 should generate garps.
> -        check_for_new_garps gw1 "false"
> -        check_for_new_garps gw2 "true"
> -        check_for_new_garps gw3 "false"
> -        check_for_new_echo_pkts gw1 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe 
> "false"
> -        check_for_new_echo_pkts gw2 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe 
> "true"
> -        check_for_new_echo_pkts gw3 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe 
> "false"
> -        check_for_new_echo_pkts ch1 f0:00:c0:a8:01:fe f0:00:c0:a8:01:01 
> "true"
> -        AS_BOX([$(date +%H:%M:%S.%03N) Restarting gw1 ovn-vswitchd])
> -        m_as ovn-gw-1 /usr/share/openvswitch/scripts/ovs-ctl start 
> --system-id=ovn-gw-1
> -
> -        AS_BOX([$(date +%H:%M:%S.%03N) Restarting gw1 ovn-controller])
> -        m_as ovn-gw-1 /usr/share/ovn/scripts/ovn-ctl start_controller 
> ${CONTROLLER_SSL_ARGS}
> -    fi
> +check_packets()
> +{
> +    check_for_new_echo_pkts gw1 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe "$1"
> +    check_for_new_echo_pkts gw2 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe "$2"
> +    check_for_new_echo_pkts gw3 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe "$3"
> +    check_for_new_echo_pkts ch1 f0:00:c0:a8:01:fe f0:00:c0:a8:01:01 "$4"
> +}
> +
> +check_migration_between_gw1_and_gw2_bfd_stop()
> +{
> +    AS_BOX([$(date +%H:%M:%S.%03N) Testing migration after bfd_stop])
> +    loss1=$1
> +    loss2=$2
> +    prepare
> +
> +    echo "$(date +%H:%M:%S.%03N) Blocking bfd on gw1 (from $ip_gw1 to 
> $ip_gw2)"
> +    nsenter --net=/proc/$gw1_pid/ns/net nft add table ip ovn-test
> +    nsenter --net=/proc/$gw1_pid/ns/net nft 'add chain ip ovn-test INPUT { 
> type filter hook input priority 0; policy accept; }'
> +    # Drop BFD from gw-1 to gw-2: geneve port (6081), inner port 3784 
> (0xec8), Session state Up, Init, Down.
> +    nsenter --net=/proc/$gw1_pid/ns/net nft add rule ip ovn-test INPUT ip 
> daddr $ip_gw1 ip saddr $ip_gw2 udp dport 6081 '@th,416,16 == 0x0ec8 @th,472,8 
> == 0xc0  counter drop'
> +    nsenter --net=/proc/$gw1_pid/ns/net nft add rule ip ovn-test INPUT ip 
> daddr $ip_gw1 ip saddr $ip_gw2 udp dport 6081 '@th,416,16 == 0x0ec8 @th,472,8 
> == 0x80  counter drop'
> +    nsenter --net=/proc/$gw1_pid/ns/net nft add rule ip ovn-test INPUT ip 
> daddr $ip_gw1 ip saddr $ip_gw2 udp dport 6081 '@th,416,16 == 0x0ec8 @th,472,8 
> == 0x40  counter drop'
> +
> +    check_loss_after_flap "gw1" $loss1
> +
> +    # gw1 still alive and gw2 tried to claim => gw1 should restart 
> generating garps.
> +    check_garps "true" "false" "false"
> +    check_packets "true" "false" "false" "true"
> +
> +    echo "$(date +%H:%M:%S.%03N) Unblocking bfd on gw1"
> +    nsenter --net=/proc/$gw1_pid/ns/net nft -a list ruleset
> +    nsenter --net=/proc/$gw1_pid/ns/net nft delete table ip ovn-test
>
>      # The network is now restored => packets should go through gw1 and reach 
> chassis-1.
> -    check_for_new_echo_pkts gw1 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe "true"
> -    check_for_new_echo_pkts gw2 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe "false"
> -    check_for_new_echo_pkts gw3 00:00:c0:a8:00:01 f0:00:c0:a8:00:fe "false"
> -    check_for_new_echo_pkts ch1 f0:00:c0:a8:01:fe f0:00:c0:a8:01:01 "true"
> -    AS_BOX([$(date +%H:%M:%S.%03N) Statistics after network restored])
> -    dump_statistics
> -    stop_sending_background_packets
> +    check_packets "true" "false" "false" "true"
> +    final_check "bfd_stop" $lost $loss2
> +}
> +
> +check_migration_between_gw1_and_gw2_kill_gw2() {
> +    AS_BOX([$(date +%H:%M:%S.%03N) Check migration after killing gw2 
> ovn-controller & vswitchd])
> +    loss1=$1
> +    loss2=$2
> +    prepare
> +
> +    on_exit 'm_as ovn-gw-2 /usr/share/openvswitch/scripts/ovs-ctl status ||
> +             m_as ovn-gw-2 /usr/share/openvswitch/scripts/ovs-ctl start 
> --system-id=ovn-gw-2'
> +    on_exit 'm_as ovn-gw-2 /usr/share/ovn/scripts/ovn-ctl status_controller 
> ||
> +             m_as ovn-gw-2 /usr/share/ovn/scripts/ovn-ctl start_controller 
> ${CONTROLLER_SSL_ARGS}'
> +
> +    m_as ovn-gw-2 kill -9 $(m_as ovn-gw-2 cat /run/ovn/ovn-controller.pid)
> +    m_as ovn-gw-2 kill -9 $(m_as ovn-gw-2 cat 
> /run/openvswitch/ovs-vswitchd.pid)
> +    m_as ovn-gw-2 kill -9 $(m_as ovn-gw-2 cat 
> /run/openvswitch/ovsdb-server.pid)
> +    m_as ovn-gw-2 ovs-dpctl del-dp system@ovs-system
> +
> +    check_loss_after_flap "gw2" $loss1
> +
> +    # gw1 still alive, but gw2 did not try to claim => gw1 should not 
> generate new garps.
> +    check_garps "false" "false" "false"
> +    check_packets "true" "fals" "false" "true"
> +
> +    echo "$(date +%H:%M:%S.%03N) Restarting gw2 ovn-vswitchd]"
> +    m_as ovn-gw-2 /usr/share/openvswitch/scripts/ovs-ctl start 
> --system-id=ovn-gw-2
> +
> +    echo "$(date +%H:%M:%S.%03N) Restarting gw2 ovn-controller"
> +    m_as ovn-gw-2 /usr/share/ovn/scripts/ovn-ctl start_controller 
> ${CONTROLLER_SSL_ARGS}
> +
> +    # The network is now restored => packets should go through gw1 and reach 
> chassis-1.
> +    check_packets "true" "false" "false" "true"
> +    final_check "kill_gw2" $lost $loss2
> +}
> +
> +check_migration_between_gw1_and_gw2_update_ovs() {
> +    AS_BOX([$(date +%H:%M:%S.%03N) Check migration after restarting gw1 
> ovs-vswitchd ("update")])
> +    loss1=$1
> +    loss2=$2
> +    prepare
> +
> +    m_as ovn-gw-1 /usr/share/openvswitch/scripts/ovs-ctl restart 
> --system-id=ovn-gw-1
> +
> +    check_loss_after_flap "gw1" $loss1
> +
> +    # The network is now restored => packets should go through gw1 and reach 
> chassis-1.
> +    check_packets "true" "false" "false" "true"
> +    final_check "ovs_update" $lost $loss2
> +}
> +
> +check_migration_between_gw1_and_gw2_kill_gw1() {
> +    AS_BOX([$(date +%H:%M:%S.%03N) Killing gw1 ovn-controller and 
> ovs-vswitchd])
> +    loss1=$1
> +    loss2=$2
> +    prepare
> +
> +    on_exit 'm_as ovn-gw-1 /usr/share/openvswitch/scripts/ovs-ctl status ||
> +             m_as ovn-gw-1 /usr/share/openvswitch/scripts/ovs-ctl start 
> --system-id=ovn-gw-1'
> +    on_exit 'm_as ovn-gw-1 /usr/share/ovn/scripts/ovn-ctl status_controller 
> ||
> +             m_as ovn-gw-1 /usr/share/ovn/scripts/ovn-ctl start_controller 
> ${CONTROLLER_SSL_ARGS}'
> +
> +    m_as ovn-gw-1 kill -9 $(m_as ovn-gw-1 cat /run/ovn/ovn-controller.pid)
> +    m_as ovn-gw-1 kill -9 $(m_as ovn-gw-1 cat 
> /run/openvswitch/ovs-vswitchd.pid)
> +    m_as ovn-gw-1 kill -9 $(m_as ovn-gw-1 cat 
> /run/openvswitch/ovsdb-server.pid)
> +    # Also delete datapath (flows)
> +    m_as ovn-gw-1 ovs-dpctl del-dp system@ovs-system
> +
> +    check_loss_after_flap "gw1" $loss1
> +
> +    # gw1 died => gw2 should generate garps.
> +    check_garps "false" "true" "false"
> +    check_packets "false" "true" "false" "true"
> +    echo "$(date +%H:%M:%S.%03N) Restarting gw1 ovn-vswitchd after killing 
> gw1"
> +    m_as ovn-gw-1 /usr/share/openvswitch/scripts/ovs-ctl start 
> --system-id=ovn-gw-1
> +
> +    # Wait some long time before restarting ovn-controller
> +    sleep 10
> +
> +    # gw2 should still be handling packets as OVN not restarted on gw1
> +    check_packets "false" "true" "false" "true"
> +
> +    echo "$(date +%H:%M:%S.%03N) Restarting gw1 ovn-controller after killing 
> gw1"
> +    m_as ovn-gw-1 /usr/share/ovn/scripts/ovn-ctl start_controller 
> ${CONTROLLER_SSL_ARGS}
> +
> +    # The network is now restored => packets should go through gw1 and reach 
> chassis-1.
> +    check_packets "true" "false" "false" "true"
> +    final_check "kill_gw1" $lost $loss2
> +}
> +
> +check_migration_between_gw1_and_gw2_reboot_gw1() {
> +    ip_gw1_eth1=$(podman exec ovn-gw-1 ip -brief address show eth1 | awk 
> '{print $3}' | cut -d/ -f1)
> +    cidr=$(podman exec ovn-gw-1 ip -brief address show eth1 | awk '{print 
> $3}' | cut -d/ -f2)
> +    AS_BOX([$(date +%H:%M:%S.%03N) Rebooting ovn-gw-1 with 
> $ip_gw1_eth1/$cidr])
> +    loss1=$1
> +    loss2=$2
> +    prepare
> +
> +    podman stop -t 0 ovn-gw-1
> +    (exec 3>&- 4>&- 5>&- 6>&-; podman start ovn-gw-1)
> +
> +    add_port br-ovn eth1 $ip_gw1_eth1/$cidr
> +    add_port br-ovn-ext eth2
> +    M_START_TCPDUMPS([ovn-gw-1], [-neei eth2], [gw1], [ovn-gw-1], [-neei 
> eth1], [gw1_eth1], [ovn-gw-1], [-neei eth2 -Q out], [gw1_out])
> +    check_loss_after_flap "gw1" $loss1
> +
> +    # gw1 died => gw2 should generate garps.
> +    check_garps "false" "true" "false"
> +    check_packets "false" "true" "false" "true"
> +
> +    echo "$(date +%H:%M:%S.%03N) Restarting gw1 ovn-vswitchd after rebooting 
> gw1"
> +    m_as ovn-gw-1 /usr/share/openvswitch/scripts/ovs-ctl start 
> --system-id=ovn-gw-1
> +
> +    # Wait some long time before restarting ovn-controller
> +    sleep 10
> +
> +    # gw2 should still be handling packets as OVN not restarted on gw1
> +    check_packets "false" "true" "false" "true"
> +
> +    echo "$(date +%H:%M:%S.%03N) Restarting gw1 ovn-controller after 
> rebooting gw1"
> +    m_as ovn-gw-1 /usr/share/ovn/scripts/ovn-ctl start_controller 
> ${CONTROLLER_SSL_ARGS}
> +
> +    # The network is now restored => packets should go through gw1 and reach 
> chassis-1.
> +    check_packets "true" "false" "false" "true"
> +    final_check "kill_gw1" $lost $loss2
>  }
>
>  start_tcpdump
> -AS_BOX([$(date +%H:%M:%S.%03N) Sending packet from hv1-vif1(inside1) to 
> ext1])
> +echo "$(date +%H:%M:%S.%03N) Sending packet from hv1-vif1(inside1) to ext1"
>  M_NS_CHECK_EXEC([ovn-chassis-1], [hv1-vif1], [ping -c3 -q -i 0.1 192.168.0.1 
> | FORMAT_PING],
>  [0], [dnl
>  3 packets transmitted, 3 received, 0% packet loss, time 0ms
> @@ -3433,7 +3591,7 @@ M_NS_CHECK_EXEC([ovn-chassis-1], [hv1-vif1], [ping -c3 
> -q -i 0.1 192.168.0.1 | F
>  stop_tcpdump
>
>  # It should have gone through gw1 and not gw2
> -AS_BOX([$(date +%H:%M:%S.%03N) Checking it went through gw1 and not gw2])
> +echo "$(date +%H:%M:%S.%03N) Checking it went through gw1 and not gw2"
>  AT_CHECK([cat gw2.tcpdump | grep "ICMP echo"], [1], [dnl
>  ])
>
> @@ -3446,17 +3604,25 @@ f0:00:c0:a8:00:fe > 00:00:c0:a8:00:01, ethertype IPv4 
> (0x0800), length 98: 192.1
>  00:00:c0:a8:00:01 > f0:00:c0:a8:00:fe, ethertype IPv4 (0x0800), length 98: 
> 192.168.0.1 > 192.168.1.1: ICMP echo reply,
>  ])
>
> -# We stop bfd between gw1 & gw2, but keep gw1 & gw2 running.
> -check_migration_between_gw1_and_gw2 "stop_bfd"
> +# We stop bfd between gw1 & gw2, but keep gw1 & gw2 running. We should not 
> lose packets.
> +check_migration_between_gw1_and_gw2_bfd_stop 1 1
>
>  # We simulate death of gw2. It should not have any effect.
> -check_migration_between_gw1_and_gw2 "kill_gw2"
> +check_migration_between_gw1_and_gw2_kill_gw2 1 1
> +
> +# We simulate ovs update on gw1. When ovs is stopped, flows should still be 
> handled by Kernel datapath.
> +# When OVS is restarted, BFD should go down immediately, and gw2 will start 
> handling packets.
> +# There will be packet losses as gw2 will usually see BFD from gw1 up (and 
> hence relase port) before gw1 sees
> +# BFD up (and claim port).
> +check_migration_between_gw1_and_gw2_update_ovs 20 1
> +
> +# We simulate restart of both OVS & OVN gw1. gw2 should take over.
> +check_migration_between_gw1_and_gw2_kill_gw1 40 20
>
>  # We simulate death of gw1. gw2 should take over.
> -check_migration_between_gw1_and_gw2 "kill_gw1"
> +check_migration_between_gw1_and_gw2_reboot_gw1 40 20
>
>  AT_CLEANUP
> -])
>
>  AT_SETUP([ovn multinode bgp L2 EVPN])
>  check_fake_multinode_setup
> --
> 2.47.1
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to