On 5/28/26 10:16 PM, Mark Michelson via dev wrote:
> In commit 621f85e92437 ("controller: Fix bfd up too early after
> unexpected reboot."), OVN would manipulate the flow-restore-wait flag in
> order to attempt to synchronize with OVS. The previous commit reverted
> this change due to a potential deadlock between OVS and OVN.
> 
> If users were using a version of OVN that had commit 621f85e92437, then
> the revert will help to ensure that OVS and OVN won't deadlock any
> longer. However, the revert on its own does not resolve the deadlock if
> the previously running version of ovn-controller set flow-restore-wait
> to true. It requires manual intervention to get flow-restore-wait set to
> false or deleted from the Open_vSwitch table.
> 
> This commit seeks to assist by clearing the flow-restore-wait key in the
> Open_vSwitch table. This way, on upgrade, the deadlock is resolved
> without manually setting any OVS database values.
> 
> Reported-at: https://redhat.atlassian.net/browse/FDP-3862
> Fixes: 621f85e92437 ("controller: Fix bfd up too early after unexpected 
> reboot.")
> Signed-off-by: Mark Michelson <[email protected]>
> ---

Hi Mark,

This looks good to me, thanks!

Acked-by: Dumitru Ceara <[email protected]>

>  controller/ovn-controller.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 7e6c9e69a..ad094a454 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -7895,6 +7895,24 @@ main(int argc, char *argv[])
>                  ovsrec_open_vswitch_update_other_config_setkey(
>                      cfg, "vlan-limit", "0");
>              }
> +            /* Clear flow-restore-wait. OVN at one point would set
> +             * flow-restore-wait in order to try to synchronize with
> +             * OVS. However, that resulted in a bug, so that behavior
> +             * was reverted. If upgrading from a version where OVN
> +             * manipulted flow-restore-wait, then flow-restore-wait
> +             * needs to be cleared in order for OVS to function
> +             * properly. This is (hopefully) a temporary measure until
> +             * a more reliable method of synchronizing with OVS is
> +             * devised.
> +             */
> +            if (smap_get_bool(&cfg->external_ids,
> +                              "ovn-managed-flow-restore-wait", false) &&
> +                smap_get(&cfg->other_config, "flow-restore-wait")) {
> +                ovsrec_open_vswitch_update_other_config_delkey(
> +                    cfg, "flow-restore-wait");
> +                ovsrec_open_vswitch_update_external_ids_delkey(
> +                   cfg, "ovn-managed-flow-restore-wait");
> +            }
>          }
>  
>          static bool chassis_idx_stored = false;


Regards,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to