On Fri, Jan 31, 2025 at 11:16 AM Xavier Simonart <xsimo...@redhat.com> wrote:
> bfd_chassis node was recomputed when sb_global columns such as nb_cfg > were updated. If, in addition, SB was read only at that moment (i.e. an > transactiuon was in fly), then the update was cancelled and resulted > in a full recompute. > > This was highlighted as causing random failures in tests such as > "ACL with Port Group conjunction flow efficiency". > > Signed-off-by: Xavier Simonart <xsimo...@redhat.com> > --- > Hi Xavier, thank you for the patch. I'm slightly confused, it doesn't seem like we need the SB Global dependency at all for the BFD node. It is needed only for the bfd_run() which takes it directly from DB txn so IMO there is no need for this node to have this dependency would you agree? > controller/ovn-controller.c | 15 +++++++- > tests/system-ovn.at | 69 +++++++++++++++++++++++++++++++++++++ > 2 files changed, 83 insertions(+), 1 deletion(-) > > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index 63c787bde..41214ac61 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -3400,6 +3400,18 @@ en_bfd_chassis_run(struct engine_node *node, void > *data OVS_UNUSED) > engine_set_node_state(node, EN_UPDATED); > } > > +static bool > +en_bfd_chassis_sb_sb_global_handler(struct engine_node *node, > + void *data OVS_UNUSED) > +{ > + const struct sbrec_sb_global_table *sb_global_table = > + EN_OVSDB_GET(engine_get_input("SB_sb_global", node)); > + const struct sbrec_sb_global *sb_global = > + sbrec_sb_global_table_first(sb_global_table); > + /* Only recompute bfd_chassis is the options column get updated */ > + return !sbrec_sb_global_is_updated(sb_global, > SBREC_SB_GLOBAL_COL_OPTIONS); > +} > + > static void > en_bfd_chassis_cleanup(void *data OVS_UNUSED){ > struct ed_type_bfd_chassis *bfd_chassis = data; > @@ -5244,7 +5256,8 @@ main(int argc, char *argv[]) > engine_add_input(&en_if_status_mgr, &en_ovs_interface, > if_status_mgr_ovs_interface_handler); > engine_add_input(&en_bfd_chassis, &en_ovs_open_vswitch, NULL); > - engine_add_input(&en_bfd_chassis, &en_sb_sb_global, NULL); > + engine_add_input(&en_bfd_chassis, &en_sb_sb_global, > + en_bfd_chassis_sb_sb_global_handler); > engine_add_input(&en_bfd_chassis, &en_sb_chassis, NULL); > engine_add_input(&en_bfd_chassis, &en_sb_ha_chassis_group, NULL); > > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > index c7379bedb..0a9771018 100644 > --- a/tests/system-ovn.at > +++ b/tests/system-ovn.at > @@ -15465,3 +15465,72 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d > /.*terminating with signal 15.*/d"]) > AT_CLEANUP > ]) > + > +OVN_FOR_EACH_NORTHD([ > +AT_SETUP([bfd induced recomputes]) > +ovn_start --use-tcp-to-sb > +OVS_TRAFFIC_VSWITCHD_START() > +ADD_BR([br-int]) > +ADD_BR([br-phys]) > + > +dnl Set external-ids in br-int needed for ovn-controller > +PARSE_LISTENING_PORT([$ovs_base/ovn-sb/ovsdb-server.log], [TCP_PORT]) > + > +check ovs-vsctl \ > + -- set Open_vSwitch . external-ids:system-id=hv1 \ > + -- set Open_vSwitch . external-ids:ovn-remote=tcp:127.0.0.1:$TCP_PORT > \ > + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ > + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ > + -- set bridge br-int fail-mode=secure > other-config:disable-in-band=true > + > +start_daemon ovn-controller > + > +check ovn-nbctl ls-add ls > +check ovn-nbctl lsp-add ls lsp1 -- lsp-set-addresses lsp1 > "00:00:10:01:02:01 10.0.0.1" > +check ovs-vsctl set open . > external_ids:ovn-bridge-mappings=physnet1:br-phys > +check ovs-vsctl add-port br-int vif1 -- set Interface vif1 > external-ids:iface-id=lsp1 -- set Interface vif1 type=internal > + > +check ovn-nbctl --wait=hv sync > +wait_for_ports_up > + > +# Stop updates from ovn-controller to sb > +stop_ovsdb_controller_updates $TCP_PORT > + > +# Change sb_global:options:nb_cfg; this will trigger updates to > (Chassis_Private in ) sb by ovn-controller. > +check ovn-nbctl --wait=sb sync > +# sb should now be read-only. > + > +lflow_run=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run) > + > +# Change again nb_cfg. > +check ovn-nbctl --wait=sb sync > + > +# Re-enable notifications from sb. > +restart_ovsdb_controller_updates $TCP_PORT > + > +# Make sure ovn-controller run. > +check ovn-nbctl --wait=hv sync > + > +lflow_run_end=$(ovn-appctl -t ovn-controller coverage/read-counter > lflow_run) > + > +n_recomputes=`expr $lflow_run_end - $lflow_run` > +echo "$n_recomputes recomputes" > + > +AT_CHECK([test $lflow_run_end == $lflow_run]) > + > +OVN_CLEANUP_CONTROLLER([hv1]) > + > +as ovn-sb > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > + > +as ovn-nb > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > + > +as northd > +OVS_APP_EXIT_AND_WAIT([ovn-northd]) > + > +as > +OVS_TRAFFIC_VSWITCHD_STOP() > + > +AT_CLEANUP > +]) > -- > 2.47.1 > > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > Thanks, Ales _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev