On 6/17/26 11:54 PM, Odintsov Vladislav wrote: > Hi Lorenzo, > Hi Vladislav, Lorenzo,
Adding some of the RH OpenStack folks (Yatin, Rodolfo) to the thread as they're the ones needing this feature. > I’ve quickly read through the commit message and the code and a bit concerned > about the approach for ensuring that migration is safe to run based on the > information of openflow flows installation is finished or not only on the > destination for VM migration node. The original request from Red Hat OpenStack devs [0] was to have a way to determine when ovn-controller handling the additional chassis of a port binding has installed the required openflow rules for that port binding. And delay triggering the live migration. IIUC, the concern they had was that without such a mechanism the following could happen: 0. live migration is needed (not started yet) 1. neutron sets additional chassis in the VMs NB LSP 2. neutron assumes ovn-controller on "source" and "target" chassis processed the NB update. 3. live migration starts The problem is at step "2". The actual flow of configuration is: neutron -> NB -> ovn-northd -> SB -> ovn-controllers. If any of these steps takes "longer" then live migration might be triggered (step "3") before ovn-controller on the "target" has managed to set up the openflow rules that are required to deliver traffic to the "new" version of the VM during live migration. > From my perspective having a converged state only on the destination node is > not enough. Imagine situation, where node A (source node) hosts a running VM > M, node B is a destination node for VM M migration and node C is another > node, which hosts VM N. If CMS uses “additional-chassis” feature for VM > migration, so while VM M is migrating, traffic from VM N (node C) towards VM > M must be cloned and delivered to both nodes (A and B). > And in case we’ve finished awaiting of node B (dest) OF flows > computation/installation, we still may have an outdated state on node C and > finalized migration can brake connectivity for some time. > The goal of this patch was to inform neutron about node B (dest) being ready to process incoming traffic and only trigger live migration then. But you make a good point here. If node C is slow then it might not have processed the additional chassis and might have not yet set up the flows to duplicate traffic to both node A and node B. > So, I’ve got some questions: > 1. Shouldn’t CMS just either bump NB nb_cfg and monitor summarized hv_cfg > value in NB DB or bump NB nb_cfg and monitor per-chasssis hv_cfg value in SB > DB instead of this approach? I think that would work, I'm not sure how much effort that would be on the neutron side. Yatin, Rodolfo, what do you guys think? > 2. Do we need a new mechanism, which covers only nodes’ state, which act as > additional chassis, but not as other nodes, which interact with it? > Are you asking if we need this patch? :) In any case, Lorenzo, maybe it's good to wait a bit with merging this patch until we clarify whether it fully meets the needs of OpenStack and if neutron has ways to mitigate Vladislav's concerns. Regards, Dumitru [0] https://redhat.atlassian.net/browse/FDP-2903 > regards, > Vladislav Odintsov > >> On 17 Jun 2026, at 20:13, Lorenzo Bianconi via dev <[email protected]> >> wrote: >> >> During VM live migration, a CMS needs to know when the destination >> chassis has finished installing OpenFlow flows before it can safely >> start the VM. Currently, ovn-controller sets ovn-installed on the >> local OVS interface, but this information is not reflected back to >> the Southbound DB, requiring a per-chassis agent to monitor readiness. >> Add a new Port_Binding options:additional-chassis-ready key that >> contains a comma-separated list of chassis names that have completed >> flow installation as additional chassis. ovn-controller sets this >> when local_binding_set_up() is called for an additional chassis >> binding, and clears it when the chassis is released. >> The option is preserved by northd during Port_Binding option rebuilds, >> gated on requested_additional_chassis being set, so it is automatically >> cleaned up when migration completes. >> This differs from the existing additional-chassis-activated option >> which is traffic-triggered (RARP/GARP/NA via activation strategy). >> The new option is flow-installation-triggered and always-on. >> >> Assisted-by: Claude Opus 4.6, Claude Code >> Signed-off-by: Lorenzo Bianconi <[email protected]> >> --- >> Changes in v2: >> - Added NEWS entry. >> - Replaced custom is_chassis_in_list() and remove_chassis_from_list() >> with sset_from_delimited_string()/sset_contains()/ >> sset_find_and_delete()/sset_join(), avoiding double-parsing in >> the release path. [Dumitru] >> - Renamed LIST_FOR_EACH iterator in local_binding_set_up() to avoid >> confusing reuse of b_lport. [Dumitru] >> - Tests: use fetch_column, remove TAG_UNSTABLE, add --wait=hv >> synchronization, convert wait_column to check_column where >> applicable, fix comment style. [Dumitru] >> --- >> NEWS | 2 + >> controller/binding.c | 54 ++++++++++++++-- >> northd/northd.c | 5 ++ >> ovn-sb.xml | 12 ++++ >> tests/ovn.at | 143 +++++++++++++++++++++++++++++++++++++++++++ >> 5 files changed, 212 insertions(+), 4 deletions(-) >> >> diff --git a/NEWS b/NEWS >> index 748ae30eb..5bb727c8a 100644 >> --- a/NEWS >> +++ b/NEWS >> @@ -1,5 +1,7 @@ >> Post v26.03.0 >> ------------- >> + - Added Port_Binding options:additional-chassis-ready to report >> per-chassis >> + flow installation readiness to the Southbound DB during live migration. >> - Added ability to set any "ipsec_*" NB_Global option to configure the >> IPsec backend. >> - Documented missing ovn-nbctl commands: "mirror-rule-add", >> diff --git a/controller/binding.c b/controller/binding.c >> index de51be823..b14cf020a 100644 >> --- a/controller/binding.c >> +++ b/controller/binding.c >> @@ -1031,11 +1031,35 @@ local_binding_set_up(struct shash *local_bindings, >> const char *pb_name, >> ts_now_str); >> } >> >> - if (!sb_readonly && lbinding && b_lport && b_lport->pb->n_up && >> - !b_lport->pb->up[0] && b_lport->pb->chassis == chassis_rec) { >> - binding_lport_set_up(b_lport, sb_readonly); >> - LIST_FOR_EACH (b_lport, list_node, &lbinding->binding_lports) { >> + if (!sb_readonly && lbinding && b_lport) { >> + if (b_lport->pb->n_up && !b_lport->pb->up[0] && >> + b_lport->pb->chassis == chassis_rec) { >> binding_lport_set_up(b_lport, sb_readonly); >> + struct binding_lport *iter; >> + LIST_FOR_EACH (iter, list_node, &lbinding->binding_lports) { >> + binding_lport_set_up(iter, sb_readonly); >> + } >> + } >> + >> + if (is_additional_chassis(b_lport->pb, chassis_rec)) { >> + const char *current = smap_get(&b_lport->pb->options, >> + "additional-chassis-ready"); >> + if (!current) { >> + sbrec_port_binding_update_options_setkey( >> + b_lport->pb, "additional-chassis-ready", >> + chassis_rec->name); >> + } else { >> + struct sset ready_set; >> + sset_from_delimited_string(&ready_set, current, ","); >> + if (!sset_contains(&ready_set, chassis_rec->name)) { >> + char *val = xasprintf("%s,%s", current, >> + chassis_rec->name); >> + sbrec_port_binding_update_options_setkey( >> + b_lport->pb, "additional-chassis-ready", val); >> + free(val); >> + } >> + sset_destroy(&ready_set); >> + } >> } >> } >> } >> @@ -1570,6 +1594,28 @@ release_lport_additional_chassis(const struct >> sbrec_port_binding *pb, >> remove_additional_chassis(pb, chassis_rec); >> } >> >> + const char *ready = smap_get(&pb->options, "additional-chassis-ready"); >> + if (ready) { >> + struct sset ready_set; >> + sset_from_delimited_string(&ready_set, ready, ","); >> + if (sset_find_and_delete(&ready_set, chassis_rec->name)) { >> + if (sb_readonly) { >> + sset_destroy(&ready_set); >> + return false; >> + } >> + if (!sset_is_empty(&ready_set)) { >> + char *updated = sset_join(&ready_set, ",", ""); >> + sbrec_port_binding_update_options_setkey( >> + pb, "additional-chassis-ready", updated); >> + free(updated); >> + } else { >> + sbrec_port_binding_update_options_delkey( >> + pb, "additional-chassis-ready"); >> + } >> + } >> + sset_destroy(&ready_set); >> + } >> + >> VLOG_INFO("Releasing lport %s from this additional chassis.", >> pb->logical_port); >> return true; >> diff --git a/northd/northd.c b/northd/northd.c >> index 0dbf17426..71b3ca9c1 100644 >> --- a/northd/northd.c >> +++ b/northd/northd.c >> @@ -2871,6 +2871,11 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn *ovnsb_txn, >> smap_add(&options, "additional-chassis-activated", >> activated_str); >> } >> + const char *ready_str = smap_get(&op->sb->options, >> + >> "additional-chassis-ready"); >> + if (ready_str) { >> + smap_add(&options, "additional-chassis-ready", >> ready_str); >> + } >> } >> >> /* Preserve virtual port options. */ >> diff --git a/ovn-sb.xml b/ovn-sb.xml >> index e45b63d73..5175c523a 100644 >> --- a/ovn-sb.xml >> +++ b/ovn-sb.xml >> @@ -3855,6 +3855,18 @@ tcp.flags = RST; >> that the port was activated using the strategy specified. >> </column> >> >> + <column name="options" key="additional-chassis-ready"> >> + A comma-separated list of chassis names that have finished >> installing >> + OpenFlow flows for this port binding as an additional chassis. Set >> by >> + <code>ovn-controller</code> when the interface reaches the >> + <code>ovn-installed</code> state on the additional chassis. This >> + allows a CMS to monitor the Southbound DB for migration readiness >> + without requiring an agent on each chassis. The option is >> + automatically cleaned up when the chassis is removed from >> + <ref column="additional_chassis"/> or when >> + <ref column="requested_additional_chassis"/> is cleared. >> + </column> >> + >> <column name="options" key="iface-id-ver"> >> If set, this port will be bound by <code>ovn-controller</code> >> only if this same key and value is configured in the >> diff --git a/tests/ovn.at b/tests/ovn.at >> index 522c1c90d..c19227e98 100644 >> --- a/tests/ovn.at >> +++ b/tests/ovn.at >> @@ -17336,6 +17336,149 @@ OVN_CLEANUP([hv1],[hv2]) >> AT_CLEANUP >> ]) >> >> +OVN_FOR_EACH_NORTHD([ >> +AT_SETUP([options:additional-chassis-ready for logical port]) >> +AT_KEYWORDS([multi-chassis]) >> +ovn_start >> + >> +net_add n1 >> + >> +sim_add hv1 >> +as hv1 >> +check ovs-vsctl add-br br-phys >> +ovn_attach n1 br-phys 192.168.0.11 >> + >> +sim_add hv2 >> +as hv2 >> +check ovs-vsctl add-br br-phys >> +ovn_attach n1 br-phys 192.168.0.12 >> + >> +check ovn-nbctl ls-add ls0 >> +check ovn-nbctl lsp-add ls0 lsp0 >> + >> +# Allow only chassis hv1 to bind logical port lsp0. >> +check ovn-nbctl --wait=hv lsp-set-options lsp0 requested-chassis=hv1 >> + >> +as hv1 check ovs-vsctl -- add-port br-int lsp0 -- \ >> + set Interface lsp0 external-ids:iface-id=lsp0 >> +as hv2 check ovs-vsctl -- add-port br-int lsp0 -- \ >> + set Interface lsp0 external-ids:iface-id=lsp0 >> + >> +wait_row_count Chassis 1 name=hv1 >> +wait_row_count Chassis 1 name=hv2 >> +hv1_uuid=$(fetch_column Chassis _uuid name=hv1) >> +hv2_uuid=$(fetch_column Chassis _uuid name=hv2) >> + >> +wait_column "$hv1_uuid" Port_Binding chassis logical_port=lsp0 >> +wait_column "$hv1_uuid" Port_Binding requested_chassis logical_port=lsp0 >> +wait_column "" Port_Binding additional_chassis logical_port=lsp0 >> +wait_column "" Port_Binding requested_additional_chassis logical_port=lsp0 >> + >> +pb_uuid=$(fetch_column Port_Binding _uuid logical_port=lsp0) >> + >> +# additional-chassis-ready should not be set yet. >> +AT_CHECK([ovn-sbctl get Port_Binding $pb_uuid >> options:additional-chassis-ready 2>/dev/null], [1], [ignore], [ignore]) >> + >> +# Request port binding at an additional chassis (simulate migration start). >> +check ovn-nbctl --wait=hv lsp-set-options lsp0 \ >> + requested-chassis=hv1,hv2 >> + >> +check_column "$hv1_uuid" Port_Binding chassis logical_port=lsp0 >> +check_column "$hv2_uuid" Port_Binding additional_chassis logical_port=lsp0 >> +check_column "$hv2_uuid" Port_Binding requested_additional_chassis >> logical_port=lsp0 >> + >> +# Verify additional-chassis-ready=hv2 is set in Port_Binding options. >> +OVS_WAIT_UNTIL([test xhv2 = x$(ovn-sbctl get Port_Binding $pb_uuid >> options:additional-chassis-ready | tr -d '""')]) >> + >> +# Complete migration: move binding to hv2. >> +check ovn-nbctl --wait=hv lsp-set-options lsp0 requested-chassis=hv2 >> + >> +check_column "$hv2_uuid" Port_Binding chassis logical_port=lsp0 >> +check_column "$hv2_uuid" Port_Binding requested_chassis logical_port=lsp0 >> +check_column "" Port_Binding additional_chassis logical_port=lsp0 >> +check_column "" Port_Binding requested_additional_chassis logical_port=lsp0 >> + >> +# Verify additional-chassis-ready is cleared after migration completes. >> +OVS_WAIT_UNTIL([test x = x$(ovn-sbctl get Port_Binding $pb_uuid >> options:additional-chassis-ready 2>/dev/null)]) >> + >> +OVN_CLEANUP([hv1 >> +ignored_dp=ls0],[hv2]) >> + >> +AT_CLEANUP >> +]) >> + >> +OVN_FOR_EACH_NORTHD([ >> +AT_SETUP([options:additional-chassis-ready with multiple additional >> chassis]) >> +AT_KEYWORDS([multi-chassis]) >> +ovn_start >> + >> +net_add n1 >> + >> +for i in 1 2 3; do >> + sim_add hv$i >> + as hv$i >> + check ovs-vsctl add-br br-phys >> + ovn_attach n1 br-phys 192.168.0.1$i >> +done >> + >> +check ovn-nbctl ls-add ls0 >> +check ovn-nbctl lsp-add ls0 lsp0 >> + >> +# Bind the port to hv1 initially. >> +check ovn-nbctl --wait=hv lsp-set-options lsp0 requested-chassis=hv1 >> + >> +for i in 1 2 3; do >> + as hv$i check ovs-vsctl -- add-port br-int lsp0 -- \ >> + set Interface lsp0 external-ids:iface-id=lsp0 >> +done >> + >> +for i in 1 2 3; do >> + wait_row_count Chassis 1 name=hv$i >> +done >> +hv1_uuid=$(fetch_column Chassis _uuid name=hv1) >> +hv2_uuid=$(fetch_column Chassis _uuid name=hv2) >> +hv3_uuid=$(fetch_column Chassis _uuid name=hv3) >> + >> +wait_column "$hv1_uuid" Port_Binding chassis logical_port=lsp0 >> + >> +pb_uuid=$(fetch_column Port_Binding _uuid logical_port=lsp0) >> + >> +# Request binding at two additional chassis. >> +check ovn-nbctl --wait=hv lsp-set-options lsp0 requested-chassis=hv1,hv2,hv3 >> + >> +check_column "$hv1_uuid" Port_Binding chassis logical_port=lsp0 >> +check_column "$hv2_uuid $hv3_uuid" Port_Binding additional_chassis >> logical_port=lsp0 >> + >> +# Verify additional-chassis-ready contains both hv2 and hv3. >> +OVS_WAIT_UNTIL([ >> + ready=$(ovn-sbctl get Port_Binding $pb_uuid >> options:additional-chassis-ready | tr -d '""') >> + echo "$ready" | grep -q hv2 && echo "$ready" | grep -q hv3 >> +]) >> + >> +# Remove hv3 from additional chassis. >> +check ovn-nbctl --wait=hv lsp-set-options lsp0 requested-chassis=hv1,hv2 >> + >> +check_column "$hv2_uuid" Port_Binding additional_chassis logical_port=lsp0 >> + >> +# Verify hv3 is removed from additional-chassis-ready but hv2 remains. >> +OVS_WAIT_UNTIL([test xhv2 = x$(ovn-sbctl get Port_Binding $pb_uuid >> options:additional-chassis-ready | tr -d '""')]) >> + >> +# Complete migration. >> +check ovn-nbctl --wait=hv lsp-set-options lsp0 requested-chassis=hv2 >> + >> +check_column "$hv2_uuid" Port_Binding chassis logical_port=lsp0 >> +check_column "" Port_Binding additional_chassis logical_port=lsp0 >> + >> +# Verify additional-chassis-ready is fully cleaned up. >> +OVS_WAIT_UNTIL([test x = x$(ovn-sbctl get Port_Binding $pb_uuid >> options:additional-chassis-ready 2>/dev/null)]) >> + >> +OVN_CLEANUP([hv1 >> +ignored_dp=ls0],[hv2],[hv3 >> +ignored_dp=ls0]) >> + >> +AT_CLEANUP >> +]) >> + >> OVN_FOR_EACH_NORTHD([ >> AT_SETUP([options:requested-chassis for logical port]) >> ovn_start >> -- >> 2.54.0 >> >> _______________________________________________ >> dev mailing list >> [email protected] >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
