Hello Dumitru, Odintsov and Lorenzo: First of all, thank you very much for working on this topic.
We (Neutron) didn't realize we could monitor the NB.nb_cfg and the destination Chassis_Private.nb_cfg. Dumitru explained to us that the destination host ovn-controller will update the Chassis_Private.nb_cfg once the OF rules have been written in OVS. This is actually the trigger we are looking for, it will be just a matter of reading the specific NB.nb_cfg and wait for the dest Chassis_Private.nb_cfg. We'll work on a solution based on this information. Thank you folks! On Thu, Jun 18, 2026 at 9:30 AM Dumitru Ceara <[email protected]> wrote: > On 6/17/26 11:54 PM, Odintsov Vladislav wrote: > > Hi Lorenzo, > > > > Hi Vladislav, Lorenzo, > > Adding some of the RH OpenStack folks (Yatin, Rodolfo) to the thread as > they're the ones needing this feature. > > > I’ve quickly read through the commit message and the code and a bit > concerned about the approach for ensuring that migration is safe to run > based on the information of openflow flows installation is finished or not > only on the destination for VM migration node. > > The original request from Red Hat OpenStack devs [0] was to have a way > to determine when ovn-controller handling the additional chassis of a > port binding has installed the required openflow rules for that port > binding. > > And delay triggering the live migration. > > IIUC, the concern they had was that without such a mechanism the > following could happen: > > 0. live migration is needed (not started yet) > 1. neutron sets additional chassis in the VMs NB LSP > 2. neutron assumes ovn-controller on "source" and "target" chassis > processed the NB update. > 3. live migration starts > > The problem is at step "2". The actual flow of configuration is: > neutron -> NB -> ovn-northd -> SB -> ovn-controllers. > > If any of these steps takes "longer" then live migration might be > triggered (step "3") before ovn-controller on the "target" has managed > to set up the openflow rules that are required to deliver traffic to the > "new" version of the VM during live migration. > > > From my perspective having a converged state only on the destination > node is not enough. Imagine situation, where node A (source node) hosts a > running VM M, node B is a destination node for VM M migration and node C is > another node, which hosts VM N. If CMS uses “additional-chassis” feature > for VM migration, so while VM M is migrating, traffic from VM N (node C) > towards VM M must be cloned and delivered to both nodes (A and B). > > And in case we’ve finished awaiting of node B (dest) OF flows > computation/installation, we still may have an outdated state on node C and > finalized migration can brake connectivity for some time. > > > > The goal of this patch was to inform neutron about node B (dest) being > ready to process incoming traffic and only trigger live migration then. > > But you make a good point here. If node C is slow then it might not > have processed the additional chassis and might have not yet set up the > flows to duplicate traffic to both node A and node B. > > > So, I’ve got some questions: > > 1. Shouldn’t CMS just either bump NB nb_cfg and monitor summarized > hv_cfg value in NB DB or bump NB nb_cfg and monitor per-chasssis hv_cfg > value in SB DB instead of this approach? > > I think that would work, I'm not sure how much effort that would be on > the neutron side. Yatin, Rodolfo, what do you guys think? > > > 2. Do we need a new mechanism, which covers only nodes’ state, which act > as additional chassis, but not as other nodes, which interact with it? > > > > Are you asking if we need this patch? :) > > In any case, Lorenzo, maybe it's good to wait a bit with merging this > patch until we clarify whether it fully meets the needs of OpenStack and > if neutron has ways to mitigate Vladislav's concerns. > > Regards, > Dumitru > > [0] https://redhat.atlassian.net/browse/FDP-2903 > > > regards, > > Vladislav Odintsov > > > >> On 17 Jun 2026, at 20:13, Lorenzo Bianconi via dev < > [email protected]> wrote: > >> > >> During VM live migration, a CMS needs to know when the destination > >> chassis has finished installing OpenFlow flows before it can safely > >> start the VM. Currently, ovn-controller sets ovn-installed on the > >> local OVS interface, but this information is not reflected back to > >> the Southbound DB, requiring a per-chassis agent to monitor readiness. > >> Add a new Port_Binding options:additional-chassis-ready key that > >> contains a comma-separated list of chassis names that have completed > >> flow installation as additional chassis. ovn-controller sets this > >> when local_binding_set_up() is called for an additional chassis > >> binding, and clears it when the chassis is released. > >> The option is preserved by northd during Port_Binding option rebuilds, > >> gated on requested_additional_chassis being set, so it is automatically > >> cleaned up when migration completes. > >> This differs from the existing additional-chassis-activated option > >> which is traffic-triggered (RARP/GARP/NA via activation strategy). > >> The new option is flow-installation-triggered and always-on. > >> > >> Assisted-by: Claude Opus 4.6, Claude Code > >> Signed-off-by: Lorenzo Bianconi <[email protected]> > >> --- > >> Changes in v2: > >> - Added NEWS entry. > >> - Replaced custom is_chassis_in_list() and remove_chassis_from_list() > >> with sset_from_delimited_string()/sset_contains()/ > >> sset_find_and_delete()/sset_join(), avoiding double-parsing in > >> the release path. [Dumitru] > >> - Renamed LIST_FOR_EACH iterator in local_binding_set_up() to avoid > >> confusing reuse of b_lport. [Dumitru] > >> - Tests: use fetch_column, remove TAG_UNSTABLE, add --wait=hv > >> synchronization, convert wait_column to check_column where > >> applicable, fix comment style. [Dumitru] > >> --- > >> NEWS | 2 + > >> controller/binding.c | 54 ++++++++++++++-- > >> northd/northd.c | 5 ++ > >> ovn-sb.xml | 12 ++++ > >> tests/ovn.at | 143 +++++++++++++++++++++++++++++++++++++++++++ > >> 5 files changed, 212 insertions(+), 4 deletions(-) > >> > >> diff --git a/NEWS b/NEWS > >> index 748ae30eb..5bb727c8a 100644 > >> --- a/NEWS > >> +++ b/NEWS > >> @@ -1,5 +1,7 @@ > >> Post v26.03.0 > >> ------------- > >> + - Added Port_Binding options:additional-chassis-ready to report > per-chassis > >> + flow installation readiness to the Southbound DB during live > migration. > >> - Added ability to set any "ipsec_*" NB_Global option to configure > the > >> IPsec backend. > >> - Documented missing ovn-nbctl commands: "mirror-rule-add", > >> diff --git a/controller/binding.c b/controller/binding.c > >> index de51be823..b14cf020a 100644 > >> --- a/controller/binding.c > >> +++ b/controller/binding.c > >> @@ -1031,11 +1031,35 @@ local_binding_set_up(struct shash > *local_bindings, const char *pb_name, > >> ts_now_str); > >> } > >> > >> - if (!sb_readonly && lbinding && b_lport && b_lport->pb->n_up && > >> - !b_lport->pb->up[0] && b_lport->pb->chassis == > chassis_rec) { > >> - binding_lport_set_up(b_lport, sb_readonly); > >> - LIST_FOR_EACH (b_lport, list_node, &lbinding->binding_lports) { > >> + if (!sb_readonly && lbinding && b_lport) { > >> + if (b_lport->pb->n_up && !b_lport->pb->up[0] && > >> + b_lport->pb->chassis == chassis_rec) { > >> binding_lport_set_up(b_lport, sb_readonly); > >> + struct binding_lport *iter; > >> + LIST_FOR_EACH (iter, list_node, &lbinding->binding_lports) > { > >> + binding_lport_set_up(iter, sb_readonly); > >> + } > >> + } > >> + > >> + if (is_additional_chassis(b_lport->pb, chassis_rec)) { > >> + const char *current = smap_get(&b_lport->pb->options, > >> + "additional-chassis-ready"); > >> + if (!current) { > >> + sbrec_port_binding_update_options_setkey( > >> + b_lport->pb, "additional-chassis-ready", > >> + chassis_rec->name); > >> + } else { > >> + struct sset ready_set; > >> + sset_from_delimited_string(&ready_set, current, ","); > >> + if (!sset_contains(&ready_set, chassis_rec->name)) { > >> + char *val = xasprintf("%s,%s", current, > >> + chassis_rec->name); > >> + sbrec_port_binding_update_options_setkey( > >> + b_lport->pb, "additional-chassis-ready", > val); > >> + free(val); > >> + } > >> + sset_destroy(&ready_set); > >> + } > >> } > >> } > >> } > >> @@ -1570,6 +1594,28 @@ release_lport_additional_chassis(const struct > sbrec_port_binding *pb, > >> remove_additional_chassis(pb, chassis_rec); > >> } > >> > >> + const char *ready = smap_get(&pb->options, > "additional-chassis-ready"); > >> + if (ready) { > >> + struct sset ready_set; > >> + sset_from_delimited_string(&ready_set, ready, ","); > >> + if (sset_find_and_delete(&ready_set, chassis_rec->name)) { > >> + if (sb_readonly) { > >> + sset_destroy(&ready_set); > >> + return false; > >> + } > >> + if (!sset_is_empty(&ready_set)) { > >> + char *updated = sset_join(&ready_set, ",", ""); > >> + sbrec_port_binding_update_options_setkey( > >> + pb, "additional-chassis-ready", updated); > >> + free(updated); > >> + } else { > >> + sbrec_port_binding_update_options_delkey( > >> + pb, "additional-chassis-ready"); > >> + } > >> + } > >> + sset_destroy(&ready_set); > >> + } > >> + > >> VLOG_INFO("Releasing lport %s from this additional chassis.", > >> pb->logical_port); > >> return true; > >> diff --git a/northd/northd.c b/northd/northd.c > >> index 0dbf17426..71b3ca9c1 100644 > >> --- a/northd/northd.c > >> +++ b/northd/northd.c > >> @@ -2871,6 +2871,11 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn > *ovnsb_txn, > >> smap_add(&options, "additional-chassis-activated", > >> activated_str); > >> } > >> + const char *ready_str = smap_get(&op->sb->options, > >> + > "additional-chassis-ready"); > >> + if (ready_str) { > >> + smap_add(&options, "additional-chassis-ready", > ready_str); > >> + } > >> } > >> > >> /* Preserve virtual port options. */ > >> diff --git a/ovn-sb.xml b/ovn-sb.xml > >> index e45b63d73..5175c523a 100644 > >> --- a/ovn-sb.xml > >> +++ b/ovn-sb.xml > >> @@ -3855,6 +3855,18 @@ tcp.flags = RST; > >> that the port was activated using the strategy specified. > >> </column> > >> > >> + <column name="options" key="additional-chassis-ready"> > >> + A comma-separated list of chassis names that have finished > installing > >> + OpenFlow flows for this port binding as an additional > chassis. Set by > >> + <code>ovn-controller</code> when the interface reaches the > >> + <code>ovn-installed</code> state on the additional chassis. > This > >> + allows a CMS to monitor the Southbound DB for migration > readiness > >> + without requiring an agent on each chassis. The option is > >> + automatically cleaned up when the chassis is removed from > >> + <ref column="additional_chassis"/> or when > >> + <ref column="requested_additional_chassis"/> is cleared. > >> + </column> > >> + > >> <column name="options" key="iface-id-ver"> > >> If set, this port will be bound by <code>ovn-controller</code> > >> only if this same key and value is configured in the > >> diff --git a/tests/ovn.at b/tests/ovn.at > >> index 522c1c90d..c19227e98 100644 > >> --- a/tests/ovn.at > >> +++ b/tests/ovn.at > >> @@ -17336,6 +17336,149 @@ OVN_CLEANUP([hv1],[hv2]) > >> AT_CLEANUP > >> ]) > >> > >> +OVN_FOR_EACH_NORTHD([ > >> +AT_SETUP([options:additional-chassis-ready for logical port]) > >> +AT_KEYWORDS([multi-chassis]) > >> +ovn_start > >> + > >> +net_add n1 > >> + > >> +sim_add hv1 > >> +as hv1 > >> +check ovs-vsctl add-br br-phys > >> +ovn_attach n1 br-phys 192.168.0.11 > >> + > >> +sim_add hv2 > >> +as hv2 > >> +check ovs-vsctl add-br br-phys > >> +ovn_attach n1 br-phys 192.168.0.12 > >> + > >> +check ovn-nbctl ls-add ls0 > >> +check ovn-nbctl lsp-add ls0 lsp0 > >> + > >> +# Allow only chassis hv1 to bind logical port lsp0. > >> +check ovn-nbctl --wait=hv lsp-set-options lsp0 requested-chassis=hv1 > >> + > >> +as hv1 check ovs-vsctl -- add-port br-int lsp0 -- \ > >> + set Interface lsp0 external-ids:iface-id=lsp0 > >> +as hv2 check ovs-vsctl -- add-port br-int lsp0 -- \ > >> + set Interface lsp0 external-ids:iface-id=lsp0 > >> + > >> +wait_row_count Chassis 1 name=hv1 > >> +wait_row_count Chassis 1 name=hv2 > >> +hv1_uuid=$(fetch_column Chassis _uuid name=hv1) > >> +hv2_uuid=$(fetch_column Chassis _uuid name=hv2) > >> + > >> +wait_column "$hv1_uuid" Port_Binding chassis logical_port=lsp0 > >> +wait_column "$hv1_uuid" Port_Binding requested_chassis > logical_port=lsp0 > >> +wait_column "" Port_Binding additional_chassis logical_port=lsp0 > >> +wait_column "" Port_Binding requested_additional_chassis > logical_port=lsp0 > >> + > >> +pb_uuid=$(fetch_column Port_Binding _uuid logical_port=lsp0) > >> + > >> +# additional-chassis-ready should not be set yet. > >> +AT_CHECK([ovn-sbctl get Port_Binding $pb_uuid > options:additional-chassis-ready 2>/dev/null], [1], [ignore], [ignore]) > >> + > >> +# Request port binding at an additional chassis (simulate migration > start). > >> +check ovn-nbctl --wait=hv lsp-set-options lsp0 \ > >> + requested-chassis=hv1,hv2 > >> + > >> +check_column "$hv1_uuid" Port_Binding chassis logical_port=lsp0 > >> +check_column "$hv2_uuid" Port_Binding additional_chassis > logical_port=lsp0 > >> +check_column "$hv2_uuid" Port_Binding requested_additional_chassis > logical_port=lsp0 > >> + > >> +# Verify additional-chassis-ready=hv2 is set in Port_Binding options. > >> +OVS_WAIT_UNTIL([test xhv2 = x$(ovn-sbctl get Port_Binding $pb_uuid > options:additional-chassis-ready | tr -d '""')]) > >> + > >> +# Complete migration: move binding to hv2. > >> +check ovn-nbctl --wait=hv lsp-set-options lsp0 requested-chassis=hv2 > >> + > >> +check_column "$hv2_uuid" Port_Binding chassis logical_port=lsp0 > >> +check_column "$hv2_uuid" Port_Binding requested_chassis > logical_port=lsp0 > >> +check_column "" Port_Binding additional_chassis logical_port=lsp0 > >> +check_column "" Port_Binding requested_additional_chassis > logical_port=lsp0 > >> + > >> +# Verify additional-chassis-ready is cleared after migration completes. > >> +OVS_WAIT_UNTIL([test x = x$(ovn-sbctl get Port_Binding $pb_uuid > options:additional-chassis-ready 2>/dev/null)]) > >> + > >> +OVN_CLEANUP([hv1 > >> +ignored_dp=ls0],[hv2]) > >> + > >> +AT_CLEANUP > >> +]) > >> + > >> +OVN_FOR_EACH_NORTHD([ > >> +AT_SETUP([options:additional-chassis-ready with multiple additional > chassis]) > >> +AT_KEYWORDS([multi-chassis]) > >> +ovn_start > >> + > >> +net_add n1 > >> + > >> +for i in 1 2 3; do > >> + sim_add hv$i > >> + as hv$i > >> + check ovs-vsctl add-br br-phys > >> + ovn_attach n1 br-phys 192.168.0.1$i > >> +done > >> + > >> +check ovn-nbctl ls-add ls0 > >> +check ovn-nbctl lsp-add ls0 lsp0 > >> + > >> +# Bind the port to hv1 initially. > >> +check ovn-nbctl --wait=hv lsp-set-options lsp0 requested-chassis=hv1 > >> + > >> +for i in 1 2 3; do > >> + as hv$i check ovs-vsctl -- add-port br-int lsp0 -- \ > >> + set Interface lsp0 external-ids:iface-id=lsp0 > >> +done > >> + > >> +for i in 1 2 3; do > >> + wait_row_count Chassis 1 name=hv$i > >> +done > >> +hv1_uuid=$(fetch_column Chassis _uuid name=hv1) > >> +hv2_uuid=$(fetch_column Chassis _uuid name=hv2) > >> +hv3_uuid=$(fetch_column Chassis _uuid name=hv3) > >> + > >> +wait_column "$hv1_uuid" Port_Binding chassis logical_port=lsp0 > >> + > >> +pb_uuid=$(fetch_column Port_Binding _uuid logical_port=lsp0) > >> + > >> +# Request binding at two additional chassis. > >> +check ovn-nbctl --wait=hv lsp-set-options lsp0 > requested-chassis=hv1,hv2,hv3 > >> + > >> +check_column "$hv1_uuid" Port_Binding chassis logical_port=lsp0 > >> +check_column "$hv2_uuid $hv3_uuid" Port_Binding additional_chassis > logical_port=lsp0 > >> + > >> +# Verify additional-chassis-ready contains both hv2 and hv3. > >> +OVS_WAIT_UNTIL([ > >> + ready=$(ovn-sbctl get Port_Binding $pb_uuid > options:additional-chassis-ready | tr -d '""') > >> + echo "$ready" | grep -q hv2 && echo "$ready" | grep -q hv3 > >> +]) > >> + > >> +# Remove hv3 from additional chassis. > >> +check ovn-nbctl --wait=hv lsp-set-options lsp0 > requested-chassis=hv1,hv2 > >> + > >> +check_column "$hv2_uuid" Port_Binding additional_chassis > logical_port=lsp0 > >> + > >> +# Verify hv3 is removed from additional-chassis-ready but hv2 remains. > >> +OVS_WAIT_UNTIL([test xhv2 = x$(ovn-sbctl get Port_Binding $pb_uuid > options:additional-chassis-ready | tr -d '""')]) > >> + > >> +# Complete migration. > >> +check ovn-nbctl --wait=hv lsp-set-options lsp0 requested-chassis=hv2 > >> + > >> +check_column "$hv2_uuid" Port_Binding chassis logical_port=lsp0 > >> +check_column "" Port_Binding additional_chassis logical_port=lsp0 > >> + > >> +# Verify additional-chassis-ready is fully cleaned up. > >> +OVS_WAIT_UNTIL([test x = x$(ovn-sbctl get Port_Binding $pb_uuid > options:additional-chassis-ready 2>/dev/null)]) > >> + > >> +OVN_CLEANUP([hv1 > >> +ignored_dp=ls0],[hv2],[hv3 > >> +ignored_dp=ls0]) > >> + > >> +AT_CLEANUP > >> +]) > >> + > >> OVN_FOR_EACH_NORTHD([ > >> AT_SETUP([options:requested-chassis for logical port]) > >> ovn_start > >> -- > >> 2.54.0 > >> > >> _______________________________________________ > >> dev mailing list > >> [email protected] > >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
