This is a proposed update to the VM live migration workflow with OVN.

Currently, when doing live migration, you must not add the iface-id to the
port for the destination VM until migration is complete.  Otherwise, while
migration is in progress, ovn-controller on two different chassis will
fight over the port binding.

This workflow is problematic for libvirt-based live migration (at least) as
it creates an identical VM on the destination host, which includes all
config such as the ovs port iface-id.  This results in ovn-controller on
two hosts fighting over the port binding for the duration of the migration.


Proposed new workflow for a migration from host A to host B:

1) The CMS sets a new option on Logical_Switch_Port called
"migration-destination".  The value would be the chassis name of the
destination chassis of the upcoming live migration (host B in this case).

2) While this option is set, if host B claims the port binding, host A will
not try to re-claim it.

3) While this option is set, if host B sees the new port appear, it will
not immediately update the port binding.  Instead, it will set up flows
watching for a GARP from the VM.  GARP packets would be forwarded to
ovn-controller.  All other packets would be dropped.  If a GARP is seen,
then host B will update the port binding to reflect that the port is now
active on host B.

At least for KVM VMs, qemu is already generating a GARP when migration is
complete.  I'm not familiar with Xen or other virtualization technologies,
but it seems like this would be a common requirement for VM migration.

4) When the migration is either completed or aborted, the CMS will remove
the "migration-destination" option from the Logical_Switch_Port in
OVN_Northbound.  At this point, ovn-controller will resume normal
behavior.  If for some reason a GARP was not seen, host B would update the
port binding at this point.


An alternative approach would be to specify port bindings explicitly
through OVN_Northbound.  One goal of using the GARP packet as the trigger
is to enable OVN to react as quickly as possible when migration is complete
and the VM is active on host B.

A future improvement would be to also migrate conntrack state.  I don't
think this is a big problem in practice, because existing connections would
just be seen as new connections on the destination host, OVN ACLs would be
re-applied, and the connections would be allowed to stay open.

-- 
Russell Bryant
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to