Public bug reported:

For ml2/OVN live-migration doesn't work. After spending some time
debugging this issue I found that its potentially more complicated and
not related to OVN intself.

Here is the full story behind not working live-migration while using OVN
in latest u/s master.

To speedup live-migration double-binding was introduced in neutron [1] and nova 
[2]. It implements this blueprint [3]. In short words it creates double binding 
(ACTIVE and INACTIVE) to verify if network bind is possible to be done on 
destination host and then starts live-migration (to not waste time in case of 
rollback).
This mechanism started to be default in Stein [4]. So before actual qemu 
live-migration neutron should send 'network-vif-plugged' to nova and then 
migration is being run.

While using OVN this mechanism doesn't work. Notification 'network-vif-
plugged' is not being send so live-migration is stuck at the beginning.

Lets check how those notifications are send. On every change of 'status'
field (sqlalchemy event) in neutron.ports row [5] function [6] is
executed and it is responsible for sending 'network-vif-unplugged' and
'network-vif-plugged' notifications.

During pre_live_migration tasks two bindings and bindings levels are created. 
At the end of this process I found that commit_port_binding() is executed [7]. 
At this time neutron port status in the db is DOWN. 
I found that at the end of commit_port_binding() [8] after 
neutron_lib.callbacks.registry notification is send the port status moves to 
UP. For ml2/OVN it stays DOWN. This is the first difference that I found 
between ml2/ovs and ml2/ovn.

After a bit digging I figured out how 'network-vif-plugged' is triggered in 
ml2/ovs.
Lets see how this is done.

1. On list of registered callbacks in ml2/ovs [8] we have configured
callback from class ovo_rpc._ObjectChangeHandler [9] and at the end of
commit_port_binding() this callback is used.

-------------------------------------------------------------
neutron.plugins.ml2.ovo_rpc._ObjectChangeHandler.handle_event
-------------------------------------------------------------

2. It is responsible for pushing new port object revisions to agents,
like:

----------------------------------------------------------------------------
Jun 24 10:01:01 test-migrate-1 neutron-server[3685]: DEBUG 
neutron.api.rpc.handlers.resources_rpc [None 
req-1430f349-d644-4d33-8833-90fad0124dcd service neutron] Pushing event updated 
for resources: {'Port': 
['ID=3704a567-ef4c-4f6d-9557-a1191de07c4a,revision_number=10']} {{(pid=3697) 
push /opt/stack/neutron/neutron/api/rpc/handlers/resources_rpc.py:243}}
----------------------------------------------------------------------------

3. OVS agent consumes it and sends back RPC to the neutron server that port is 
actually UP (on source node!):
------------------------------------------------------------------------------------------------------------
Jun 24 10:01:01 test-migrate-1 neutron-openvswitch-agent[18660]: DEBUG 
neutron.agent.resource_cache [None req-1430f349-d644-4d33-8833-90fad0124dcd 
service neutron] Resource Port 3704a567-ef4c-4f6d-9557-a1191de07c4a updated 
(revision_number 8->10). Old fields: {'status': u'ACTIVE', 'bindings': 
[PortBinding(host='test-migrate-1',port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,profile={},status='INACTIVE',vif_details={"port_filter":
 true, "bridge_name": "br-int", "datapath_type": "system", "ovs_hybrid_plug": 
false},vif_type='ovs',vnic_type='normal'), 
PortBinding(host='test-migrate-2',port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,profile={"migrating_to":
 "test-migrate-1"},status='ACTIVE',vif_details={"port_filter": true, 
"bridge_name": "br-int", "datapath_type": "system", "ovs_hybrid_plug": 
false},vif_type='ovs',vnic_type='normal')], 'binding_levels': 
[PortBindingLevel(driver='openvswitch',host='test-migrate-1',level=0,port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,segment=NetworkSegment(c6866834-4577-497f-a6c8-ff9724a82e59),segment_id=c6866834-4577-497f-a6c8-ff9724a82e59),
 
PortBindingLevel(driver='openvswitch',host='test-migrate-2',level=0,port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,segment=NetworkSegment(c6866834-4577-497f-a6c8-ff9724a82e59),segment_id=c6866834-4577-497f-a6c8-ff9724a82e59)]}
 New fields: {'status': u'DOWN', 'bindings': 
[PortBinding(host='test-migrate-1',port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,profile={},status='ACTIVE',vif_details={"port_filter":
 true, "bridge_name": "br-int", "datapath_type": "system", "ovs_hybrid_plug": 
false},vif_type='ovs',vnic_type='normal'), 
PortBinding(host='test-migrate-2',port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,profile={"migrating_to":
 
"test-migrate-1"},status='INACTIVE',vif_details=None,vif_type='unbound',vnic_type='normal')],
 'binding_levels': 
[PortBindingLevel(driver='openvswitch',host='test-migrate-1',level=0,port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,segment=NetworkSegment(c6866834-4577-497f-a6c8-ff9724a82e59),segment_id=c6866834-4577-497f-a6c8-ff9724a82e59)]}
 {{(pi
Jun 24 10:01:01 test-migrate-1 neutron-openvswitch-agent[18660]: d=18660) 
record_resource_update /opt/stack/neutron/neutron/agent/resource_cache.py:186}}
...

Jun 24 10:01:02 test-migrate-1 neutron-openvswitch-agent[18660]: DEBUG 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [None 
req-9daaf112-57f4-49bb-8390-4b65a5c5e674 None None] Setting status for 
3704a567-ef4c-4f6d-9557-a1191de07c4a to UP {{(pid=18660) _bind_devices 
/opt/stack/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:1088}}
------------------------------------------------------------------------------------------------------------

4. Neutron server consumes it:
------------------------------------------------------------------------------------------------------------
Jun 24 10:01:02 test-migrate-1 neutron-server[3685]: DEBUG 
neutron.plugins.ml2.rpc [None req-62e69669-fa7e-4f70-9e38-38cb3e2c30a7 None 
None] Device 3704a567-ef4c-4f6d-9557-a1191de07c4a up at agent 
ovs-agent-test-migrate-1 {{(pid=3698) update_device_up 
/opt/stack/neutron/neutron/plugins/ml2/rpc.py:269}}
...
Jun 24 10:01:02 test-migrate-1 neutron-server[3685]: DEBUG 
neutron.db.provisioning_blocks [None req-62e69669-fa7e-4f70-9e38-38cb3e2c30a7 
None None] Provisioning for port 3704a567-ef4c-4f6d-9557-a1191de07c4a completed 
by entity L2. {{(pid=3698) provisioning_complete 
/opt/stack/neutron/neutron/db/provisioning_blocks.py:133}}
...
Jun 24 10:01:02 test-migrate-1 neutron-server[3685]: DEBUG 
neutron.db.provisioning_blocks [None req-62e69669-fa7e-4f70-9e38-38cb3e2c30a7 
None None] Provisioning complete for port 3704a567-ef4c-4f6d-9557-a1191de07c4a 
triggered by entity L2. {{(pid=3698) provisioning_complete 
/opt/stack/neutron/neutron/db/provisioning_blocks.py:140}}
------------------------------------------------------------------------------------------------------------

and then generates internal event "PROVISIONING_COMPLETE" [10]. This
event is consumed by [11] and port_provisioned() updates port status in
the DB to UP [12]. At the end it emits notification 'network-vif-
plugged' and nova continues migration.


In ml2/ovn we don't have agents, so we don't use ovo_rpc. That's why migration 
for ml2/ovn doesn't work.

It looks like general bug somewhere between nova and neutron. Neutron shouldn't 
send notification 'network-vif-plug' during configuration of double binding 
from source host like it is now (paragraph 3.)
Maybe we could consider using some more sophisticated names, like 
'neutron-vif-inactive-binding-set'?
Maybe nova could watch for inactive binding being created [13] and then start 
live-migration
instead waiting for neutron notification?


Thanks,
Maciej


[1] 
https://review.opendev.org/#/q/topic:bp/live-migration-portbinding+(status:open+OR+status:merged)
[2] https://review.opendev.org/#/c/558001/
[3] https://blueprints.launchpad.net/nova/+spec/neutron-new-port-binding-api 
[4] https://review.opendev.org/#/c/635360/
[5] 
https://github.com/openstack/neutron/blob/0e2508c8b1a3706a2ade0517f5c5359af2f8bc78/neutron/db/db_base_plugin_v2.py#L173
[6] 
https://github.com/openstack/neutron/blob/0e2508c8b1a3706a2ade0517f5c5359af2f8bc78/neutron/notifiers/nova.py#L182
[7] 
https://github.com/openstack/neutron/blob/0e2508c8b1a3706a2ade0517f5c5359af2f8bc78/neutron/plugins/ml2/plugin.py#L505
[8] 
https://github.com/openstack/neutron/blob/0e2508c8b1a3706a2ade0517f5c5359af2f8bc78/neutron/plugins/ml2/plugin.py#L713
[9] 
https://github.com/openstack/neutron/blob/0e2508c8b1a3706a2ade0517f5c5359af2f8bc78/neutron/plugins/ml2/ovo_rpc.py#L51
[10] 
https://github.com/openstack/neutron/blob/0e2508c8b1a3706a2ade0517f5c5359af2f8bc78/neutron/db/provisioning_blocks.py#L140
[11] 
https://github.com/openstack/neutron/blob/0e2508c8b1a3706a2ade0517f5c5359af2f8bc78/neutron/plugins/ml2/plugin.py#L285
[12] 
https://github.com/openstack/neutron/blob/0e2508c8b1a3706a2ade0517f5c5359af2f8bc78/neutron/plugins/ml2/plugin.py#L316
[13] 
https://specs.openstack.org/openstack/neutron-specs/specs/backlog/pike/portbinding_information_for_nova.html#list-bindings

** Affects: networking-ovn
     Importance: Undecided
         Status: New

** Affects: nova
     Importance: Undecided
         Status: New

** Affects: neutron (Ubuntu)
     Importance: Undecided
         Status: New

** Also affects: nova
   Importance: Undecided
       Status: New

** Also affects: networking-ovn
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1834045

Title:
  Live-migration double binding doesn't work with OVN

To manage notifications about this bug go to:
https://bugs.launchpad.net/networking-ovn/+bug/1834045/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to