[Yahoo-eng-team] [Bug 1834045] Re: Live-migration double binding doesn't work with OVN
Fix already released: https://review.opendev.org/#/c/673803/ ** Changed in: networking-ovn Status: New => Fix Released ** Changed in: neutron Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1834045 Title: Live-migration double binding doesn't work with OVN Status in networking-ovn: Fix Released Status in neutron: Fix Released Status in OpenStack Compute (nova): Incomplete Status in neutron package in Ubuntu: Fix Released Bug description: For ml2/OVN live-migration doesn't work. After spending some time debugging this issue I found that its potentially more complicated and not related to OVN intself. Here is the full story behind not working live-migration while using OVN in latest u/s master. To speedup live-migration double-binding was introduced in neutron [1] and nova [2]. It implements this blueprint [3]. In short words it creates double binding (ACTIVE and INACTIVE) to verify if network bind is possible to be done on destination host and then starts live-migration (to not waste time in case of rollback). This mechanism started to be default in Stein [4]. So before actual qemu live-migration neutron should send 'network-vif-plugged' to nova and then migration is being run. While using OVN this mechanism doesn't work. Notification 'network- vif-plugged' is not being send so live-migration is stuck at the beginning. Lets check how those notifications are send. On every change of 'status' field (sqlalchemy event) in neutron.ports row [5] function [6] is executed and it is responsible for sending 'network-vif- unplugged' and 'network-vif-plugged' notifications. During pre_live_migration tasks two bindings and bindings levels are created. At the end of this process I found that commit_port_binding() is executed [7]. At this time neutron port status in the db is DOWN. I found that at the end of commit_port_binding() [8] after neutron_lib.callbacks.registry notification is send the port status moves to UP. For ml2/OVN it stays DOWN. This is the first difference that I found between ml2/ovs and ml2/ovn. After a bit digging I figured out how 'network-vif-plugged' is triggered in ml2/ovs. Lets see how this is done. 1. On list of registered callbacks in ml2/ovs [8] we have configured callback from class ovo_rpc._ObjectChangeHandler [9] and at the end of commit_port_binding() this callback is used. - neutron.plugins.ml2.ovo_rpc._ObjectChangeHandler.handle_event - 2. It is responsible for pushing new port object revisions to agents, like: Jun 24 10:01:01 test-migrate-1 neutron-server[3685]: DEBUG neutron.api.rpc.handlers.resources_rpc [None req-1430f349-d644-4d33-8833-90fad0124dcd service neutron] Pushing event updated for resources: {'Port': ['ID=3704a567-ef4c-4f6d-9557-a1191de07c4a,revision_number=10']} {{(pid=3697) push /opt/stack/neutron/neutron/api/rpc/handlers/resources_rpc.py:243}} 3. OVS agent consumes it and sends back RPC to the neutron server that port is actually UP (on source node!): Jun 24 10:01:01 test-migrate-1 neutron-openvswitch-agent[18660]: DEBUG neutron.agent.resource_cache [None req-1430f349-d644-4d33-8833-90fad0124dcd service neutron] Resource Port 3704a567-ef4c-4f6d-9557-a1191de07c4a updated (revision_number 8->10). Old fields: {'status': u'ACTIVE', 'bindings': [PortBinding(host='test-migrate-1',port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,profile={},status='INACTIVE',vif_details={"port_filter": true, "bridge_name": "br-int", "datapath_type": "system", "ovs_hybrid_plug": false},vif_type='ovs',vnic_type='normal'), PortBinding(host='test-migrate-2',port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,profile={"migrating_to": "test-migrate-1"},status='ACTIVE',vif_details={"port_filter": true, "bridge_name": "br-int", "datapath_type": "system", "ovs_hybrid_plug": false},vif_type='ovs',vnic_type='normal')], 'binding_levels': [PortBindingLevel(driver='openvswitch',host='test-migrate-1',level=0,port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,segment=NetworkSegment(c6866834-4577-497f-a6c8-ff9724a82e59),segment_id=c6866834-4577-497f-a6c8-ff9724a82e59), PortBindingLevel(driver='openvswitch',host='test-migrate-2',level=0,port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,segment=NetworkSegment(c6866834-4577-497f-a6c8-ff9724a82e59),segment_id=c6866834-4577-497f-a6c8-ff9724a82e59)]} New fields: {'status': u'DOWN', 'bindings': [PortBinding(host='test-migrate-1',port
[Yahoo-eng-team] [Bug 1834045] Re: Live-migration double binding doesn't work with OVN
Fix released: https://review.opendev.org/#/c/673803/ ** Changed in: neutron (Ubuntu) Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1834045 Title: Live-migration double binding doesn't work with OVN Status in networking-ovn: Fix Released Status in neutron: Fix Released Status in OpenStack Compute (nova): Incomplete Status in neutron package in Ubuntu: Fix Released Bug description: For ml2/OVN live-migration doesn't work. After spending some time debugging this issue I found that its potentially more complicated and not related to OVN intself. Here is the full story behind not working live-migration while using OVN in latest u/s master. To speedup live-migration double-binding was introduced in neutron [1] and nova [2]. It implements this blueprint [3]. In short words it creates double binding (ACTIVE and INACTIVE) to verify if network bind is possible to be done on destination host and then starts live-migration (to not waste time in case of rollback). This mechanism started to be default in Stein [4]. So before actual qemu live-migration neutron should send 'network-vif-plugged' to nova and then migration is being run. While using OVN this mechanism doesn't work. Notification 'network- vif-plugged' is not being send so live-migration is stuck at the beginning. Lets check how those notifications are send. On every change of 'status' field (sqlalchemy event) in neutron.ports row [5] function [6] is executed and it is responsible for sending 'network-vif- unplugged' and 'network-vif-plugged' notifications. During pre_live_migration tasks two bindings and bindings levels are created. At the end of this process I found that commit_port_binding() is executed [7]. At this time neutron port status in the db is DOWN. I found that at the end of commit_port_binding() [8] after neutron_lib.callbacks.registry notification is send the port status moves to UP. For ml2/OVN it stays DOWN. This is the first difference that I found between ml2/ovs and ml2/ovn. After a bit digging I figured out how 'network-vif-plugged' is triggered in ml2/ovs. Lets see how this is done. 1. On list of registered callbacks in ml2/ovs [8] we have configured callback from class ovo_rpc._ObjectChangeHandler [9] and at the end of commit_port_binding() this callback is used. - neutron.plugins.ml2.ovo_rpc._ObjectChangeHandler.handle_event - 2. It is responsible for pushing new port object revisions to agents, like: Jun 24 10:01:01 test-migrate-1 neutron-server[3685]: DEBUG neutron.api.rpc.handlers.resources_rpc [None req-1430f349-d644-4d33-8833-90fad0124dcd service neutron] Pushing event updated for resources: {'Port': ['ID=3704a567-ef4c-4f6d-9557-a1191de07c4a,revision_number=10']} {{(pid=3697) push /opt/stack/neutron/neutron/api/rpc/handlers/resources_rpc.py:243}} 3. OVS agent consumes it and sends back RPC to the neutron server that port is actually UP (on source node!): Jun 24 10:01:01 test-migrate-1 neutron-openvswitch-agent[18660]: DEBUG neutron.agent.resource_cache [None req-1430f349-d644-4d33-8833-90fad0124dcd service neutron] Resource Port 3704a567-ef4c-4f6d-9557-a1191de07c4a updated (revision_number 8->10). Old fields: {'status': u'ACTIVE', 'bindings': [PortBinding(host='test-migrate-1',port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,profile={},status='INACTIVE',vif_details={"port_filter": true, "bridge_name": "br-int", "datapath_type": "system", "ovs_hybrid_plug": false},vif_type='ovs',vnic_type='normal'), PortBinding(host='test-migrate-2',port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,profile={"migrating_to": "test-migrate-1"},status='ACTIVE',vif_details={"port_filter": true, "bridge_name": "br-int", "datapath_type": "system", "ovs_hybrid_plug": false},vif_type='ovs',vnic_type='normal')], 'binding_levels': [PortBindingLevel(driver='openvswitch',host='test-migrate-1',level=0,port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,segment=NetworkSegment(c6866834-4577-497f-a6c8-ff9724a82e59),segment_id=c6866834-4577-497f-a6c8-ff9724a82e59), PortBindingLevel(driver='openvswitch',host='test-migrate-2',level=0,port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,segment=NetworkSegment(c6866834-4577-497f-a6c8-ff9724a82e59),segment_id=c6866834-4577-497f-a6c8-ff9724a82e59)]} New fields: {'status': u'DOWN', 'bindings': [PortBinding(host='test-migrate-1',port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,profile={},status='ACTIVE',v
[Yahoo-eng-team] [Bug 1834045] Re: Live-migration double binding doesn't work with OVN
** Also affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1834045 Title: Live-migration double binding doesn't work with OVN Status in networking-ovn: New Status in neutron: New Status in OpenStack Compute (nova): New Status in neutron package in Ubuntu: New Bug description: For ml2/OVN live-migration doesn't work. After spending some time debugging this issue I found that its potentially more complicated and not related to OVN intself. Here is the full story behind not working live-migration while using OVN in latest u/s master. To speedup live-migration double-binding was introduced in neutron [1] and nova [2]. It implements this blueprint [3]. In short words it creates double binding (ACTIVE and INACTIVE) to verify if network bind is possible to be done on destination host and then starts live-migration (to not waste time in case of rollback). This mechanism started to be default in Stein [4]. So before actual qemu live-migration neutron should send 'network-vif-plugged' to nova and then migration is being run. While using OVN this mechanism doesn't work. Notification 'network- vif-plugged' is not being send so live-migration is stuck at the beginning. Lets check how those notifications are send. On every change of 'status' field (sqlalchemy event) in neutron.ports row [5] function [6] is executed and it is responsible for sending 'network-vif- unplugged' and 'network-vif-plugged' notifications. During pre_live_migration tasks two bindings and bindings levels are created. At the end of this process I found that commit_port_binding() is executed [7]. At this time neutron port status in the db is DOWN. I found that at the end of commit_port_binding() [8] after neutron_lib.callbacks.registry notification is send the port status moves to UP. For ml2/OVN it stays DOWN. This is the first difference that I found between ml2/ovs and ml2/ovn. After a bit digging I figured out how 'network-vif-plugged' is triggered in ml2/ovs. Lets see how this is done. 1. On list of registered callbacks in ml2/ovs [8] we have configured callback from class ovo_rpc._ObjectChangeHandler [9] and at the end of commit_port_binding() this callback is used. - neutron.plugins.ml2.ovo_rpc._ObjectChangeHandler.handle_event - 2. It is responsible for pushing new port object revisions to agents, like: Jun 24 10:01:01 test-migrate-1 neutron-server[3685]: DEBUG neutron.api.rpc.handlers.resources_rpc [None req-1430f349-d644-4d33-8833-90fad0124dcd service neutron] Pushing event updated for resources: {'Port': ['ID=3704a567-ef4c-4f6d-9557-a1191de07c4a,revision_number=10']} {{(pid=3697) push /opt/stack/neutron/neutron/api/rpc/handlers/resources_rpc.py:243}} 3. OVS agent consumes it and sends back RPC to the neutron server that port is actually UP (on source node!): Jun 24 10:01:01 test-migrate-1 neutron-openvswitch-agent[18660]: DEBUG neutron.agent.resource_cache [None req-1430f349-d644-4d33-8833-90fad0124dcd service neutron] Resource Port 3704a567-ef4c-4f6d-9557-a1191de07c4a updated (revision_number 8->10). Old fields: {'status': u'ACTIVE', 'bindings': [PortBinding(host='test-migrate-1',port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,profile={},status='INACTIVE',vif_details={"port_filter": true, "bridge_name": "br-int", "datapath_type": "system", "ovs_hybrid_plug": false},vif_type='ovs',vnic_type='normal'), PortBinding(host='test-migrate-2',port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,profile={"migrating_to": "test-migrate-1"},status='ACTIVE',vif_details={"port_filter": true, "bridge_name": "br-int", "datapath_type": "system", "ovs_hybrid_plug": false},vif_type='ovs',vnic_type='normal')], 'binding_levels': [PortBindingLevel(driver='openvswitch',host='test-migrate-1',level=0,port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,segment=NetworkSegment(c6866834-4577-497f-a6c8-ff9724a82e59),segment_id=c6866834-4577-497f-a6c8-ff9724a82e59), PortBindingLevel(driver='openvswitch',host='test-migrate-2',level=0,port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,segment=NetworkSegment(c6866834-4577-497f-a6c8-ff9724a82e59),segment_id=c6866834-4577-497f-a6c8-ff9724a82e59)]} New fields: {'status': u'DOWN', 'bindings': [PortBinding(host='test-migrate-1',port_id=3704a567-ef4c-4f6d-9557-a1191de07c4a,profile={},status='ACTIVE',vif_details={"port_filter": true, "bridge_name": "br-int", "datapath_type": "system", "ovs_h