[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
** Changed in: cloud-archive/rocky Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
This bug was fixed in the package neutron - 2:12.1.1-0ubuntu4~cloud0 --- neutron (2:12.1.1-0ubuntu4~cloud0) xenial-queens; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:12.1.1-0ubuntu4) bionic; urgency=medium . * Fix interrupt of VLAN traffic on reboot of neutron-ovs-agent: - d/p/0001-ovs-agent-signal-to-plugin-if-tunnel-refresh-needed.patch (LP: #1853613) - d/p/0002-Do-not-block-connection-between-br-int-and-br-phys-o.patch (LP: #1869808) - d/p/0003-Ensure-that-stale-flows-are-cleaned-from-phys_bridge.patch (LP: #1864822) - d/p/0004-DVR-Reconfigure-re-created-physical-bridges-for-dvr-.patch (LP: #1864822) - d/p/0005-Ensure-drop-flows-on-br-int-at-agent-startup-for-DVR.patch (LP: #1887148) - d/p/0006-Don-t-check-if-any-bridges-were-recrected-when-OVS-w.patch (LP: #1864822) - d/p/0007-Not-remove-the-running-router-when-MQ-is-unreachable.patch (LP: #1871850) ** Changed in: cloud-archive/queens Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
This bug was fixed in the package neutron - 2:12.1.1-0ubuntu4 --- neutron (2:12.1.1-0ubuntu4) bionic; urgency=medium * Fix interrupt of VLAN traffic on reboot of neutron-ovs-agent: - d/p/0001-ovs-agent-signal-to-plugin-if-tunnel-refresh-needed.patch (LP: #1853613) - d/p/0002-Do-not-block-connection-between-br-int-and-br-phys-o.patch (LP: #1869808) - d/p/0003-Ensure-that-stale-flows-are-cleaned-from-phys_bridge.patch (LP: #1864822) - d/p/0004-DVR-Reconfigure-re-created-physical-bridges-for-dvr-.patch (LP: #1864822) - d/p/0005-Ensure-drop-flows-on-br-int-at-agent-startup-for-DVR.patch (LP: #1887148) - d/p/0006-Don-t-check-if-any-bridges-were-recrected-when-OVS-w.patch (LP: #1864822) - d/p/0007-Not-remove-the-running-router-when-MQ-is-unreachable.patch (LP: #1871850) -- Edward Hope-Morley Mon, 22 Feb 2021 16:55:40 + ** Changed in: neutron (Ubuntu Bionic) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
Verified Xenial queens (uca) using [Test Plan] with output as follows: # apt-cache policy neutron-common neutron-common: Installed: 2:12.1.1-0ubuntu4~cloud0 Candidate: 2:12.1.1-0ubuntu4~cloud0 Version table: *** 2:12.1.1-0ubuntu4~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu xenial-proposed/queens/main amd64 Packages 100 /var/lib/dpkg/status 2:8.4.0-0ubuntu7.5 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages 2:8.4.0-0ubuntu7.4 500 500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages 2:8.0.0-0ubuntu1 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu xenial/main amd64 Packages I ran a ping for the duration of restarting both rabbit and neutron- openvswitch-agent and did not see any interruption. ** Tags removed: verification-queens-needed ** Tags added: verification-queens-done -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
** Tags removed: verification-needed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
Verified Bionic rocky (uca) using [Test Plan] with output as follows: # apt-cache policy neutron-common neutron-common: Installed: 2:13.0.7-0ubuntu1~cloud3 Candidate: 2:13.0.7-0ubuntu1~cloud3 Version table: *** 2:13.0.7-0ubuntu1~cloud3 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-proposed/rocky/main amd64 Packages 100 /var/lib/dpkg/status 2:12.1.1-0ubuntu3 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages 2:12.0.1-0ubuntu1 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu bionic/main amd64 Packages I ran a ping for the duration of restarting both rabbit and neutron- openvswitch-agent and did not see any interruption. ** Tags removed: verification-rocky-needed ** Tags added: verification-rocky-done -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
Verified Bionic queens using [Test Plan] with output as follows: # apt-cache policy neutron-common neutron-common: Installed: 2:12.1.1-0ubuntu4 Candidate: 2:12.1.1-0ubuntu4 Version table: *** 2:12.1.1-0ubuntu4 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu bionic-proposed/main amd64 Packages 100 /var/lib/dpkg/status 2:12.1.1-0ubuntu3 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages 2:12.0.1-0ubuntu1 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu bionic/main amd64 Packages I ran a ping for the duration of restarting both rabbit and neutron-openvswitch-agent and did not see any interruption. ** Description changed: (SRU template copied from comment 42) [Impact] - When there is a RabbitMQ or neutron-api outage, the neutron- openvswitch-agent undergoes a "resync" process and temporarily blocks all VM traffic. This always happens for a short time period (maybe <1 second) but in some high scale environments this lasts for minutes. If RabbitMQ is down again during the re-sync, traffic will also be blocked until it can connect which may be for a long period. This also affects situations where neutron-openvswitch-agent is intentionally restarted while RabbitMQ is down. Bug #1869808 addresses this issue and Bug #1887148 is a fix for that fix to prevent network loops during DVR startup. - In the same situation, the neutron-l3-agent can delete the L3 router (Bug #1871850), or may need to refresh the tunnel (Bug #1853613), or may need to update flows or reconfigure bridges (Bug #1864822) [Test Case] (1) Deploy Openstack Bionic-Queens with DVR and a *VLAN* tenant network (VXLAN or FLAT will not reproduce the issue). With a standard deployment, simply enabling DHCP on the ext_net subnet will allow VMs to be booted directly on the ext_net provider network. "openstack subnet set --dhcp ext_net and then deploy the VM directly to ext_net" (2) Deploy a VM to the VLAN network (3) Start pinging the VM from an external network (4) Stop all RabbitMQ servers (5) Restart neutron-openvswitch-agent - (6) Ping traffic should cease and not recover + (6) Ping traffic should NOT see interruption (7) Start all RabbitMQ servers - (8) Ping traffic will recover after 30-60 seconds + (8) Ping traffic should still be fine [Where problems could occur] These patches are all cherry-picked from the upstream stable branches, and have existed upstream including the stable/queens branch for many months and in Ubuntu all supported subsequent releases (Stein onwards) have also had these patches for many months with the exception of Queens. There is a chance that not installing these drop flows during startup could have traffic go somewhere that's not expected when the network is in a partially setup case, this was the case for DVR and in setups where more than 1 DVR external network port existed a network loop was possibly temporarily created. This was already addressed with the included patch for Bug #1869808. Checked and could not locate any other merged changes to this drop_port logic that also need to be backported. [Other Info] [original description] We are using Openstack Neutron 13.0.6 and it is deployed using OpenStack-helm. I test ping servers in the same vlan while rebooting neutron-ovs-agent. The result shows root@mgt01:~# openstack server list +--+-++--+--+---+ | ID | Name| Status | Networks | Image| Flavor| +--+-++--+--+---+ | 22d55077-b1b5-452e-8eba-cbcd2d1514a8 | test-1-1| ACTIVE | vlan105=172.31.10.4 | Cirros 0.4.0 64-bit | m1.tiny | | 726bc888-7767-44bc-b68a-7a1f3a6babf1 | test-1-2| ACTIVE | vlan105=172.31.10.18 | Cirros 0.4.0 64-bit | m1.tiny | $ ping 172.31.10.4 PING 172.31.10.4 (172.31.10.4): 56 data bytes .. 64 bytes from 172.31.10.4: seq=59 ttl=64 time=0.465 ms 64 bytes from 172.31.10.4: seq=60 ttl=64 time=0.510 ms < 64 bytes from 172.31.10.4: seq=61 ttl=64 time=0.446 ms 64 bytes from 172.31.10.4: seq=63 ttl=64 time=0.744 ms 64 bytes from 172.31.10.4: seq=64 ttl=64 time=0.477 ms 64 bytes from 172.31.10.4: seq=65 ttl=64 time=0.441 ms 64 bytes from 172.31.10.4: seq=66 ttl=64 time=0.376 ms 64 bytes from 172.31.10.4: seq=67 ttl=64 time=0.481 ms As one can see, packet seq 62 is lost, I believe, during rebooting ovs agent. Right now, I am suspecting
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
Bionic-queens regression tests ran successfully: 23:25:18 == 23:25:18 Totals 23:25:18 == 23:25:18 Ran: 97 tests in 742.1731 sec. 23:25:18 - Passed: 89 23:25:18 - Skipped: 8 23:25:18 - Expected Fail: 0 23:25:18 - Unexpected Success: 0 23:25:18 - Failed: 0 23:25:18 Sum of execute time for each test: 586.7572 sec. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
I have specifically verified that this bug (vlan traffic interruption during restart when rabbitmq is down) is fixed by the package in bionic- proposed. Followed my reproduction steps per the Test Case and all traffic to instances stops on 12.1.1-0ubuntu3 and does not stop on 12.1.1-0ubuntu4 But not completing verification yet as we need to perform more general testing on the package for regressions etc -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
Hello norman, or anyone else affected, Accepted neutron into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/neutron/2:12.1.1-0ubuntu4 in a few hours, and then in the -proposed repository. Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed- bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification- failed-bionic. In either case, without details of your testing we will not be able to proceed. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping! N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days. ** Changed in: neutron (Ubuntu Bionic) Status: In Progress => Fix Committed ** Tags added: verification-needed verification-needed-bionic -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
** Description changed: (SRU template copied from comment 42) [Impact] - When there is a RabbitMQ or neutron-api outage, the neutron- openvswitch-agent undergoes a "resync" process and temporarily blocks all VM traffic. This always happens for a short time period (maybe <1 second) but in some high scale environments this lasts for minutes. If RabbitMQ is down again during the re-sync, traffic will also be blocked until it can connect which may be for a long period. This also affects situations where neutron-openvswitch-agent is intentionally restarted while RabbitMQ is down. Bug #1869808 addresses this issue and Bug #1887148 is a fix for that fix to prevent network loops during DVR startup. - In the same situation, the neutron-l3-agent can delete the L3 router - (Bug #1871850) + (Bug #1871850), or may need to refresh the tunnel (Bug #1853613), or may + need to update flows or reconfigure bridges (Bug #1864822) [Test Case] (1) Deploy Openstack Bionic-Queens with DVR and a *VLAN* tenant network (VXLAN or FLAT will not reproduce the issue). With a standard deployment, simply enabling DHCP on the ext_net subnet will allow VMs to be booted directly on the ext_net provider network. "openstack subnet set --dhcp ext_net and then deploy the VM directly to ext_net" (2) Deploy a VM to the VLAN network (3) Start pinging the VM from an external network (4) Stop all RabbitMQ servers (5) Restart neutron-openvswitch-agent (6) Ping traffic should cease and not recover (7) Start all RabbitMQ servers (8) Ping traffic will recover after 30-60 seconds [Where problems could occur] These patches are all cherry-picked from the upstream stable branches, and have existed upstream including the stable/queens branch for many months and in Ubuntu all supported subsequent releases (Stein onwards) have also had these patches for many months with the exception of Queens. There is a chance that not installing these drop flows during startup could have traffic go somewhere that's not expected when the network is in a partially setup case, this was the case for DVR and in setups where more than 1 DVR external network port existed a network loop was possibly temporarily created. This was already addressed with the included patch for Bug #1869808. Checked and could not locate any other merged changes to this drop_port logic that also need to be backported. [Other Info] - [original description] We are using Openstack Neutron 13.0.6 and it is deployed using OpenStack-helm. I test ping servers in the same vlan while rebooting neutron-ovs-agent. The result shows root@mgt01:~# openstack server list +--+-++--+--+---+ | ID | Name| Status | Networks | Image| Flavor| +--+-++--+--+---+ | 22d55077-b1b5-452e-8eba-cbcd2d1514a8 | test-1-1| ACTIVE | vlan105=172.31.10.4 | Cirros 0.4.0 64-bit | m1.tiny | | 726bc888-7767-44bc-b68a-7a1f3a6babf1 | test-1-2| ACTIVE | vlan105=172.31.10.18 | Cirros 0.4.0 64-bit | m1.tiny | $ ping 172.31.10.4 PING 172.31.10.4 (172.31.10.4): 56 data bytes .. 64 bytes from 172.31.10.4: seq=59 ttl=64 time=0.465 ms 64 bytes from 172.31.10.4: seq=60 ttl=64 time=0.510 ms < 64 bytes from 172.31.10.4: seq=61 ttl=64 time=0.446 ms 64 bytes from 172.31.10.4: seq=63 ttl=64 time=0.744 ms 64 bytes from 172.31.10.4: seq=64 ttl=64 time=0.477 ms 64 bytes from 172.31.10.4: seq=65 ttl=64 time=0.441 ms 64 bytes from 172.31.10.4: seq=66 ttl=64 time=0.376 ms 64 bytes from 172.31.10.4: seq=67 ttl=64 time=0.481 ms As one can see, packet seq 62 is lost, I believe, during rebooting ovs agent. Right now, I am suspecting https://github.com/openstack/neutron/blob/6d619ea7c13e89ec575295f04c63ae316759c50a/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py#L229 this code is refreshing flow table rules even though it is not necessary. Because when I dump flows on phys bridge, I can see duration is rewinding to 0 which suggests flow has been deleted and created again """ duration=secs The time, in seconds, that the entry has been in the table. secs includes as much precision as the switch provides, possibly to nanosecond resolution. """ root@compute01:~# ovs-ofctl dump-flows br-floating ... cookie=0x673522f560f5ca4f, duration=323.852s, table=2, n_packets=1100, n_bytes=103409,
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
Looking to get this approved so that we can verify it, as needing this ideally released by the weekend of March 27th for some maintenance activity. Is something holding back the approval? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
New build has been uploaded to bionic unapproved queue - https://launchpadlibrarian.net/526055760/neutron_12.1.1-0ubuntu4_source.changes. Waiting for approval and then will verify. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
The new set of patches is as follows: d/p/0001-ovs-agent-signal-to-plugin-if-tunnel-refresh-needed.patch (LP: #1853613) d/p/0002-Do-not-block-connection-between-br-int-and-br-phys-o.patch (LP: #1869808) d/p/0003-Ensure-that-stale-flows-are-cleaned-from-phys_bridge.patch (LP: #1864822) d/p/0004-DVR-Reconfigure-re-created-physical-bridges-for-dvr-.patch (LP: #1864822) d/p/0005-Ensure-drop-flows-on-br-int-at-agent-startup-for-DVR.patch (LP: #1887148) d/p/0006-Don-t-check-if-any-bridges-were-recrected-when-OVS-w.patch (LP: #1864822) d/p/0007-Not-remove-the-running-router-when-MQ-is-unreachable.patch (LP: #1871850) Same test case etc as current sru. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
** Patch added: "lp1869808-bionic-queens.debdiff" https://bugs.launchpad.net/neutron/+bug/1869808/+attachment/5471546/+files/lp1869808-bionic-queens.debdiff -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
After a further review of the patches surrounding this issue I decided to pull what looks more like a complete set of the associated patches from stable/queens and have been testing a build that I am now happy with. It behaves no differently to the current upload but supports hopefully all the edge cases around ovs agent restart and resync that have been resolved in stable/queens. I will attach a debdiff and would like to request that replace the currently uploaded package with this. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
neutron 2:12.1.1-0ubuntu4 is in the bionic unapproved queue: https://launchpad.net/ubuntu/bionic/+queue?queue_state=1_text=neutron neutron 2:13.0.7-0ubuntu1~cloud3 has been uploaded to rocky-staging -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
Correction, stable/queens patches are the same as the attached debdiff. I created stable/rocky patches based on upstream stable/stein. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
@Trent, thanks for the patches. I've modified them slightly to be cherry-picked from upstream's stable/rocky branch. ** Changed in: cloud-archive/rocky Status: New => Triaged ** Changed in: cloud-archive/rocky Importance: Undecided => Critical ** Changed in: cloud-archive/queens Importance: Undecided => Critical ** Changed in: cloud-archive/queens Status: New => Triaged -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
** Changed in: neutron (Ubuntu Bionic) Importance: Undecided => Critical ** Changed in: neutron (Ubuntu Bionic) Status: New => In Progress ** Changed in: neutron (Ubuntu Bionic) Assignee: (unassigned) => Trent Lloyd (lathiat) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
** Description changed: + (SRU template copied from comment 42) + + [Impact] + + - When there is a RabbitMQ or neutron-api outage, the neutron- + openvswitch-agent undergoes a "resync" process and temporarily blocks + all VM traffic. This always happens for a short time period (maybe <1 + second) but in some high scale environments this lasts for minutes. If + RabbitMQ is down again during the re-sync, traffic will also be blocked + until it can connect which may be for a long period. This also affects + situations where neutron-openvswitch-agent is intentionally restarted + while RabbitMQ is down. Bug #1869808 addresses this issue and Bug + #1887148 is a fix for that fix to prevent network loops during DVR + startup. + + - In the same situation, the neutron-l3-agent can delete the L3 router + (Bug #1871850) + + [Test Case] + + (1) Deploy Openstack Bionic-Queens with DVR and a *VLAN* tenant network + (VXLAN or FLAT will not reproduce the issue). With a standard + deployment, simply enabling DHCP on the ext_net subnet will allow VMs to + be booted directly on the ext_net provider network. "openstack subnet + set --dhcp ext_net and then deploy the VM directly to ext_net" + + (2) Deploy a VM to the VLAN network + + (3) Start pinging the VM from an external network + + (4) Stop all RabbitMQ servers + + (5) Restart neutron-openvswitch-agent + + (6) Ping traffic should cease and not recover + + (7) Start all RabbitMQ servers + + (8) Ping traffic will recover after 30-60 seconds + + [Where problems could occur] + + These patches are all cherry-picked from the upstream stable branches, + and have existed upstream including the stable/queens branch for many + months and in Ubuntu all supported subsequent releases (Stein onwards) + have also had these patches for many months with the exception of + Queens. + + There is a chance that not installing these drop flows during startup + could have traffic go somewhere that's not expected when the network is + in a partially setup case, this was the case for DVR and in setups where + more than 1 DVR external network port existed a network loop was + possibly temporarily created. This was already addressed with the + included patch for Bug #1869808. Checked and could not locate any other + merged changes to this drop_port logic that also need to be backported. + + [Other Info] + + + [original description] + We are using Openstack Neutron 13.0.6 and it is deployed using OpenStack-helm. I test ping servers in the same vlan while rebooting neutron-ovs-agent. The result shows root@mgt01:~# openstack server list +--+-++--+--+---+ | ID | Name| Status | Networks | Image| Flavor| +--+-++--+--+---+ | 22d55077-b1b5-452e-8eba-cbcd2d1514a8 | test-1-1| ACTIVE | vlan105=172.31.10.4 | Cirros 0.4.0 64-bit | m1.tiny | | 726bc888-7767-44bc-b68a-7a1f3a6babf1 | test-1-2| ACTIVE | vlan105=172.31.10.18 | Cirros 0.4.0 64-bit | m1.tiny | $ ping 172.31.10.4 PING 172.31.10.4 (172.31.10.4): 56 data bytes .. 64 bytes from 172.31.10.4: seq=59 ttl=64 time=0.465 ms 64 bytes from 172.31.10.4: seq=60 ttl=64 time=0.510 ms < 64 bytes from 172.31.10.4: seq=61 ttl=64 time=0.446 ms 64 bytes from 172.31.10.4: seq=63 ttl=64 time=0.744 ms 64 bytes from 172.31.10.4: seq=64 ttl=64 time=0.477 ms 64 bytes from 172.31.10.4: seq=65 ttl=64 time=0.441 ms 64 bytes from 172.31.10.4: seq=66 ttl=64 time=0.376 ms 64 bytes from 172.31.10.4: seq=67 ttl=64 time=0.481 ms As one can see, packet seq 62 is lost, I believe, during rebooting ovs agent. Right now, I am suspecting https://github.com/openstack/neutron/blob/6d619ea7c13e89ec575295f04c63ae316759c50a/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py#L229 this code is refreshing flow table rules even though it is not necessary. Because when I dump flows on phys bridge, I can see duration is rewinding to 0 which suggests flow has been deleted and created again """ duration=secs - The time, in seconds, that the entry has been in the table. - secs includes as much precision as the switch provides, possibly - to nanosecond resolution. + The time, in seconds, that the entry has been in the table. + secs includes as much precision as the switch provides, possibly + to nanosecond resolution. """ root@compute01:~# ovs-ofctl dump-flows br-floating ... - cookie=0x673522f560f5ca4f,
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
** Changed in: cloud-archive/victoria Status: New => Fix Released ** Changed in: cloud-archive/ussuri Status: New => Fix Released ** Changed in: cloud-archive/train Status: New => Fix Released ** Changed in: cloud-archive/stein Status: New => Fix Released ** Changed in: neutron (Ubuntu Hirsute) Status: New => Fix Released ** Changed in: neutron (Ubuntu Groovy) Status: New => Fix Released ** Changed in: neutron (Ubuntu Focal) Status: New => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
Attaching revised SRU patch for Ubuntu Bionic, no code content changes but fixed the changelog to list all 3 bug numbers correctly. ** Patch added: "neutron SRU patch for Ubuntu Bionic (new version)" https://bugs.launchpad.net/neutron/+bug/1869808/+attachment/5464699/+files/lp1869808-bionic.debdiff ** Patch removed: "debdiff for ubuntu cloud archive (queens)" https://bugs.launchpad.net/neutron/+bug/1869808/+attachment/5464416/+files/lp1869808-queens.debdiff -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
Ubuntu SRU Justification [Impact] - When there is a RabbitMQ or neutron-api outage, the neutron- openvswitch-agent undergoes a "resync" process and temporarily blocks all VM traffic. This always happens for a short time period (maybe <1 second) but in some high scale environments this lasts for minutes. If RabbitMQ is down again during the re-sync, traffic will also be blocked until it can connect which may be for a long period. This also affects situations where neutron-openvswitch-agent is intentionally restarted while RabbitMQ is down. Bug #1869808 addresses this issue and Bug #1887148 is a fix for that fix to prevent network loops during DVR startup. - In the same situation, the neutron-l3-agent can delete the L3 router (Bug #1871850) [Test Case] (1) Deploy Openstack Bionic-Queens with DVR and a *VLAN* tenant network (VXLAN or FLAT will not reproduce the issue). With a standard deployment, simply enabling DHCP on the ext_net subnet will allow VMs to be booted directly on the ext_net provider network. "openstack subnet set --dhcp ext_net and then deploy the VM directly to ext_net" (2) Deploy a VM to the VLAN network (3) Start pinging the VM from an external network (4) Stop all RabbitMQ servers (5) Restart neutron-openvswitch-agent (6) Ping traffic should cease and not recover (7) Start all RabbitMQ servers (8) Ping traffic will recover after 30-60 seconds [Where problems could occur] These patches are all cherry-picked from the upstream stable branches, and have existed upstream including the stable/queens branch for many months and in Ubuntu all supported subsequent releases (Stein onwards) have also had these patches for many months with the exception of Queens. There is a chance that not installing these drop flows during startup could have traffic go somewhere that's not expected when the network is in a partially setup case, this was the case for DVR and in setups where more than 1 DVR external network port existed a network loop was possibly temporarily created. This was already addressed with the included patch for Bug #1869808. Checked and could not locate any other merged changes to this drop_port logic that also need to be backported. [Other Info] -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
SRU proposed for Ubuntu Bionic + Cloud Archive (Queens) for the following 3 bugs: Bug #1869808 reboot neutron-ovs-agent introduces a short interrupt of vlan traffic Bug #1887148 Network loop between physical networks with DVR (Fix for fix to Bug #1869808) Bug #1871850 [L3] existing router resources are partial deleted unexpectedly when MQ is gone SRU is only required for Bionic + Queens Cloud Archive, all other releases already have these patches. == reboot neutron-ovs-agent introduces a short interrupt of vlan traffic https://bugs.launchpad.net/neutron/+bug/1869808 pike1f4f888ad34d54ec968d9c9f9f80c388f3ca0d12stable/pike [EOL] queens 131bbc9a53411033cf27664d8f1fd7afc72c57bfstable/queens [Needed] rocky cc48edf85cf66277423b0eb52ae6353f8028d2a6stable/rocky [EOL] stein 6dfc35680fcc885d9ad449ca2b39225fb1bca89814.3.0 [Already done] train 4f501f405d1c44e00784df8450cbe83129da1ea715.2.0 [Already done] ussuri 88e70a520acaca37db645c3ef1124df8c7d778d516.1.0 [Already done] master 90212b12cdf62e92d811997ebba699cab431d69617.0.0 [Already done] == [L3] existing router resources are partial deleted unexpectedly when MQ is gone https://bugs.launchpad.net/neutron/+bug/1871850 queens ec6c98060d78c97edf6382ede977209f007fdb81stable/queens [Needed] rocky 5ee377952badd94d08425aab41853916092acd07stable/rocky [EOL] stein 71f22834f2240834ca591e27a920f9444bac968914.4.0 [Already done] train a96ad52c7e57664c63e3675b64718c5a288946fb15.3.0 [Already done] ussuri 5eeb98cdb51dc0dadd43128d1d0ed7d497606ded16.2.0 [Already done] master 12b9149e20665d80c11f1ef3d2283e1fa6f3b69317.0.0 [Already done] == Network loop between physical networks with DVR (Fix for 1869808) https://bugs.launchpad.net/neutron/+bug/1887148 pike00466f41d690ca7c7a918bfd861878ef620bbec9stable/pike [EOL] queens 8a173ec29ac1819c3d28c191814cd1402d272bb9stable/queens [Needed] rocky 47ec363f5faefd85dfa33223c0087fafb5b9stable/rocky [EOL] stein 8181c5dbfe799ac6c832ab67b7eab3bcef4098b914.3.1 [Already done] train 17eded13595b18ab60af5256e0f63c57c370229615.2.0 [Already done] ussuri 143fe8ff89ba776618ed6291af9d5e28e4662bdb16.1.0 [Already done] master c1a77ef8b74bb9b5abbc5cb03fb3201383122eb817.0.0 [Already done] -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
The attachment "debdiff for ubuntu cloud archive (queens)" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team. [This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.] ** Tags added: patch -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
** Also affects: neutron (Ubuntu) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Focal) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Hirsute) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Groovy) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Bionic) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs