[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
Reviewed: https://review.opendev.org/c/openstack/oslo.privsep/+/819996 Committed: https://opendev.org/openstack/oslo.privsep/commit/c223dbced7d5a8d1920fe764cbce42cf844538e1 Submitter: "Zuul (22348)" Branch:master commit c223dbced7d5a8d1920fe764cbce42cf844538e1 Author: Mohammed Naser Date: Wed Dec 1 11:19:26 2021 +0400 Bump max_buffer_size for Deserializer Since msgpack 0.6.0, some limits were introduced for the deserializer which were put in to avoid any denial of service attacks using msgpack. These limits were raised to 100MiB in the release of msgpack 1.0.0. The default buffer sizes that were implemented were quite low and when running certain `privsep` commands, especially for Neutron when using linux bridge, where there is a large amount of netdevs, privsep would crash since msgpack would fail to decode the message since it considers it too big: ValueError: 1174941 exceeds max_str_len(1048576) In this commit, the `max_buffer_size` is bumped to the value that ships with msgpack==1.0.0 to allow for users who don't have that to continue to function. Also, since `msgpack` is only being used by the internal API, we're not worried about a third party coming in and overwhelming the system by deserializing calls. This fix also addresses some weird behaviour where privsep will die and certain OpenStack agents would start to behave in a strange way once they hit a certain number of ports (since any privsep calls would start to fail). Closes-Bug: #1844822 Closes-Bug: #1896734 Related-Bug: #1928764 Closes-Bug: #1952611 Change-Id: I135917522daff95377d07566317ef0fc0d16e7cb ** Changed in: oslo.privsep Status: In Progress => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
Fix proposed to branch: master Review: https://review.opendev.org/c/openstack/oslo.privsep/+/819996 ** Changed in: oslo.privsep Status: New => In Progress -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
I logged it locally, and this is how much data was put out: root@tctrko1:~# cat /tmp/debug | wc -c 1179958 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
The stack trace is the following: Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: -- Green Thread-- Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /openstack/venvs/neutron-21.2.6/bin/neutron-linuxbridge-agent:8 in Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `sys.exit(main())` Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /openstack/venvs/neutron-21.2.6/lib/python3.6/site-packages/neutron/cmd/eventlet/plugins/linuxbridge_neutron_agent.py:21 in main Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `agent_main.main()` Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /openstack/venvs/neutron-21.2.6/lib/python3.6/site-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:1055 in main Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `manager = LinuxBridgeManager(bridge_mappings, interface_mappings)` Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /openstack/venvs/neutron-21.2.6/lib/python3.6/site-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:93 in __init__ Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `self.check_vxlan_support()` Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /openstack/venvs/neutron-21.2.6/lib/python3.6/site-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:745 in check_vxlan_sup Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `if self.vxlan_ucast_supported():` Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /openstack/venvs/neutron-21.2.6/lib/python3.6/site-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:706 in vxlan_ucast_sup Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `ip_lib.vxlan_in_use(seg_id)):` Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /openstack/venvs/neutron-21.2.6/lib/python3.6/site-packages/neutron/agent/linux/ip_lib.py:741 in vxlan_in_use Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `interfaces = get_devices_info(namespace)` Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /openstack/venvs/neutron-21.2.6/lib/python3.6/site-packages/neutron/agent/linux/ip_lib.py:1379 in get_devices_info Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `devices = privileged.get_link_devices(namespace, **kwargs)` Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /openstack/venvs/neutron-21.2.6/lib/python3.6/site-packages/oslo_privsep/priv_context.py:247 in _wrap Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `return self.channel.remote_call(name, args, kwargs)` Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /openstack/venvs/neutron-21.2.6/lib/python3.6/site-packages/oslo_privsep/daemon.py:214 in remote_call Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `result = self.send_recv((Message.CALL.value, name, args, kwargs))` Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /openstack/venvs/neutron-21.2.6/lib/python3.6/site-packages/oslo_privsep/comm.py:172 in send_recv Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `reply = future.result()` Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /openstack/venvs/neutron-21.2.6/lib/python3.6/site-packages/oslo_privsep/comm.py:109 in result Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `self.condvar.wait()` Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /usr/lib/python3.6/threading.py:295 in wait Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `waiter.acquire()` Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /openstack/venvs/neutron-21.2.6/lib/python3.6/site-packages/eventlet/semaphore.py:115 in acquire Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `hubs.get_hub().switch()` Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: /openstack/venvs/neutron-21.2.6/lib/python3.6/site-packages/eventlet/hubs/hub.py:298 in switch Dec 01 07:19:38 tctrko1 neutron-linuxbridge-agent[18892]: `return self.greenlet.switch()` In this case it looks like `privileged.get_link_devices` is what needs to be optimized which was not covered by the fix. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
Unfortunately, I'm running into this issue right now in an environment that is running `linuxbridge`. I'm going to assume the number of `VXLAN` interfaces is bubbling up that is causing this issue to occur. Also, since in linuxbridge, there are `brq` and `VXLAN` interfaces. I've got around 219 interfaces in this scenario: root@tctrko1:~# ip link | grep qlen | wc -l 219 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
The Groovy Gorilla has reached end of life, so this bug will not be fixed for that release ** Changed in: python-oslo.privsep (Ubuntu Groovy) Status: New => Won't Fix -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
This bug was fixed in the package neutron - 2:17.1.0+git2021012815.0fb63f7297-0ubuntu5~cloud0 --- neutron (2:17.1.0+git2021012815.0fb63f7297-0ubuntu5~cloud0) focal-wallaby; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:17.1.0+git2021012815.0fb63f7297-0ubuntu5) hirsute; urgency=medium . * d/p/improve-get-devices-with-ip-performance.patch: Performance of get_devices_with_ip is improved to limit the amount of information to be sent and reduce the number of syscalls. (LP: #1896734). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
** Changed in: cloud-archive Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
This bug was fixed in the package neutron - 2:16.3.0-0ubuntu3~cloud0 --- neutron (2:16.3.0-0ubuntu3~cloud0) bionic-ussuri; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:16.3.0-0ubuntu3) focal; urgency=medium . * d/p/revert-dvr-remove-control-plane-arp-updates.patch: Cherry-picked from https://review.opendev.org/c/openstack/neutron/+/777903 to prevent permanent arp entries that never get deleted (LP: #1916761). * d/p/improve-get-devices-with-ip-performance.patch: Performance of get_devices_with_ip is improved to limit the amount of information to be sent and reduce the number of syscalls. (LP: #1896734). ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
Verified Bionic Ussuri using [Test Plan]: # apt-cache policy neutron-common neutron-common: Installed: 2:16.3.0-0ubuntu3~cloud0 Candidate: 2:16.3.0-0ubuntu3~cloud0 Version table: *** 2:16.3.0-0ubuntu3~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-proposed/ussuri/main amd64 Packages 100 /var/lib/dpkg/status 2:12.1.1-0ubuntu3 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages 2:12.0.1-0ubuntu1 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu bionic/main amd64 Packages ** Tags removed: verification-ussuri-needed ** Tags added: verification-ussuri-done -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
Comment 11 seems to have added incorrect tags for verification for the Ussuri cloud archive. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision. ** Tags added: verification-ussuri-needed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
ah, wrong bug, ignore the last comment :) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
The verification of the Stable Release Update for neutron has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions. This bug was released into the Victoria cloud archive but a hiccup in the automatic backport process from distro caused this change notification to get lost. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
This bug was fixed in the package neutron - 2:17.1.0-0ubuntu3~cloud0 --- neutron (2:17.1.0-0ubuntu3~cloud0) focal-victoria; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:17.1.0-0ubuntu3) groovy; urgency=medium . * d/p/revert-dvr-remove-control-plane-arp-updates.patch: Cherry-picked from https://review.opendev.org/c/openstack/neutron/+/777903 to prevent permanent arp entries that never get deleted (LP: #1916761). * d/p/improve-get-devices-with-ip-performance.patch: Performance of get_devices_with_ip is improved to limit the amount of information to be sent and reduce the number of syscalls. (LP: #1896734). ** Changed in: cloud-archive/victoria Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
This bug was fixed in the package neutron - 2:17.1.0-0ubuntu3 --- neutron (2:17.1.0-0ubuntu3) groovy; urgency=medium * d/p/revert-dvr-remove-control-plane-arp-updates.patch: Cherry-picked from https://review.opendev.org/c/openstack/neutron/+/777903 to prevent permanent arp entries that never get deleted (LP: #1916761). * d/p/improve-get-devices-with-ip-performance.patch: Performance of get_devices_with_ip is improved to limit the amount of information to be sent and reduce the number of syscalls. (LP: #1896734). neutron (2:17.1.0-0ubuntu2) groovy; urgency=medium * Backport fix for dvr-snat missig rfp interfaces (LP: #1894843) - d/p/0001-Fix-deletion-of-rfp-interfaces-when-router-is-re-ena.patch neutron (2:17.1.0-0ubuntu1) groovy; urgency=medium * d/watch: Fix typo in watch URL, add trailing slash. * New stable point release for OpenStack Victoria (LP: #1915785). * d/p/fix-removal-of-dvr-src-mac-flows.patch, d/p/ovn-fix-inconsistent-igmp-configuration.patch: Removed after fix landed upstream. -- Corey Bryant Mon, 08 Mar 2021 13:00:21 -0500 ** Changed in: neutron (Ubuntu Groovy) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
This bug was fixed in the package neutron - 2:16.3.0-0ubuntu3 --- neutron (2:16.3.0-0ubuntu3) focal; urgency=medium * d/p/revert-dvr-remove-control-plane-arp-updates.patch: Cherry-picked from https://review.opendev.org/c/openstack/neutron/+/777903 to prevent permanent arp entries that never get deleted (LP: #1916761). * d/p/improve-get-devices-with-ip-performance.patch: Performance of get_devices_with_ip is improved to limit the amount of information to be sent and reduce the number of syscalls. (LP: #1896734). neutron (2:16.3.0-0ubuntu2) focal; urgency=medium * Backport fix for dvr-snat missig rfp interfaces (LP: #1894843) - d/p/0001-Fix-deletion-of-rfp-interfaces-when-router-is-re-ena.patch neutron (2:16.3.0-0ubuntu1) focal; urgency=medium * d/watch: Add trailing slash to Neutron URL. * New stable point release for OpenStack Ussuri (LP: #1915786). * d/p/fix-removal-of-dvr-src-mac-flows.patch, d/p/ovn-fix-inconsistent-igmp-configuration.patch: Removed after patch landed upstream. -- Corey Bryant Mon, 08 Mar 2021 13:26:42 -0500 ** Changed in: neutron (Ubuntu Focal) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
** Tags removed: verification-needed ** Tags added: verification-done -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
Verified Focal Ussuri using [Test Plan]: # apt-cache policy neutron-common neutron-common: Installed: 2:16.3.0-0ubuntu3 Candidate: 2:16.3.0-0ubuntu3 Version table: *** 2:16.3.0-0ubuntu3 500 500 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 Packages 100 /var/lib/dpkg/status 2:16.2.0-0ubuntu3 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages 2:16.0.0~b3~git2020041516.5f42488a9a-0ubuntu2 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu focal/main amd64 Packages ** Tags removed: verification-needed-focal ** Tags added: verification-done-focal -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
Verified Focal Victoria (UCA) using [Test Plan]: # apt-cache policy neutron-common neutron-common: Installed: 2:17.1.0-0ubuntu3~cloud0 Candidate: 2:17.1.0-0ubuntu3~cloud0 Version table: *** 2:17.1.0-0ubuntu3~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu focal-proposed/victoria/main amd64 Packages 100 /var/lib/dpkg/status 2:16.2.0-0ubuntu3 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages 2:16.0.0~b3~git2020041516.5f42488a9a-0ubuntu2 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu focal/main amd64 Packages ** Tags removed: verification-victoria-needed ** Tags added: verification-victoria-done -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
** Description changed: [Impact] When there is a large amount of netdevs registered in the kernel and debug logging is enabled, neutron-openvswitch-agent and the privsep daemon spawned by it hang since the RPC call result sent by the privsep daemon over a unix socket exceeds the message sizes that the msgpack library can handle. The impact of this is that enabling debug logging on the cloud completely stalls neutron-openvswitch-agents and makes them "dead" from the Neutron server perspective. The issue is summarized in detail in comment #5 https://bugs.launchpad.net/oslo.privsep/+bug/1896734/comments/5 [Test Plan] - * deploy Openstack Train/Ussuri/Victoria - * need at least one compute host - * enable neutron debug logging - * create a load of interfaces on your compute host to create a large 'ip addr show' output - * for ((i=0;i<400;i++)); do ip tuntap add mode tap tap-`uuidgen| cut -c1-11`; done - * create a single vm - * add floating ip - * ping fip - * create 20 ports and attach them to the vm - * for ((i=0;i<20;i++)); do id=`uuidgen`; openstack port create --network private --security-group ab698dc1-4b87-4b46-9c38-cd53e3ec7492 X-$id; openstack server add port X-$id; done - * attaching ports should not result in errors + * deploy Openstack Train/Ussuri/Victoria + * need at least one compute host + * enable neutron debug logging + * create a load of interfaces on your compute host to create a large 'ip addr show' output + * for ((i=0;i<400;i++)); do ip tuntap add mode tap tap-`uuidgen| cut -c1-11`; done + * create a single vm + * add floating ip + * ping fip + * create 20 ports and attach them to the vm + * for ((i=0;i<20;i++)); do id=`uuidgen`; openstack port create --network private --security-group __SG__ X-$id; openstack server add port __VM__ X-$id; done + * attaching ports should not result in errors [Where problems could occur] No problems anticipated this patchset. - - When there is a large amount of netdevs registered in the kernel and debug logging is enabled, neutron-openvswitch-agent and the privsep daemon spawned by it hang since the RPC call result sent by the privsep daemon over a unix socket exceeds the message sizes that the msgpack library can handle. + When there is a large amount of netdevs registered in the kernel and + debug logging is enabled, neutron-openvswitch-agent and the privsep + daemon spawned by it hang since the RPC call result sent by the privsep + daemon over a unix socket exceeds the message sizes that the msgpack + library can handle. The impact of this is that enabling debug logging on the cloud completely stalls neutron-openvswitch-agents and makes them "dead" from the Neutron server perspective. The issue is summarized in detail in comment #5 https://bugs.launchpad.net/oslo.privsep/+bug/1896734/comments/5 Old Description While trying to debug a different issue, I encountered a situation where privsep hangs in the process of handling a request from neutron- openvswitch-agent when debug logging is enabled (juju debug-log neutron- openvswitch=true): https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1895652/comments/11 https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1895652/comments/12 The issue gets reproduced reliably in the environment where I encountered it on all units. As a result, neutron-openvswitch-agent services hang while waiting for a response from the privsep daemon and do not progress past basic initialization. They never post any state back to the Neutron server and thus are marked dead by it. The processes though are shown as "active (running)" by systemd which adds to the confusion since they do indeed start from the systemd's perspective. systemctl --no-pager status neutron-openvswitch-agent.service ● neutron-openvswitch-agent.service - Openstack Neutron Open vSwitch Plugin Agent Loaded: loaded (/lib/systemd/system/neutron-openvswitch-agent.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2020-09-23 08:28:41 UTC; 25min ago Main PID: 247772 (/usr/bin/python) Tasks: 4 (limit: 9830) CGroup: /system.slice/neutron-openvswitch-agent.service ├─247772 /usr/bin/python3 /usr/bin/neutron-openvswitch-agent --config-file=/etc/neutron/neutron.conf --config-file=/etc/neutron/plugins/ml2/openvswitch_…og └─248272 /usr/bin/python3 /usr/bin/privsep-helper --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/openvswitch_agent.ini -…ck An strace shows that the privsep daemon tries to receive input from fd 3 which is the unix socket it uses to
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
Verified Groovy using [Test Plan]: # apt-cache policy neutron-common neutron-common: Installed: 2:17.1.0-0ubuntu3 Candidate: 2:17.1.0-0ubuntu3 Version table: *** 2:17.1.0-0ubuntu3 500 500 http://archive.ubuntu.com/ubuntu groovy-proposed/main amd64 Packages 100 /var/lib/dpkg/status 2:17.0.0-0ubuntu3 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu groovy-updates/main amd64 Packages 2:17.0.0-0ubuntu1 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu groovy/main amd64 Packages ** Description changed: + [Impact] + When there is a large amount of netdevs registered in the kernel and debug logging is enabled, neutron-openvswitch-agent and the privsep daemon spawned by it hang since the RPC call result sent by the privsep daemon over a unix socket exceeds the message sizes that the msgpack library can handle. + + The impact of this is that enabling debug logging on the cloud + completely stalls neutron-openvswitch-agents and makes them "dead" from + the Neutron server perspective. + + The issue is summarized in detail in comment #5 + https://bugs.launchpad.net/oslo.privsep/+bug/1896734/comments/5 + + [Test Plan] + + * deploy Openstack Train/Ussuri/Victoria + * need at least one compute host + * enable neutron debug logging + * create a load of interfaces on your compute host to create a large 'ip addr show' output + * for ((i=0;i<400;i++)); do ip tuntap add mode tap tap-`uuidgen| cut -c1-11`; done + * create a single vm + * add floating ip + * ping fip + * create 20 ports and attach them to the vm + * for ((i=0;i<20;i++)); do id=`uuidgen`; openstack port create --network private --security-group ab698dc1-4b87-4b46-9c38-cd53e3ec7492 X-$id; openstack server add port X-$id; done + * attaching ports should not result in errors + + [Where problems could occur] + + No problems anticipated this patchset. + + + + + When there is a large amount of netdevs registered in the kernel and debug logging is enabled, neutron-openvswitch-agent and the privsep daemon spawned by it hang since the RPC call result sent by the privsep daemon over a unix socket exceeds the message sizes that the msgpack library can handle. The impact of this is that enabling debug logging on the cloud completely stalls neutron-openvswitch-agents and makes them "dead" from the Neutron server perspective. The issue is summarized in detail in comment #5 https://bugs.launchpad.net/oslo.privsep/+bug/1896734/comments/5 Old Description While trying to debug a different issue, I encountered a situation where privsep hangs in the process of handling a request from neutron- openvswitch-agent when debug logging is enabled (juju debug-log neutron- openvswitch=true): https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1895652/comments/11 https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1895652/comments/12 The issue gets reproduced reliably in the environment where I encountered it on all units. As a result, neutron-openvswitch-agent services hang while waiting for a response from the privsep daemon and do not progress past basic initialization. They never post any state back to the Neutron server and thus are marked dead by it. The processes though are shown as "active (running)" by systemd which adds to the confusion since they do indeed start from the systemd's perspective. systemctl --no-pager status neutron-openvswitch-agent.service ● neutron-openvswitch-agent.service - Openstack Neutron Open vSwitch Plugin Agent Loaded: loaded (/lib/systemd/system/neutron-openvswitch-agent.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2020-09-23 08:28:41 UTC; 25min ago Main PID: 247772 (/usr/bin/python) Tasks: 4 (limit: 9830) CGroup: /system.slice/neutron-openvswitch-agent.service ├─247772 /usr/bin/python3 /usr/bin/neutron-openvswitch-agent --config-file=/etc/neutron/neutron.conf --config-file=/etc/neutron/plugins/ml2/openvswitch_…og └─248272 /usr/bin/python3 /usr/bin/privsep-helper --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/openvswitch_agent.ini -…ck An strace shows that the privsep daemon tries to receive input from fd 3 which is the unix socket it uses to communicate with the client. However, this is just one tread out of many spawned by the privsep daemon so it is unlikely to be the root cause (there are 65 threads there in total, see https://paste.ubuntu.com/p/fbGvN2P8rP/) # there is one extra neutron-openvvswitch-agent running in a LXD container which can be ignored here (there is an octavia unit on the node which has a
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
Hello Dmitrii, or anyone else affected, Accepted neutron into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/neutron/2:16.3.0-0ubuntu3 in a few hours, and then in the -proposed repository. Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed- focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification- failed-focal. In either case, without details of your testing we will not be able to proceed. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping! N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days. ** Changed in: neutron (Ubuntu Focal) Status: Triaged => Fix Committed ** Tags added: verification-needed-focal -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
Hello Dmitrii, or anyone else affected, Accepted neutron into groovy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/neutron/2:17.1.0-0ubuntu3 in a few hours, and then in the -proposed repository. Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed- groovy to verification-done-groovy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification- failed-groovy. In either case, without details of your testing we will not be able to proceed. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping! N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days. ** Changed in: neutron (Ubuntu Groovy) Status: Triaged => Fix Committed ** Tags added: verification-needed verification-needed-groovy -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
This bug was fixed in the package neutron - 2:17.1.0+git2021012815.0fb63f7297-0ubuntu5 --- neutron (2:17.1.0+git2021012815.0fb63f7297-0ubuntu5) hirsute; urgency=medium * d/p/improve-get-devices-with-ip-performance.patch: Performance of get_devices_with_ip is improved to limit the amount of information to be sent and reduce the number of syscalls. (LP: #1896734). -- Corey Bryant Tue, 09 Mar 2021 16:01:26 -0500 ** Changed in: neutron (Ubuntu Hirsute) Status: Triaged => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
** Also affects: neutron (Ubuntu) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Focal) Importance: Undecided Status: New ** Also affects: python-oslo.privsep (Ubuntu Focal) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Hirsute) Importance: Undecided Status: New ** Also affects: python-oslo.privsep (Ubuntu Hirsute) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Groovy) Importance: Undecided Status: New ** Also affects: python-oslo.privsep (Ubuntu Groovy) Importance: Undecided Status: New ** Changed in: neutron (Ubuntu Focal) Importance: Undecided => Medium ** Changed in: neutron (Ubuntu Focal) Status: New => Triaged ** Changed in: neutron (Ubuntu Groovy) Importance: Undecided => Medium ** Changed in: neutron (Ubuntu Groovy) Status: New => Triaged ** Changed in: neutron (Ubuntu Hirsute) Importance: Undecided => Medium ** Changed in: neutron (Ubuntu Hirsute) Status: New => Triaged ** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/ussuri Importance: Undecided Status: New ** Also affects: cloud-archive/victoria Importance: Undecided Status: New ** Changed in: cloud-archive/ussuri Importance: Undecided => Medium ** Changed in: cloud-archive/ussuri Status: New => Triaged ** Changed in: cloud-archive/victoria Importance: Undecided => Medium ** Changed in: cloud-archive/victoria Status: New => Triaged -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
Neutron fix is released upstream and backported all the way to Train. ** Changed in: neutron Status: New => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
Neutron patch: https://review.opendev.org/q/I97ada62484023b9833ed12afd68eb4c8d337fd1f -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
** Also affects: neutron Importance: Undecided Status: New ** Changed in: neutron Importance: Undecided => Medium ** Changed in: neutron Assignee: (unassigned) => Rodolfo Alonso (rodolfo-alonso-hernandez) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
** Tags added: seg -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
ip address show output on one of the affected nodes (for reference on how large it is): https://paste.ubuntu.com/p/cRf655Y8kt/ ** Also affects: oslo.privsep Importance: Undecided Status: New ** Description changed: + When there is a large amount of netdevs registered in the kernel and + debug logging is enabled, neutron-openvswitch-agent and the privsep + daemon spawned by it hang since the RPC call result sent by the privsep + daemon over a unix socket exceeds the message sizes that the msgpack + library can handle. + + The impact of this is that enabling debug logging on the cloud + completely stalls neutron-openvswitch-agents and makes them "dead" from + the Neutron server perspective. + + The issue is summarized in detail in comment #5 + https://bugs.launchpad.net/oslo.privsep/+bug/1896734/comments/5 + + + Old Description + While trying to debug a different issue, I encountered a situation where privsep hangs in the process of handling a request from neutron- openvswitch-agent when debug logging is enabled (juju debug-log neutron- openvswitch=true): https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1895652/comments/11 https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1895652/comments/12 The issue gets reproduced reliably in the environment where I encountered it on all units. As a result, neutron-openvswitch-agent services hang while waiting for a response from the privsep daemon and do not progress past basic initialization. They never post any state back to the Neutron server and thus are marked dead by it. The processes though are shown as "active (running)" by systemd which adds to the confusion since they do indeed start from the systemd's perspective. systemctl --no-pager status neutron-openvswitch-agent.service ● neutron-openvswitch-agent.service - Openstack Neutron Open vSwitch Plugin Agent Loaded: loaded (/lib/systemd/system/neutron-openvswitch-agent.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2020-09-23 08:28:41 UTC; 25min ago Main PID: 247772 (/usr/bin/python) Tasks: 4 (limit: 9830) CGroup: /system.slice/neutron-openvswitch-agent.service ├─247772 /usr/bin/python3 /usr/bin/neutron-openvswitch-agent --config-file=/etc/neutron/neutron.conf --config-file=/etc/neutron/plugins/ml2/openvswitch_…og └─248272 /usr/bin/python3 /usr/bin/privsep-helper --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/openvswitch_agent.ini -…ck An strace shows that the privsep daemon tries to receive input from fd 3 which is the unix socket it uses to communicate with the client. However, this is just one tread out of many spawned by the privsep daemon so it is unlikely to be the root cause (there are 65 threads there in total, see https://paste.ubuntu.com/p/fbGvN2P8rP/) # there is one extra neutron-openvvswitch-agent running in a LXD container which can be ignored here (there is an octavia unit on the node which has a neutron-openvswitch subordinate) root@node2:~# ps -eo pid,user,args --sort user | grep -P 'privsep.*openvswitch' 860690 10 /usr/bin/python3 /usr/bin/privsep-helper --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/openvswitch_agent.ini --privsep_context neutron.privileged.default --privsep_sock_path /tmp/tmp910qakfk/privsep.sock 248272 root /usr/bin/python3 /usr/bin/privsep-helper --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/openvswitch_agent.ini --privsep_context neutron.privileged.default --privsep_sock_path /tmp/tmpcmwn7vom/privsep.sock 363905 root grep --color=auto -P privsep.*openvswitch root@node2:~# strace -f -p 248453 2>&1 [pid 248786] futex(0x7f6a6401c1d0, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0x [pid 248475] futex(0x7f6a6c024590, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0x [pid 248473] futex(0x7f6a746d9fd0, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0x [pid 248453] recvfrom(3, root@node2:~# lsof -p 248453 | grep 3u privsep-h 248453 root3u unix 0x8e6d8abdec00 0t0 356522977 type=STREAM root@node2:~# ss -pax | grep 356522977 u_str ESTAB 00 /tmp/tmp2afa3enn/privsep.sock 356522978 * 356522977 users:(("/usr/bin/python",pid=247567,fd=16)) u_str ESTAB 00 * 356522977 * 356522978