[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
*** This bug is a duplicate of bug 1839592 *** https://bugs.launchpad.net/bugs/1839592 I think that's a fairly safe assumption @lathiat so I'm going to dupe this bug against that one. ** This bug has been marked a duplicate of bug 1839592 Open vSwitch (Version 2.9.2) goes into deadlocked state -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
Seems there is a good chance at least some of the people commenting or affected by this bug are duplicate of Bug #1839592 - essentially a libc6 bug that meant threads weren't woken up when they should have been. Fixed by libc6 upgrade to 2.27-3ubuntu1.3 in bionic. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
** Information type changed from Public Security to Public -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
yes, dpkg -l | grep libc6 ii libc6:amd64 2.29-0ubuntu2 amd64GNU C Library: Shared libraries ii libc6-dev:amd64 2.29-0ubuntu2 amd64GNU C Library: Development Libraries and Header Files -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
libc6?? ubuntu 1904 source? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
We managed to solve the problem by installing the libc version from the ubuntu disco repository. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
** Information type changed from Public to Public Security -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
I know this has been open for a year now, but for what it's worth, we're seeing the same issue on two of our three Neutron nodes. A whole lot of entries in /var/log/openvswitch/ovs-vswitchd.log like the following: 2020-05-22T00:19:25.581Z|18410|poll_loop(handler74)|INFO|Dropped 2973816 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2020-05-22T00:19:25.581Z|18411|poll_loop(handler74)|INFO|wakeup due to [POLLIN] on fd 35 (unknown anon_inode:[eventpoll]) at ../lib/dpif-netlink.c:2786 (98% CPU usage) We're running Neutron in Rocky on Ubuntu 18.04: # uname -r 4.15.0-99-generic # dpkg -l | grep openvswitch- ii neutron-openvswitch-agent 2:13.0.6-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - Open vSwitch plugin agent ii openvswitch-common 2.10.0-0ubuntu2~cloud0 amd64Open vSwitch common components ii openvswitch-switch 2.10.0-0ubuntu2~cloud0 amd64Open vSwitch switch implementations Looking at the ovs-vswitchd process itself: # timeout 10 strace -c -p 1227 strace: Process 1227 attached strace: Process 1227 detached % time seconds usecs/call callserrors syscall -- --- --- - - 96.289.059391 28852 314 poll 3.030.284833 284833 1 restart_syscall 0.280.026164 13 1968 sendmsg 0.270.025726 67 385 208 recvfrom 0.070.006302 3 2220 252 recvmsg 0.020.002327 7 31569 read 0.020.002157 2 930 171 futex 0.020.002075 5 378 378 accept 0.010.000532 2 316 getrusage 0.000.000238 1417 sendto 0.000.34 11 3 write -- --- --- - - 100.009.409779 6847 1078 total -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
We are also seeing this in bionic with rocky neutron-openvswitch-agents. We haven't really found any pattern in when it occurs, but even after emptying out a compute node with live migrations and rebooting it, we will keep seeing it. Amount of traffic doesn't seem to be related. What we are seeing are an increasing number of handler threads using 100% cpu which in our setup can mean several 1000% in total being used by ovs-vswitchd. While investigating we found that the handler threads seemed to spend quite a lot of time in native_queued_spin_lock_slowpath, making us believe there might be too many threads running competing for the same locks so we tried lowering other_config:n-handler-threads, which actually seemed to fix the issue, but it seems to have just been a temporary fix as we still see the issue and in fact may just be that the modification of the other_config:n-handler-threads variable slays the handlers that have been stuck in 100% CPU and spun up new fresh ones. I find it strange however that a rebooted node doesn't see the same though which makes me think there is something going on during startup of either OVS or openstack-neutron-agent that in some circumstances causes this. Investigation is on-going, just wanted to share our observations so far. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
We catch this problem about a couple of times a week and have found a pretty non-trivial way to fix it quickly. In crontab: watchdog_openvswitch.sh #!/bin/bash timeout 10 ovs-appctl version &>/dev/null if [[ "$?" != "0" ]]; then echo "run strace for fix openvswitch" timeout 5 strace -f -p $(cat /var/run/openvswitch/ovs-vswitchd.pid) &>/dev/null fi -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
Same on bionic with stein and ovs 2.11.0. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
A bit of follow up on this: strace on the thread in question shows the following: 13:35:47 poll([{fd=23, events=POLLIN}], 1, 0) = 0 (Timeout) <0.18> 13:35:47 epoll_wait(42, [{EPOLLIN, {u32=3, u64=3}}], 9, 0) = 1 <0.18> 13:35:47 recvmsg(417, {msg_namelen=0}, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.18> 13:35:47 poll([{fd=11, events=POLLIN}, {fd=42, events=POLLIN}, {fd=23, events=POLLIN}], 3, 2147483647) = 1 ([{fd=42, revents=POLLIN}]) <0.19> 13:35:47 getrusage(RUSAGE_THREAD, {ru_utime={tv_sec=490842, tv_usec=749026}, ru_stime={tv_sec=710657, tv_usec=442946}, ...}) = 0 <0.18> 13:35:47 poll([{fd=23, events=POLLIN}], 1, 0) = 0 (Timeout) <0.18> 13:35:47 epoll_wait(42, [{EPOLLIN, {u32=3, u64=3}}], 9, 0) = 1 <0.19> 13:35:47 recvmsg(417, {msg_namelen=0}, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.18> 13:35:47 poll([{fd=11, events=POLLIN}, {fd=42, events=POLLIN}, {fd=23, events=POLLIN}], 3, 2147483647) = 1 ([{fd=42, revents=POLLIN}]) <0.19> 13:35:47 getrusage(RUSAGE_THREAD, {ru_utime={tv_sec=490842, tv_usec=749108}, ru_stime={tv_sec=710657, tv_usec=442946}, ...}) = 0 <0.17> And if I strace with -c to collect a summary, after 4-5 seconds it shows the following: sudo strace -c -p 1658344 strace: Process 1658344 attached ^Cstrace: Process 1658344 detached % time seconds usecs/call callserrors syscall -- --- --- - - 0.000.00 0 9 write 0.000.00 0 56397 poll 0.000.00 0 28198 28198 recvmsg 0.000.00 0 28199 getrusage 0.000.00 0 14856 futex 0.000.00 0 28199 epoll_wait -- --- --- - - 100.000.00141150 28254 total -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: openvswitch (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
I have same issue with highload cpu usage on network node Openstack Rocky Ubuntu 18.04 Kernel 4.15.0-48-generic #51-Ubuntu Package: ii neutron-openvswitch-agent 2:13.0.2-0ubuntu3.1~cloud0 all Neutron is a virtual network service for Openstack - Open vSwitch plugin agent ii openvswitch-common2.10.0-0ubuntu2~cloud0 amd64Open vSwitch common components ii openvswitch-switch2.10.0-0ubuntu2~cloud0 amd64Open vSwitch switch implementations Sytem not reboot, from logs ovs-vswitchd.log: 2019-05-08T15:30:13.280Z|00726|connmgr|INFO|br-tun<->tcp:127.0.0.1:6633: 12 flow_mods in the 9 s starting 10 s ago (9 adds, 3 deletes) 2019-05-08T15:30:14.232Z|00727|bridge|INFO|bridge br-tun: added interface vxlan-0ac84a99 on port 55 2019-05-08T15:30:17.456Z|2|poll_loop(handler68)|INFO|wakeup due to [POLLIN] on fd 29 (unknown anon_inode:[eventpoll]) at ../lib/dpif-netlink.c:2786 (99% CPU usage) 2019-05-08T15:30:17.456Z|3|poll_loop(handler68)|INFO|wakeup due to [POLLIN] on fd 29 (unknown anon_inode:[eventpoll]) at ../lib/dpif-netlink.c:2786 (99% CPU usage) 2019-05-08T15:30:17.456Z|4|poll_loop(handler68)|INFO|wakeup due to [POLLIN] on fd 29 (unknown anon_inode:[eventpoll]) at ../lib/dpif-netlink.c:2786 (99% CPU usage) 2019-05-08T15:30:17.456Z|5|poll_loop(handler68)|INFO|wakeup due to [POLLIN] on fd 29 (unknown anon_inode:[eventpoll]) at ../lib/dpif-netlink.c:2786 (99% CPU usage) 2019-05-08T15:30:17.456Z|6|poll_loop(handler68)|INFO|wakeup due to [POLLIN] on fd 29 (unknown anon_inode:[eventpoll]) at ../lib/dpif-netlink.c:2786 (99% CPU usage) 2019-05-08T15:30:17.456Z|7|poll_loop(handler68)|INFO|wakeup due to [POLLIN] on fd 29 (unknown anon_inode:[eventpoll]) at ../lib/dpif-netlink.c:2786 (99% CPU usage) 2019-05-08T15:30:17.456Z|8|poll_loop(handler68)|INFO|wakeup due to [POLLIN] on fd 29 (unknown anon_inode:[eventpoll]) at ../lib/dpif-netlink.c:2786 (99% CPU usage) 2019-05-08T15:30:17.456Z|9|poll_loop(handler68)|INFO|wakeup due to [POLLIN] on fd 29 (unknown anon_inode:[eventpoll]) at ../lib/dpif-netlink.c:2786 (99% CPU usage) 2019-05-08T15:30:17.456Z|00010|poll_loop(handler68)|INFO|wakeup due to [POLLIN] on fd 29 (unknown anon_inode:[eventpoll]) at ../lib/dpif-netlink.c:2786 (99% CPU usage) 2019-05-08T15:30:17.456Z|00011|poll_loop(handler68)|INFO|wakeup due to [POLLIN] on fd 29 (unknown anon_inode:[eventpoll]) at ../lib/dpif-netlink.c:2786 (99% CPU usage) 2019-05-08T15:30:23.456Z|00012|poll_loop(handler68)|INFO|Dropped 2841636 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2019-05-08T15:30:23.456Z|00013|poll_loop(handler68)|INFO|wakeup due to [POLLIN] on fd 29 (unknown anon_inode:[eventpoll]) at ../lib/dpif-netlink.c:2786 (99% CPU usage) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs