Felix, Thanks for the advice , I'm looking into the relay and had come across this https://www.openvswitch.org/support/ovscon2019/day1/1436-OVSCON-Nouman.pdf
which was very useful. Gav On Mon, 1 May 2023 at 23:58, Felix Hüttner <felix.huettner@mail.schwarz> wrote: > Hi Gavin, > > > > we saw similar issues after reaching a certain number of hypervisors. This > happened because our ovsdb processes ran at 100% cpu utilization (and they > are not multithreaded). > > > > Our solutions where: > > 1. If you use ssl on your north-/southbound db. Disable it and add a > tls terminating reverse proxy (like traefik) in front > 2. Increase the inactivity probe significantly (you might need to > change it on the ovn-controller and ovsdb side, not sure anymore) > 3. Introduce ovsdb relays and connect the ovn-controllers there. > > > > -- > > Felix Huettner > > > > *From:* discuss <ovs-discuss-boun...@openvswitch.org> * On Behalf Of *Gavin > McKee via discuss > *Sent:* Monday, May 1, 2023 9:20 PM > *To:* ovs-discuss <ovs-discuss@openvswitch.org> > *Subject:* [ovs-discuss] CPU pinned at 100% , ovn-controller to ovnsb_db > unstable > > > > Hi , > > I'm having a pretty bad issue with OVN controller on the hypervisors being > unable to connect to the OVS SB DB , > > > > 2023-05-01T19:13:33.969Z|00541|reconnect|ERR|tcp:10.193.1.2:6642: no > response to inactivity probe after 5 seconds, disconnecting > 2023-05-01T19:13:33.969Z|00542|reconnect|INFO|tcp:10.193.1.2:6642: > connection dropped > 2023-05-01T19:13:43.043Z|00543|reconnect|INFO|tcp:10.193.1.2:6642: > connected > 2023-05-01T19:13:56.115Z|00544|reconnect|ERR|tcp:10.193.1.2:6642: no > response to inactivity probe after 5 seconds, disconnecting > 2023-05-01T19:13:56.115Z|00545|reconnect|INFO|tcp:10.193.1.2:6642: > connection dropped > 2023-05-01T19:14:36.177Z|00546|reconnect|INFO|tcp:10.193.1.2:6642: > connected > 2023-05-01T19:14:44.996Z|00547|jsonrpc|WARN|tcp:10.193.1.2:6642: receive > error: Connection reset by peer > 2023-05-01T19:14:44.996Z|00548|reconnect|WARN|tcp:10.193.1.2:6642: > connection dropped (Connection reset by peer) > 2023-05-01T19:15:44.131Z|00549|reconnect|INFO|tcp:10.193.1.2:6642: > connected > 2023-05-01T19:15:54.137Z|00550|reconnect|ERR|tcp:10.193.1.2:6642: no > response to inactivity probe after 5 seconds, disconnecting > 2023-05-01T19:15:54.137Z|00551|reconnect|INFO|tcp:10.193.1.2:6642: > connection dropped > 2023-05-01T19:16:02.184Z|00552|reconnect|INFO|tcp:10.193.1.2:6642: > connected > 2023-05-01T19:16:14.488Z|00553|reconnect|ERR|tcp:10.193.1.2:6642: no > response to inactivity probe after 5 seconds, disconnecting > 2023-05-01T19:16:14.488Z|00554|reconnect|INFO|tcp:10.193.1.2:6642: > connection dropped > > > > This happened after pushing a configuration to north db for around 250 > logical switch ports. > > Once I turn on the VM's everything goes bad very quickly, > > > > > > 2023-05-01T04:27:09.294Z|01947|poll_loop|INFO|wakeup due to [POLLOUT] on > fd 66 (10.193.200.6:6642<->10.193.0.102:48794) at ../lib/stream-fd.c:153 > (100% CPU usage) > > > > Can anyone provide any guidance how to run down an issue like this ? > > Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für > die Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht > der vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich > in Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie > hier <https://www.datenschutz.schwarz/>. >
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss