ovs-vswitchd is multi-threaded. ovsdb-server is single threaded. (You did not answer my question about the file from which the logs were printed in your email)
Who is at 127.0.0.1:45928 and 127.0.0.1:45930? On Thu, 27 Sep 2018 at 11:14, Jean-Philippe Méthot < [email protected]> wrote: > Thank you for your reply. > > This is Openstack with ml2 plugin. There’s no other 3rd party application > used with our network, so no OVN or anything of the sort. Essentially, to > give a quick idea of the topology, we have our vms on our compute nodes > going through GRE tunnels toward network nodes where they are routed in > network namespace toward a flat external network. > > Generally, the above indicates that a daemon fronting a Open vSwitch > database hasn't been able to connect to its client. Usually happens when > CPU consumption is very high. > > > Our network nodes CPU are literally sleeping. Is openvswitch single-thread > or multi-thread though? If ovs overloaded a single thread, it’s possible I > may have missed it. > > Jean-Philippe Méthot > Openstack system administrator > Administrateur système Openstack > PlanetHoster inc. > > > > > Le 27 sept. 2018 à 14:04, Guru Shetty <[email protected]> a écrit : > > > > On Wed, 26 Sep 2018 at 12:59, Jean-Philippe Méthot via discuss < > [email protected]> wrote: > >> Hi, >> >> I’ve been using openvswitch for my networking backend on openstack for >> several years now. Lately, as our network has grown, we’ve started noticing >> some intermittent packet drop accompanied with the following error message >> in openvswitch: >> >> 2018-09-26T04:15:20.676Z|00005|reconnect|ERR|tcp:127.0.0.1:45928: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:15:20.677Z|00006|reconnect|ERR|tcp:127.0.0.1:45930: no >> response to inactivity probe after 5 seconds, disconnecting >> > > Open vSwitch is a project with multiple daemons. Since you are using > OpenStack, it is not clear from your message, what type of networking > plugin you are using. Do you use OVN? > Also, you did not mention from which file you have gotten the above errors. > > Generally, the above indicates that a daemon fronting a Open vSwitch > database hasn't been able to connect to its client. Usually happens when > CPU consumption is very high. > > > >> 2018-09-26T04:15:30.409Z|00007|reconnect|ERR|tcp:127.0.0.1:45874: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:15:33.661Z|00008|reconnect|ERR|tcp:127.0.0.1:45934: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:15:33.847Z|00009|reconnect|ERR|tcp:127.0.0.1:45894: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:16:03.247Z|00010|reconnect|ERR|tcp:127.0.0.1:45958: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:16:21.534Z|00011|reconnect|ERR|tcp:127.0.0.1:45956: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:16:21.786Z|00012|reconnect|ERR|tcp:127.0.0.1:45974: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:16:47.085Z|00013|reconnect|ERR|tcp:127.0.0.1:45988: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:16:49.618Z|00014|reconnect|ERR|tcp:127.0.0.1:45982: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:16:53.321Z|00015|reconnect|ERR|tcp:127.0.0.1:45964: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:17:15.543Z|00016|reconnect|ERR|tcp:127.0.0.1:45986: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:17:24.767Z|00017|reconnect|ERR|tcp:127.0.0.1:45990: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:17:31.735Z|00018|reconnect|ERR|tcp:127.0.0.1:45998: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:20:12.593Z|00019|reconnect|ERR|tcp:127.0.0.1:46014: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:23:51.996Z|00020|reconnect|ERR|tcp:127.0.0.1:46028: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:25:12.187Z|00021|reconnect|ERR|tcp:127.0.0.1:46022: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:25:28.871Z|00022|reconnect|ERR|tcp:127.0.0.1:46056: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:27:11.663Z|00023|reconnect|ERR|tcp:127.0.0.1:46046: no >> response to inactivity probe after 5 seconds, disconnecting >> 2018-09-26T04:29:56.161Z|00024|jsonrpc|WARN|tcp:127.0.0.1:46018: receive >> error: Connection reset by peer >> 2018-09-26T04:29:56.161Z|00025|reconnect|WARN|tcp:127.0.0.1:46018: >> connection dropped (Connection reset by peer) >> >> This definitely kills the connection for a few seconds before it >> reconnects. So, I’ve been wondering, what is this probe and what is really >> happening here? What’s the cause and is there a way to fix this? >> >> Openvswitch version is 2.9.0-3 on CentOS 7 with Openstack Pike running on >> it (but the issues show up on Queens too). >> >> >> Jean-Philippe Méthot >> Openstack system administrator >> Administrateur système Openstack >> PlanetHoster inc. >> >> >> >> >> _______________________________________________ >> discuss mailing list >> [email protected] >> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > >
_______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
