This specific error is triggered by the following. When a client connects to 
ovsdb json rpc server, it has to follow certain protocol. In this case, a the 
server sends probes, and the client must acknowledge them by sending the exact 
message it received from the server back to the server. If a client, does not 
do that in time, the server drops the client.

So something in this openstack driver is broken, because it does not respond to 
server probes.

Best Regards,
Paul Greenberg

________________________________
From: 20230277700n behalf of
Sent: Thursday, September 27, 2018 3:40 PM
To: [email protected]
Cc: [email protected]
Subject: Re: [ovs-discuss] "ovs|01253|reconnect|ERR|tcp:127.0.0.1:50814: no 
response to inactivity probe after 5.01 seconds, disconnecting" messages and 
lost packets


ovs-vswitchd is multi-threaded. ovsdb-server is single threaded.
(You did not answer my question about the file from which the logs were printed 
in your email)

Who is at 127.0.0.1:45928<http://127.0.0.1:45928/> and 
127.0.0.1:45930<http://127.0.0.1:45930/>?

On Thu, 27 Sep 2018 at 11:14, Jean-Philippe Méthot 
<[email protected]<mailto:[email protected]>> wrote:
Thank you for your reply.

This is Openstack with ml2 plugin. There’s no other 3rd party application used 
with our network, so no OVN or anything of the sort. Essentially, to give a 
quick idea of the topology, we have our vms on our compute nodes going through 
GRE tunnels toward network nodes where they are routed in network namespace 
toward a flat external network.

Generally, the above indicates that a daemon fronting a Open vSwitch database 
hasn't been able to connect to its client. Usually happens when CPU consumption 
is very high.

Our network nodes CPU are literally sleeping. Is openvswitch single-thread or 
multi-thread though? If ovs overloaded a single thread, it’s possible I may 
have missed it.

Jean-Philippe Méthot
Openstack system administrator
Administrateur système Openstack
PlanetHoster inc.




Le 27 sept. 2018 à 14:04, Guru Shetty <[email protected]<mailto:[email protected]>> a 
écrit :



On Wed, 26 Sep 2018 at 12:59, Jean-Philippe Méthot via discuss 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

I’ve been using openvswitch for my networking backend on openstack for several 
years now. Lately, as our network has grown, we’ve started noticing some 
intermittent packet drop accompanied with the following error message in 
openvswitch:

2018-09-26T04:15:20.676Z|00005|reconnect|ERR|tcp:127.0.0.1:45928<http://127.0.0.1:45928/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:15:20.677Z|00006|reconnect|ERR|tcp:127.0.0.1:45930<http://127.0.0.1:45930/>:
 no response to inactivity probe after 5 seconds, disconnecting

Open vSwitch is a project with multiple daemons. Since you are using OpenStack, 
it is not clear from your message, what type of networking plugin you are 
using. Do you use OVN?
Also, you did not mention from which file you have gotten the above errors.

Generally, the above indicates that a daemon fronting a Open vSwitch database 
hasn't been able to connect to its client. Usually happens when CPU consumption 
is very high.


2018-09-26T04:15:30.409Z|00007|reconnect|ERR|tcp:127.0.0.1:45874<http://127.0.0.1:45874/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:15:33.661Z|00008|reconnect|ERR|tcp:127.0.0.1:45934<http://127.0.0.1:45934/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:15:33.847Z|00009|reconnect|ERR|tcp:127.0.0.1:45894<http://127.0.0.1:45894/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:16:03.247Z|00010|reconnect|ERR|tcp:127.0.0.1:45958<http://127.0.0.1:45958/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:16:21.534Z|00011|reconnect|ERR|tcp:127.0.0.1:45956<http://127.0.0.1:45956/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:16:21.786Z|00012|reconnect|ERR|tcp:127.0.0.1:45974<http://127.0.0.1:45974/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:16:47.085Z|00013|reconnect|ERR|tcp:127.0.0.1:45988<http://127.0.0.1:45988/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:16:49.618Z|00014|reconnect|ERR|tcp:127.0.0.1:45982<http://127.0.0.1:45982/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:16:53.321Z|00015|reconnect|ERR|tcp:127.0.0.1:45964<http://127.0.0.1:45964/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:17:15.543Z|00016|reconnect|ERR|tcp:127.0.0.1:45986<http://127.0.0.1:45986/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:17:24.767Z|00017|reconnect|ERR|tcp:127.0.0.1:45990<http://127.0.0.1:45990/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:17:31.735Z|00018|reconnect|ERR|tcp:127.0.0.1:45998<http://127.0.0.1:45998/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:20:12.593Z|00019|reconnect|ERR|tcp:127.0.0.1:46014<http://127.0.0.1:46014/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:23:51.996Z|00020|reconnect|ERR|tcp:127.0.0.1:46028<http://127.0.0.1:46028/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:25:12.187Z|00021|reconnect|ERR|tcp:127.0.0.1:46022<http://127.0.0.1:46022/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:25:28.871Z|00022|reconnect|ERR|tcp:127.0.0.1:46056<http://127.0.0.1:46056/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:27:11.663Z|00023|reconnect|ERR|tcp:127.0.0.1:46046<http://127.0.0.1:46046/>:
 no response to inactivity probe after 5 seconds, disconnecting
2018-09-26T04:29:56.161Z|00024|jsonrpc|WARN|tcp:127.0.0.1:46018<http://127.0.0.1:46018/>:
 receive error: Connection reset by peer
2018-09-26T04:29:56.161Z|00025|reconnect|WARN|tcp:127.0.0.1:46018<http://127.0.0.1:46018/>:
 connection dropped (Connection reset by peer)

This definitely kills the connection for a few seconds before it reconnects. 
So, I’ve been wondering, what is this probe and what is really happening here? 
What’s the cause and is there a way to fix this?

Openvswitch version is 2.9.0-3 on CentOS 7 with Openstack Pike running on it 
(but the issues show up on Queens too).


Jean-Philippe Méthot
Openstack system administrator
Administrateur système Openstack
PlanetHoster inc.




_______________________________________________
discuss mailing list
[email protected]<mailto:[email protected]>
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to