Hello all,
In the OVN Kubernetes project, we are seeing a ton of following errors in
ovsdb-server-nb.log and ovsdb-server-sb.log.
---------8<------------8<------
2020-07-07T06:30:53.320Z|20733|jsonrpc|WARN|unix#3145: send error: Broken
pipe
2020-07-07T06:30:53.320Z|20734|reconnect|WARN|unix#3145: connection dropped
(Broken pipe)
---------8<------------8<------
(Note: The N in uniX#N used for unbound unix sockets is monotonically
increasing number. So, with unix#3145 there are more than 3000 such errors
in the log file)
I am going to be vague in explaining why it might be happening, so sincere
apologies for that. So, this is what we are seeing and how to reproduce
the issue.
1. Unhandled reply case
========================
(a) Client
~~~~~~~~~~~
$ ovn-sbctl -vjsonrpc -vstream --no-leader-only
--db=unix:/var/run/openvswitch/ovnsb_db.sock get SB_Global .
options:e2e_timestamp
<output snipped>
2020-07-07T04:54:22Z|00006|jsonrpc|DBG|unix:/var/run/openvswitch/ovnsb_db.sock:
send request, method="set_db_change_aware", params=[true], id=3
<output snipped>
"1594097605" (<-- returned value of e2e_timestamp)
(b) Server
~~~~~~~~~~~
<output snipped>
2020-07-07T06:30:53.320Z|20731|jsonrpc|DBG|unix#3145: received request,
method="set_db_change_aware", params=[true], id=3
2020-07-07T06:30:53.320Z|20732|jsonrpc|DBG|unix#3145: send reply,
result={}, id=3
2020-07-07T06:30:53.320Z|20733|jsonrpc|WARN|unix#3145: send error: Broken
pipe
2020-07-07T06:30:53.320Z|20734|reconnect|WARN|unix#3145: connection dropped
(Broken pipe)
<output snipped>
So, the OVSDB server did try to send the reply. However, before it could
send it through stream-fd.c`fd_send() the client program exits and closed
the Unix Socket and we see broken pipe on the sender
2. Unexpected reply case
=========================
(a) Client
~~~~~~~~~~~
$ ovn-sbctl -v --no-leader-only
--db=unix:/var/run/openvswitch/ovnsb_db.sock get datapath_binding
3b735851-db34-489a-a54c-8bb947431a74 external_ids:logical-switch
<output snipped>
2020-07-07T06:43:18Z|00021|ovsdb_idl|DBG|unix:/var/run/openvswitch/ovnsb_db.sock:
DATA_MONITOR_COND_SINCE_REQUESTED -> MONITORING at lib/ovsdb-idl.c:764
2020-07-07T06:43:18Z|00022|jsonrpc|DBG|unix:/var/run/openvswitch/ovnsb_db.sock:
received reply, result={}, id=3
2020-07-07T06:43:18Z|00023|ovsdb_idl|DBG|unix:/var/run/openvswitch/ovnsb_db.sock:
received unexpected reply message: {"error":null,"id":3,"result":{}}
<output snipped>
(b) Server
~~~~~~~~~~~
<output snipped>
2020-07-07T06:43:18.402Z|21691|jsonrpc|DBG|unix#3281: received request,
method="set_db_change_aware", params=[true], id=3
2020-07-07T06:43:18.402Z|21692|jsonrpc|DBG|unix#3281: send reply,
result={}, id=3
<output snipped>
So, in the 2nd case the Server's reply makes it all the way to the Client,
but the client is not expecting it.
So, looking at the lib/ovsdb-idl.c's ovsdb_idl_process_response()....
ovsdb_idl_process_response() {
case IDL_S_SERVER_MONITOR_COND_REQUESTED:
if (ok) {
idl->server.monitoring = OVSDB_IDL_MONITORING_COND;
ovsdb_idl_db_parse_monitor_reply(&idl->server, msg->result,
OVSDB_IDL_MM_MONITOR_COND);
if (ovsdb_idl_check_server_db(idl)) {
ovsdb_idl_send_db_change_aware(idl);
}
} else {
}
... I see that we call the OVSDB set_db_change_aware method through
ovsdb_idl_send_db_change_aware(), but there is no state machine to process
the reply for it. In the 1st case, the client exits early and connection is
broken? Whilst, in the 2nd case the reply comes through but the code is not
expecting it.
Regards,
~Girish
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss