grpc github issue: https://github.com/grpc/grpc/issues/18554
StackOverflow post:
https://stackoverflow.com/questions/55460086/client-channel-unusable-after-a-network-reset
Summary: If a client channel is in a READY state and the network is
disconnected, the channel becomes unusable and the client will not attempt
to reconnect to the server once the network connection is re-established. *The
channel does not transition from a READY state to a TRANSIENT_FAILURE on a
DEADLINE_EXCEEDED error (deadline set by my client application).*
What version of gRPC and what language are you using?
1.17.2
Same issue experience in version 1.11.x
C++
What operating system (Linux, Windows,...) and version?
Client running on Ubuntu 16.04.
Server running Windows Enterprise.
What did you do?
Server and client are both started on a connected network. I can
successfully make calls and receive responses from the server. When the
network is turned off, the server receives a "Disconnected client -
Endpoint read failed" error. Some other relevant fields in this debug
message - "grpc_status":14 (UNAVAILABLE), "occured_during_write":0,
"description":"An established connection was aborted by the software in
your host machine".
At the time of network disconnect, the client does not print out any logs
at all (using
GRPC_TRACE=connectivity_state,call_error,op_failure,server_channel,client_channel,channel
GRPC_VERBOSITY=DEBUG).
Once the network is turned on again there are no logs experienced on
neither the server nor the client. Attempting to make a call using the
client (send a launch request) results in a repeated DEADLINE_EXCEEDED
error. Turning off the network connection at this time does not result in a
server side "Disconnected client" error.
The client context is set to use a deadline (tested with 2 and 10 seconds).
Synchronous calls are used in this case.
*Code sniplets:*
*/rpc_service.proto*
syntax = "proto3";
import "google/rpc/status.proto";
message RpcRequest {
}
message RpcResponse {
}
service RpcService{
rpc Call(RpcRequest) returns (RpcResponse);
}
*/client.cc*
Initialization:
std::unique_ptrRpcService::Stub stub_ = RpcService::NewStub(::grpc::
CreateChannel(
server_endpoint, ::grpc::InsecureChannelCredentials()));
*Sending a rpc request:*
::grpc::ClientContext context;
context.set_deadline(
gpr_time_from_micros(call_timeout_.InMicroseconds(), GPR_TIMESPAN));
RpcRequest request;
RpcResponse response;
::grpc::Status grpc_status = stub_->Call(&context, request, &response);
*/server.cc*
grpc::ServerBuilder builder;
builder.AddListeningPort(endpoint, ::grpc::InsecureServerCredentials());
builder.RegisterService(&rpc_service);
std::unique_ptrgrpc::Server grpc_server_ = builder.BuildAndStart();
What did you expect to see?
Client should make a successful call after a network reset.
What did you see instead?
Client fails to receive a response from the server.
Anything else we should know about your project / environment?
When the network connection is re-established and the client fails to
receive a response from the server, tcpdump captures the client sending out
some packets.
Starting up both client and server with network ON, and then unplugging the
network does not result in any error messages until a call is attempted.
This is the same result as when starting both client and server with the
network disconnected. Once a call is attempted the client will transition
from IDLE to CONNECTING and then begin to bounce back and forth between
CONNECTING and TRANSIENT_FAILURE states (attempting to reconnect with
exponential back-off) until the connection is re-established.
If the client is started with the network connected, but doesn't send a
request and the network is disconnected the server doesn't get a
disconnected client error. Until a call is made, the client stays in a
"IDLE".
If a client is initialized and a call is made on a disconnected network,
then the client will enter a CONNECTING state (with exponential backoff up
to a max of 2 min where the client will be in a TRANSIENT_FAILURE state).
Once the network is connected, the connection will be re-established the
next time the channel will enter a CONNECTING state and the client will
enter the READY state. After this, each call will succeed until the network
is reset.
Disconnecting the network after the client is in a READY state will not
transition the client out of a READY state.
In summary: Until a call is made, the client will stay in an "IDLE" state
no matter the network status. Once a call is made, the client will attempt
to make a connection by entering the CONNECTING state. If no connection is
found, it will transition bounce in-between CONNECTING and
TRANSIENT_FAILURE states. Once a connection is found, the client will go
into a READY state. From here, if a connection is lost, the client will not
attempt to enter a CONNECTING state again.
*Similar issue (closed) to the one I’m having:*
https://github.com/grpc/grpc/issues/16974
Known fix
Create a new channel on each call.
Failed fix attempts
Set GRPC_ARG_HTTP2_MAX_PINGS_WITHOUT_DATA = 0
Questions
Should the client be able to use the already created channel after a
network reset?
Does the channel have to be restarted when a network is reset?
--
You received this message because you are subscribed to the Google Groups
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit
https://groups.google.com/d/msgid/grpc-io/bf306179-6e7b-4edb-a205-6df4ad7a0125%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.