We are using gRPC (version  1.37.1) for our inter-process communication 
between our C# process and C++ process. Both processes act as a server and 
client with the other and run on the same machine over localhost using the 
HTTP/2 transport. All of the calls are use blocking synchronous unary calls 
and not bi-directional streaming. Some average(ish) stats:

>From C++->C#: 0-2 calls per second, 0-40 calls per minute

>From C#->C++: 0-5 calls per second, 0-200 calls per minute

Intermittently, we were getting one of 3 issues

   - C# client call to C++ server comes back with an RpcException, usually 
   “HTTP2/Parse Error”, “Endpoint Read Failed”, or “Transport Closed” 
   - C++ client call to C# server comes back with Unavailable or Unknown 
   - C++ client WaitForConnected call to check the channel fails after 500ms 

 

The top most one is the most frequent and where we have the most 
information about. Usually, what we’ll see is the Client receives the RPC 
call and runs into an unknown frame type. Then the subchannel goes into 
shutdown and everything usually re-connects fine. We also generally see an 
embedded error like the following (note that we replaced all __FILE__ 
instances to __FUNCTION__ in our gRPC source):

win_read","file_line":307,"os_error":"The system detected an invalid 
pointer address in attempting to use a pointer argument in a 
call.\r\n","syscall":"WSARecv","wsa_error":10014}]},{"created":"@1622120588.494000000","description":"frame
 
of size 262404 overflows local window of 
65535","file":"grpc_core::chttp2::TransportFlowControl::ValidateRecvData","file_line":213}]}

What we’ve seen with the unknown frame type, is that it parses the HEADERS, 
WINDOW_UPDATE, DATA, WINDOW_UPDATE and then gets a TCP: on_read without a 
corresponding READ and then tries to parse again. It’s this parse where it 
looks like the parser is at the wrong offset in the buffer, because it gets 
the unknown frame type, incoming frame size and incoming stream_id all map 
to the middle of the RPC call that it just parsed.

 

The above was what we were encountering prior to a change to create a new 
channel for each rpc call. While we realize it is not great from a 
performance standpoint, we have seen increased stability since making the 
change. However, we still do occasionally get rpc exceptions. Now, the most 
common is “Unknown”/”Stream Removed” rather than the ones listed above.


Any ideas on what might be going wrong is appreciated.

 

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/85ab5d38-dba5-4a37-b3b3-f7f3485252abn%40googlegroups.com.

Reply via email to