This is pretty strange. It is possible that we are being blocked on flow 
control. I would check that we are making sure that the application layer 
is reading. If I am not mistaken, `perform_stream_op[s=0x7f0e16937290]:  
RECV_MESSAGE` is a log that is seen at the start of an operation meaning 
that the HTTP/2 layer hasn't yet been instructed to read a message, (or 
there is a previous read on the stream already that hasn't finished). Given 
that you are just updating the gRPC version from 1.20 to 1.36.1, I do not 
have an answer as to why you would see this without any application 
changes. 

A few questions - 
Do the two streams use the same underlying channel/transport?
Are the clients and the server in the same process?
Is there anything special about the environment this is being run in?

(One way to make sure that the read op is being propagated to the transport 
layer, is to check the logs with the "channel" tracer.)
On Friday, March 19, 2021 at 12:59:30 PM UTC-7 Bryan Schwerer wrote:

> Hello,
>
> I'm in the long overdo process of updating gRPC from 1.20 to 1.36.1.  I am 
> running into an issue where the streaming replies from the server are not 
> reaching the client in about 50% of the instances.  This is binary, either 
> the streaming call works perfectly or it doesn't work at all.  After 
> debugging a bit, I turned on the http tracing and from what I can tell, the 
> http messages are received in the client thread, but where in the correct 
> case, perform_stream_op[s=0x7f0e16937290]:  RECV_MESSAGE is logged, but in 
> the broken case it isn't.  No error messages occur.
>
> I've tried various tracers, but haven't hit anything.  The code is pretty 
> much the same pattern as the example and there's no indication any 
> disconnect has occurred which would cause the call to terminate.  Using gdb 
> to look at the thread, it is still in epoll_wait.
>
> The process in which this runs calls 2 different synchronous server 
> streaming calls to the same server in separate threads.  It also is a gRPC 
> server.  Everything is run over the internal 'lo' interface.  Any ideas on 
> where to look to debug this?
>
> Thanks,
>
> Bryan
>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/8fdb96e8-e33e-4202-b218-c93a0baaad67n%40googlegroups.com.

Reply via email to