The following minimized (pseudo)-code describes the reproduction scenario:

// Thread 1:                                            Thread 2:
void* got_tag; void* got_tag;
bool ok; bool ok;

printf("Waiting for stream events");
while (cq1->Next(&got_tag, &ok)) while (cq2->Next(&got_tag, &ok))
{ {
static_cast<CallData*>(got_tag)->Update(ok); printf("New stream event");
printf("Sleeping 10s"); static_cast<StreamData*>(got_tag)->Update(ok);
std::this_thread::sleep_for(10s); printf("Stream event finished");
} }

In the error situation, the following output is generated:

Waiting for stream events
  (...)
Sleeping 10s
  (10 seconds delay)
New stream event
Stream event finished
New stream event
Stream event finished


On Monday, November 12, 2018 at 2:30:52 PM UTC+1, [email protected] wrote:
>
> I am working on a asynchronous server-side integration of the GRPC in C++. 
> I already solved quite some mistakes and misunderstandings, and overall it 
> is very stable. Just 1 issue in the startup behavior is making my life 
> difficult for the time being.
>
> *Introduction:*
> I wrote a test that starts 1 client and restarts the server-side multiple 
> times. With restarting I mean shutting down the completion queues with 
> attached threads, including the 'grpc::Server' and re-creating them again. 
> The client is never restarted and just reconnects. This consistently 
> happens without any lockups or complaints from GRPC.
>
> Server-side there are 2 CompletionQueues, handled in 2 separate threads:
> 1. is accepting requests from the client and respond using 
> ServerAsyncResponseWriter.
> 2. is accepting streams from the client and send updates from server to 
> client using ServerAsyncWriter.
>
> Client-side there is 1 CompletionQueue to handle ClientAsyncReader events 
> in it's own thread. Requests to the server are implemented synchronously.
> The backoff algorithm is configured to reconnect to the server within 1s 
> +-0.2s. The client monitors the channel status using (async) 
> NotifyOnStateChange with a timeout of 2 seconds and sends the stream 
> requests as soon as the channel is up.
>
> I've separated the client and server implementation into 2 separate 
> processes to ensure there is no interference whatsoever.
>
> *The issue:*
> Sometimes, the server seems to block all events in the 'stream 
> CompletionQueue' (thread 2) when blocked in the request thread (1). More 
> specifically: Thread 2 is blocked until ::grpc::CompletionQueue::Next is 
> called in thread 1. I've deliberately added a long sleep just before 
> calling cq1->Next in thread 1 to ensure the issue still reproduces and it 
> does. The printf in thread 2 just after cq2->Next is not triggered until 
> the sleep finishes.
>
> While sleeping, multiple stream connection attempts arrive from the client 
> (supposedly in the second CompletionQueue). I verified this by capturing 
> the TCP stream. These events arrive directly after the request message. As 
> soon as 'Next' is called in thread 1, these connection attempts in thread 2 
> are handled immediately.
>
> For me it reproduces about every 5-10 cycles. Is there a proper way to 
> debug this behavior in GRPC? Which verbosity flags should I enable? Am I 
> doing any correct assumptions about multiple CompletionQueues?
>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/476be267-024b-4e03-b9a4-b8ea830e488f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to