The following minimized (pseudo)-code describes the reproduction scenario:
// Thread 1: Thread 2:
void* got_tag; void* got_tag;
bool ok; bool ok;
printf("Waiting for stream events");
while (cq1->Next(&got_tag, &ok)) while (cq2->Next(&got_tag, &ok))
{ {
static_cast<CallData*>(got_tag)->Update(ok); printf("New stream event");
printf("Sleeping 10s"); static_cast<StreamData*>(got_tag)->Update(ok);
std::this_thread::sleep_for(10s); printf("Stream event finished");
} }
In the error situation, the following output is generated:
Waiting for stream events
(...)
Sleeping 10s
(10 seconds delay)
New stream event
Stream event finished
New stream event
Stream event finished
On Monday, November 12, 2018 at 2:30:52 PM UTC+1, [email protected] wrote:
>
> I am working on a asynchronous server-side integration of the GRPC in C++.
> I already solved quite some mistakes and misunderstandings, and overall it
> is very stable. Just 1 issue in the startup behavior is making my life
> difficult for the time being.
>
> *Introduction:*
> I wrote a test that starts 1 client and restarts the server-side multiple
> times. With restarting I mean shutting down the completion queues with
> attached threads, including the 'grpc::Server' and re-creating them again.
> The client is never restarted and just reconnects. This consistently
> happens without any lockups or complaints from GRPC.
>
> Server-side there are 2 CompletionQueues, handled in 2 separate threads:
> 1. is accepting requests from the client and respond using
> ServerAsyncResponseWriter.
> 2. is accepting streams from the client and send updates from server to
> client using ServerAsyncWriter.
>
> Client-side there is 1 CompletionQueue to handle ClientAsyncReader events
> in it's own thread. Requests to the server are implemented synchronously.
> The backoff algorithm is configured to reconnect to the server within 1s
> +-0.2s. The client monitors the channel status using (async)
> NotifyOnStateChange with a timeout of 2 seconds and sends the stream
> requests as soon as the channel is up.
>
> I've separated the client and server implementation into 2 separate
> processes to ensure there is no interference whatsoever.
>
> *The issue:*
> Sometimes, the server seems to block all events in the 'stream
> CompletionQueue' (thread 2) when blocked in the request thread (1). More
> specifically: Thread 2 is blocked until ::grpc::CompletionQueue::Next is
> called in thread 1. I've deliberately added a long sleep just before
> calling cq1->Next in thread 1 to ensure the issue still reproduces and it
> does. The printf in thread 2 just after cq2->Next is not triggered until
> the sleep finishes.
>
> While sleeping, multiple stream connection attempts arrive from the client
> (supposedly in the second CompletionQueue). I verified this by capturing
> the TCP stream. These events arrive directly after the request message. As
> soon as 'Next' is called in thread 1, these connection attempts in thread 2
> are handled immediately.
>
> For me it reproduces about every 5-10 cycles. Is there a proper way to
> debug this behavior in GRPC? Which verbosity flags should I enable? Am I
> doing any correct assumptions about multiple CompletionQueues?
>
--
You received this message because you are subscribed to the Google Groups
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit
https://groups.google.com/d/msgid/grpc-io/476be267-024b-4e03-b9a4-b8ea830e488f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.