Are the client and server on different machines over a WAN? Are you certain the server side isn’t blocking? I would jstack the server as soon as a delay is detected during the test. I would also run a constant ping from client to server during the test to make sure there are no network failures.
Other than that, I think you have to enable the gRPC tracing on both sides to isolate where the hang is happening. > On Nov 21, 2018, at 9:18 AM, [email protected] wrote: > > > final NettyChannelBuilder channelBuilder = > NettyChannelBuilder.forAddress(getHost(), getPort()) > .usePlaintext(grpcProperties.isUsePlainText()) > > .loadBalancerFactory(RoundRobinLoadBalancerFactory.getInstance()) > .intercept(getClientInterceptors()); > > addConnectionPersistenceConfig(channelBuilder); > > > if (grpcProperties.isEnableClientFixedConcurrency()) { > > channelBuilder.executor(Executors.newFixedThreadPool(grpcProperties.getClientThreadNumber())); > } > > > this.channel = channelBuilder.build(); > ... > > private void addConnectionPersistenceConfig(final NettyChannelBuilder > channelBuilder) { > if (grpcProperties.getClientKeepAlive() != 0) { > channelBuilder > .keepAliveTime(grpcProperties.getClientKeepAlive(), > SECONDS) // 5 > > .keepAliveWithoutCalls(grpcProperties.isClientKeepAliveWithoutCalls()) //true > > .keepAliveTimeout(grpcProperties.getClientKeepAliveTimeout(), SECONDS); //60 > } > > if (grpcProperties.getClientIdle() != 0) { > channelBuilder.idleTimeout(grpcProperties.getClientIdle(), > SECONDS); //60 > } > } > > > I have added the relevant bits of code that build the client, i think it > should reuse the connection. > > > > > Another log from a client: > > 112ms:event=Started call > 113ms:event=Message sent > 113ms:event=Finished sending messages > 5.16s:Response headers > received=Metadata(content-type=application/grpc,grpc-encoding=identity,grpc-accept-encoding=gzip) > 5.16s:event=Response received > 5.16s:event=Call closed > > > On the server side, there is no request that took more than 50ms. > > > Regarding file descriptors, both the client and server have about 100 open > file descriptors. > > >> On Wednesday, November 21, 2018 at 4:19:59 PM UTC+2, Robert Engels wrote: >> There are also ways to abort the connection to avoid the close delay. >> >>> On Nov 21, 2018, at 8:18 AM, Robert Engels <[email protected]> wrote: >>> >>> It could be a wait for tcp connection. If you are continually creating new >>> connections, the server will run out of file descriptors since some >>> connections will remain in a close wait state - so it has to wait for these >>> to finally close in order to make a new connection. >>> >>> You might want to make sure your test is reusing connections. >>> >>>> On Nov 21, 2018, at 8:15 AM, Alexandru Keszeg <[email protected]> >>>> wrote: >>>> >>>> That was my first thought as well, but monitoring doesn't show any long GC >>>> pauses. >>>> >>>> What seems odd is that I have not seen a "pause" between two query >>>> traces(check the attached image in the first post), only at the "start" of >>>> a request. >>>> >>>>> On Wed, Nov 21, 2018 at 3:34 PM Robert Engels <[email protected]> >>>>> wrote: >>>>> Maybe a full GC is occurring on the server ? That’s what I would look >>>>> for. >>>>> >>>>>> On Nov 21, 2018, at 2:50 AM, [email protected] wrote: >>>>>> >>>>>> Randomly, some gRPC calls which usually complete in 20 milliseconds take >>>>>> a few seconds. >>>>>> We have Jager in place, traces show a few seconds where the call does >>>>>> nothing and then processing begins, which seems to imply queuing of some >>>>>> sorts? >>>>>> >>>>>> On the server side, we have fixed concurrency, a thread dump shows most >>>>>> of them idle. >>>>>> Our environment: Kuberentes 1.9 on Google Cloud, services are exposed >>>>>> using ClusterIP: None, clients connect using DNS load balancing >>>>>> >>>>>> - Is there some build-in queuing on the server side? >>>>>> - Is there any way to track the queue depth? >>>>>> - Any other tips on debugging this? >>>>>> <Selection_120.png> >>>>>> >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "grpc.io" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>>> an email to [email protected]. >>>>>> To post to this group, send email to [email protected]. >>>>>> Visit this group at https://groups.google.com/group/grpc-io. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/grpc-io/3f8075e1-e261-45c9-865f-23285b98cca9%40googlegroups.com. >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> <Selection_120.png> >>>> >>>> >>>> -- >>>> ____________________________________ >>>> >>>> >>>> >>>> Alexandru Keszeg >>>> Developer >>>> +40 747124216 >>>> Coratim Business Center >>>> Campul Painii nr.3-5 >>>> Cluj Napoca ROMANIA >>>> This message (including any attachment(s)) may be copyright-protected >>>> and/or contain privileged and confidential information intended for use by >>>> the above-mentioned recipient only. If you are not the intended recipient >>>> of this message, then please inform the sender immediately via the >>>> telephone number, fax number or e-mail address indicated above and >>>> promptly delete this message from your system. Any unauthorized copying, >>>> disclosure to third parties or use of this message (including any >>>> attachment(s)) is strictly prohibited. It is generally accepted that the >>>> security of electronic communications is not failsafe. Despite our best >>>> efforts, we cannot guarantee that electronic communications received were >>>> in fact sent by the purported sender and we shall not be liable for the >>>> improper or incomplete transmission of the information contained in this >>>> communication, nor for any delay in its receipt or damage to your system. >>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google Groups >>>> "grpc.io" group. >>>> To unsubscribe from this group and stop receiving emails from it, send an >>>> email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> Visit this group at https://groups.google.com/group/grpc-io. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/grpc-io/CACekFaEPNG6ZhXEZgyY-YwyNSE2V-aAgcsnCGcvN_nb13ovipg%40mail.gmail.com. >>>> For more options, visit https://groups.google.com/d/optout. >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "grpc.io" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at https://groups.google.com/group/grpc-io. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/grpc-io/FFD79299-249F-47F3-88D8-3251B378B023%40earthlink.net. >>> For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "grpc.io" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/grpc-io. > To view this discussion on the web visit > https://groups.google.com/d/msgid/grpc-io/d3ae5c74-98db-430f-b750-7676170f407c%40googlegroups.com. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/grpc-io. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/092D7CAA-6C58-44C9-8194-E0AA9C75BC41%40earthlink.net. For more options, visit https://groups.google.com/d/optout.
