Monitoring on kube-dns, reports a request rate of only about 2req/s. I've also run tcpdump on the kube-dns pods and I haven't managed to catch any problems. I'm not 100%, but DNS doesn't seem to be an issue. Any way to debug this from grpc-java? Is there any way to log how much time DNS takes?
According to the ChannelZ service, the connection is closed after the idle timeout (configured to 60 seconds on the client, 5 minutes on the server). keepAliveWithoutCalls is activated and keep alive packets are sent (tested with tcpdump), but the idle configuration and closes the channel. Is it supposed to do this? I would have expected keep alive packet to the connection alive... On Thu, Nov 22, 2018 at 3:19 PM tommy via grpc.io <[email protected]> wrote: > 5 seconds smells like a DNS timeout. > I don't know why that would affect issuing an RPC on an already > established connection thou. I assume you are not closing the channel > between calls? > > On Wednesday, November 21, 2018 at 4:54:22 PM UTC+1, Alexandru Keszeg > wrote: >> >> > Are the client and server on different machines over a WAN? >> >> Different machines running on Google Kubernetes service in two >> availability zones. >> >> > Are you certain the server side isn’t blocking? I would jstack the >> server as soon as a delay is detected during the test. I would also run a >> constant ping from client to server during the test to make sure there are >> no network failures. >> >> I am not running a test, I am trying to diagnose a problem in production. >> Randomly, 1 request in hundreds takes a few seconds. >> >> >> *To rephrase my question* >> I'm trying to find out what happened between 35ms and 5.07s. The log >> shows that the server receives the message after 5 seconds. Is there any >> way to debug this? Can I log how many requests are waiting for a >> connection? >> >> >> Here is the trace form both sides. The logs are from >> ClientTracingInterceptor >> https://github.com/grpc-ecosystem/grpc-opentracing/blob/master/java/src/main/java/io/opentracing/contrib/ClientTracingInterceptor.java#L180 >> >> Client: >> >> Logs (6) >> 33ms: >> >> - event=Started call >> >> 35ms: >> >> - event=Message sent >> >> 35ms: >> >> - event=Finished sending messages >> >> 5.08s: >> >> - Response headers received= >> >> Metadata(content-type=application/grpc,grpc-encoding=identity,grpc-accept-encoding=gzip) >> >> 5.08s: >> >> - event=Response received >> >> 5.08s: >> >> - event=Call closed >> >> >> >> Server: >> >> Logs (3) >> 5.07s: >> >> - Message received=accountIds: "facd3373-ffda-465b-8d32-50da88a4b3ab" >> accountIds: "f7c878c6-5c76-480b-ae54-d47ad5266344" accountIds: >> "ae6e4599-eb8c-4184-8ded-9317ddb2189d" >> >> 5.07s: >> >> - event=Client finished sending messages >> >> 5.08s: >> >> - event=Call completed >> >> >> There is how the server threads looked at this time(sum for 5 instances): >> >> [image: Capture.PNG] >> >> >> >> >> What I would expect to see is something like this, where the time is >> clearly spent in the server(and there is a span not shown with a "lock" >> operation that takes 5 seconds) >> >> Client: >> >> 21ms: >> >> - event=Started call >> >> 22ms: >> >> - event=Message sent >> >> 22ms: >> >> - event=Finished sending messages >> >> 5.52s: >> >> - Response headers received= >> >> Metadata(content-type=application/grpc,grpc-encoding=identity,grpc-accept-encoding=gzip) >> >> 5.52s: >> >> - event=Response received >> >> 5.52s: >> >> - event=Call closed >> >> >> Server: >> 27ms: >> >> - Message received=username: "123" accountId: "" >> >> 27ms: >> >> - event=Client finished sending messages >> >> 5.52s: >> >> - event=Call completed >> >> >> >> >> >> On Wed, Nov 21, 2018 at 5:25 PM Robert Engels <[email protected]> >> wrote: >> >>> Are the client and server on different machines over a WAN? >>> >>> Are you certain the server side isn’t blocking? I would jstack the >>> server as soon as a delay is detected during the test. I would also run a >>> constant ping from client to server during the test to make sure there are >>> no network failures. >>> >>> Other than that, I think you have to enable the gRPC tracing on both >>> sides to isolate where the hang is happening. >>> >>> On Nov 21, 2018, at 9:18 AM, [email protected] wrote: >>> >>> >>> final NettyChannelBuilder channelBuilder = NettyChannelBuilder. >>> forAddress(getHost(), getPort()) >>> .usePlaintext(grpcProperties.isUsePlainText()) >>> .loadBalancerFactory(RoundRobinLoadBalancerFactory. >>> getInstance()) >>> .intercept(getClientInterceptors()); >>> >>> addConnectionPersistenceConfig(channelBuilder); >>> >>> >>> if (grpcProperties.isEnableClientFixedConcurrency()) { >>> channelBuilder.executor(Executors.newFixedThreadPool( >>> grpcProperties.getClientThreadNumber())); >>> } >>> >>> >>> this.channel = channelBuilder.build(); >>> ... >>> >>> private void addConnectionPersistenceConfig(final >>> NettyChannelBuilder channelBuilder) { >>> if (grpcProperties.getClientKeepAlive() != 0) { >>> channelBuilder >>> .keepAliveTime(grpcProperties.getClientKeepAlive(), >>> SECONDS) // 5 >>> .keepAliveWithoutCalls(grpcProperties. >>> isClientKeepAliveWithoutCalls()) //true >>> .keepAliveTimeout(grpcProperties. >>> getClientKeepAliveTimeout(), SECONDS); //60 >>> } >>> >>> if (grpcProperties.getClientIdle() != 0) { >>> channelBuilder.idleTimeout(grpcProperties.getClientIdle(), >>> SECONDS); //60 >>> } >>> } >>> >>> I have added the relevant bits of code that build the client, i think it >>> should reuse the connection. >>> >>> >>> >>> >>> Another log from a client: >>> >>> 112ms: >>> >>> - event=Started call >>> >>> 113ms: >>> >>> - event=Message sent >>> >>> 113ms: >>> >>> - event=Finished sending messages >>> >>> 5.16s: >>> >>> - Response headers received= >>> >>> Metadata(content-type=application/grpc,grpc-encoding=identity,grpc-accept-encoding=gzip) >>> >>> 5.16s: >>> >>> - event=Response received >>> >>> 5.16s: >>> >>> - event=Call closed >>> >>> >>> >>> On the server side, there is no request that took more than 50ms. >>> >>> >>> Regarding file descriptors, both the client and server have about 100 >>> open file descriptors. >>> >>> >>> On Wednesday, November 21, 2018 at 4:19:59 PM UTC+2, Robert Engels wrote: >>>> >>>> There are also ways to abort the connection to avoid the close delay. >>>> >>>> On Nov 21, 2018, at 8:18 AM, Robert Engels <[email protected]> >>>> wrote: >>>> >>>> It could be a wait for tcp connection. If you are continually creating >>>> new connections, the server will run out of file descriptors since some >>>> connections will remain in a close wait state - so it has to wait for these >>>> to finally close in order to make a new connection. >>>> >>>> You might want to make sure your test is reusing connections. >>>> >>>> On Nov 21, 2018, at 8:15 AM, Alexandru Keszeg <[email protected]> >>>> wrote: >>>> >>>> That was my first thought as well, but monitoring doesn't show any long >>>> GC pauses. >>>> >>>> What seems odd is that I have not seen a "pause" between two query >>>> traces(check the attached image in the first post), only at the "start" of >>>> a request. >>>> >>>> On Wed, Nov 21, 2018 at 3:34 PM Robert Engels <[email protected]> >>>> wrote: >>>> >>>>> Maybe a full GC is occurring on the server ? That’s what I would look >>>>> for. >>>>> >>>>> On Nov 21, 2018, at 2:50 AM, [email protected] wrote: >>>>> >>>>> Randomly, some gRPC calls which usually complete in 20 milliseconds >>>>> take a few seconds. >>>>> We have Jager in place, traces show a few seconds where the call does >>>>> nothing and then processing begins, which seems to imply queuing of some >>>>> sorts? >>>>> >>>>> On the server side, we have fixed concurrency, a thread dump shows >>>>> most of them idle. >>>>> Our environment: Kuberentes 1.9 on Google Cloud, services are exposed >>>>> using ClusterIP: None, clients connect using DNS load balancing >>>>> >>>>> - Is there some build-in queuing on the server side? >>>>> - Is there any way to track the queue depth? >>>>> - Any other tips on debugging this? >>>>> >>>>> <Selection_120.png> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "grpc.io" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To post to this group, send email to [email protected]. >>>>> Visit this group at https://groups.google.com/group/grpc-io. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/grpc-io/3f8075e1-e261-45c9-865f-23285b98cca9%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/grpc-io/3f8075e1-e261-45c9-865f-23285b98cca9%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>>> <Selection_120.png> >>>>> >>>>> >>>> >>>> -- >>>> ____________________________________ >>>> >>>> >>>> <http://pitechplus.com> >>>> >>>> Alexandru Keszeg >>>> >>>> Developer >>>> >>>> +40 747124216 >>>> >>>> Coratim Business Center >>>> >>>> Campul Painii nr.3-5 >>>> >>>> Cluj Napoca ROMANIA >>>> >>>> This message (including any attachment(s)) may be copyright-protected >>>> and/or contain privileged and confidential information intended for use by >>>> the above-mentioned recipient only. If you are not the intended recipient >>>> of this message, then please inform the sender immediately via the >>>> telephone number, fax number or e-mail address indicated above and promptly >>>> delete this message from your system. Any unauthorized copying, disclosure >>>> to third parties or use of this message (including any attachment(s)) is >>>> strictly prohibited. It is generally accepted that the security of >>>> electronic communications is not failsafe. Despite our best efforts, we >>>> cannot guarantee that electronic communications received were in fact sent >>>> by the purported sender and we shall not be liable for the improper or >>>> incomplete transmission of the information contained in this communication, >>>> nor for any delay in its receipt or damage to your system. >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "grpc.io" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> Visit this group at https://groups.google.com/group/grpc-io. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/grpc-io/CACekFaEPNG6ZhXEZgyY-YwyNSE2V-aAgcsnCGcvN_nb13ovipg%40mail.gmail.com >>>> <https://groups.google.com/d/msgid/grpc-io/CACekFaEPNG6ZhXEZgyY-YwyNSE2V-aAgcsnCGcvN_nb13ovipg%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "grpc.io" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> Visit this group at https://groups.google.com/group/grpc-io. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/grpc-io/FFD79299-249F-47F3-88D8-3251B378B023%40earthlink.net >>>> <https://groups.google.com/d/msgid/grpc-io/FFD79299-249F-47F3-88D8-3251B378B023%40earthlink.net?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "grpc.io" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at https://groups.google.com/group/grpc-io. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/grpc-io/d3ae5c74-98db-430f-b750-7676170f407c%40googlegroups.com >>> <https://groups.google.com/d/msgid/grpc-io/d3ae5c74-98db-430f-b750-7676170f407c%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >>> -- >>> You received this message because you are subscribed to a topic in the >>> Google Groups "grpc.io" group. >>> To unsubscribe from this topic, visit >>> https://groups.google.com/d/topic/grpc-io/D2lsL82beCs/unsubscribe. >>> To unsubscribe from this group and all its topics, send an email to >>> [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at https://groups.google.com/group/grpc-io. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/grpc-io/092D7CAA-6C58-44C9-8194-E0AA9C75BC41%40earthlink.net >>> <https://groups.google.com/d/msgid/grpc-io/092D7CAA-6C58-44C9-8194-E0AA9C75BC41%40earthlink.net?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> -- >> ____________________________________ >> >> >> <http://pitechplus.com> >> >> Alexandru Keszeg >> >> Developer >> >> +40 747124216 >> >> Coratim Business Center >> >> Campul Painii nr.3-5 >> >> Cluj Napoca ROMANIA >> >> This message (including any attachment(s)) may be copyright-protected >> and/or contain privileged and confidential information intended for use by >> the above-mentioned recipient only. If you are not the intended recipient >> of this message, then please inform the sender immediately via the >> telephone number, fax number or e-mail address indicated above and promptly >> delete this message from your system. Any unauthorized copying, disclosure >> to third parties or use of this message (including any attachment(s)) is >> strictly prohibited. It is generally accepted that the security of >> electronic communications is not failsafe. Despite our best efforts, we >> cannot guarantee that electronic communications received were in fact sent >> by the purported sender and we shall not be liable for the improper or >> incomplete transmission of the information contained in this communication, >> nor for any delay in its receipt or damage to your system. >> >> -- > You received this message because you are subscribed to a topic in the > Google Groups "grpc.io" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/grpc-io/D2lsL82beCs/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/grpc-io. > To view this discussion on the web visit > https://groups.google.com/d/msgid/grpc-io/c82aa2b5-7965-44be-9e19-00430b5a82f7%40googlegroups.com > <https://groups.google.com/d/msgid/grpc-io/c82aa2b5-7965-44be-9e19-00430b5a82f7%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- ____________________________________ <http://pitechplus.com> Alexandru Keszeg Developer +40 747124216 Coratim Business Center Campul Painii nr.3-5 Cluj Napoca ROMANIA This message (including any attachment(s)) may be copyright-protected and/or contain privileged and confidential information intended for use by the above-mentioned recipient only. If you are not the intended recipient of this message, then please inform the sender immediately via the telephone number, fax number or e-mail address indicated above and promptly delete this message from your system. Any unauthorized copying, disclosure to third parties or use of this message (including any attachment(s)) is strictly prohibited. It is generally accepted that the security of electronic communications is not failsafe. Despite our best efforts, we cannot guarantee that electronic communications received were in fact sent by the purported sender and we shall not be liable for the improper or incomplete transmission of the information contained in this communication, nor for any delay in its receipt or damage to your system. -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/grpc-io. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/CACekFaHdjYdDs60285%2BqR-YbO1ZoDE%2BONuzRO%3Du1HFCHKhSpzg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
