[grpc-io] Re: Detecting bad connection in java

yfertel Wed, 05 Jun 2019 17:24:38 -0700

So it seems like I'm getting hit with the problem addressed in 
https://github.com/grpc/grpc-java/pull/5599 and 
https://github.com/grpc/proposal/blob/master/A18-tcp-user-timeout.md. It 
exactly matches the 15min it takes to reset, the write heavy workload and 
the default linux TCP timeout.
I'm sadly unable to upgrade to the latest release (1.21.0) that has the fix 
at the moment. Wonder if anyone has another way around this?


Thanks, 

Yosef

On Wednesday, June 5, 2019 at 3:01:42 PM UTC-4, Carl Mastrangelo wrote:
>
> You'll need to read the docs of that method I linked (and the other ones 
> with "keepalive" in the name).  It discusses what will happen with regards 
> to active calls.
>
> On Wednesday, June 5, 2019 at 11:02:34 AM UTC-7, [email protected] wrote:
>>
>> Hey Carl, 
>>
>> Thanks for your help with this!
>>
>> Is it correct that if I am submitting calls every 10 seconds, but my 
>> keepAliveTime > 10seconds, then no keep alives would be sent? Or are keep 
>> alives sent regardless of request activity?
>> Is it only failed keep alive pings that will indicate a channel is down 
>> but not regular requests? 
>>
>> Thanks again,
>>
>> Yosef
>>
>> On Wednesday, June 5, 2019 at 1:57:25 PM UTC-4, Carl Mastrangelo wrote:
>>>
>>> Try setting the keep alive settings defined here:
>>>
>>>
>>> https://grpc.github.io/grpc-java/javadoc/io/grpc/ManagedChannelBuilder.html#keepAliveTimeout-long-java.util.concurrent.TimeUnit-
>>>
>>> On Wednesday, June 5, 2019 at 10:48:38 AM UTC-7, [email protected] wrote:
>>>>
>>>> Hey folks, 
>>>>
>>>> I'm making calls using a `ManagedChannel` over a VPN connection. When 
>>>> the VPN renegotiates (~ 8 hours), a periodic call (issued every 10 
>>>> seconds) 
>>>> that would ordinarily take 1ms times out repeatedly for about 15 min 
>>>> instead before the `ManagedChannel` figures out the connection is bad 
>>>> (fails with `RuntimeException: UNAVAILABLE`) and resets. Is there a way to 
>>>> make that happen faster? I've set the deadline for this call to 2 seconds 
>>>> which is typically more than enough time (usually comes back in under 10 
>>>> ms). The logs end up looking something like 
>>>>
>>>>
>>>> 14:14:05.256 io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: 
>>>> deadline exceeded after 4999901019ns
>>>> 14:14:15.257 io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: 
>>>> deadline exceeded after 4999901019ns
>>>>
>>>> ... call is made every 10 sec and fails like this for 15 min before 
>>>>
>>>> 14:30:42.236 java.io.IOException: Connection timed out
>>>> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>>> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>>> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>>>> at sun.nio.ch.IOUtil.read(IOUtil.java:192)
>>>> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
>>>> at 
>>>> io.grpc.netty.shaded.io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:288)
>>>> at 
>>>> io.grpc.netty.shaded.io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1132)
>>>> at 
>>>> io.grpc.netty.shaded.io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:347)
>>>> at 
>>>> io.grpc.netty.shaded.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148)
>>>> at 
>>>> io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656)
>>>> at 
>>>> io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:591)
>>>> at 
>>>> io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:508)
>>>> at 
>>>> io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470)
>>>> at 
>>>> io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)
>>>> at 
>>>> io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>>>> at java.lang.Thread.run(Thread.java:748)
>>>> Wrapped by: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
>>>>
>>>> After which everything resumes normally.
>>>>
>>>> How can I get `ManagedChannel` to pick up that the connection is bad 
>>>> and to reset earlier than 15min. I've tested outside of grpc that the VPN 
>>>> renegotiation takes about 2 seconds, not anywhere close to 15 min.
>>>>
>>>> I'm using GRPC for java 1.18.0 with Netty
>>>>
>>>> Many thanks!
>>>>
>>>> Yosef
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/0eefe98c-f8b3-40b8-bc7b-39fe77148590%40googlegroups.com.

[grpc-io] Re: Detecting bad connection in java

Reply via email to