Failed to process selector key (Connection reset by peer)

erathina Fri, 05 Mar 2021 21:19:26 -0800

Hi, We have an ignite cluster setup with two ignite servers. At certain times
during the week, we get these error messages in a sequence that we believe
is causing the JVM Memory size to increase. We have 2gb xmx and xms set
using jdk 11. Ignite version used is 2.8.0. We know 2gb is very small but we
believe increasing the heap size allocation is not going to solve the issue.
The exact stack trace is



/Mar 02, 2021 1:45:20 AM org.apache.ignite.logger.java.JavaLogger error
SEVERE: Failed to process selector key [ses=GridSelectorNioSessionImpl
[worker=ByteBufferNioClientWorker [readBuf=java.nio.HeapByteBuffer[pos=0
lim=8192 cap=8192], super=AbstractNioClientWorker [idx=3, bytesRcvd=0,
bytesSent=0, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker
[name=grid-nio-worker-client-listener-3, igniteInstanceName=null,
finished=false, heartbeatTs=1614667518323, hashCode=92764489,
interrupted=false, runner=grid-nio-worker-client-listener-3-#133]]],
writeBuf=null, readBuf=null, inRecovery=null, outRecovery=null,
closeSocket=true, outboundMessagesQueueSizeMetric=null,
super=GridNioSessionImpl [locAddr=/x.x.x.x:x, rmtAddr=/x.x.x.x:x,
createTime=1614667512243, closeTime=0, bytesSent=0, bytesRcvd=517,
bytesSent0=0, bytesRcvd0=0, sndSchedTime=1614667512243,
lastSndTime=1614667512243, lastRcvTime=1614667512273, readsPaused=false,
filterChain=FilterChain[filters=[GridNioAsyncNotifyFilter,
GridNioCodecFilter [parser=ClientListenerBufferedParser, directMode=false]],
accepted=true, markedForClose=false]]]
java.io.IOException: Connection reset by peer
at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at java.base/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:276)
at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:245)
at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:223)
at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:358)
at
org.apache.ignite.internal.util.nio.GridNioServer$ByteBufferNioClientWorker.processRead(GridNioServer.java:1162)
at
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2449)
at
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2216)
at
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1857)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.base/java.lang.Thread.run(Thread.java:834)/


The server crashes with JAVA OOM and upon looking at the .hprof file
analyzing the biggest objects at the time of OOM, we saw this, 

<http://apache-ignite-users.70518.x6.nabble.com/file/t3087/highheapmem.png> 

It looks like just the ClientListenerNioServerBuffer is consuming 1GB of
memory at the time of crash. Shouldn't this buffer cleared when there is any
issue with NC's.

Other threads suggest increasing the socket timeout or reducing the failure
detection timeout. Although, I will try them out, I am skeptical that those
fixes will work.

Any help is appreciated!

Thanks!



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Failed to process selector key (Connection reset by peer)

Reply via email to