MYou must increase the Linux NOFILE ulimit when running Ignite.  The
documentation describes how to do this.

On Sun, Jun 24, 2018, 12:47 PM 胡海麟 <[email protected]> wrote:

> Hi,
>
> Re-post message 'cause I failed to post my logs pasted.
>
> I have got repeated Too many open files exceptions since sometime.
> ================================
> [11:26:24,493][SEVERE][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
> Failed to process selector key [ses=GridSelectorNioSessionImpl
> [worker=ByteBufferNioClientWorker
> [readBuf=java.nio.HeapByteBuffer[pos=0 lim=8192 cap=8192],
> super=AbstractNioClientWorker [idx=1, bytesRcvd=0, bytesSent=0,
> bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker
> [name=grid-nio-worker-tcp-rest-1, igniteInstanceName=null,
> finished=false, hashCode=1611196193, interrupted=false,
> runner=grid-nio-worker-tcp-rest-1-#57]]], writeBuf=null, readBuf=null,
> inRecovery=null, outRecovery=null, super=GridNioSessionImpl
> [locAddr=/10.1.14.11:11211, rmtAddr=/10.1.252.184:40680,
> createTime=1529666783471, closeTime=0, bytesSent=5, bytesRcvd=1074,
> bytesSent0=0, bytesRcvd0=0, sndSchedTime=1529666783481,
> lastSndTime=1529666783481, lastRcvTime=1529666783481,
> readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter
> [parser=GridTcpRestParser [marsh=JdkMarshaller
> [clsFilter=o.a.i.i.IgniteKernal$5@331b0c4a], routerClient=false],
> directMode=false]], accepted=true]]]
> java.io.IOException: Connection reset by peer
>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>         at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$ByteBufferNioClientWorker.processRead(GridNioServer.java:1085)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2339)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2110)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1764)
>         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>         at java.lang.Thread.run(Thread.java:748)
>
> [11:26:24,493][WARNING][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
> Closing NIO session because of unhandled exception [cls=class
> o.a.i.i.util.nio.GridNioException, msg=Connection reset by peer]
>
> [11:26:24,493][WARNING][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
> Closed client session due to exception [ses=GridSelectorNioSessionImpl
> [worker=ByteBufferNioClientWorker
> [readBuf=java.nio.HeapByteBuffer[pos=0 lim=8192 cap=8192],
> super=AbstractNioClientWorker [idx=1, bytesRcvd=0, bytesSent=0,
> bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker
> [name=grid-nio-worker-tcp-rest-1, igniteInstanceName=null,
> finished=false, hashCode=1611196193, interrupted=false,
> runner=grid-nio-worker-tcp-rest-1-#57]]], writeBuf=null, readBuf=null,
> inRecovery=null, outRecovery=null, super=GridNioSessionImpl
> [locAddr=/10.1.14.11:11211, rmtAddr=/10.1.252.184:40680,
> createTime=1529666783471, closeTime=1529666784488, bytesSent=5,
> bytesRcvd=1074, bytesSent0=0, bytesRcvd0=0,
> sndSchedTime=1529666783481, lastSndTime=1529666783481,
> lastRcvTime=1529666783481, readsPaused=false,
> filterChain=FilterChain[filters=[GridNioCodecFilter
> [parser=GridTcpRestParser [marsh=JdkMarshaller
> [clsFilter=o.a.i.i.IgniteKernal$5@331b0c4a], routerClient=false],
> directMode=false]], accepted=true]], msg=Connection reset by peer]
> [11:26:24,513][SEVERE][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
> Caught unhandled exception in NIO worker thread (restart the node).
> java.lang.NullPointerException
>         at
> sun.nio.ch.EPollArrayWrapper.isEventsHighKilled(EPollArrayWrapper.java:174)
>         at
> sun.nio.ch.EPollArrayWrapper.setUpdateEvents(EPollArrayWrapper.java:190)
>         at sun.nio.ch.EPollArrayWrapper.add(EPollArrayWrapper.java:239)
>         at
> sun.nio.ch.EPollSelectorImpl.implRegister(EPollSelectorImpl.java:178)
>         at sun.nio.ch.SelectorImpl.register(SelectorImpl.java:132)
>         at
> java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:212)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.register(GridNioServer.java:2545)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1934)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1764)
>         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>         at java.lang.Thread.run(Thread.java:748)
> [11:26:30,277][SEVERE][nio-acceptor-#55][GridTcpRestProtocol] Failed
> to accept remote connection (will wait for 2000ms).
> class org.apache.ignite.IgniteCheckedException: Failed to accept
> connection: GridWorker [name=nio-acceptor, igniteInstanceName=null,
> finished=false, hashCode=1020662787, interrupted=false,
> runner=nio-acceptor-#55]
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2888)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.body(GridNioServer.java:2822)
>         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Too many open files
>         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>         at sun.nio.ch
> .ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
>         at sun.nio.ch
> .ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.processSelectedKeys(GridNioServer.java:2938)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2872)
>         ... 3 more
> [11:26:32,284][SEVERE][nio-acceptor-#55][GridTcpRestProtocol] Failed
> to accept remote connection (will wait for 2000ms).
> class org.apache.ignite.IgniteCheckedException: Failed to accept
> connection: GridWorker [name=nio-acceptor, igniteInstanceName=null,
> finished=false, hashCode=1020662787, interrupted=false,
> runner=nio-acceptor-#55]
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2888)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.body(GridNioServer.java:2822)
>         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Too many open files
>         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>         at sun.nio.ch
> .ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
>         at sun.nio.ch
> .ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.processSelectedKeys(GridNioServer.java:2938)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2872)
>         ... 3 more
> ================================
>
> My max open files is 32768, and ignite process does have 32768 open files.
> ================================
> $ sudo ls -hl /proc/4055/fd/ | wc -l
> 32768
> ================================
>
> Most of them look like this
> ================================
> ...
> lrwx------ 1 root root 64 Jun 23 12:22 9990 -> socket:[1167798]
> lrwx------ 1 root root 64 Jun 23 12:22 9991 -> socket:[1167799]
> lrwx------ 1 root root 64 Jun 23 12:22 9992 -> socket:[1166839]
> lrwx------ 1 root root 64 Jun 23 12:22 9993 -> socket:[1167800]
> lrwx------ 1 root root 64 Jun 23 12:22 9994 -> socket:[1168762]
> lrwx------ 1 root root 64 Jun 23 12:22 9995 -> socket:[1168763]
> lrwx------ 1 root root 64 Jun 23 12:22 9996 -> socket:[1164109]
> lrwx------ 1 root root 64 Jun 23 12:22 9997 -> socket:[1166840]
> lrwx------ 1 root root 64 Jun 23 12:22 9998 -> socket:[1164110]
> lrwx------ 1 root root 64 Jun 23 12:22 9999 -> socket:[1169810]
> ================================
>
> I haven't found any document about how ignite uses unix socket.
> It seems ignite doesn't close them properly. Any help?
>
> Thanks.
>
>

Disclaimer

The information contained in this communication from the sender is 
confidential. It is intended solely for use by the recipient and others 
authorized to receive it. If you are not the recipient, you are hereby notified 
that any disclosure, copying, distribution or taking action in relation of the 
contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been 
automatically archived by Mimecast Ltd, an innovator in Software as a Service 
(SaaS) for business. Providing a safer and more useful place for your human 
generated data. Specializing in; Security, archiving and compliance. To find 
out more visit the Mimecast website.

Reply via email to