Hi, 

We are using Ignite cache version 2.1 . We are using it as persistent store 
in Partitioned Mode having 4 cluster node running.  Atomicity mode is 
ATOMIC, and Rebalance mode is ASYNC while CacheWriteSynchronizationMode is 
FULL_SYNC. 

We are experiencing frequent connection issue where server node gets
disconnected. This happens when we start writing huge data (close to ~1.5 m
key value pair having 2G size) into the cache.

Below is the exception trace  : 

2017-12-21 06:43:13,926 WARN
[tcp-comm-worker-#1%f76e71a5-7941-41a0-aca0-12fdab5f629e%] {}
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi - Failed to
connect to a remote node (make sure that des
tination node is alive and operating system firewall is disabled on local
and remote hosts)
[addrs=[ueu-ip-lapp0002.coresit.xxxxxx.org/xx.zz.216.22:47101,
ueu-ip-lapp0002.mgmt.xxxxxx.org/xx.yy.44.22:47101, /10.6
2.21.54:47101, /127.0.0.1:47101]]
2017-12-21 06:43:13,926 ERROR
[tcp-comm-worker-#1%f76e71a5-7941-41a0-aca0-12fdab5f629e%] {}
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi -
TcpCommunicationSpi failed to establish connection to
 node, node will be dropped from cluster [rmtNode=TcpDiscoveryNode
[id=a9e043b4-c9d2-4922-aa0e-f44397b8dd5a, addrs=[xx.yy.21.54, xx.yy.44.22,
xx.zz.216.22, 127.0.0.1], sockAddrs=[ueu-ip-lapp0002.mgmt.xxxxxx.or
g/xx.yy.44.22:0, /xx.yy.21.54:0,
ueu-ip-lapp0002.coresit.xxxxxx.org/xx.zz.216.22:0, /127.0.0.1:0],
discPort=0, order=584, intOrder=294, lastExchangeTime=1513838405068,
loc=false, ver=2.1.0#20170720-sha1:a6ca5c
8a, isClient=true]] class org.apache.ignite.IgniteCheckedException: Failed
to connect to node (is node still alive?). Make sure that each ComputeTask
and cache Transaction has a timeout set in order to preve
nt parties from waiting forever in case of network issues
[nodeId=a9e043b4-c9d2-4922-aa0e-f44397b8dd5a,
addrs=[ueu-ip-lapp0002.coresit.xxxxxx.org/xx.zz.216.22:47101,
ueu-ip-lapp0002.mgmt.xxxxxx.org/xx.yy.44.22:4
7101, /xx.yy.21.54:47101, /127.0.0.1:47101]]
        at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3179)
        at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2763)
        at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2655)
        at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.access$5800(TcpCommunicationSpi.java:244)
        at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:4053)
        at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:3879)
        at
org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
        Suppressed: class org.apache.ignite.IgniteCheckedException: Failed
to connect to address
[addr=ueu-ip-lapp0002.coresit.xxxxxx.org/xx.zz.216.22:47101, err=Connection
refused]
                at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3184)
                ... 6 more
        Caused by: java.net.ConnectException: Connection refused
                at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
                at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
                at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
                at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3030)
                ... 6 more
        Suppressed: class org.apache.ignite.IgniteCheckedException: Failed
to connect to address
[addr=ueu-ip-lapp0002.mgmt.xxxxxx.org/xx.yy.44.22:47101, err=Connection
refused]
                at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3184)
                ... 6 more
        Caused by: java.net.ConnectException: Connection refused
                at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
                at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
                at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
                at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3030)
                                               ... 6 more
        Suppressed: class org.apache.ignite.IgniteCheckedException: Failed
to connect to address [addr=/xx.yy.21.54:47101, err=Connection refused]
                at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3184)
                ... 6 more
        Caused by: java.net.ConnectException: Connection refused
                at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
                at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
                at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
                at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3030)
                ... 6 more
        Suppressed: class org.apache.ignite.IgniteCheckedException: Failed
to connect to address [addr=/127.0.0.1:47101, err=Connection refused]
                at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3184)
                ... 6 more
        Caused by: java.net.ConnectException: Connection refused
                at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
                at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
                at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
                at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3030)
                ... 6 more

2017-12-21 06:43:14,080 INFO
[disco-event-worker-#81%f76e71a5-7941-41a0-aca0-12fdab5f629e%] {}
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager - Node
left topology: TcpDiscoveryNode [id=a9e043b4-c9d2-4922-aa0e-f44397b8dd5a,
addrs=[xx.yy.21.54, xx.yy.44.22, xx.zz.216.22, 127.0.0.1],
sockAddrs=[ueu-ip-lapp0002.mgmt.xxxxxx.org/xx.yy.44.22:0, /xx.yy.21.54:0,
ueu-ip-lapp0002.coresit.xxxxxx.org/xx.zz.216.22:0, /127.0.0.1:0],
discPort=0, order=584, intOrder=294, lastExchangeTime=1513838405068,
loc=false, ver=2.1.0#20170720-sha1:a6ca5c8a, isClient=true]
2017-12-21 06:43:14,279 INFO
[disco-event-worker-#81%f76e71a5-7941-41a0-aca0-12fdab5f629e%] {}
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager -
Topology snapshot [ver=585, servers=3, clients=0, CPUs=120, heap=15.0GB]
2017-12-21 06:43:14,280 INFO
[exchange-worker-#82%f76e71a5-7941-41a0-aca0-12fdab5f629e%] {}
org.apache.ignite.internal.exchange.time - Started exchange init
[topVer=AffinityTopologyVersion [topVer=585, minorTopVer=0], crd=false,
evt=11, node=TcpDiscoveryNode [id=ecc8c5ec-af66-484e-a4dc-041fbbdeb24f,
addrs=[xx.yy.21.26, xx.yy.44.50, xx.zz.216.50, 127.0.0.1],
sockAddrs=[ueu-ip-lapp0003.mgmt.xxxxxx.org/xx.yy.44.50:47500,
/xx.yy.21.26:47500, /127.0.0.1:47500,
ueu-ip-lapp0003.coresit.xxxxxx.org/xx.zz.216.50:47500], discPort=47500,
order=2, intOrder=2, lastExchangeTime=1513838594269, loc=true,
ver=2.1.0#20170720-sha1:a6ca5c8a, isClient=false], evtNode=TcpDiscoveryNode
[id=ecc8c5ec-af66-484e-a4dc-041fbbdeb24f, addrs=[xx.yy.21.26, xx.yy.44.50,
xx.zz.216.50, 127.0.0.1],
sockAddrs=[ueu-ip-lapp0003.mgmt.xxxxxx.org/xx.yy.44.50:47500,
/xx.yy.21.26:47500, /127.0.0.1:47500,
ueu-ip-lapp0003.coresit.xxxxxx.org/xx.zz.216.50:47500], discPort=47500,
order=2, intOrder=2, lastExchangeTime=1513838594269, loc=true,
ver=2.1.0#20170720-sha1:a6ca5c8a, isClient=false], customEvt=null]
2017-12-21 06:43:14,280 INFO
[exchange-worker-#82%f76e71a5-7941-41a0-aca0-12fdab5f629e%] {}
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture
- Snapshot initialization completed [topVer=AffinityTopologyVersion
[topVer=585, minorTopVer=0], time=0ms]
2017-12-21 06:43:14,280 INFO
[exchange-worker-#82%f76e71a5-7941-41a0-aca0-12fdab5f629e%] {}
org.apache.ignite.internal.exchange.time - Finished exchange init
[topVer=AffinityTopologyVersion [topVer=585, minorTopVer=0], crd=false]
2017-12-21 06:43:14,281 INFO
[exchange-worker-#82%f76e71a5-7941-41a0-aca0-12fdab5f629e%] {}
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager
- Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
[topVer=585, minorTopVer=0], evt=NODE_LEFT,
node=a9e043b4-c9d2-4922-aa0e-f44397b8dd5a]
2017-12-21 06:43:17,650 INFO
[disco-event-worker-#81%f76e71a5-7941-41a0-aca0-12fdab5f629e%] {}
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager - Added
new node to topology: TcpDiscoveryNode
[id=aba97147-ecf1-42b4-b697-1372134c5af1, addrs=[xx.yy.21.54, xx.yy.44.22,
xx.zz.216.22, 127.0.0.1],
sockAddrs=[ueu-ip-lapp0002.mgmt.xxxxxx.org/xx.yy.44.22:0, /xx.yy.21.54:0,
ueu-ip-lapp0002.coresit.xxxxxx.org/xx.zz.216.22:0, /127.0.0.1:0],
discPort=0, order=586, intOrder=295, lastExchangeTime=1513838597629,
loc=false, ver=2.1.0#20170720-sha1:a6ca5c8a, isClient=true]
2017-12-21 06:43:17,820 INFO
[disco-event-worker-#81%f76e71a5-7941-41a0-aca0-12fdab5f629e%] {}
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager -
Topology snapshot [ver=586, servers=3, clients=1, CPUs=120, heap=45.0GB]
2017-12-21 06:43:17,820 INFO
[exchange-worker-#82%f76e71a5-7941-41a0-aca0-12fdab5f629e%] {}
org.apache.ignite.internal.exchange.time - Started exchange init
[topVer=AffinityTopologyVersion [topVer=586, minorTopVer=0], crd=false,
evt=10, node=TcpDiscoveryNode [id=ecc8c5ec-af66-484e-a4dc-041fbbdeb24f,
addrs=[xx.yy.21.26, xx.yy.44.50, xx.zz.216.50, 127.0.0.1],
sockAddrs=[ueu-ip-lapp0003.mgmt.xxxxxx.org/xx.yy.44.50:47500,
/xx.yy.21.26:47500, /127.0.0.1:47500,
ueu-ip-lapp0003.coresit.xxxxxx.org/xx.zz.216.50:47500], discPort=47500,
order=2, intOrder=2, lastExchangeTime=1513838597649, loc=true,
ver=2.1.0#20170720-sha1:a6ca5c8a, isClient=false], evtNode=TcpDiscoveryNode
[id=ecc8c5ec-af66-484e-a4dc-041fbbdeb24f, addrs=[xx.yy.21.26, xx.yy.44.50,
xx.zz.216.50, 127.0.0.1],
sockAddrs=[ueu-ip-lapp0003.mgmt.xxxxxx.org/xx.yy.44.50:47500,
/xx.yy.21.26:47500, /127.0.0.1:47500,
ueu-ip-lapp0003.coresit.xxxxxx.org/xx.zz.216.50:47500], discPort=47500,
order=2, intOrder=2, lastExchangeTime=1513838597649, loc=true,
ver=2.1.0#20170720-sha1:a6ca5c8a, isClient=false], customEvt=null]
2017-12-21 06:43:17,820 INFO
[exchange-worker-#82%f76e71a5-7941-41a0-aca0-12fdab5f629e%] {}
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture
- Snapshot initialization completed [topVer=AffinityTopologyVersion
[topVer=586, minorTopVer=0], time=0ms]
2017-12-21 06:43:17,821 INFO
[exchange-worker-#82%f76e71a5-7941-41a0-aca0-12fdab5f629e%] {}
org.apache.ignite.internal.exchange.time - Finished exchange init
[topVer=AffinityTopologyVersion [topVer=586, minorTopVer=0], crd=false]







--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Reply via email to