Thanks, Vladislav.

Have already tuned the GC parameters according to that link.

Seems that this error happens very frequently when the cache is very big
like over 15G per node in off-heap. 

BTW, found that in the remote nodes, the JoinEvent and FailedEvent were
received at almost the same time. Any idea about this?

[2016.08.11 00:36:34,994 PDT][INFO
][disco-event-worker-#174%null%][GridDiscoveryManager] Added new node to
topology: TcpDiscoveryNode [id=eeb076c7-2b63-431b-b53d-4acef66e99f2,
addrs=[10.183.142.50, 10.65.84.249, 127.0.0.1], sockAddrs=[/127.0.0.1:47500,
/10.183.142.50:47500, CO3SCH050520537/10.65.84.249:47500], discPort=47500,
order=1158, intOrder=662, lastExchangeTime=1470900973709, loc=false,
ver=1.7.0#20160801-sha1:383273e3, isClient=false]
[2016.08.11 00:36:34,996 PDT][INFO
][disco-event-worker-#174%null%][GridDiscoveryManager] Topology snapshot
[ver=1158, servers=20, clients=146, CPUs=1336, heap=360.0GB]
[2016.08.11 00:36:35,439 PDT][INFO
][exchange-worker-#176%null%][GridCachePartitionExchangeManager] Skipping
rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=1157,
minorTopVer=0], evt=NODE_FAILED, node=eeb076c7-2b63-431b-b53d-4acef66e99f2]
[2016.08.11 00:36:35,625 PDT][WARN
][disco-event-worker-#174%null%][GridDiscoveryManager] Node FAILED:
TcpDiscoveryNode [id=eeb076c7-2b63-431b-b53d-4acef66e99f2,
addrs=[10.183.142.50, 10.65.84.249, 127.0.0.1], sockAddrs=[/127.0.0.1:47500,
/10.183.142.50:47500, CO3SCH050520537/10.65.84.249:47500], discPort=47500,
order=1158, intOrder=662, lastExchangeTime=1470900973709, loc=false,
ver=1.7.0#20160801-sha1:383273e3, isClient=false]



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Fail-to-join-topology-and-repeat-join-process-tp6987p7045.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Reply via email to