Fail to join topology and repeat join process

Jason Thu, 11 Aug 2016 07:49:10 -0700

hi Ignite team,

In a cluster with 20 server nodes, I manually restarted a server node to
test its partition re-balance and reliability, then the restarted node
couldn't join the topology with the below error. And this process lasted for
few hours, but still couldn't move forward.


The attached is the log in the other remote server nodes.

FYI, we've a big cache with 20G off_heap memory per node.

Would you like to take a look and give us some suggestion on how to tune
this? 

Any suggestion or advice will be appreciated.

Thanks,
-Jason

[TcpDiscoverySpi] Node has not been connected to topology and will repeat
join process. Check remote nodes logs for possible error messages. Note that
large topology may require significant time to start. Increase
'TcpDiscoverySpi.networkTimeout' configuration property if getting this
message on the starting nodes [networkTimeout=30000]

error_log.txt
<http://apache-ignite-users.70518.x6.nabble.com/file/n6987/error_log.txt>  



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Fail-to-join-topology-and-repeat-join-process-tp6987.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Fail to join topology and repeat join process

Reply via email to