Hi,

Thanks for the logs, but I meant if you could attach full Ignite nodes logs
for investigation as these parts are not enough to pinpoint the root cause.

>From what I got so far: first of all, try to remove timeouts and set
property failureDetectionTimeout for both TcpDiscoverySpi and
TcpCommunicationSpi, you can do this by propagating this property directly
into IgniteConfiguration instance.

Could you also verify if it's the same node being kicked out from cluster
after 10 minutes, or failing node is different in each case?

>From this short logs I can only see that there are nodes with ids:
N1. d0daf95e-49c0-4071-b6bc-cd7e279c0582
N2. a0830a00-1970-43b4-b143-6e1947f0059f
N3. 4313271e-8273-4f28-bcab-aac3a43e6722

N1 does not appear in rcvd= section, so I assume that there is problem on
this node. Most probable reasons are network and GC, so I need full logs for
nodes and GC logging enabled:

Add following to JVM opts:

-XX:+PrintAdaptiveSizePolicy -XX:+PrintGC
-XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
-XX:+PrintHeapAtGC

Add -DIGNITE_QUIET=false to node startup cmd and collect logs.

Regards,
Anton



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Reply via email to