RE: How long Ignite retries upon NODE_FAILED events

HEWA WIDANA GAMAGE, SUBASH Mon, 02 Jul 2018 08:37:58 -0700

Yes failureDetectionTimeout determines the time it wait to mark a node failed. 
But my question is, after such node failed happened, and then what happens when 
that failed node becomes reachable in the network (less that 
failureDetectionTimeout) ?


From: Evgenii Zhuravlev [mailto:[email protected]]
Sent: Monday, July 02, 2018 11:05 AM
To: [email protected]
Subject: Re: How long Ignite retries upon NODE_FAILED events

Hi,

by default, Ignite uses a mechanism, that can be configured using 
failureDetectionTimeout: 
https://apacheignite.readme.io/v2.5/docs/tcpip-discovery#section-failure-detection-timeout

Evgenii

2018-07-02 16:40 GMT+03:00 HEWA WIDANA GAMAGE, SUBASH 
<[email protected]<mailto:[email protected]>>:
Hi team,

For example, let’s say one of the node is not down(JVM is up), but network not 
reachable from/to it. Then rest of the nodes will see  NODE_FAILED and started 
working as normal with reduced cluster size. If that failed node, the network 
from/to it, becomes normal again  after X minutes. Then,
- will other nodes discover them, or will that node be able to figure it out ?
- How long X can be at max? Is there max retry or timeout. (I seen joinTimeout 
param in discovery, but that’s seems only applicable for startup, like how long 
it should pause starting the node to let join others)

RE: How long Ignite retries upon NODE_FAILED events

Reply via email to