Finishing work on IGNITE-752 (Speed up failure detection)

Denis Magda Thu, 23 Jul 2015 03:33:08 -0700

Igniters,

During this week I've been working on an improvement that lets to detectfailures at cluster nodes' discovery/communication/network levels asquick as possible and lets the user to tune such a behavior with asingle configuration parameter.

Sure the failure detection exists for a long time in Ignite and the useris able to tune it BUT there are around *10* configuration parametersthat have to be setup to achieve a desired result.

When IGNITE-752 is merged to the main development branch all thisbehavior will be possible to control with a single parameter -IgniteConfiguration.failureDetectionThreshold.

By setting the failure detection threshold for a server node it will bepossible to detect failed nodes in a cluster topology during the timeequal to threshold's value and switch to/keep working with only alivenodes.By setting the threshold for a client node will let us to connectionfailures between the client and its router node (a server node that is apart of a topology).

In addition, bunch of other improvements and simplifications were doneat the level of TcpDiscoverySpi and TcpCommunicationSpi. Changes areaggregated here:

https://issues.apache.org/jira/browse/IGNITE-752

General review is passed. However if anyone wants to review as well orhave any thoughts/suggestions don't hesitate to propose them.

Dmitiry S, I would like to ask you to review documentation changes inany case before I do a merge.



Regards,
Denis

Finishing work on IGNITE-752 (Speed up failure detection)

Reply via email to