Pavel Kovalenko created IGNITE-9494:
---------------------------------------
Summary: Communication error resolver may be invoked when topology
is under construction
Key: IGNITE-9494
URL: https://issues.apache.org/jira/browse/IGNITE-9494
Project: Ignite
Issue Type: Bug
Components: cache
Affects Versions: 2.5
Reporter: Pavel Kovalenko
Fix For: 2.7
Zookeeper Discovery.
During massive node start and join to topology there can happen communication
error problems which can lead to invoking communication error resolver.
Communication error resolver initiates a peer-to-peer ping process on all alive
nodes. Youngest nodes in a cluster may have the not complete picture about
alive nodes in a cluster. This can lead to a situation, that youngest node will
not ping all available nodes, and the coordinator may decide that those nodes
have an unstable network and unexpectedly kill them.
We should throttle communication error resolver in case of massive node join
and give them a time to get the complete picture about topology.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)