[
https://issues.apache.org/jira/browse/IGNITE-22377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Bessonov updated IGNITE-22377:
-----------------------------------
Reviewer: Aditya Mukhopadhyay (was: Filipp Shergalis)
> Choose node to fail on a refused handshake
> ------------------------------------------
>
> Key: IGNITE-22377
> URL: https://issues.apache.org/jira/browse/IGNITE-22377
> Project: Ignite
> Issue Type: Improvement
> Reporter: Roman Puchkovskiy
> Assignee: Ivan Bessonov
> Priority: Major
> Labels: ignite-3
> Fix For: 3.2
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Currently, if during a handshake a node gets refused because it's stale from
> the point of view of the node to which it connects, the refused node notifies
> its FailureHandler to force node restart.
> If a network partition happens, this might cause problems when it disappears:
> nodesĀ from different segments will start sniping each other. In the worst
> case, a single segmented node might make the whole cluster (but itself)
> restart if.
> It is suggested that the refusing node sends the following information about
> the physical topology as it sees it to the refused node:
> # Number of nodes in the PT
> # Min ID of nodes in the PT
> The refused node will only restart if the number of nodes in the PT, as it
> sees it, is less than the number of nodes in the PT of the refusing node; if
> the sizes are equal, then comparing Min IDs of nodes in the PT will allow to
> make a determenistic decision.
> This idea needs to be thought through and improved (or rejected).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)