[
https://issues.apache.org/jira/browse/IGNITE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vyacheslav Koptilin updated IGNITE-20076:
-----------------------------------------
Labels: ignite-3 (was: igntie-3)
> Improve networking shutdown implementation
> ------------------------------------------
>
> Key: IGNITE-20076
> URL: https://issues.apache.org/jira/browse/IGNITE-20076
> Project: Ignite
> Issue Type: Bug
> Reporter: Roman Puchkovskiy
> Assignee: Roman Puchkovskiy
> Priority: Major
> Labels: ignite-3
> Fix For: 3.0.0-beta2
>
> Time Spent: 1h
> Remaining Estimate: 0h
>
> Currently, when initiating an Ignite's node shutdown, we first stop
> ScaleCube's cluster (so that it sends a LEAVING message) and only when it's
> completely shutdown do we shut the connection manager. As a result, there is
> some interval when the node's networking thinks it's still alive (and hence
> it tries to restore connections with other nodes), but other nodes think the
> node has already left (as they received that LEAVING message from it), so
> they don't let it establish connections. The first node sees that it is
> rejected and tries to handle this is a critical failure. Currently, it just
> logs a scary message, but, when we implement a proper failure handler, this
> will kill the node. This is not ok for a graceful stop scenario.
> The idea is to first (before stopping the ScaleCube local cluster) tell the
> connection manager that it is now in the 'stopping' state. In this state, it
> does not try to establish new connections (and does not attempt to reconnect)
> and does not allow any incoming connections; also, it does not handle
> rejections by other nodes as critical failures in this state.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)