[ 
https://issues.apache.org/jira/browse/IGNITE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vyacheslav Koptilin updated IGNITE-20076:
-----------------------------------------
    Labels: ignite-3  (was: igntie-3)

> Improve networking shutdown implementation
> ------------------------------------------
>
>                 Key: IGNITE-20076
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20076
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Roman Puchkovskiy
>            Assignee: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.0.0-beta2
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, when initiating an Ignite's node shutdown, we first stop 
> ScaleCube's cluster (so that it sends a LEAVING message) and only when it's 
> completely shutdown do we shut the connection manager. As a result, there is 
> some interval when the node's networking thinks it's still alive (and hence 
> it tries to restore connections with other nodes), but other nodes think the 
> node has already left (as they received that LEAVING message from it), so 
> they don't let it establish connections. The first node sees that it is 
> rejected and tries to handle this is a critical failure. Currently, it just 
> logs a scary message, but, when we implement a proper failure handler, this 
> will kill the node. This is not ok for a graceful stop scenario.
> The idea is to first (before stopping the ScaleCube local cluster) tell the 
> connection manager that it is now in the 'stopping' state. In this state, it 
> does not try to establish new connections (and does not attempt to reconnect) 
> and does not allow any incoming connections; also, it does not handle 
> rejections by other nodes as critical failures in this state.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to