[
https://issues.apache.org/jira/browse/IGNITE-22876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Roman Puchkovskiy reassigned IGNITE-22876:
------------------------------------------
Assignee: Roman Puchkovskiy
> Initiate cluster reset
> ----------------------
>
> Key: IGNITE-22876
> URL: https://issues.apache.org/jira/browse/IGNITE-22876
> Project: Ignite
> Issue Type: Improvement
> Reporter: Roman Puchkovskiy
> Assignee: Roman Puchkovskiy
> Priority: Major
> Labels: ignite-3
>
> Either a specific manager (like CmgDisasterRecoveryManager) or a method in
> ClusterManagementGroupManager needs to be added.
> The method signature should look like
> CompletableFuture<Void> resetCluster(List<String> newCmgConsistentIds)
> # It accepts consistent IDs of the members forming the new CMG
> # It builds a ResetClusterMessage containing newCmgConsistentIds, MG nodes
> (taken from the local CMG storage), existing cluster name, a newly generated
> random cluster ID and the history of previous cluster IDs (taken from the
> local CMG storage)
> # It then sends this message to each node in the physical topology
> (including itself)
> # Receiving this message, a node saves its contents to the Vault, responds
> with OK and initiates a restart of itself
> # The node on which the method is called (the Conductor) waits for all nodes
> to either respond with OK or leave failing to receive the message
> # If the wait finishes successfully without triggering a timeout, the
> Conductor restarts itself
> # If a timeout passes, but the wait above has not finished, then:
> ## If the majority of the new CMG nodes has replied with OK, then the
> Conductor restarts itself
> ## Otherwise, the result is indeterminate, an exception is returned, and the
> user has to manually restart the nodes that did not respond.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)