Hi All,
We are using Ignite 2.10.0 and we have a question about Partition
Backups、baselineAutoAdjust and Disaster recovery.
We have three nodes:A、B、C,and baseline auto adjustment enabled、no backup,
if one of them is down,for example ,node C is down,then the baseline topology
cannot be adjusted automatically because of partition loss. Stop the node A、B
and then restart these node. The cluster will wait for all baseline nodes to be
online.
By analogy, if we have 100 nodes, and then one node shut down, and this
cannot be restarted, the above situation will also occur.Without manual
intervention, the baseline topology will never be adjusted to the remaining 99
nodes. When the cluster restarts, it will also wait for this node to join, so
the cluster cannot be activated.
So,we must exclude the failed node that cannot be restarted before
attempting to reset the state of lost partitions,Then execute the command
”--cache reset_lost_partitions“ on each cache.
If we stop the nodes in the order of A, B, and C, and wait for the baseline
topology change after each stop, if we want to restart the cluster, we must
restart in the order of A, B, and C, right?