[
https://issues.apache.org/jira/browse/IGNITE-23486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890853#comment-17890853
]
Kirill Gusakov commented on IGNITE-23486:
-----------------------------------------
{code}
metastoreTopologyChangeEventReceived:
onNodeLeft:
initPartitionDistributionResetTimer(revision)
initDataNodesAutoAdjustScaleDownTimer()
onNodeJoined:
initDataNodesAutoAdjustScaleUpTimer()
onTopologyLeap:
initPartitionDistributionResetTimer()
initDataNodesAutoAdjustScaleUpTimer()
initDataNodesAutoAdjustScaleDownTimer()
onPartitionDistributionResetTimeout:
revision = getRevisionFromEvent()
for t in zoneTables:
for p in tablePartitions:
if !checkMajority(p, revision):
resetPartition(p)
nodeStarted:
(stable, pending) = (
getCurrentStableAssignmentsFromMs(),
getCurrentPendingAssignmentsFromMs())
// pending.force flag is a result of resetPartition,
// see disaster recovery protocol
https://docs.google.com/document/u/0/d/1g_TgBdxMBF3rGmY-zwD4SqRkJ9u4Ejvw1P-TPloViLg/edit
if currentNode in union(stable, pending) && !pending.force:
startRaftNode()
else:
if (pending.force):
assert pending.size == 1 // see resetPartition requirements
startRaftNode()
else:
skip
// Definitions.
// If you can't find any functions here,
// it's supposed that their behaviour is trivial.
initPartitionDistributionResetTimer(revision):
// in the case of concurrent set,
// new timer must reschedule old with new revision
saveTimerIntent(topologyEventRevision, currentTime())
checkMajority(partition, revision):
assignments = getCurrentAssignmentsFromStableMetastoreKey(partition)
aliveAssignments = assignments.intersect(getLogicalTopology(revision))
return aliveAssignments.size() >= 1 + assignments.size() / 2
resetPartition(partition):
/*
Requirements:
- Leader hijack protection must be already implemented. Analogue of
IGNITE-22904 must be implemented for non-metastore partitions.
- Reset must work in 2 phases: (planned=targetTopology,
pending=singleAliveNode). Where targetTopology is the alive nodes from the
current stable assignments.
- Only alive nodes must be placed in the planned key.
*/
initDataNodesAutoAdjustScaleUpTimer() // see here
https://cwiki.apache.org/confluence/display/IGNITE/IEP-101%3A+Distribution+Zones
initDataNodesAutoAdjustScaleDownTimer() // see here
https://cwiki.apache.org/confluence/display/IGNITE/IEP-101%3A+Distribution+Zones
{code}
> Formalize algorithm for HA mode
> -------------------------------
>
> Key: IGNITE-23486
> URL: https://issues.apache.org/jira/browse/IGNITE-23486
> Project: Ignite
> Issue Type: Improvement
> Reporter: Kirill Gusakov
> Assignee: Kirill Gusakov
> Priority: Major
> Labels: ignite-3
>
> *Motivation*
> We need to summarize in any formal form the algorithm of HA mode behavior for
> the epic IGNITE-23438.
> *Definition of done*
> - The whole algorithm described in the formal form, which suitable for use
> cases validation.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)