[
https://issues.apache.org/jira/browse/HDDS-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296960#comment-17296960
]
Prashant Pogde commented on HDDS-4914:
--------------------------------------
The goals is to write comprehensive framework that will
* drives SCM - finalization
* Inject failures in both DataNodes as well as SCM at every state change in
both SCM and DataNodes.
* Validate that SCM and Datanodes eventually finalize and upgrade is
successful.
HDDS upgrade model can be thought of as a State Machine model \{states,
transitions}, where
* states are specific stages in upgrade finalization either on the SCM node or
on the individual DataNodes
* transitions are events that trigger state change
Different HDDS-Upgrade stages, for Both DataNodes as well SCM are defined as
* BeforePreFinalizeUpgrade
* AfterPreFinalizeUpgrade
* BeforeCompleteFinalization
* AfterCompleteFinalization
* AfterPostFinalizeUpgrade
This validation framework will trigger all possible combination of failures
while the nodes are in different possible states. The different combinations
will include :
* One Node failures - Fail SCM in the middle of SCM upgrade while the SCM is
at a specific state.
** Try this for all possible SCM-upgrade states
* One Node failures - Fail DataNode in the middle of SCM upgrade while the SCM
is at a specific state.
** Try this for all possible SCM-upgrade states
* One Node failures - Fail SCM in the middle of DataNode upgrade while the
DataNode is at a specific state.
** Try this for all possible DataNode-upgrade states
* One Node failures - Fail DataNode in the middle of DataNode upgrade while
the same DataNode is at a specific state.
** Try this for all possible DataNode-upgrade states
* Two Node Failures - Fail SCM as well as a DataNode in the middle of SCM
upgrade while the SCM is at a specific state.
** Try this for all possible SCM-upgrade states
* Two Node Failures - Fail SCM as well as a DataNode in the middle of the
DataNode upgrade while the same DataNode is at a specific state.
** Try this for all possible DataNode-upgrade states
* Two Node Failures - Fail SCM at a specific upgrade state in SCM thread
context. Fail DataNode at a specific upgrade state in DataNode upgrade thread
context.
** Try this for all permutations of SCM-upgrade-states and
Data-Node-Upgrade-states
* Multi-node failure - Fail All the DataNodes at specific SCM-upgrade state
** Try this for all possible SCM-upgrade states
* Multi-node failure - Fail All the DataNodes at specific DataNode-upgrade
state
** Try this for all possible DataNode-upgrade states
> Validating HDDS upgrade in presence of failures
> -----------------------------------------------
>
> Key: HDDS-4914
> URL: https://issues.apache.org/jira/browse/HDDS-4914
> Project: Apache Ozone
> Issue Type: Sub-task
> Components: Ozone Datanode, SCM, upgrade
> Reporter: Prashant Pogde
> Assignee: Prashant Pogde
> Priority: Major
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]