[ 
https://issues.apache.org/jira/browse/HDDS-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296960#comment-17296960
 ] 

Prashant Pogde commented on HDDS-4914:
--------------------------------------

The goals is to write comprehensive framework that will
 * drives SCM - finalization
 * Inject failures in both DataNodes as well as SCM at every state change in 
both SCM and DataNodes.
 * Validate that SCM and Datanodes eventually finalize and upgrade is 
successful.

HDDS upgrade model can be thought of as a State Machine model \{states, 
transitions}, where
 * states are specific stages in upgrade finalization either on the SCM node or 
on the individual DataNodes
 * transitions are events that trigger state change

Different HDDS-Upgrade stages, for Both DataNodes as well SCM are defined as
 * BeforePreFinalizeUpgrade
 * AfterPreFinalizeUpgrade
 * BeforeCompleteFinalization
 * AfterCompleteFinalization
 * AfterPostFinalizeUpgrade

This validation framework will trigger all possible combination of failures 
while the nodes are in different possible states. The different combinations 
will include :
 *  One Node failures - Fail SCM  in the middle of SCM upgrade while the SCM is 
at a specific state.
 ** Try this for all possible SCM-upgrade states 
 * One Node failures - Fail DataNode in the middle of SCM upgrade while the SCM 
is at a specific state. 
 ** Try this for all possible SCM-upgrade states 
 *  One Node failures - Fail SCM in the middle of DataNode upgrade while the 
DataNode is at a specific state.
 ** Try this for all possible DataNode-upgrade states 
 * One Node failures - Fail DataNode in the middle of DataNode upgrade while 
the same DataNode is at a specific state. 
 ** Try this for all possible DataNode-upgrade states
 * Two Node Failures - Fail SCM as well as a DataNode in the middle of SCM 
upgrade while the SCM is at a specific state.
 ** Try this for all possible SCM-upgrade states
 * Two Node Failures - Fail SCM as well as a DataNode in the middle of the 
DataNode upgrade while the same DataNode is at a specific state.
 ** Try this for all possible DataNode-upgrade states
 * Two Node Failures - Fail SCM at a specific upgrade state in SCM thread 
context. Fail DataNode at a specific upgrade state in DataNode upgrade thread 
context.
 ** Try this for all permutations of SCM-upgrade-states and 
Data-Node-Upgrade-states
 * Multi-node failure - Fail All the DataNodes at specific SCM-upgrade state
 ** Try this for all possible SCM-upgrade states
 * Multi-node failure - Fail All the DataNodes at specific DataNode-upgrade 
state
 ** Try this for all possible DataNode-upgrade states

 

> Validating HDDS upgrade in presence of failures
> -----------------------------------------------
>
>                 Key: HDDS-4914
>                 URL: https://issues.apache.org/jira/browse/HDDS-4914
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: Ozone Datanode, SCM, upgrade
>            Reporter: Prashant Pogde
>            Assignee: Prashant Pogde
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to