prashantpogde opened a new pull request #1998:
URL: https://github.com/apache/ozone/pull/1998


   ## What changes were proposed in this pull request?
   
   The goals of this PR is to write comprehensive framework that will
   
   - drives SCM - finalization
   - Inject failures in both DataNodes as well as SCM at every state change in 
both SCM and DataNodes.
   - Validate that SCM and DataNodes eventually finalize and upgrade is 
successful.
   
   HDDS upgrade model can be thought of as a State Machine model {states, 
transitions}, where
   states are specific stages in upgrade finalization either on the SCM node or 
on the individual DataNodes
   transitions are events that trigger state change
   
   Different HDDS-Upgrade stages, for Both DataNodes as well SCM are defined as
   
   - BeforePreFinalizeUpgrade
   - AfterPreFinalizeUpgrade
   - BeforeCompleteFinalization
   - AfterCompleteFinalization
   - AfterPostFinalizeUpgrade
   
   This validation framework will trigger all possible combination of failures 
while the nodes are in different possible states. The different combinations 
will include :
   
   -  One Node failures - Fail SCM  in the middle of SCM upgrade while the SCM 
is at a specific state.
         -Try this for all possible SCM-upgrade states 
   - One Node failures - Fail DataNode in the middle of SCM upgrade while the 
SCM is at a specific state. 
         -  Try this for all possible SCM-upgrade states 
   -  One Node failures - Fail SCM in the middle of DataNode upgrade while the 
DataNode is at a specific state.
         - Try this for all possible DataNode-upgrade states 
   - One Node failures - Fail DataNode in the middle of DataNode upgrade while 
the same DataNode is at a specific state. 
        - Try this for all possible DataNode-upgrade states
   - Two Node Failures - Fail SCM as well as a DataNode in the middle of SCM 
upgrade while the SCM is at a specific state.
       - Try this for all possible SCM-upgrade states
   - Two Node Failures - Fail SCM as well as a DataNode in the middle of the 
DataNode upgrade while the same DataNode is at a specific state.
       - Try this for all possible DataNode-upgrade states
   - Two Node Failures - Fail SCM at a specific upgrade state in SCM thread 
context. Fail DataNode at a specific upgrade state in DataNode upgrade thread 
context.
       - Try this for all permutations of SCM-upgrade-states and 
Data-Node-Upgrade-states
   - Multi-node failure - Fail All the DataNodes at specific SCM-upgrade state
      - Try this for all possible SCM-upgrade states
   - Multi-node failure - Fail All the DataNodes at specific DataNode-upgrade 
state
     - Try this for all possible DataNode-upgrade states
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4914
   
   ## How was this patch tested?
   
   Running newly introduced Integration Tests.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to