swamirishi opened a new pull request, #7862:
URL: https://github.com/apache/ozone/pull/7862

   ## What changes were proposed in this pull request?
   Currently when an apply transaction fails on ContainerStateMachine, the 
[isStateMachineHealthy](https://github.com/apache/ozone/blob/bf19af946aa08d2bea5064d79513a196b9bbf646/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L568)
 flag is set to false. However when the next set of transactions are applied 
this flag is not checked and all the transactions in the pipeline can get 
applied to the statemachine which could potentially bring down the statemachine 
in an inconsistent state. For instance if the write chunk fails and the next 
putBlock transaction succeeds then this would mean that the container is in an 
inconsistent state. 
   
   In this patch an unhealthyContainerSet is added to ensure no future 
transactions gets applied to the container if the container is present in the 
unhealthyContainerSet. A container would be added to containerSet if any errors 
are seen while writing StateMachine data/reading 
StateMachineData/applyTransactions to the container.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-12236
   
   ## How was this patch tested?
   Adding unit tests
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to