[GitHub] [ozone] Xushaohong opened a new pull request, #3657: HDDS-7099. Provide a configurable way to cleanup closed unrecoverable container

GitBox Thu, 04 Aug 2022 20:30:23 -0700


Xushaohong opened a new pull request, #3657:
URL: https://github.com/apache/ozone/pull/3657


   
   ## What changes were proposed in this pull request?
   
   **Background:**
   The async write is still not robust enough, sometimes there will be some 
uncoverable containers (no healthy replicas) when the cluster load is too high.
   
   Currently, such an unrecoverable ratis container will go through the 
following process.
   
   - DN will mark the container as unhealthy and report it to the SCM.
   
   - SCM then tries to close the container, and the container state will be 
closing.
   
   - DN won't close an unhealthy replica.
   
   - SCM RM will not send close cmd to those unhealthy containers. 
   
   Hence, the unrecoverable container will be stuck in the state of  Closing.
   
   After the admin fixes some available data in such containers or just 
abandons them, these containers shall be closed on purpose. 
   
   Under such circumstances,  we shall provide a configurable way to clean up 
these closed containers.
   After closing the unhealthy container,  the unrecoverable container with 
only unhealthy replicas could be deleted.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-7099
   
   
   ## How was this patch tested?
   
   UT and in production env
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [ozone] Xushaohong opened a new pull request, #3657: HDDS-7099. Provide a configurable way to cleanup closed unrecoverable container

Reply via email to