hanishakoneru opened a new pull request #3258:
URL: https://github.com/apache/ozone/pull/3258


   ## What changes were proposed in this pull request?
   
   Currently, containers are marked UNHEALTHY by Container Scrubber for one of 
the following reasons:
   
   If an operation fails on an open/ closing container, it is marked unhealthy 
so that subsequent write transactions also fail.
   If Container Scrubber is enabled and ContainerMetadataScanner detects an 
error during KeyValueContainerCheck#fastCheck().
   Metadata path or Chunks path is not accessible as a directory
   Container checksum verification fails
   On-disk Container Yaml data does not match in-memory container data 
(ContainerType, ContainerID, Container DBType, Metadata Path)
   If Container Scrubber is enabled and ContainerDataScanner (runs only on 
closed and quasi-closed containers) detects any block with missing or corrupted 
chunks file.
   If a container in “open” state in SCM is marked unhealthy (in the container 
report), SCM asks the DNs to close the container. But for a “closing” container 
with an “unhealthy” replica, SCM leaves the container replica as is.
   
   If ReplicationManager does not find a healthy replica for a container, it 
does not replicate that container. So if there is only 1 replica of a container 
and it is unhealthy, SCM will never replicate it and there is potential for 
data loss if that single replica is lost for any reason (for example: disk 
failure).
   
   If there is a Quasi-Closed replica and an Unhealthy container, SCM will 
delete the unhealthy container. In this scenario, SCM should not delete the 
unhealthy container if it can recovered as it is possible that the unhealthy 
container is ahead of the quasi-closed container.
   
   SCM should be more conservative with deleting unhealthy containers as they 
could possibly be recovered. This Jira proposes to let SCM replicate an 
unhealthy container if there is no other replica. Also, if there is only a 
quasi-closed replica and an unhealthy replica, SCM should not delete the 
unhealthy replica.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6447
   
   ## How was this patch tested?
   
   (Please explain how this patch was tested. Ex: unit tests, manual tests)
   (If this patch involves UI changes, please attach a screen-shot; otherwise, 
remove this)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to