Aravindan Vijayan created HDDS-4539:
---------------------------------------

             Summary: Container Health Task should not run until Recon has 
reached steady state.
                 Key: HDDS-4539
                 URL: https://issues.apache.org/jira/browse/HDDS-4539
             Project: Hadoop Distributed Data Store
          Issue Type: Bug
          Components: Ozone Recon
            Reporter: Aravindan Vijayan


On a cluster with millions of containers or hundreds of Datanodes, it will take 
some time for Recon to reach a steady state (all active DNs and Containers 
reported). If the container health task is run before this, it can incorrectly 
flag most of the containers as missing. This also leads to an other problem 
mentioned in HDDS-4402. We need to make sure the container health task is not 
run before cluster has reached steady state. This could be a fixed wait time 
(~10mins) or by checking Recon's SCM state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to