xBis7 commented on PR #5651: URL: https://github.com/apache/ozone/pull/5651#issuecomment-1845226355
@devmadhuu Thanks for the comments. > isMissing() still considers total replicas irrespective of healthy/unhealthy. Shouldn't we consider only HealthyReplicas here ? I changed that because I think it's a discrepancy. Missing container for the SCM means an empty replica list, regardless of healthy/unhealthy. If a container has only 1 replica and that replica is unhealthy, it won't be considered missing for SCM but Recon would be reporting it as such. I got that from these places in the code * that's where `ReplicationManager` marks a container missing [here](https://github.com/apache/ozone/blob/master/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/health/ClosingContainerHandler.java#L74-L78), the container replica list has ALL known replicas. * that's where `LegacyReplicationManager` marks a container missing [here](https://github.com/apache/ozone/blob/master/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/LegacyReplicationManager.java#L535-L539) and [here](https://github.com/apache/ozone/blob/master/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/LegacyReplicationManager.java#L1834-L1837). Please correct me if I misunderstood and the list checked contain only healthy containers. > And we had this JIRA [HDDS-9633](https://issues.apache.org/jira/browse/HDDS-9633) to sync Recon container stats with RM container report. I wasn't aware of that. Thanks. > Is this PR covering all ? This is covering `over-replication`, `under-replication` and `missing`. I think we can get rid of `replicaDelta` in Recon entirely and reuse SCM code for all calculations. I didn't want to move ahead with it because the scope will grow out of hand. We will have to modify all the Recon tests for containers. I also haven't change anything about `mis-replication`. I have a PR open for an issue related to that. > Can you add test matching admin CLI RM container report numbers matching for various states with Recon showing container stats or matching. Yes, let me look into an integration test or a robot test. I don't know which is easier at the moment. We should be able to use `curl` to get the data from Recon UI. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
