errose28 commented on PR #9472: URL: https://github.com/apache/ozone/pull/9472#issuecomment-3712024658
> There is something about storing the health state in ContainerInfo that doesn't feel correct to me. The state is captured at a point in time and then its stale soon afterwards. It doesn't get updated until the next run of RM. The thing about the report object, was that it captured the stats of a complete run of RM, but in the new way, the container infos get changed as RM runs. I guess it should be pretty fast, but it could lead to kind of unstable numbers. This is true of almost all in-memory state that SCM has. A lot of the metrics we are tracking to follow the cluster state are derived from container report information. Same with Recon which also has an API to list all unhealthy containers. I think the container report still has merit as a point in time snapshot of the replication manager and we should leave it as is, but I don't think that should exclude us from adding additional functionality like querying SCM's current in-memory view of the cluster. If there's a different place to maintain this information which still supports querying containers by health state we can use that instead, but to me the in-memory `ContainerInfo` object looks like the best place to store it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
