devmadhuu commented on PR #9472: URL: https://github.com/apache/ozone/pull/9472#issuecomment-3695400905
> There is something about storing the health state in ContainerInfo that doesn't feel correct to me. The state is captured at a point in time and then its stale soon afterwards. It doesn't get updated until the next run of RM. The thing about the report object, was that it captured the stats of a complete run of RM, but in the new way, the container infos get changed as RM runs. I guess it should be pretty fast, but it could lead to kind of unstable numbers. > > I cannot really come up with a concrete reason as to why I think using container Info for this is wrong, aside from it only being updated by RM with its periodic runs. Perhaps I am over thinking it and its fine. > > I think there are also some places that use RM in a read only mode to check container states (decommission maybe), so that may update the containerInfo states between RM runs. I am not sure if that is a problem or not. Probably not as it can only make the state more current. > > Aside from the above scanning the PR quickly, the thing I am not sure about is multiplying up the states - like under_replicated, unhealthy_under_replicate, qc_under_replicated ... It leads to a lot more states that may just be more confusing than helpful. > > In the RM report, we tried to only have a container in a single state, but it can be unhealthy and under / over replicated. It can be missing and under-replicated I think. Missing is kind of an extreme version of under-replicated. The only way to capture these "double states" with a single field is to multiple up the states I guess. Yes , with this new way, ContainerInfo object will hold its state (multiple or just single) and there is a possibility of changing it between two different RM runs. But in real time, that may be good also as it will reflect the current state even in read only mode. Could you please summarize your points to get better understanding how the current behavior may be an issue, As per my understanding, that should not be an issue, but still looking for deeper understanding from your contextual thinking. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
