errose28 commented on PR #5523: URL: https://github.com/apache/ozone/pull/5523#issuecomment-1787640980
Thanks for iterating on this @sodonnel, I do feel better about this approach vs. the original one in #5504. However, I came up with another idea to solve this problem. Let me know what you think: - When SCM first creates the container, it knows the datanode replicas that are supposed to have the container. It should track this information until it gets reports that the container is created, even after the pipeline is closed. - When the pipeline is either closed gracefully by SCM or fails on the datanode, SCM should send close commands for all affected containers, including these empty ones. - When a datanode gets a close container command for a container it does not have, it can ack back to the SCM that the container is closed with BCSID=0, block count=0, empty, etc. If the container has data then the normal container flow still applies. - If the container was never created, SCM will now see it as empty and can then move this container through the regular close and delete flow. A datanode getting a delete command for a container it does not have should be ok. With this approach, we can re-use the normal delete flow and safely clean the containers out of the system, because it requires one round of back and forth between SCM and datanodes. There are some things we may need to look into though: - What could happen if decommissioning a node while it still "has" one of these empty containers from SCM's point of view? - What is the retry mechanism for datanodes letting SCM know about the container if the initial close ack is dropped? The replica doesn't exist so it won't show up in container reports. I think SCM would keep sending the close command and the datanode could keep sending the ack that the "replica" is closed and empty. - May need some SCM change to persist the set of nodes that is supposed to have a container, even if it is never created or reported. This would need to survive restarts, so it may involve leaving the container's pipeline in SCM DB until all the containers are closed. This may happen anyways but need to check. Let me know if you think there are issues with this approach that make it not viable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
