slfan1989 commented on PR #7249: URL: https://github.com/apache/ozone/pull/7249#issuecomment-2412641383
> @slfan1989 Thanks for the improvement. > > > We then send the deletion command to these 6 DNs, and the command executes normally, successfully deleting 6 blocks. However, if DN7 to DN9 are not selected, our deletion process will get stuck. > > In your case whether any DN was slow or unhealthy? causing not to be selected for deletion. With current behaviour SCM will keep track of remaining DN7 to DN9, until it waits to be successfully deleted. After your change it requires all DN1 to DN9 to be in healthy and normal state, so that SCM will send deletion request in one time and process it. But in certain case(may have happened in your environment as well) when some DNs are unhealthy or slow. In this case SCM will keep waiting and space reclaim from all DNs will halt for some time because of particular unhealthy DN. Do we need to consider this? @ashishkumar50 Thank you for your question! In my previous response, I provided some detailed thoughts on this matter. Overall, I believe that the slow deletion might be related to some unhealthy DNs, but their unhealthiness is just one of the factors involved. In this PR, we will skip containers that do not meet the conditions, meaning that if a certain DN is unhealthy, no deletions will be executed for all replicas of the containers above it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
