deepujain opened a new pull request, #8308: URL: https://github.com/apache/hadoop/pull/8308
### Summary On a standby NameNode, a DataNode can get stuck in `DECOMMISSION_INPROGRESS` indefinitely when a timing race causes a new replica (created during re-replication) to be marked as **excess** instead of **live**. The standby's decommission monitor then sees too few "live" replicas and never considers the block sufficient, so decommission never completes. This branch merges [apache/hadoop#8295](https://github.com/apache/hadoop/pull/8295) with current trunk so the fix is up to date. ### Change - **DatanodeAdminManager.isSufficient()**: For non–under-construction blocks, count **excess** replicas together with live replicas when deciding if the block is sufficiently replicated for decommission. Retain the existing guard so decommission does not proceed when there are zero live replicas (`hasMinStorage(block, numLive)`). - **TestDatanodeAdminManagerIsSufficient**: New unit tests for the sufficiency logic (excess counts toward sufficiency, no live blocks decommission, etc.). ### JIRA Fixes HDFS-17722 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
