ivandika3 opened a new pull request, #9926: URL: https://github.com/apache/ozone/pull/9926
## What changes were proposed in this pull request? DeadNodeHandler and HealthyReadOnlyNodeHandler run on separate SingleThreadExecutors, which can lead to a race condition where a resurrected datanode is removed from the NetworkTopology after being re-added. This leaves the node reachable but invisible to the placement policy. Fix: DeadNodeHandler now checks the current node state before removing it from the topology, skipping removal if the node is no longer DEAD. HealthyReadOnlyNodeHandler uses unconditional add (idempotent) instead of a contains-then-add check, closing the TOCTOU gap. Made-with: Cursor Alternative considered approaches - Use a shared `SingleThreadExecutor` for both `DeadNodeHandler`: This requires a large change in the SCM event framework - Using lock to guard the access: Need to check the concurrency guarantee to prevent deadlocks ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-14834 ## How was this patch tested? UT -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
