ivandika3 opened a new pull request, #9926:
URL: https://github.com/apache/ozone/pull/9926

   ## What changes were proposed in this pull request?
   
   DeadNodeHandler and HealthyReadOnlyNodeHandler run on separate 
SingleThreadExecutors, which can lead to a race condition where a resurrected 
datanode is removed from the NetworkTopology after being re-added. This leaves 
the node reachable but invisible to the placement policy.
   
   Fix: DeadNodeHandler now checks the current node state before removing it 
from the topology, skipping removal if the node is no longer DEAD. 
HealthyReadOnlyNodeHandler uses unconditional add (idempotent) instead of a 
contains-then-add check, closing the TOCTOU gap.
   
   Made-with: Cursor
   
   Alternative considered approaches
   - Use a shared `SingleThreadExecutor` for both `DeadNodeHandler`: This 
requires a large change in the SCM event 
   framework
   - Using lock to guard the access: Need to check the concurrency guarantee to 
prevent deadlocks
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-14834
   
   ## How was this patch tested?
   
   UT


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to