Ivan Andika created HDDS-14950:
----------------------------------

             Summary: Revisit SCM NodeManager concurrency
                 Key: HDDS-14950
                 URL: https://issues.apache.org/jira/browse/HDDS-14950
             Project: Apache Ozone
          Issue Type: Improvement
            Reporter: Ivan Andika
            Assignee: Ivan Andika
         Attachments: image-2026-04-01-10-55-37-370.png

We recently found Datanode lifecycle race condition like HDDS-14834 which 
causes node inconsistency between SCM NodeStateManager and NetworkTopology. 
HDDS-14834 reduces the window to make race condition highly improbable, but it 
might theoretically happen. 

!image-2026-04-01-10-55-37-370.png|width=738,height=219!

Practical risk is near-zero but not zero:
  • The window is a handful of JVM bytecode instructions (nanoseconds of CPU 
time).
  • For the race to manifest, the HealthyReadOnlyNodeHandler must fully 
complete (including closePipeline() I/O for every pipeline on that node) within 
that window.
  • The JVM can preempt a thread at any point for an arbitrary duration (GC 
pause, OS scheduling), so theoretically the DeadNodeHandler thread could be 
paused after the state check long enough for the entire
    pipeline to play out.

We need to revisit the SCM datanode management concurrency model to prevent 
similar race conditions. We also need to revisit the whole SCM event framework.

Possible approach
 # A lock that makes the state-check + topology-remove atomic.
 # A double-check: re-read the state after nt.remove() and call nt.add() back 
if the node is no longer DEAD.
 # A version/epoch on the topology mutation so stale removes are rejected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to