Ivan Andika created HDDS-14950:
----------------------------------
Summary: Revisit SCM NodeManager concurrency
Key: HDDS-14950
URL: https://issues.apache.org/jira/browse/HDDS-14950
Project: Apache Ozone
Issue Type: Improvement
Reporter: Ivan Andika
Assignee: Ivan Andika
Attachments: image-2026-04-01-10-55-37-370.png
We recently found Datanode lifecycle race condition like HDDS-14834 which
causes node inconsistency between SCM NodeStateManager and NetworkTopology.
HDDS-14834 reduces the window to make race condition highly improbable, but it
might theoretically happen.
!image-2026-04-01-10-55-37-370.png|width=738,height=219!
Practical risk is near-zero but not zero:
• The window is a handful of JVM bytecode instructions (nanoseconds of CPU
time).
• For the race to manifest, the HealthyReadOnlyNodeHandler must fully
complete (including closePipeline() I/O for every pipeline on that node) within
that window.
• The JVM can preempt a thread at any point for an arbitrary duration (GC
pause, OS scheduling), so theoretically the DeadNodeHandler thread could be
paused after the state check long enough for the entire
pipeline to play out.
We need to revisit the SCM datanode management concurrency model to prevent
similar race conditions. We also need to revisit the whole SCM event framework.
Possible approach
# A lock that makes the state-check + topology-remove atomic.
# A double-check: re-read the state after nt.remove() and call nt.add() back
if the node is no longer DEAD.
# A version/epoch on the topology mutation so stale removes are rejected.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]