[
https://issues.apache.org/jira/browse/HDDS-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shashikant Banerjee updated HDDS-5122:
--------------------------------------
Description:
During SCM reinitialialisation, ratis server is spinned up to check if an
existing ratis group exists or not, and closes the server without starting it.
In ratis, the segmented raft log worker thraeds are started during init()
itself but get closed during raftServer.close() only if the server transitions
to RUNNING state which causes the issue.
{code:java}
Attaching to process ID 266710, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.232-b09
Deadlock Detection:No deadlocks found.Thread 266745: (state = BLOCKED)Locked
ownable synchronizers:
- NoneThread 266783: (state = BLOCKED)
- sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may
be imprecise)
- java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long)
@bci=20, line=215 (Compiled frame)
-
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(long)
@bci=78, line=2078 (Compiled frame)
-
org.apache.ratis.util.DataBlockingQueue.poll(org.apache.ratis.util.TimeDuration)
@bci=134, line=137 (Compiled frame)
- org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker.run()
@bci=16, line=292 (Interpreted frame)
-
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker$$Lambda$161.run()
@bci=4 (Interpreted frame)
- java.lang.Thread.run() @bci=11, line=748 (Interpreted frame)Locked ownable
synchronizers:
- NoneThread 266761: (state = BLOCKED)Locked ownable synchronizers:
- NoneThread 266760: (state = BLOCKED)Locked ownable synchronizers:
- NoneThread 266759: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.ref.ReferenceQueue.remove(long) @bci=59, line=144 (Compiled frame)
- java.lang.ref.ReferenceQueue.remove() @bci=2, line=165 (Compiled frame)
- java.lang.ref.Finalizer$FinalizerThread.run() @bci=36, line=216 (Interpreted
frame)Locked ownable synchronizers:
- NoneThread 266758: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=502 (Compiled frame)
- java.lang.ref.Reference.tryHandlePending(boolean) @bci=54, line=191
(Compiled frame)
- java.lang.ref.Reference$ReferenceHandler.run() @bci=1, line=153 (Interpreted
frame)Locked ownable synchronizers:
- None
{code}
was:During SCM reinitialialisation, ratis server is spinned up to check if an
existing ratis group exists or not, and closes the server without starting it.
In ratis, the segmented raft log worker thraeds are started during init()
itself but get closed during raftServer.close() only if the server transitions
to RUNNING state which causes the issue.
> SCM Reinitialization can end up leaking Ratis Segmented RaftLogWorker threads
> -----------------------------------------------------------------------------
>
> Key: HDDS-5122
> URL: https://issues.apache.org/jira/browse/HDDS-5122
> Project: Apache Ozone
> Issue Type: Bug
> Components: SCM HA
> Reporter: István Fajth
> Assignee: Shashikant Banerjee
> Priority: Major
>
> During SCM reinitialialisation, ratis server is spinned up to check if an
> existing ratis group exists or not, and closes the server without starting
> it. In ratis, the segmented raft log worker thraeds are started during init()
> itself but get closed during raftServer.close() only if the server
> transitions to RUNNING state which causes the issue.
>
> {code:java}
> Attaching to process ID 266710, please wait...
> Debugger attached successfully.
> Server compiler detected.
> JVM version is 25.232-b09
> Deadlock Detection:No deadlocks found.Thread 266745: (state = BLOCKED)Locked
> ownable synchronizers:
> - NoneThread 266783: (state = BLOCKED)
> - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information
> may be imprecise)
> - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long)
> @bci=20, line=215 (Compiled frame)
> -
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(long)
> @bci=78, line=2078 (Compiled frame)
> -
> org.apache.ratis.util.DataBlockingQueue.poll(org.apache.ratis.util.TimeDuration)
> @bci=134, line=137 (Compiled frame)
> - org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker.run()
> @bci=16, line=292 (Interpreted frame)
> -
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker$$Lambda$161.run()
> @bci=4 (Interpreted frame)
> - java.lang.Thread.run() @bci=11, line=748 (Interpreted frame)Locked ownable
> synchronizers:
> - NoneThread 266761: (state = BLOCKED)Locked ownable synchronizers:
> - NoneThread 266760: (state = BLOCKED)Locked ownable synchronizers:
> - NoneThread 266759: (state = BLOCKED)
> - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
> - java.lang.ref.ReferenceQueue.remove(long) @bci=59, line=144 (Compiled
> frame)
> - java.lang.ref.ReferenceQueue.remove() @bci=2, line=165 (Compiled frame)
> - java.lang.ref.Finalizer$FinalizerThread.run() @bci=36, line=216
> (Interpreted frame)Locked ownable synchronizers:
> - NoneThread 266758: (state = BLOCKED)
> - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
> - java.lang.Object.wait() @bci=2, line=502 (Compiled frame)
> - java.lang.ref.Reference.tryHandlePending(boolean) @bci=54, line=191
> (Compiled frame)
> - java.lang.ref.Reference$ReferenceHandler.run() @bci=1, line=153
> (Interpreted frame)Locked ownable synchronizers:
> - None
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]