[ 
https://issues.apache.org/jira/browse/HDDS-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17412384#comment-17412384
 ] 

Tsz-wo Sze commented on HDDS-5721:
----------------------------------

[~Sammi], thanks for filing this JIRA.  The readLock is just for the toString() 
method.  Let me fix it.
{code}
"Command processor thread" #138 daemon prio=5 os_prio=0 tid=0x00007f5b5c3f8000 
nid=0x3da6 waiting on condition [0x00007f5ad51dd000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000000851c8260> (a 
java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
        at 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
        at 
org.apache.ratis.util.AutoCloseableLock.acquire(AutoCloseableLock.java:43)
        at 
org.apache.ratis.util.AutoCloseableLock.acquire(AutoCloseableLock.java:39)
        at 
org.apache.ratis.server.raftlog.RaftLogBase.readLock(RaftLogBase.java:339)
        at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.toString(SegmentedRaftLog.java:519)
        at java.lang.String.valueOf(String.java:2994)
        at java.lang.StringBuilder.append(StringBuilder.java:131)
        at 
org.apache.ratis.server.impl.ServerState.toString(ServerState.java:370)
        at java.lang.String.valueOf(String.java:2994)
        at java.lang.StringBuilder.append(StringBuilder.java:131)
        at 
org.apache.ratis.server.impl.RaftServerImpl.toString(RaftServerImpl.java:591)
        at java.lang.String.valueOf(String.java:2994)
        at java.lang.StringBuilder.append(StringBuilder.java:131)
        at 
org.apache.ratis.server.impl.RaftServerProxy$ImplMap.toString(RaftServerProxy.java:173)
        at 
org.apache.ratis.server.impl.RaftServerProxy$ImplMap.remove(RaftServerProxy.java:106)
        - locked <0x00000000844c0568> (a 
org.apache.ratis.server.impl.RaftServerProxy$ImplMap)
        at 
org.apache.ratis.server.impl.RaftServerProxy.groupRemoveAsync(RaftServerProxy.java:501)
        at 
org.apache.ratis.server.impl.RaftServerProxy.groupManagementAsync(RaftServerProxy.java:460)
        at 
org.apache.ratis.server.impl.RaftServerProxy.groupManagement(RaftServerProxy.java:440)
        at 
org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.removeGroup(XceiverServerRatis.java:775)
        at 
org.apache.hadoop.ozone.container.common.statemachine.commandhandler.ClosePipelineCommandHandler.handle(ClosePipelineCommandHandler.java:74)
        at 
org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:99)
        at 
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$2(DatanodeStateMachine.java:555)
        at 
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine$$Lambda$191/1341681543.run(Unknown
 Source)
        at java.lang.Thread.run(Thread.java:748)
{code}


> DN failed to create new pipeline due to RaftServerProxy$ImplMap LOCK not 
> available
> ----------------------------------------------------------------------------------
>
>                 Key: HDDS-5721
>                 URL: https://issues.apache.org/jira/browse/HDDS-5721
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Sammi Chen
>            Priority: Blocker
>         Attachments: jstack.tar.gz
>
>
> The DN is found no response to the /jmx request.  And with further 
> investigation, this DN failed to join the new pipeline at the same time. 
> Attached is the jstack, it seems that DN is waiting for the  
> RaftServerProxy$ImplMap LOCK to add new group, while the LOCK is hold by 
> another thread to remove the old group, and this thread wants to have the 
> readLock of it's SegmentLock, the readLock is hold by another writeChunk 
> thread, which is waiting stateMachineDataCache quota. 
> See the attached jstack. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to