[
https://issues.apache.org/jira/browse/HDDS-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hongbing Wang updated HDDS-8366:
--------------------------------
Description:
OM all rpc handlers hang when calling
`OzoneManagerRatisServer#submitRequestToRatis`, the key stack as follows:
{noformat}
"IPC Server handler 99 on 9862" #187 daemon prio=5 os_prio=0
tid=0x00007f1897b4c000 nid=0x10fa63 waiting on condition [0x00007f05a5b48000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00007f08a185e050> (a
java.util.concurrent.CompletableFuture$Signaller)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
at
java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at
java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
at
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at
org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.submitRequestToRatis(OzoneManagerRatisServer.java:285)
at
org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.submitRequest(OzoneManagerRatisServer.java:247)
at
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestToRatis(OzoneManagerProtocolServerSideTranslatorPB.java:217)
at
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:198)
at
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB$$Lambda$696/251832800.apply(Unknown
Source)
at
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
at
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:147)
at
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:886)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:828)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1903)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2716)
Locked ownable synchronizers:
- None
{noformat}
The complete stack see: [^om.abnormal.jstack]
was:OM all rpc handlers hang when calling
`OzoneManagerRatisServer#submitRequestToRatis`.
> OzoneManager hangs when submitRequestToRatis
> --------------------------------------------
>
> Key: HDDS-8366
> URL: https://issues.apache.org/jira/browse/HDDS-8366
> Project: Apache Ozone
> Issue Type: Bug
> Components: OM, Ozone Manager
> Affects Versions: 1.3.0
> Reporter: Hongbing Wang
> Priority: Major
> Attachments: om.abnormal.jstack, om.normal.jstack, om_rpc_callqueue_
> accumulation.png
>
>
> OM all rpc handlers hang when calling
> `OzoneManagerRatisServer#submitRequestToRatis`, the key stack as follows:
> {noformat}
> "IPC Server handler 99 on 9862" #187 daemon prio=5 os_prio=0
> tid=0x00007f1897b4c000 nid=0x10fa63 waiting on condition [0x00007f05a5b48000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00007f08a185e050> (a
> java.util.concurrent.CompletableFuture$Signaller)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at
> java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
> at
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
> at
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
> at
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
> at
> org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.submitRequestToRatis(OzoneManagerRatisServer.java:285)
> at
> org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.submitRequest(OzoneManagerRatisServer.java:247)
> at
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestToRatis(OzoneManagerProtocolServerSideTranslatorPB.java:217)
> at
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:198)
> at
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB$$Lambda$696/251832800.apply(Unknown
> Source)
> at
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
> at
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:147)
> at
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:886)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:828)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1903)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2716)
> Locked ownable synchronizers:
> - None
> {noformat}
> The complete stack see: [^om.abnormal.jstack]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]