[ 
https://issues.apache.org/jira/browse/HDDS-10918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-10918:
------------------------------
    Summary: NPE in OM when OM leader transfers  (was: NPE when OM leader 
transfers)

> NPE in OM when OM leader transfers
> ----------------------------------
>
>                 Key: HDDS-10918
>                 URL: https://issues.apache.org/jira/browse/HDDS-10918
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Sammi Chen
>            Assignee: Sammi Chen
>            Priority: Major
>
> Current OM doesn't handle the LeaderSteppingDownException at all. 
> Stack trace,
> {noformat}
> 2024-05-28 07:00:47,206 [IPC Server handler 67 on default port 9862] WARN 
> ipc.Server: IPC Server handler 67 on default port 9862, call Call#25821 
> Retry#0 
> org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
> scm3.org:53692 / 172.25.0.118:53692
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.ozone.om.helpers.OMRatisHelper.getOMResponseFromRaftClientReply(OMRatisHelper.java:69)
>         at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.createOmResponseImpl(OzoneManagerRatisServer.java:527)
>         at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.lambda$2(OzoneManagerRatisServer.java:285)
>         at 
> org.apache.hadoop.util.MetricUtil.captureLatencyNs(MetricUtil.java:45)
>         at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.createOmResponse(OzoneManagerRatisServer.java:283)
>         at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.submitRequest(OzoneManagerRatisServer.java:263)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestToRatis(OzoneManagerProtocolServerSideTranslatorPB.java:252)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.internalProcessRequest(OzoneManagerProtocolServerSideTranslatorPB.java:226)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:161)
>         at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:89)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:152)
>         at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.processCall(ProtobufRpcEngine.java:484)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:595)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1094)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1017)
>         at java.base/java.security.AccessController.doPrivileged(Native 
> Method)
>         at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3048)
> {noformat}
> Way to reproduce,
> 1. start a docker cluster with OM HA enabled 
> 2. start a command to transfer OM leader repeatedly
> 3. start a "ozone freon rk" command
> 4. watch the OM logs, and freon ouputs, there are multiple above stack 
> traces. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to