Aswin Shakil Balasubramanian created HDDS-5611:
--------------------------------------------------
Summary: NullPointerException in ContainerStateMachine during
Pipeline Close.
Key: HDDS-5611
URL: https://issues.apache.org/jira/browse/HDDS-5611
Project: Apache Ozone
Issue Type: Bug
Reporter: Aswin Shakil Balasubramanian
Assignee: Aswin Shakil Balasubramanian
{code:java}
2021-06-22 05:43:07,590 ERROR
org.apache.hadoop.ozone.container.common.statemachine.commandhandler.ClosePipelineCommandHandler:
Can't close pipeline PipelineID=0bcbc90b-5982-450
2-a5d6-ba14a461c307
java.io.IOException: java.lang.NullPointerException
at
org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.removeGroup(XceiverServerRatis.java:782)
at
org.apache.hadoop.ozone.container.common.statemachine.commandhandler.ClosePipelineCommandHandler.handle(ClosePipelineCommandHandler.java:74)
at
org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:99)
at
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$2(DatanodeStateMachine.java:497)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.io.IOException: java.lang.NullPointerException
at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54)
at
org.apache.ratis.server.impl.RaftServerImpl.waitForReply(RaftServerImpl.java:862)
at
org.apache.ratis.server.impl.RaftServerProxy.groupManagement(RaftServerProxy.java:432)
at
org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.removeGroup(XceiverServerRatis.java:780)
... 4 more
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.markContainerForClose(ContainerController.java:83)
at
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.notifyGroupRemove(ContainerStateMachine.java:883)
at
org.apache.ratis.server.impl.RaftServerImpl.groupRemove(RaftServerImpl.java:362)
at
org.apache.ratis.server.impl.RaftServerProxy.lambda$groupRemoveAsync$14(RaftServerProxy.java:499)
at
java.base/java.util.concurrent.CompletableFuture.uniApplyNow(CompletableFuture.java:680)
at
java.base/java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:658)
at
java.base/java.util.concurrent.CompletableFuture.thenApply(CompletableFuture.java:2094)
at
org.apache.ratis.server.impl.RaftServerProxy.groupRemoveAsync(RaftServerProxy.java:498)
at
org.apache.ratis.server.impl.RaftServerProxy.groupManagementAsync(RaftServerProxy.java:452)
... 6 more
{code}
During pipeline close, we iterate over the list of containers in Ratis snapshot
to close them if needed. This will cause an NPE for missing containers. We need
to make sure we also look at missing containers, and skip them in this step.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]