Aswin Shakil Balasubramanian created HDDS-5611:
--------------------------------------------------

             Summary: NullPointerException in ContainerStateMachine during 
Pipeline Close.
                 Key: HDDS-5611
                 URL: https://issues.apache.org/jira/browse/HDDS-5611
             Project: Apache Ozone
          Issue Type: Bug
            Reporter: Aswin Shakil Balasubramanian
            Assignee: Aswin Shakil Balasubramanian


{code:java}
2021-06-22 05:43:07,590 ERROR 
org.apache.hadoop.ozone.container.common.statemachine.commandhandler.ClosePipelineCommandHandler:
 Can't close pipeline PipelineID=0bcbc90b-5982-450
2-a5d6-ba14a461c307
java.io.IOException: java.lang.NullPointerException
        at 
org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.removeGroup(XceiverServerRatis.java:782)
        at 
org.apache.hadoop.ozone.container.common.statemachine.commandhandler.ClosePipelineCommandHandler.handle(ClosePipelineCommandHandler.java:74)
        at 
org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:99)
        at 
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$2(DatanodeStateMachine.java:497)
        at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.io.IOException: java.lang.NullPointerException
        at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54)
        at 
org.apache.ratis.server.impl.RaftServerImpl.waitForReply(RaftServerImpl.java:862)
        at 
org.apache.ratis.server.impl.RaftServerProxy.groupManagement(RaftServerProxy.java:432)
        at 
org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.removeGroup(XceiverServerRatis.java:780)
        ... 4 more
Caused by: java.lang.NullPointerException
        at 
org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.markContainerForClose(ContainerController.java:83)
        at 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.notifyGroupRemove(ContainerStateMachine.java:883)
        at 
org.apache.ratis.server.impl.RaftServerImpl.groupRemove(RaftServerImpl.java:362)
        at 
org.apache.ratis.server.impl.RaftServerProxy.lambda$groupRemoveAsync$14(RaftServerProxy.java:499)
        at 
java.base/java.util.concurrent.CompletableFuture.uniApplyNow(CompletableFuture.java:680)
        at 
java.base/java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:658)
        at 
java.base/java.util.concurrent.CompletableFuture.thenApply(CompletableFuture.java:2094)
        at 
org.apache.ratis.server.impl.RaftServerProxy.groupRemoveAsync(RaftServerProxy.java:498)
        at 
org.apache.ratis.server.impl.RaftServerProxy.groupManagementAsync(RaftServerProxy.java:452)
        ... 6 more
{code}
During pipeline close, we iterate over the list of containers in Ratis snapshot 
to close them if needed. This will cause an NPE for missing containers. We need 
to make sure we also look at missing containers, and skip them in this step.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to