[
https://issues.apache.org/jira/browse/HDDS-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shashikant Banerjee updated HDDS-1589:
--------------------------------------
Description:
Currently, while trying to close an unhealthy container over Ratis, it fails
with INTERNAL_ERROR which leads to exception as follow:
{code:java}
2019-05-19 22:00:48,386 ERROR commandhandler.CloseContainerCommandHandler
(CloseContainerCommandHandler.java:handle(124)) - Can't close container #125
org.apache.ratis.protocol.StateMachineException:
java.util.concurrent.CompletionException from Server
faea26b0-9c60-4b4c-a0df-bf7c67cc5b48: java.lang.IllegalStateException
at
org.apache.ratis.server.impl.RaftServerImpl.lambda$replyPendingRequest$24(RaftServerImpl.java:1221)
at
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
at
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1595)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.CompletionException:
java.lang.IllegalStateException
at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
at
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1592)
... 3 more
Caused by: java.lang.IllegalStateException
at
com.google.common.base.Preconditions.checkState(Preconditions.java:129)
at
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:300)
at
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:149)
at
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:347)
at
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.runCommand(ContainerStateMachine.java:354)
at
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$applyTransaction$5(ContainerStateMachine.java:613)
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
{code}
This happens when , it tries to mark the container unhealthy as the transaction
has failed and tries to mark the container unhealthy where it expects the
container to be in OPEN or CLOSIG state ad hence asserts. It should ideally
fail with CONTAINER_UNHEALTHY so as to not retry to not change the state to be
UNHEALTHY.
was:
Currently, while trying to close an unhealthy container over Ratis, it fails
with INTERNAL_ERROR which leads to exception as follow:
{code:java}
2019-05-19 22:00:48,386 ERROR commandhandler.CloseContainerCommandHandler
(CloseContainerCommandHandler.java:handle(124)) - Can't close container #125
org.apache.ratis.protocol.StateMachineException:
java.util.concurrent.CompletionException from Server
faea26b0-9c60-4b4c-a0df-bf7c67cc5b48: java.lang.IllegalStateException
at
org.apache.ratis.server.impl.RaftServerImpl.lambda$replyPendingRequest$24(RaftServerImpl.java:1221)
at
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
at
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1595)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.CompletionException:
java.lang.IllegalStateException
at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
at
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1592)
... 3 more
Caused by: java.lang.IllegalStateException
at
com.google.common.base.Preconditions.checkState(Preconditions.java:129)
at
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:300)
at
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:149)
at
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:347)
at
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.runCommand(ContainerStateMachine.java:354)
at
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$applyTransaction$5(ContainerStateMachine.java:613)
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
{code}
This happens when , it tries to mark the container unhealthy as the transaction
has failed and tries to mark the container unhealthy where it expects the
container to be in OPE or CLOSIG state ad hence asserts. It should ideally fail
with CONTAINER_UNHEATHY so as to not retry to not change the state to be
UNNHEATHY.
> CloseContainer transaction on unhealthy replica should fail with
> CONTAINER_UNHEALTHY exception
> ----------------------------------------------------------------------------------------------
>
> Key: HDDS-1589
> URL: https://issues.apache.org/jira/browse/HDDS-1589
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: Ozone Datanode
> Reporter: Shashikant Banerjee
> Assignee: Shashikant Banerjee
> Priority: Major
>
> Currently, while trying to close an unhealthy container over Ratis, it fails
> with INTERNAL_ERROR which leads to exception as follow:
> {code:java}
> 2019-05-19 22:00:48,386 ERROR commandhandler.CloseContainerCommandHandler
> (CloseContainerCommandHandler.java:handle(124)) - Can't close container #125
> org.apache.ratis.protocol.StateMachineException:
> java.util.concurrent.CompletionException from Server
> faea26b0-9c60-4b4c-a0df-bf7c67cc5b48: java.lang.IllegalStateException
> at
> org.apache.ratis.server.impl.RaftServerImpl.lambda$replyPendingRequest$24(RaftServerImpl.java:1221)
> at
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> at
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
> at
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1595)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.concurrent.CompletionException:
> java.lang.IllegalStateException
> at
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
> at
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
> at
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1592)
> ... 3 more
> Caused by: java.lang.IllegalStateException
> at
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:300)
> at
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:149)
> at
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:347)
> at
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.runCommand(ContainerStateMachine.java:354)
> at
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$applyTransaction$5(ContainerStateMachine.java:613)
> at
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> {code}
> This happens when , it tries to mark the container unhealthy as the
> transaction has failed and tries to mark the container unhealthy where it
> expects the container to be in OPEN or CLOSIG state ad hence asserts. It
> should ideally fail with CONTAINER_UNHEALTHY so as to not retry to not change
> the state to be UNHEALTHY.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]