[
https://issues.apache.org/jira/browse/HDDS-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16647562#comment-16647562
]
Shashikant Banerjee commented on HDDS-579:
------------------------------------------
Thanks [~jnp], for the review.
{code:java}
>From the patch it seems, the container is just marked unhealthy and close
>action is initiated, subsequent transactions are not really failed promptly,
>until the container is marked for close. I think unhealthy replica should just
>stop applying any more transactions.
{code}
We will fail all transaction after marking the container state unhealthy. All
the "Write" transactions in container will fail with the check introduced here:
{code:java}
private void checkContainerOpen(KeyValueContainer kvContainer)
throws StorageContainerException {
ContainerLifeCycleState containerState = kvContainer.getContainerState();
if (containerState == ContainerLifeCycleState.OPEN) {
return;
} else {
String msg = "Requested operation not allowed as ContainerState is " +
containerState;
ContainerProtos.Result result = null;
switch (containerState) {
case CLOSING:
case CLOSED:
result = CLOSED_CONTAINER_IO;
break;
case UNHEALTHY:
result = CONTAINER_UNHEALTHY;
break;
case INVALID:
result = INVALID_CONTAINER_STATE;
break;
default:
result = CONTAINER_INTERNAL_ERROR;
}
throw new StorageContainerException(msg, result);
}
}
{code}
> ContainerStateMachine should fail subsequent transactions per container in
> case one fails
> -----------------------------------------------------------------------------------------
>
> Key: HDDS-579
> URL: https://issues.apache.org/jira/browse/HDDS-579
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Reporter: Shashikant Banerjee
> Assignee: Shashikant Banerjee
> Priority: Major
> Labels: recovery
> Attachments: HDDS-579.000.patch
>
>
> ContainerStateMachine will keep of track of the last successfully applied
> transaction index and on restart inform Ratis the index, so that the
> subsequent transactions can be reapplied from here.
> Moreover, in case one transaction fails, all the subsequent transactions on
> the container should fail in the containerStateMachine and a container close
> action to SCM needs to be initiated to close the container.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]