[
https://issues.apache.org/jira/browse/HDDS-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17719799#comment-17719799
]
Stephen O'Donnell commented on HDDS-8179:
-----------------------------------------
I mistakenly assumed the SCM thought the container was empty, but the DN did
not, which is not the case here. It is clear from the logs, that the replicas
on SCM see a keyCount=1 and the DN refusing to do the delete is valid. The
question is then how the container got into a DELETING state with a keyCount !=
0. This is what I believe happened, or at least something along these lines:
# Container opened for a write
# Client starts write of a small key, smaller than a stripe.
# Around the same time as the client starts the write, the node is
decommissioned. The first step here is to close the pipelines.
# Pipeline is closed that moves the containers into a CLOSING state and sends
commands to close the container to the DNs. DNs don't get this until the next
heartbeat.
# One DN gets the close command, and closes the container and sends an ICR. If
this replica was not data index = 1 or one of the parity nodes, it could report
a container is closed with zero keys. This ICR will transition the container to
CLOSED in SCM.
# RM runs and sees the container is CLOSED with one replica and its zero keys,
and transitions the container into DELETING.
# Other replicas check in with a keysize of 1 which updates the container key
size in RM leaving it in this strange state we don't expect, but at this stage
RM is has nothing to transition the container back to CLOSED.
One part of the problem, is that we let the container go from CLOSING to CLOSED
too easily. For RATIS, there are some checks so that only a replica with the
latest BCSID can transition the container to CLOSED. For EC, I think we need to
add some logic to ensure that only a "key" replica can transition the close. A
key replica is replicaIndex = 1 or any parity index, as they are guaranteed to
hold a block file for every block in the container. Other replicas may be
empty. We already have logic to ensure the keyCount and ByteUsed are only
updated by a "Key Replica" so this woudl extend the same thing to the CLOSING
to CLOSED transition in AbstractContainerReportHandler. It should stop this
incorrect DELETING state from being able to happen.
> Datanode decommissioning blocked due to non-empty replica of deleting
> container
> -------------------------------------------------------------------------------
>
> Key: HDDS-8179
> URL: https://issues.apache.org/jira/browse/HDDS-8179
> Project: Apache Ozone
> Issue Type: Bug
> Components: ECOfflineRecovery, SCM
> Reporter: Varsha Ravi
> Assignee: Stephen O'Donnell
> Priority: Major
> Labels: pull-request-available
>
> The Replication Manager is sending delete container command to a non-empty
> container due to HDDS-7775. The container is not deleted but the *subsequent
> decommissioning calls to any of the DNs is not completing* because the
> container is in under-replicated as well as unhealthy state.
> *SCM.log:*
> {noformat}
> 2023-03-14 21:53:26,413 INFO
> org.apache.hadoop.hdds.scm.container.replication.ReplicationManager: Sending
> command [deleteContainerCommand: containerID: 15019, replicaIndex: 1, force:
> false] for container ContainerInfo{id=#15019, state=DELETING,
> pipelineID=PipelineID=e3fb8629-89ee-472a-9c43-3962629bd7a9,
> stateEnterTime=2023-03-14T19:17:07.315Z, owner=om2} to
> 1ca038f8-c505-47ca-b701-d542b85bb75b
> 2023-03-14 21:53:26,413 INFO
> org.apache.hadoop.hdds.scm.container.replication.ReplicationManager: Sending
> command [deleteContainerCommand: containerID: 15019, replicaIndex: 5, force:
> false] for container ContainerInfo{id=#15019, state=DELETING,
> pipelineID=PipelineID=e3fb8629-89ee-472a-9c43-3962629bd7a9,
> stateEnterTime=2023-03-14T19:17:07.315Z, owner=om2} to
> 1ac8e090-7eb7-4dab-93b7-97e4845f7b49
> 2023-03-14 23:19:12,206 INFO
> org.apache.hadoop.hdds.scm.container.replication.ReplicationManager: Sending
> command [deleteContainerCommand: containerID: 15019, replicaIndex: 3, force:
> false] for container ContainerInfo{id=#15019, state=DELETING,
> pipelineID=PipelineID=e3fb8629-89ee-472a-9c43-3962629bd7a9,
> stateEnterTime=2023-03-14T19:17:07.315Z, owner=om2} to
> c5c3948e-1296-4313-8c4e-9e6e50424280
> 2023-03-14 23:19:53,296 INFO
> org.apache.hadoop.hdds.scm.node.NodeDecommissionManager: Starting
> Decommission for node c5c3948e-1296-4313-8c4e-9e6e50424280
> 2023-03-14 23:22:38,512 INFO
> org.apache.hadoop.hdds.scm.node.DatanodeAdminMonitorImpl: Under Replicated
> Container #15019
> org.apache.hadoop.hdds.scm.container.replication.ECContainerReplicaCount@2bd10f2f;
> Replicas{
> ContainerReplica{containerID=#15019, state=CLOSED,
> datanodeDetails=ba62c66a-a342-4147-8344-3ce91726c2dc,
> placeOfBirth=ba62c66a-a342-4147-8344-3ce91726c2dc, sequenceId=0, keyCount=1,
> bytesUsed=102400,replicaIndex=5},
> ContainerReplica{containerID=#15019, state=CLOSED,
> datanodeDetails=15af7526-8376-45c4-97a5-7a74b7abc678,
> placeOfBirth=15af7526-8376-45c4-97a5-7a74b7abc678, sequenceId=0, keyCount=1,
> bytesUsed=102400,replicaIndex=4},
> ContainerReplica{containerID=#15019, state=CLOSED,
> datanodeDetails=1ca038f8-c505-47ca-b701-d542b85bb75b,
> placeOfBirth=1ca038f8-c505-47ca-b701-d542b85bb75b, sequenceId=0, keyCount=1,
> bytesUsed=102400,replicaIndex=1},
> ContainerReplica{containerID=#15019, state=CLOSED,
> datanodeDetails=c5c3948e-1296-4313-8c4e-9e6e50424280,
> placeOfBirth=c5c3948e-1296-4313-8c4e-9e6e50424280, sequenceId=0, keyCount=1,
> bytesUsed=102400,replicaIndex=3},
> ContainerReplica{containerID=#15019, state=CLOSED,
> datanodeDetails=f689fc55-e0e3-4785-9f2a-f799e18f0578,
> placeOfBirth=f689fc55-e0e3-4785-9f2a-f799e18f0578, sequenceId=0, keyCount=1,
> bytesUsed=102400,replicaIndex=1},
> ContainerReplica{containerID=#15019, state=CLOSED,
> datanodeDetails=1ac8e090-7eb7-4dab-93b7-97e4845f7b49,
> placeOfBirth=1ac8e090-7eb7-4dab-93b7-97e4845f7b49, sequenceId=0, keyCount=1,
> bytesUsed=102400,replicaIndex=5}}
> 2023-03-14 23:22:38,512 INFO
> org.apache.hadoop.hdds.scm.node.DatanodeAdminMonitorImpl: Unhealthy Container
> #15019
> org.apache.hadoop.hdds.scm.container.replication.ECContainerReplicaCount@2bd10f2f;
> Replicas{
> ContainerReplica{containerID=#15019, state=CLOSED,
> datanodeDetails=ba62c66a-a342-4147-8344-3ce91726c2dc,
> placeOfBirth=ba62c66a-a342-4147-8344-3ce91726c2dc, sequenceId=0, keyCount=1,
> bytesUsed=102400,replicaIndex=5},
> ContainerReplica{containerID=#15019, state=CLOSED,
> datanodeDetails=15af7526-8376-45c4-97a5-7a74b7abc678,
> placeOfBirth=15af7526-8376-45c4-97a5-7a74b7abc678, sequenceId=0, keyCount=1,
> bytesUsed=102400,replicaIndex=4},
> ContainerReplica{containerID=#15019, state=CLOSED,
> datanodeDetails=1ca038f8-c505-47ca-b701-d542b85bb75b,
> placeOfBirth=1ca038f8-c505-47ca-b701-d542b85bb75b, sequenceId=0, keyCount=1,
> bytesUsed=102400,replicaIndex=1},
> ContainerReplica{containerID=#15019, state=CLOSED,
> datanodeDetails=c5c3948e-1296-4313-8c4e-9e6e50424280,
> placeOfBirth=c5c3948e-1296-4313-8c4e-9e6e50424280, sequenceId=0, keyCount=1,
> bytesUsed=102400,replicaIndex=3},
> ContainerReplica{containerID=#15019, state=CLOSED,
> datanodeDetails=f689fc55-e0e3-4785-9f2a-f799e18f0578,
> placeOfBirth=f689fc55-e0e3-4785-9f2a-f799e18f0578, sequenceId=0, keyCount=1,
> bytesUsed=102400,replicaIndex=1},
> ContainerReplica{containerID=#15019, state=CLOSED,
> datanodeDetails=1ac8e090-7eb7-4dab-93b7-97e4845f7b49,
> placeOfBirth=1ac8e090-7eb7-4dab-93b7-97e4845f7b49, sequenceId=0, keyCount=1,
> bytesUsed=102400,replicaIndex=5}}
> 2023-03-14 23:22:38,512 INFO
> org.apache.hadoop.hdds.scm.node.DatanodeAdminMonitorImpl:
> c5c3948e-1296-4313-8c4e-9e6e50424280 has 60 sufficientlyReplicated, 1
> underReplicated and 1 unhealthy containers{noformat}
> *DN.log:*
> {noformat}
> 2023-03-14 21:53:32,032 ERROR
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler: Received
> container deletion command for container 15019 but the container is not empty
> with blockCount 1
> 2023-03-14 21:53:32,035 ERROR
> org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteContainerCommandHandler:
> Exception occurred while deleting the container.
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
> Non-force deletion of non-empty container is not allowed.
> at
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.deleteInternal(KeyValueHandler.java:1303)
> at
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.deleteContainer(KeyValueHandler.java:1160)
> at
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.deleteContainer(ContainerController.java:182)
> at
> org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteContainerCommandHandler.handleInternal(DeleteContainerCommandHandler.java:108)
> at
> org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteContainerCommandHandler.lambda$handle$0(DeleteContainerCommandHandler.java:78)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834){noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]