[
https://issues.apache.org/jira/browse/HDDS-8838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashish Kumar reassigned HDDS-8838:
----------------------------------
Assignee: Ashish Kumar
> Orphan blocks can leave empty container stuck deleting forever
> --------------------------------------------------------------
>
> Key: HDDS-8838
> URL: https://issues.apache.org/jira/browse/HDDS-8838
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Stephen O'Donnell
> Assignee: Ashish Kumar
> Priority: Major
> Labels: pull-request-available
>
> A container can get stuck in the DELETING state in the following scenario:
> Shutdown a node and write an EC key shortly afterwards before the node has
> been marked stale. In this case, the client can do the following:
> 1. Attempt to write to a pipeline that has the now stopped node.
> 2. Fails to write the stripe and abandons it. The client will seemlessly
> rewrite the stripe to a new container.
> 3. This could be a new container with no other data. Now it will have some
> chunk written to some of the replicas, but no put block will have arrived.
> 4. On the DN, the rocks DB will not have an entry for this block, but there
> will be a block file on disk. The DN will therefore refuse to delete the
> container if it is asked.
> 5. The stale node handler triggers the close of the pipeline. The replicas
> are reported to SCM with zero bytes and zero keys and hence the container
> goes to DELETING after it is CLOSED.
> 6. SCM sends delete commands which are dropped by the DN due to the chunks
> written and the container is stuck deleting forever.
> The following logs show this happening. The key should have gone into
> container 1 as this was a fresh cluster with no previous writes, but node 6
> was stopped. You can see the logs for the pipeline getting closed including
> node_6 for container 1:
> {code}
> [36mscm_1 [0m 2023-06-13 10:56:07,420
> [EventQueue-StaleNodeForStaleNodeHandler] INFO node.StaleNodeHandler:
> Datanode
> 7c8bf940-3520-41e4-a1b3-672359713c5b(ozone_datanode_6.ozone_default/172.30.0.12)
> moved to stale state. Finalizing its pipelines
> [PipelineID=2f228c83-dad4-40cd-8415-b153564738de,
> PipelineID=59d63d29-64f1-4a95-94cc-e75c65abb94b,
> PipelineID=dac2f9a5-f274-4baa-9409-3d203b794905]
> [36mscm_1 [0m 2023-06-13 10:56:07,435
> [EventQueue-StaleNodeForStaleNodeHandler] INFO pipeline.PipelineManagerImpl:
> Container #1 closed for
> pipeline=PipelineID=2f228c83-dad4-40cd-8415-b153564738de
> [36mscm_1 [0m 2023-06-13 10:56:07,437
> [EventQueue-CloseContainerForCloseContainerEventHandler] INFO
> container.CloseContainerEventHandler: Close container Event triggered for
> container : #1, current state: CLOSING
> {code}
> Container 1 has moved to CLOSING, then we see it transiton to closed and then
> deleting:
> {code}
> [36mscm_1 [0m 2023-06-13 10:56:16,091
> [FixedThreadPoolWithAffinityExecutor-9-0] INFO
> container.IncrementalContainerReportHandler: Moving container #1 to CLOSED
> state, datanode
> 9c7eab65-20fb-49b6-85cb-e0d4df91b14e(ozone_datanode_3.ozone_default/172.30.0.3)
> reported CLOSED replica with index 5.
> [36mscm_1 [0m 2023-06-13 10:56:28,492 [ReplicationMonitor] INFO
> replication.ReplicationManager: Sending command [deleteContainerCommand:
> containerID: 1, replicaIndex: 5, force: false] for container
> ContainerInfo{id=#1, state=CLOSED,
> stateEnterTime=2023-06-13T10:56:16.095763Z,
> pipelineID=PipelineID=2f228c83-dad4-40cd-8415-b153564738de, owner=om1} to
> 9c7eab65-20fb-49b6-85cb-e0d4df91b14e(ozone_datanode_3.ozone_default/172.30.0.3)
> with datanode deadline 1686653848492 and scm deadline 1686653878492
> [36mscm_1 [0m 2023-06-13 10:56:28,493 [ReplicationMonitor] INFO
> replication.ReplicationManager: Sending command [deleteContainerCommand:
> containerID: 1, replicaIndex: 4, force: false] for container
> ContainerInfo{id=#1, state=CLOSED,
> stateEnterTime=2023-06-13T10:56:16.095763Z,
> pipelineID=PipelineID=2f228c83-dad4-40cd-8415-b153564738de, owner=om1} to
> 02968363-81b4-44b9-bd9b-bfb03eb96193(ozone_datanode_1.ozone_default/172.30.0.10)
> with datanode deadline 1686653848493 and scm deadline 1686653878493
> {code}
> Getting the key info, We can see the key is actually in container 2, so what
> likely happened, is that the client tried to write the datanode_6 and the
> parity on datanodes 3 and 1. The stripe got abandoned and re-written to
> container 2 and hence the put block to update the length likely never made it
> onto the parity nodes, but some block data did. Now the DNs will not delete
> the container as it is not empty and it remains stuck in DELETING forever.
> {code}
> ozone sh key info /vol1/bucket1/key1
> {
> "volumeName" : "vol1",
> "bucketName" : "bucket1",
> "name" : "key1",
> "dataSize" : 175,
> "creationTime" : "2023-06-13T10:56:04.158Z",
> "modificationTime" : "2023-06-13T10:56:35.186Z",
> "replicationConfig" : {
> "data" : 3,
> "parity" : 2,
> "ecChunkSize" : 1048576,
> "codec" : "RS",
> "replicationType" : "EC",
> "requiredNodes" : 5
> },
> "metadata" : { },
> "ozoneKeyLocations" : [ {
> "containerID" : 2,
> "localID" : 111677748019200002,
> "length" : 175,
> "offset" : 0,
> "keyOffset" : 0
> } ]
> }
> {code}
> Checking of the DN storage directories:
> {code}
> bash-4.2$ pwd
> /data/hdds/hdds/CID-684c8137-b002-47d8-b134-7831df54e517/current/containerDir0/1
> bash-4.2$ ls -al chunks/
> total 12
> drwxr-xr-x 2 hadoop users 4096 Jun 13 10:56 .
> drwxr-xr-x 4 hadoop users 4096 Jun 13 10:56 ..
> -rw-r--r-- 1 hadoop users 175 Jun 13 10:56 111677748019200001.block
> {code}
> Dumped the rocksDB with:
> {code}
> ozone debug ldb --db=container.db scan --column_family=block_data -l=-1
> {code}
> And the blockID for the file above is not present, so the container is
> "empty" with an orphan block in it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]