Ethan Rose created HDDS-8141:
--------------------------------
Summary: Exception "Non-force deletion of non-empty container is
not allowed" in datanode logs
Key: HDDS-8141
URL: https://issues.apache.org/jira/browse/HDDS-8141
Project: Apache Ozone
Issue Type: Sub-task
Reporter: Ethan Rose
This exception has been noticed a few times in datanode logs
{code:java}
2023-02-16 14:57:11,330 ERROR
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler: Received container
deletion command for container 54652 but the container is not empty.
2023-02-16 14:57:11,330 ERROR
org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteContainerCommandHandler:
Exception occurred while deleting the container.
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
Non-force deletion of non-empty container is not allowed.
at
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.deleteInternal(KeyValueHandler.java:1133)
at
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.deleteContainer(KeyValueHandler.java:1094)
at
org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.deleteContainer(ContainerController.java:182)
at
org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteContainerCommandHandler.lambda$handle$0(DeleteContainerCommandHandler.java:75)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
{code}
This is a defensive code path that checks the block count metadata in RocksDB
to determine if the container is empty. It is not expected to be hit.
The last delete block command for this container was logged about 5 minutes
prior to this message. When checking the disk of a few containers where this
happened, we noticed there were no block files present there. Logs show SCM
would retry the delete but get the same result every time.
Later on, the container inspector was run on this cluster and it reported that
there was only one copy of this container in the whole cluster. It had the
following metadata:
{code:java}
{
"containerID": 54652,
"schemaVersion": "2",
"containerState": "CLOSED",
"currentDatanodeID": "a160b3e2-a450-446d-a75c-898241a1ff7a",
"originDatanodeID": "a160b3e2-a450-446d-a75c-898241a1ff7a",
"dBMetadata": {
"#BLOCKCOUNT": -6,
"#BYTESUSED": -1431232412,
"#PENDINGDELETEBLOCKCOUNT": 0,
"#delTX": 46312,
"#BCSID": 1548650
},
"aggregates": {
"blockCount": 0,
"usedBytes": 0,
"pendingDeleteBlocks": 0,
"pendingDeleteBytes": 0
},
"chunksDirectory": {
"path":
"/data/qssufn48/hadoop-ozone/datanode/data/hdds/CID-30dff43d-34c2-4855-991f-797164dcb259/current/containerDir106/54652/chunks",
"present": true,
"fileCount": 0
},
"dBMetadataDeleteCount_minus_aggregatedDeleteCount": 0,
"correct": false,
"errors": [
{
"property": "dBMetadata.#BLOCKCOUNT",
"expected": 0,
"actual": -6,
"repaired": false
},
{
"property": "dBMetadata.#BYTESUSED",
"expected": 0,
"actual": -1431232412,
"repaired": false
}
]
}
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]