Ivan Andika created HDDS-13728:
----------------------------------
Summary: DECOMMISSIONED datanodes might cause deletion to be slow
Key: HDDS-13728
URL: https://issues.apache.org/jira/browse/HDDS-13728
Project: Apache Ozone
Issue Type: Bug
Reporter: Ivan Andika
We recently encountered an issue where we decommission four datanodes, but did
not turned them off immediately. We saw that this causes a large pending
deletion in both the OM and SCM
>From OM logs, we saw that deleting 80,000 keys takes more than 5 minutes where
>normally it only takes few seconds. As a result, the OM deletedTable entries
>kept increasing (to hundreds of millions)
{code:java}
KeyDeletingService Background task execution took 303525629747ns >
300000000000ns(timeout)
KeyDeletingService Background task execution took 399958041329ns >
300000000000ns(timeout){code}
>From SCM logs, we saw that the a lot of deletion transactions time out (logs
>are truncated for visibility).
{code:java}
SCM BlockDeletionCommand
ScmTxStateMachine{dnId=72e9966d-428d-4d05-ab92-f11dccc14d92,
scmTxID=1757003087471, deletedBlocksTxIds=[9800591406, ...],
updateTime=2025-09-30T06:53:29.994Z, status=SENT} for Datanode:
72e9966d-428d-4d05-ab92-f11dccc14d92 was removed after 300000ms without update
{code}
DeletedBlockTransactionScanner also became very slow
{code:java}
Totally added 406081 blocks to be deleted for 88 datanodes / REDACTED
totalnodes: [REDACTED], task elapsed time: 381931ms {code}
The SCM deletedBlocksTable also kept increasing.
We suspect, it is due to the DeletedBlockLogImpl lock contention between
addTransactions and getTransactions
However, when we turn off the DECOMMISSIONED datanodes, the deletion
performance improved significantly. This is odd since DECOMMISSIONED should not
trigger any deletion (since the machine is going to be decommissioned anyway).
This leads me to suspect that there might be some issues in our deletion
implementation. I suspect that
SCMDeletedBlockTransactionStatusManager#commitTransactions usage of
ContainerManager#getContainerReplicas (which returns Set<ContainerReplica>)
might not be correct since it also includes the DECOMMISSIONED datanodes which
never receives any deletion commands. This might cause the commitTransactions
to never remove entries from the deletedBlocksTable. We might need to exclude
DECOMMISSIONED nodes instead.
Our version is based on 1.4.1 version, so maybe there might be some
improvements we have not incorporated.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]