Ivan Andika created HDDS-13728:
----------------------------------

             Summary: DECOMMISSIONED datanodes might cause deletion to be slow
                 Key: HDDS-13728
                 URL: https://issues.apache.org/jira/browse/HDDS-13728
             Project: Apache Ozone
          Issue Type: Bug
            Reporter: Ivan Andika


We recently encountered an issue where we decommission four datanodes, but did 
not turned them off immediately. We saw that this causes a large pending 
deletion in both the OM and SCM

>From OM logs, we saw that deleting 80,000 keys takes more than 5 minutes where 
>normally it only takes few seconds. As a result, the OM deletedTable entries 
>kept increasing (to hundreds of millions)

 
{code:java}
KeyDeletingService Background task execution took 303525629747ns > 
300000000000ns(timeout)
KeyDeletingService Background task execution took 399958041329ns > 
300000000000ns(timeout){code}
>From SCM logs, we saw that the a lot of deletion transactions time out (logs 
>are truncated for visibility). 
{code:java}
SCM BlockDeletionCommand 
ScmTxStateMachine{dnId=72e9966d-428d-4d05-ab92-f11dccc14d92, 
scmTxID=1757003087471, deletedBlocksTxIds=[9800591406, ...], 
updateTime=2025-09-30T06:53:29.994Z, status=SENT} for Datanode: 
72e9966d-428d-4d05-ab92-f11dccc14d92 was removed after 300000ms without update 
{code}
DeletedBlockTransactionScanner also became very slow
{code:java}
Totally added 406081 blocks to be deleted for 88 datanodes / REDACTED 
totalnodes: [REDACTED], task elapsed time: 381931ms {code}
The SCM deletedBlocksTable also kept increasing.

We suspect, it is due to the DeletedBlockLogImpl lock contention between 
addTransactions and getTransactions 

However, when we turn off the DECOMMISSIONED datanodes, the deletion 
performance improved significantly. This is odd since DECOMMISSIONED should not 
trigger any deletion (since the machine is going to be decommissioned anyway). 
This leads me to suspect that there might be some issues in our deletion 
implementation. I suspect that 
SCMDeletedBlockTransactionStatusManager#commitTransactions usage of 
ContainerManager#getContainerReplicas (which returns Set<ContainerReplica>) 
might not be correct since it also includes the DECOMMISSIONED datanodes which 
never receives any deletion commands. This might cause the commitTransactions 
to never remove entries from the deletedBlocksTable. We might need to exclude 
DECOMMISSIONED nodes instead.

Our version is based on 1.4.1 version, so maybe there might be some 
improvements we have not incorporated.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to