slfan1989 commented on PR #4988: URL: https://github.com/apache/ozone/pull/4988#issuecomment-2371660310
@xichen01 @adoroszlai During our use of deletion, I noticed that it can be very slow, especially after we switched to the EC policy. Our Ozone01 cluster currently has about 1K machines. Initially, we chose to use a `Ratis-3Replica` strategy, but for cost considerations, we gradually switched to the `EC-6-3` strategy in July. The following chart shows the deletion speed for `Ratis-3Replica` .  The following chart shows the deletion speed for `EC-6-3`.  By reviewing the code and analyzing the logs, we found that the following situation can cause deletion to be very slow. We will illustrate this with an example. > Background We want to delete data from an EC container with ContainerId = 1000. Since it is EC-6-3, there are 9 replicas (DN1, DN2, DN3, ... DN9). > Process Before deletion, we first select a batch of DNs; at this time, we may only select DN1 to DN6. We then send the deletion command to these 6 DNs, and the command executes normally, successfully deleting 6 blocks. However, if DN7 to DN9 are not selected, our deletion process will get stuck. > Code https://github.com/apache/ozone/blob/1f86ce80bd775fc4403617f17aa272a0d1297c7f/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/DeletedBlockLogImpl.java#L294-L313 I came up with a possible solution to eliminate this stuck situation. We require that all replicas of the container to be deleted must be present in the selected DN list simultaneously. Otherwise, we will skip that container. ``` private void getTransaction(DeletedBlocksTransaction tx, DatanodeDeletedBlockTransactions transactions, Set<DatanodeDetails> dnList, Set<ContainerReplica> replicas, Map<UUID, Map<Long, CmdStatus>> commandStatus) { DeletedBlocksTransaction updatedTxn = DeletedBlocksTransaction.newBuilder(tx) .setCount(transactionStatusManager.getOrDefaultRetryCount( tx.getTxID(), 0)) .build(); // Requiring that replicas must be present in the DN list simultaneously ensures that the deletion commands for all // replicas of the same container can be issued at once, avoiding situations where some replicas of the container are // deleted while others are not. for (ContainerReplica replica : replicas) { DatanodeDetails datanodeDetails = replica.getDatanodeDetails(); if (!dnList.contains(datanodeDetails)) { return; } } for (ContainerReplica replica : replicas) { DatanodeDetails details = replica.getDatanodeDetails(); if (!dnList.contains(details)) { continue; } if (!transactionStatusManager.isDuplication( details, updatedTxn.getTxID(), commandStatus)) { transactions.addTransactionToDN(details.getUuid(), updatedTxn); } } } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
