Xiangyi Zhu created HDFS-16043:
----------------------------------

             Summary: HDFS : Delete performance optimization
                 Key: HDFS-16043
                 URL: https://issues.apache.org/jira/browse/HDFS-16043
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: hdfs, namanode
    Affects Versions: 3.4.0
            Reporter: Xiangyi Zhu


The deletion of the large directory caused NN to hold the lock for too long, 
which caused our NameNode to be killed by ZKFC.
Through the flame graph, it is found that its main time-consuming calculation 
is QuotaCount when removingBlocks(toRemovedBlocks) and deleting inodes, and 
removeBlocks(toRemovedBlocks) takes a higher proportion of time.
h3. 
solution:

1. RemoveBlocks is processed asynchronously. A thread is started in the 
BlockManager to process the deleted blocks and control the lock time.
2. QuotaCount calculation optimization, this is similar to the optimization of 
this Issue [HDFS-16000|https://issues.apache.org/jira/browse/HDFS-16000].
h3. Comparison before and after optimization:


Delete 1000w Inode and 1000w block test.
*before:*
Before optimization: remove inode elapsed time: 7691 ms
remove block elapsed time :11107 ms
*after:*
remove inode elapsed time: 4149 ms
remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to