[ https://issues.apache.org/jira/browse/HDFS-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiangyi Zhu updated HDFS-16043: ------------------------------- Description: The deletion of the large directory caused NN to hold the lock for too long, which caused our NameNode to be killed by ZKFC. Through the flame graph, it is found that its main time-consuming calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time. h3. solution: 1. RemoveBlocks is processed asynchronously. A thread is started in the BlockManager to process the deleted blocks and control the lock time. 2. QuotaCount calculation optimization, this is similar to the optimization of this Issue HDFS-16000. h3. Comparison before and after optimization: Delete 1000w Inode and 1000w block test. *before:* Before optimization: remove inode elapsed time: 7691 ms remove block elapsed time :11107 ms *after:* remove inode elapsed time: 4149 ms remove block elapsed time :0 ms was: The deletion of the large directory caused NN to hold the lock for too long, which caused our NameNode to be killed by ZKFC. Through the flame graph, it is found that its main time-consuming calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time. h3. solution: 1. RemoveBlocks is processed asynchronously. A thread is started in the BlockManager to process the deleted blocks and control the lock time. 2. QuotaCount calculation optimization, this is similar to the optimization of this Issue [HDFS-16000|https://issues.apache.org/jira/browse/HDFS-16000]. h3. Comparison before and after optimization: Delete 1000w Inode and 1000w block test. *before:* Before optimization: remove inode elapsed time: 7691 ms remove block elapsed time :11107 ms *after:* remove inode elapsed time: 4149 ms remove block elapsed time :0 ms > HDFS : Delete performance optimization > -------------------------------------- > > Key: HDFS-16043 > URL: https://issues.apache.org/jira/browse/HDFS-16043 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namanode > Affects Versions: 3.4.0 > Reporter: Xiangyi Zhu > Priority: Major > > The deletion of the large directory caused NN to hold the lock for too long, > which caused our NameNode to be killed by ZKFC. > Through the flame graph, it is found that its main time-consuming > calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting > inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time. > h3. solution: > 1. RemoveBlocks is processed asynchronously. A thread is started in the > BlockManager to process the deleted blocks and control the lock time. > 2. QuotaCount calculation optimization, this is similar to the optimization > of this Issue HDFS-16000. > h3. Comparison before and after optimization: > Delete 1000w Inode and 1000w block test. > *before:* > Before optimization: remove inode elapsed time: 7691 ms > remove block elapsed time :11107 ms > *after:* > remove inode elapsed time: 4149 ms > remove block elapsed time :0 ms -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org