[ https://issues.apache.org/jira/browse/HDFS-13831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241420#comment-17241420 ]
Frank Top Frank edited comment on HDFS-13831 at 12/1/20, 10:14 AM: ------------------------------------------------------------------- [~linyiqun], [~jianliang.wu] [~weichiu] Hello, recently, in A production environment with more than 1000 nodes(HDP3.1.0;HADOOP-3.1.0), when A large number of files were deleted, RPC response was particularly high. I detected the phenomenon in the following figure with tool Arthas(Alibaba) [Arthas screenShot|https://img-blog.csdnimg.cn/20201201181311356.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzM3ODY1NDIw,size_16,color_FFFFFF,t_70] The HDP code is as follows {code:java} static int BLOCK_DELETION_INCREMENT = 1000; .... .."... ...... void removeBlocks(BlocksMapUpdateInfo blocks) {//写入editlog后,循环刚才收集到的blocks,然后调用blockManager的removeBlock来处理要删除的数据块. List<BlockInfo> toDeleteList = blocks.getToDeleteList(); Iterator<BlockInfo> iter = toDeleteList.iterator(); while (iter.hasNext()) { writeLock(); try { //循环收集到的块,这里双重限制:常量限制(BLOCK_DELETION_INCREMENT)和块数量限制(iter.hasNext()) //每次默认限制删除块的增量是BLOCK_DELETION_INCREMENT(1000). for (int i = 0; i < BLOCK_DELETION_INCREMENT && iter.hasNext(); i++) { blockManager.removeBlock(iter.next());//用blockManager的removeBlock来处理要删除的数据块. } } finally { writeUnlock("removeBlocks"); } } }{code} Should I inject your patch and make the parameters("dfs.namenode.block.deletion.increment") bigger? was (Author: gaofeng6): [~linyiqun], [~jianliang.wu] [~weichiu] Hello, recently, in A production environment with more than 1000 nodes(HDP3.1.0;HADOOP-3.1.0), when A large number of files were deleted, RPC response was particularly high. I detected the phenomenon in the following figure with tool Arthas(Alibaba) !image-2020-12-01-17-58-47-313.png! The HDP code is as follows {code:java} static int BLOCK_DELETION_INCREMENT = 1000; .... .."... ...... void removeBlocks(BlocksMapUpdateInfo blocks) {//写入editlog后,循环刚才收集到的blocks,然后调用blockManager的removeBlock来处理要删除的数据块. List<BlockInfo> toDeleteList = blocks.getToDeleteList(); Iterator<BlockInfo> iter = toDeleteList.iterator(); while (iter.hasNext()) { writeLock(); try { //循环收集到的块,这里双重限制:常量限制(BLOCK_DELETION_INCREMENT)和块数量限制(iter.hasNext()) //每次默认限制删除块的增量是BLOCK_DELETION_INCREMENT(1000). for (int i = 0; i < BLOCK_DELETION_INCREMENT && iter.hasNext(); i++) { blockManager.removeBlock(iter.next());//用blockManager的removeBlock来处理要删除的数据块. } } finally { writeUnlock("removeBlocks"); } } }{code} Should I inject your patch and make the parameters("dfs.namenode.block.deletion.increment") bigger? > Make block increment deletion number configurable > ------------------------------------------------- > > Key: HDFS-13831 > URL: https://issues.apache.org/jira/browse/HDFS-13831 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 3.1.0 > Reporter: Yiqun Lin > Assignee: Ryan Wu > Priority: Major > Fix For: 2.10.0, 3.2.0, 3.0.4, 3.1.2 > > Attachments: HDFS-13831.001.patch, HDFS-13831.002.patch, > HDFS-13831.003.patch, HDFS-13831.004.patch, HDFS-13831.branch-3.0.001.patch > > > When NN deletes a large directory, it will hold the write lock long time. For > improving this, we remove the blocks in a batch way. So that other waiters > have a chance to get the lock. But right now, the batch number is a > hard-coded value. > {code} > static int BLOCK_DELETION_INCREMENT = 1000; > {code} > We can make this value configurable, so that we can control the frequency of > other waiters to get the lock chance. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org