[ 
https://issues.apache.org/jira/browse/HDFS-16043?focusedWorklogId=709008&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-709008
 ]

ASF GitHub Bot logged work on HDFS-16043:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 14/Jan/22 12:29
            Start Date: 14/Jan/22 12:29
    Worklog Time Spent: 10m 
      Work Description: zhuxiangyi commented on pull request #3882:
URL: https://github.com/apache/hadoop/pull/3882#issuecomment-1013077777


   `[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
19.417 s <<< FAILURE! - in 
org.apache.hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized
   [ERROR] 
testWithKerberizedCluster(org.apache.hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized)
  Time elapsed: 19.326 s  <<< ERROR!
   java.io.IOException: DestHost:destPort localhost:10476 , LocalHost:localPort 
04339ec2a237/172.17.0.2:0. Failed on local exception: java.io.IOException: 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   
   [ERROR] Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
205.883 s <<< FAILURE! - in 
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes
   [ERROR] 
testBalancerWithObserverWithFailedNode(org.apache.hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes)
  Time elapsed: 180.186 s  <<< ERROR!
   org.junit.runners.model.TestTimedOutException: test timed out after 180000 
milliseconds`
   
   
   @Hexiaoqiao The build failed again. The reason for the failure was the same 
as last time. The above two tests passed in my local build.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 709008)
    Time Spent: 11.5h  (was: 11h 20m)

> Add markedDeleteBlockScrubberThread to delete blocks asynchronously
> -------------------------------------------------------------------
>
>                 Key: HDFS-16043
>                 URL: https://issues.apache.org/jira/browse/HDFS-16043
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs, namanode
>    Affects Versions: 3.4.0
>            Reporter: Xiangyi Zhu
>            Assignee: Xiangyi Zhu
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.4.0
>
>         Attachments: 20210527-after.svg, 20210527-before.svg
>
>          Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> The deletion of the large directory caused NN to hold the lock for too long, 
> which caused our NameNode to be killed by ZKFC.
>  Through the flame graph, it is found that its main time-consuming 
> calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting 
> inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time.
> h3. solution:
> 1. RemoveBlocks is processed asynchronously. A thread is started in the 
> BlockManager to process the deleted blocks and control the lock time.
>  2. QuotaCount calculation optimization, this is similar to the optimization 
> of this Issue HDFS-16000.
> h3. Comparison before and after optimization:
> Delete 1000w Inode and 1000w block test.
>  *before:*
> remove inode elapsed time: 7691 ms
>  remove block elapsed time :11107 ms
>  *after:*
>  remove inode elapsed time: 4149 ms
>  remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to