[ 
https://issues.apache.org/jira/browse/HDFS-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887213#comment-13887213
 ] 

Suresh Srinivas commented on HDFS-5790:
---------------------------------------

I know that many of the HDFS restarts with running jobs that have opened many 
files run into this issue. In the past I had fixed a bug where namenode did 
editlog sync holding lock. Even with that I see that this issue slows down 
lease recovery and namenode in such restarts becomes unresponsive. That said, I 
am okay not putting this into 2.3.

> LeaseManager.findPath is very slow when many leases need recovery
> -----------------------------------------------------------------
>
>                 Key: HDFS-5790
>                 URL: https://issues.apache.org/jira/browse/HDFS-5790
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode, performance
>    Affects Versions: 2.3.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 3.0.0, 2.4.0
>
>         Attachments: hdfs-5790.txt, hdfs-5790.txt
>
>
> We recently saw an issue where the NN restarted while tens of thousands of 
> files were open. The NN then ended up spending multiple seconds for each 
> commitBlockSynchronization() call, spending most of its time inside 
> LeaseManager.findPath(). findPath currently works by looping over all files 
> held for a given writer, and traversing the filesystem for each one. This 
> takes way too long when tens of thousands of files are open by a single 
> writer.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to