[
https://issues.apache.org/jira/browse/HADOOP-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12603117#action_12603117
]
Tsz Wo (Nicholas), SZE commented on HADOOP-3507:
------------------------------------------------
I agreed that the performance is not as good as before when you rename a
directory which contains a lot of being created files, especially, when the
number of being created files is 1,000,000. However, is this a rare case?
> Rename of a directory with many opened files blocks name-node for a long
> time. changeLease() to blame.
> ------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3507
> URL: https://issues.apache.org/jira/browse/HADOOP-3507
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.17.0
> Reporter: Konstantin Shvachko
> Fix For: 0.18.0
>
>
> I am creating a directory containing 200,000 files, and then renaming it.
> The rename operation is twice as long as the the total time for creating all
> those files.
> The worst thing is that the rename blocks the name-node for minutes. I tried
> it with a bigger directory containing 1 mln files - it blocks for 30 minutes.
> The rename itself is fast, its the changeLease() called after renaming that
> takes all the time.
> As I can see from the code changeLease() gets tailMap() of the directory that
> it renames and scans the whole tail.
> If the number of open files is large as in my case this takes forever because
> the tailMap includes all files in the subtree.
> Simple way to reproduce it is to run
> {code}
> NNThroughputBenchmark -op open -files N
> {code}
> with a large N. This will first create N files in directory
> "/NNThroughputBenchmark/create" and then rename it to
> "/NNThroughputBenchmark/open".
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.