[
https://issues.apache.org/jira/browse/HDFS-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886194#action_12886194
]
Konstantin Shvachko commented on HDFS-1140:
-------------------------------------------
Todd, thanks for looking.
[Here|http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/423/testReport/]
is another report that also failed the same test cases. I don't see it now on
my box either, but you can check the logs to understand what is going on, and
probably model. The message "The directory is already locked." means that the
previous DN is still running or did not release the lock on the directory.
If we could isolate TestFileAppend4 into a separate jira, then this one can be
closed.
> Speedup INode.getPathComponents
> -------------------------------
>
> Key: HDFS-1140
> URL: https://issues.apache.org/jira/browse/HDFS-1140
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: name-node
> Affects Versions: 0.22.0
> Reporter: Dmytro Molkov
> Assignee: Dmytro Molkov
> Priority: Minor
> Fix For: 0.22.0
>
> Attachments: HDFS-1140.2.patch, HDFS-1140.3.patch, HDFS-1140.4.patch,
> HDFS-1140.patch
>
>
> When the namenode is loading the image there is a significant amount of time
> being spent in the DFSUtil.string2Bytes. We have a very specific workload
> here. The path that namenode does getPathComponents for shares N - 1
> component with the previous path this method was called for (assuming current
> path has N components).
> Hence we can improve the image load time by caching the result of previous
> conversion.
> We thought of using some simple LRU cache for components, but the reality is,
> String.getBytes gets optimized during runtime and LRU cache doesn't perform
> as well, however using just the latest path components and their translation
> to bytes in two arrays gives quite a performance boost.
> I could get another 20% off of the time to load the image on our cluster (30
> seconds vs 24) and I wrote a simple benchmark that tests performance with and
> without caching.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.