Speedup INode.getPathComponents
-------------------------------
Key: HDFS-1140
URL: https://issues.apache.org/jira/browse/HDFS-1140
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Dmytro Molkov
When the namenode is loading the image there is a significant amount of time
being spent in the DFSUtil.string2Bytes. We have a very specific workload here.
The path that namenode does getPathComponents for shares N - 1 component with
the previous path this method was called for (assuming current path has N
components).
Hence we can improve the image load time by caching the result of previous
conversion.
We thought of using some simple LRU cache for components, but the reality is,
String.getBytes gets optimized during runtime and LRU cache doesn't perform as
well, however using just the latest path components and their translation to
bytes in two arrays gives quite a performance boost.
I could get another 20% off of the time to load the image on our cluster (30
seconds vs 24) and I wrote a simple benchmark that tests performance with and
without caching.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.