[
https://issues.apache.org/jira/browse/HDFS-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daryn Sharp updated HDFS-10616:
-------------------------------
Attachment: 2.6-2.7.1-heap.png
Here's an illustration how the GC characteristics on a moderated sized and
lightly loaded NN (by Y! standards) when we upgraded to 2.7 early this year.
These path changes and forthcoming IPC changes are the primary optimizations
for returning to 2.6 behavior. (Note we still had to increase heap sizes when
upgrading to 2.7, as seen at tail of graph)
> Improve performance of path handling
> ------------------------------------
>
> Key: HDFS-10616
> URL: https://issues.apache.org/jira/browse/HDFS-10616
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs
> Affects Versions: 2.0.0-alpha
> Reporter: Daryn Sharp
> Assignee: Daryn Sharp
> Attachments: 2.6-2.7.1-heap.png
>
>
> Path handling in the namesystem and directory is very inefficient. The path
> is repeatedly resolved, decomposed into path components, recombined to a full
> path. parsed again, throughout the system. This is directly inefficient for
> general performance, and indirectly via unnecessary pressure on young gen GC.
> The namesystem should only operate on paths, parse it once into inodes, and
> the directory should only operate on inodes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]