[
https://issues.apache.org/jira/browse/HDFS-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900910#action_12900910
]
Suresh Srinivas commented on HDFS-1052:
---------------------------------------
ryan, this is discussed in the proposal already. Let me summarize:
# Increasing the namenode heap does not increase the namenode throughput
# Currently NN takes 30 mins to startup with 50G heap. The startup time would
go to 2.5 hrs. There are couple of jiras improving the NN startup time. Even
with that, start up time would be > 1 hr for such a large heap.
# While debugging memory leaks in NN, I could not get lot of tools to work with
the heap dump of 40G, especially jhat. Not sure how well the tools can support
250G heap dump.
# This solution does not work for installation where the NN needs to support
more 4x scaling. This is needed in clusters that might want to store smaller
files instead of depending on large files to reduce object count.
The solution proposed here does not preclude one from using a single namenode
and vertically scaling it.
I am also curious about your experience and challenges of running a namenode
with such large heap. We could have that discussion offline.
> HDFS scalability with multiple namenodes
> ----------------------------------------
>
> Key: HDFS-1052
> URL: https://issues.apache.org/jira/browse/HDFS-1052
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: name-node
> Affects Versions: 0.22.0
> Reporter: Suresh Srinivas
> Assignee: Suresh Srinivas
> Attachments: Block pool proposal.pdf, Mulitple Namespaces5.pdf
>
>
> HDFS currently uses a single namenode that limits scalability of the cluster.
> This jira proposes an architecture to scale the nameservice horizontally
> using multiple namenodes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.