[
https://issues.apache.org/jira/browse/HDFS-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aaron T. Myers updated HDFS-4465:
---------------------------------
Attachment: HDFS-4465.patch
Here's an updated patch which should address all of your feedback, Suresh.
# Good thinking. I did some back of the envelope math which suggested that even
1% was probably higher than necessary for a typical DN. Switched this to 0.5%.
# Per previous discussion, left it extending Block and added a comment.
# Good thinking. Moved the parsing code to a separate static function and added
a test for it.
# In my testing with a DN with ~1MM blocks, this patch makes each replica go
from using ~635 bytes per replica to ~250 bytes per replica, so about a 2.5x
improvement.
Note that to address the findbugs warning I had to add an exception to the
findbugs exclude file, since in this patch I am very deliberately using the
String(String) constructor so as to trim the underlying char[] array.
> Optimize datanode ReplicasMap and ReplicaInfo
> ---------------------------------------------
>
> Key: HDFS-4465
> URL: https://issues.apache.org/jira/browse/HDFS-4465
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Affects Versions: 2.0.5-alpha
> Reporter: Suresh Srinivas
> Assignee: Aaron T. Myers
> Attachments: dn-memory-improvements.patch, HDFS-4465.patch,
> HDFS-4465.patch
>
>
> In Hadoop a lot of optimization has been done in namenode data structures to
> be memory efficient. Similar optimizations are necessary for Datanode
> process. With the growth in storage per datanode and number of blocks hosted
> on datanode, this jira intends to optimize long lived ReplicasMap and
> ReplicaInfo objects.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira