[jira] [Updated] (HDFS-4465) Optimize datanode ReplicasMap and ReplicaInfo

Aaron T. Myers (JIRA) Tue, 02 Jul 2013 19:00:13 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Aaron T. Myers updated HDFS-4465:
---------------------------------

    Attachment: HDFS-4465.patch

Here's an updated patch which should address all of your feedback, Suresh.

# Good thinking. I did some back of the envelope math which suggested that even 
1% was probably higher than necessary for a typical DN. Switched this to 0.5%.
# Per previous discussion, left it extending Block and added a comment.
# Good thinking. Moved the parsing code to a separate static function and added 
a test for it.
# In my testing with a DN with ~1MM blocks, this patch makes each replica go 
from using ~635 bytes per replica to ~250 bytes per replica, so about a 2.5x 
improvement.

Note that to address the findbugs warning I had to add an exception to the 
findbugs exclude file, since in this patch I am very deliberately using the 
String(String) constructor so as to trim the underlying char[] array.
                
> Optimize datanode ReplicasMap and ReplicaInfo
> ---------------------------------------------
>
>                 Key: HDFS-4465
>                 URL: https://issues.apache.org/jira/browse/HDFS-4465
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.0.5-alpha
>            Reporter: Suresh Srinivas
>            Assignee: Aaron T. Myers
>         Attachments: dn-memory-improvements.patch, HDFS-4465.patch, 
> HDFS-4465.patch
>
>
> In Hadoop a lot of optimization has been done in namenode data structures to 
> be memory efficient. Similar optimizations are necessary for Datanode 
> process. With the growth in storage per datanode and number of blocks hosted 
> on datanode, this jira intends to optimize long lived ReplicasMap and 
> ReplicaInfo objects.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4465) Optimize datanode ReplicasMap and ReplicaInfo

Reply via email to