[ 
https://issues.apache.org/jira/browse/HDFS-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875211#action_12875211
 ] 

Suresh Srinivas commented on HDFS-1114:
---------------------------------------

# Not using supplemental hash function will result in severe clustering when we 
move to sequential block IDs (as only higher bits are used for hash).
# Why do we need configurability of either using java HashMap or this new 
implementation? 
#* With new impl, BlockInfo implements LinkedElement interface. On switching to 
java HashMap would it continue to implement this interface and incur the cost 
of {{next}} member in BlockInfo?
# In "Arrays" section the GC behavior description was not clear. Not sure how 
the GC behavior is better with arrays?
# Static array size for the map simplifies the code, but pushes complexity to 
the cluster admin by adding one more configuration. This configuration is an 
internal implementation detail which a cluster admin may not understand and get 
it right. If it configured wrong and the cluster continues to work, cluster 
admin may not be aware of performance degradation.
# I feel we should implement resizing to avoid introducing config param. It is 
a rare event on a stable cluster.  NN has enough heap head room to account for 
floating garage and YG guarantee. Hence availability of memory should not be an 
issue. Worst case scenario, resize may trigger a full GC. 
# If we implement resizing we should also think about 2^N table size as it has 
potential to waste a lot of memory during doubling, especially considering 
millions of entries in the table.


> Reducing NameNode memory usage by an alternate hash table
> ---------------------------------------------------------
>
>                 Key: HDFS-1114
>                 URL: https://issues.apache.org/jira/browse/HDFS-1114
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>         Attachments: GSet20100525.pdf
>
>
> NameNode uses a java.util.HashMap to store BlockInfo objects.  When there are 
> many blocks in HDFS, this map uses a lot of memory in the NameNode.  We may 
> optimize the memory usage by a light weight hash table implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to