[ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380725#comment-14380725
 ] 

Daryn Sharp commented on HDFS-6658:
-----------------------------------

bq. It would be nice if this could help with the goals of HDFS-7836 ... Right 
now I don't see a path from this patch to there but very possibly I'm missing 
something.

I think if anything this patch will make your goals easier to achieve due to 
better abstractions.

Currently the block control logic is diffused throughout the nodes, storages, 
blockinfos, BM, etc.  The BM is now the focal control object for all block 
manipulations.  The storages, blockinfos, etc are now dumb model objects.

The BM isn't really aware of the special data structures which are hidden from 
it via the BlocksMap's storage & block iterators.  In fact the rest of the 
BlocksMap relies on its iterators to hide the implementation details.  It's not 
until you go into the BlocksMap's BlockReplicaMap that things get interesting.

If I can clear up a few dependency issues with the storage/block iterators, 
moving them into BlockReplicaMap should make the changes invisible.  At which 
time it should be much easier to swap in an impl that meets your needs, 
assuming we can't evolve the new data structures to be thread-safe.

> Namenode memory optimization - Block replicas list 
> ---------------------------------------------------
>
>                 Key: HDFS-6658
>                 URL: https://issues.apache.org/jira/browse/HDFS-6658
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.4.1
>            Reporter: Amir Langer
>            Assignee: Daryn Sharp
>         Attachments: BlockListOptimizationComparison.xlsx, BlocksMap 
> redesign.pdf, HDFS-6658.patch, HDFS-6658.patch, HDFS-6658.patch, Namenode 
> Memory Optimizations - Block replicas list.docx, New primative indexes.jpg, 
> Old triplets.jpg
>
>
> Part of the memory consumed by every BlockInfo object in the Namenode is a 
> linked list of block references for every DatanodeStorageInfo (called 
> "triplets"). 
> We propose to change the way we store the list in memory. 
> Using primitive integer indexes instead of object references will reduce the 
> memory needed for every block replica (when compressed oops is disabled) and 
> in our new design the list overhead will be per DatanodeStorageInfo and not 
> per block replica.
> see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to