[
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349489#comment-14349489
]
Daryn Sharp commented on HDFS-6658:
-----------------------------------
No worries, discussion is good. Will be even better if I can manage to get the
patch up today/tomorrow.
I forgot to add that the #1 complication to removing all forms of back ref from
storage to block is full block reports.
No disagreement that we can do better than the current naive (scalability wise)
designs of the balancer, decomm, and block reports. My goal is an initial impl
with minimal changes to use new data structures.
I did experiment with trying to scan the blocks map last fall. I don't
remember the slowdown, but it was abysmal. Even with a mere 60 mil blocks, I
gave up waiting for it to start after 20-30 mins. I thought about
incrementally cycling through the blocks, but I quickly realized that the
bookkeeping and consistency concerns would be a rabbit hole I could neither
spend time on, nor would anyone be likely to review in a timely fashion.
> Namenode memory optimization - Block replicas list
> ---------------------------------------------------
>
> Key: HDFS-6658
> URL: https://issues.apache.org/jira/browse/HDFS-6658
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Affects Versions: 2.4.1
> Reporter: Amir Langer
> Assignee: Daryn Sharp
> Attachments: BlockListOptimizationComparison.xlsx, BlocksMap
> redesign.pdf, HDFS-6658.patch, Namenode Memory Optimizations - Block replicas
> list.docx
>
>
> Part of the memory consumed by every BlockInfo object in the Namenode is a
> linked list of block references for every DatanodeStorageInfo (called
> "triplets").
> We propose to change the way we store the list in memory.
> Using primitive integer indexes instead of object references will reduce the
> memory needed for every block replica (when compressed oops is disabled) and
> in our new design the list overhead will be per DatanodeStorageInfo and not
> per block replica.
> see attached design doc. for details and evaluation results.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)