[ 
https://issues.apache.org/jira/browse/HDFS-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647349#comment-14647349
 ] 

Yi Liu commented on HDFS-8792:
------------------------------

[~arpitagarwal], sorry for late response

{quote}
do you have any estimates of the memory saved by using LightWeightHashSet?
{quote}
Yes, compared to java {{HashSet}}, there are two advantages from memory point 
of review:
# Java {{HashSet}} internally uses a {{HashMap}}, so there is one more 
reference (4 bytes) for each entry compared to {{LightWeightHashSet}}, so we 
can save {{4 * size}} bytes of memory.
# In {{LightWeightHashSet}}, when elements become less, the size is shrinked a 
lot.

So we can see {{LightWeightHashSet}} is more better.  The main issue is 
{{LightWeightHashSet#LinkedSetIterator}} doesn't support {{remove}} currently, 
it's easy to support it (similar to java HashSet).    By the way, currently in 
Hadoop, we use {{LightWeightHashSet}} for all big objects required hash set 
except this one which needs to use {{remove}}.

> Use LightWeightHashSet for BlockManager#postponedMisreplicatedBlocks
> --------------------------------------------------------------------
>
>                 Key: HDFS-8792
>                 URL: https://issues.apache.org/jira/browse/HDFS-8792
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>         Attachments: HDFS-8792.001.patch
>
>
> {{LightWeightHashSet}} requires fewer memory than java hashset. 
> Furthermore, for {{excessReplicateMap}}, we can use {{HashMap}} instead of 
> {{TreeMap}} instead, since no need to sort. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to