[
https://issues.apache.org/jira/browse/HDFS-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647349#comment-14647349
]
Yi Liu commented on HDFS-8792:
------------------------------
[~arpitagarwal], sorry for late response
{quote}
do you have any estimates of the memory saved by using LightWeightHashSet?
{quote}
Yes, compared to java {{HashSet}}, there are two advantages from memory point
of review:
# Java {{HashSet}} internally uses a {{HashMap}}, so there is one more
reference (4 bytes) for each entry compared to {{LightWeightHashSet}}, so we
can save {{4 * size}} bytes of memory.
# In {{LightWeightHashSet}}, when elements become less, the size is shrinked a
lot.
So we can see {{LightWeightHashSet}} is more better. The main issue is
{{LightWeightHashSet#LinkedSetIterator}} doesn't support {{remove}} currently,
it's easy to support it (similar to java HashSet). By the way, currently in
Hadoop, we use {{LightWeightHashSet}} for all big objects required hash set
except this one which needs to use {{remove}}.
> Use LightWeightHashSet for BlockManager#postponedMisreplicatedBlocks
> --------------------------------------------------------------------
>
> Key: HDFS-8792
> URL: https://issues.apache.org/jira/browse/HDFS-8792
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Yi Liu
> Assignee: Yi Liu
> Attachments: HDFS-8792.001.patch
>
>
> {{LightWeightHashSet}} requires fewer memory than java hashset.
> Furthermore, for {{excessReplicateMap}}, we can use {{HashMap}} instead of
> {{TreeMap}} instead, since no need to sort.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)