[ 
https://issues.apache.org/jira/browse/HDFS-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879975#action_12879975
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1114:
----------------------------------------------

Ran some benchmarks.  When the modulus is large, which means that number of 
collisions is small, LightWeightGSet is much better than GSetByHashMap.

|| datasize || modulus || GSetByHashMap|| LightWeightGSet||
| 65536 | 1025 | 219 | 234|
| 65536 | 1048577 | 516 | 296|
| 65536 | 1073741825 | 500 | 281|
| 262144 | 1025 | 1422 | 1531|
| 262144 | 1048577 | 3078 | 2156|
| 262144 | 1073741825 | 3094 | 2281|
| 1048576 | 1025 | 7172 | 7313|
| 1048576 | 1048577 | 13531 | 9844|
| 1048576 | 1073741825 | 14485 | 10718|

> Reducing NameNode memory usage by an alternate hash table
> ---------------------------------------------------------
>
>                 Key: HDFS-1114
>                 URL: https://issues.apache.org/jira/browse/HDFS-1114
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.22.0
>
>         Attachments: GSet20100525.pdf, gset20100608.pdf, 
> h1114_20100607.patch, h1114_20100614b.patch, h1114_20100615.patch, 
> h1114_20100616b.patch, h1114_20100617.patch, h1114_20100617b.patch
>
>
> NameNode uses a java.util.HashMap to store BlockInfo objects.  When there are 
> many blocks in HDFS, this map uses a lot of memory in the NameNode.  We may 
> optimize the memory usage by a light weight hash table implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to