Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The following page has been changed by izaakrubin:
http://wiki.apache.org/hadoop/Hbase/UsingBloomFilters

------------------------------------------------------------------------------
  Bloom filters can be enabled on a per-column family basis in Hbase. 
- There are three bloom filter variants supported:
+ There are four bloom filter variants supported:
   1. A [http://portal.acm.org/citation.cfm?id=362692&dl=ACM&coll=portal bloom 
filter] as defined by Bloom in 1970.
   1. A [http://portal.acm.org/citation.cfm?id=343571.343572 counting bloom 
filter] as defined by Fan et al. in a ToN 2000 paper.
   1. A 
[http://www-rp.lip6.fr/site_npa/site_rp/_publications/740-rbf_cameraready.pdf 
retouched bloom filter] as described in the CoNEXT 2006 paper.
+  1. A 
[http://www.cse.fau.edu/~jie/research/publications/Publication_files/infocom2006.pdf
 dynamic bloom filter] as defined in the INFOCOM 2006 paper.
  
+ Bloom filters can be instantiated by specifying the vector size and the 
number of hash functions.  Dynamic bloom filters require an additional 
argument, a threshold for the maximum number of keys to record in a row.  
- There are two ways in which a bloom filter can be instantiated:
-  1. by supplying the estimated number of values, in which case HBase selects 
the number of hash functions to be 4 and computes the vector size from the 
formula
-   {{{size = number-of-values * number-of-hashfunctions / ln(2) }}}
  
+ Junit testing for these four bloom filters can be found in 
hbase.regionserver.!TestBloomFilters.
-  This formula was presented in 
[http://www.eecs.harvard.edu/~michaelm/NEWWORK/postscripts/BloomFilterSurvey.pdf
 Network Applications of Bloom Filters: A Survey, by Broder and Mitzenmacher]
-  1.#2 by specifying the vector size and the number of hash functions 
explicitly.
- 
- Both of these techniques are demonstrated in the Junit test 
hbase.!TestBloomFilters.
  
  '''Additional Resources:'''
  

Reply via email to