Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.
The following page has been changed by JimKellerman: http://wiki.apache.org/lucene-hadoop/Hbase/UsingBloomFilters The comment on the change is: Initial content ------------------------------------------------------------------------------ - Describe Hbase/UsingBloomFilters here. + Bloom filters can be enabled on a per-column family basis in Hbase. + There are three bloom filter variants supported: + 1. A [http://portal.acm.org/citation.cfm?id=362692&dl=ACM&coll=portal bloom filter] as defined by Bloom in 1970. + 1. A [http://portal.acm.org/citation.cfm?id=343571.343572 counting bloom filter] as defined by Fan et al. in a ToN 2000 paper. + 1. A [http://www-rp.lip6.fr/site_npa/site_rp/_publications/740-rbf_cameraready.pdf retouched bloom filter] as described in the CoNEXT 2006 paper. - TBS. + There are two ways in which a bloom filter can be instantiated: + 1. by supplying the estimated number of values, in which case HBase selects the number of hash functions to be 4 and computes the vector size from the formula + {{{size = number-of-values * number-of-hashfunctions / ln(2) }}} + This formula was presented in [http://www.eecs.harvard.edu/~michaelm/NEWWORK/postscripts/BloomFilterSurvey.pdf Network Applications of Bloom Filters: A Survey, by Broder and Mitzenmacher] + 1.#2 by specifying the vector size and the number of hash functions explicitly. + + Both of these techniques are demonstrated in the Junit test hbase.!TestBloomFilters. + + '''Additional Resources:''' + + 1. [http://www.cc.gatech.edu/~manolios/bloom-filters/calculator.html Bloom Filter Calculator] + 1. [http://pages.cs.wisc.edu/~cao/papers/summary-cache/node8.html Bloom Filters - the math] + 1. [http://www.flipcode.com/articles/article_bloomfilters.shtml Coding Bloom Filters] +
