Achim Schneider wrote:
Aaron Denney <[EMAIL PROTECTED]> wrote:

On 2008-05-30, Achim Schneider <[EMAIL PROTECTED]> wrote:
Bryan O'Sullivan <[EMAIL PROTECTED]> wrote:

A Bloom filter is a probabilistic data
structure that provides a fast set membership querying capability.
It does not give false negatives, but has a tunable false positive
rate.  (A false positive arises when the filter claims that an
element is present, but in fact it is not.)

/me squints.

Please tell me that this isn't reversible.
Tell me what you mean by "reversible".  You can't, for instance,
extract the items in the set.

I guess invertible would have been the right word, though it's still
ambiguous.

Turning it into something that does not give false positives, but has a
tunable false negative rate.

Without looking at the algorithm, I imagine it working somewhat like a
hashtable, and this inversion would utterly destroy my intuition.
Without looking at the code to verify that this is how it has really been implemented, a bloom filter is like a series of hash tables, where the hash table entries are one bit. The bit is set if there is an item that hashes to that value in the bloom filter. So, assuming a single hash table where half the bits are set, there's a 50% false positive rate and no false negatives when you do a membership test.

To reduce the false positives, we can add another hash table with a different hash function. Assuming it also is half full, we can check if an item is in both tables, and our false positive rate drops to 25%.

In practice, one might use something like 32 hash tables. This yields a false positive rate of 1/(2^32). Their most obvious application is to store the dictionary for a spell checker in a space-efficient way, though I have a friend who wrote a paper on using them for router caches.

There was a google tech talk on bloom filters awhile ago: http://www.youtube.com/watch?v=947gWqwkhu0

-jim
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to