Paul Rubin <http://[EMAIL PROTECTED]> writes: > Finally, I think to use Bloom filters effectively you have to choose > nlayers and layersize carefully, according to the number of keys you > expect to see, the amount of memory you have available, and/or the > probability of false matches that you're willing to accept. The > standard references (whatever they are) about Bloom filters probably > have some suitable formulas for computing these things.
Heh, there's an online calculator: http://www.cc.gatech.edu/fac/Pete.Manolios/bloom-filters/calculator.html Link is from http://en.wikipedia.org/wiki/Bloom_filter which is a good article. As described in the article, there's only one "layer" in the Bloom filter (you use n different hashes but they all touch the same bit map) which corresponds to how the ancient Unix spell checker worked, I think. It would take some analysis to show which way is better. I may try working out the formulas as an exercise but I'm too sleepy for that right now. -- http://mail.python.org/mailman/listinfo/python-list