On 2019/08/03 20:42:10, Ryan Blue <[email protected]> wrote: 
>    - Should the bloom filters support compression? If the strategy for a
>    lower false-positive rate is to under-fill the multi-block filter, then
>    would compression help reduce the storage cost?

Eventually, it might be useful to support compression. The question of how to 
do so in the unique case of Bloom filters, though, deserves some thought and 
planning that I think would be most useful in a follow-on. Mitzenmacher's 
"Compressed Bloom Filters" [0] has some discussion of the issues.

There are complex tradeoffs between compressed size, uncompressed size, number 
of hash functions, and the false positive rate, and this is only one direction 
we could go to reduce the false positive rate while keeping the storage space 
constant. For instance, there are several other filter algorithms that achieve 
the best possible false positive rate (for any filter, compressed or not), but 
that pay for it with complexity, query time, and construction time. These 
include Golomb filters, XORSAT filters, Porat's matrix filters, Pagh et. al's 
filters, etc.

[0] http://www.eecs.harvard.edu/~michaelm/NEWWORK/postscripts/cbf2.pdf

Reply via email to