[freenet-dev] Can we implement Bloom filter sharing quickly???

Evan Daniel Fri, 1 May 2009 16:46:25 -0400

On Fri, May 1, 2009 at 4:32 PM, Robert Hailey <robert at freenetproject.org> 
wrote:
>
> On May 1, 2009, at 3:02 PM, Matthew Toseland wrote:
>
> [20:37:16] <evanbd> You standardize on a size for the bloom filters; say
> 1MiB.
> Then, if your store has 100GB of data, and needs 18MiB of bloom filters,
> you
> partition the keyspace into 18 chunks of roughly equal population. ?Each
> segment of the keyspace then gets its own bloom filter.
> [20:38:24] <evanbd> Then, if my node has 100MiB of memory to spend on my
> peers' bloom filters, and 20 peers, I just ask each peer for up to 5 bloom
> filters.
> [20:40:11] <toad_> evanbd: it's good to have a partial filter for each node
> yes
> [20:40:24] <toad_> evanbd: however, you end up checking more filters which
> increases false positives
> [20:40:52] <evanbd> No, the fp rate stays the same.
> [20:41:18] <evanbd> Suppose your node has 18 filters, each with a 1E-5 fp
> rate
> [20:41:36] <evanbd> When I get a request, I compare it to your node's
> filter
> set.
> [20:41:57] <evanbd> But only *one* of those filters gets checked, since
> each
> one covers a different portion of the keyspace
>
> If requested to send 5 partial filters, which do you send? I presume those
> closest to the originator's present location.
> Looks like you actually may end up checking?less?filters overall (if the
> blooms from large peers are out-of-range of the request).


Yes, that's a question worth considering.  There are both performance
and security issues involved, I think.  Note that the partition could
be a set of contiguous regions (allowing performance optimization
around which piece of the keyspace you send info about), but it could
just as easily be determined by a hash function instead.

You still check the same number of filters overall -- one per peer.
The difference is that for some peers you may have a partial filter
set, and therefore sometimes check their filters, instead of deciding
you don't have the memory for that peer's filter and never checking
it.

> Nice idea, Evan!

Thanks!

Evan

[freenet-dev] Can we implement Bloom filter sharing quickly???

Reply via email to