[freenet-dev] Can we implement Bloom filter sharing quickly???

Robert Hailey Fri, 1 May 2009 14:37:38 -0500

On May 1, 2009, at 9:35 AM, Matthew Toseland wrote:

> IMPLEMENTING IT:
> Main tasks:
> - Converting our datastore indexes into a compact, slightly more  
> lossy, 1-bit
> filter. (Easy)
> - Creating a snapshot once an hour, and keeping for some period (1  
> week?).
> (Easy)
> - Creating an efficient binary diff format for hour-to-hour diffs,  
> and keeping
> them. (Moderate)


This might actually be quite hard as an efficient bloom filter will  
scatter (even a few) updates all over the filter.

> - Sending our filters to our darknet peers. (Easy)
> - Updating our darknet peers once an hour with the diffs we create  
> anyway.
> (Easy)
> - Recording what version of the indexes we have sent to each darknet  
> peer.
> (Easy)
> - Updating our darknet peers when they reconnect after downtime with  
> all the
> diffs they missed. (Easy)

IMHO this revision/diff/update system is too exacting a mechanism.

If this is to be used only for failing requests, perhaps a quicker-to- 
implement & lossy mechanism should be considered.

i.e. a sparse (and decaying?) bloom filter
--
initialize to empty (of course)
choose an update period (1 week you mentioned?)
send out 'rolling' updates of our bloom filter
*i.e. start at 'chunk' 0, then 1, then 2
*at such a rate that in one period it will be fully transmitted (1/nth  
chunk every 1/period)
whenever we get an update from our peer it overwrites that chunk of  
the bloom filter
*if we don't get an update from our peer, zero out that chunk
*if we have been offline, zero out the chunks we missed
--

Still must decide if going to listen to updates (i.e. keeping huge  
bloom filters)
Transmits are WAY easier
Receives are easy
After ~ one period it becomes usable (starts producing hits)

The downside is, the more a peer is down, the less likely we are to  
have a usable filter... That is, unless a peer might be able to  
request a chunk of the bloom filter that he missed/wants (!!?!)]

Or even... just make it a pull-mechanism!!!!   those peers interested  
in our bloom filters can request any part of it; but it might cost  
them request-capital.

> - Using any filters we have to send GetOfferedKey-like requests for  
> the data,
> and handling such requests. (Easy)

This is the core of the enhancement, of course; but may be easier by  
just requesting the data (making it like a turtled request) rather  
than a new message.

> - Track success rates for bloom-based fetches. (Optional, can be  
> done later)
> (Easy)
> - Tracking a large number of opennet peers which we have connected  
> to over the
> last week, recording their total uptime over the week. (Moderate)
> - Calculating which have sufficient uptime for sharing bloom filters  
> to be
> worthwhile. (Easy)
> - Sending those nodes the bloom filters, and keeping them up to  
> date, and
> recording what version we sent them last. (Easy)
> - Limit the total memory usage for Bloom filters, by adding it up  
> and then
> dropping the biggest filters from active use until we're not over  
> the limit.
> Tell our peers so they don't send us updates. (Moderate)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20090501/b047c408/attachment.html>

[freenet-dev] Can we implement Bloom filter sharing quickly???

Reply via email to