On Friday 01 May 2009 20:37:38 Robert Hailey wrote: > > On May 1, 2009, at 9:35 AM, Matthew Toseland wrote: > > > IMPLEMENTING IT: > > Main tasks: > > - Converting our datastore indexes into a compact, slightly more > > lossy, 1-bit > > filter. (Easy) > > - Creating a snapshot once an hour, and keeping for some period (1 > > week?). > > (Easy) > > - Creating an efficient binary diff format for hour-to-hour diffs, > > and keeping > > them. (Moderate) > > This might actually be quite hard as an efficient bloom filter will > scatter (even a few) updates all over the filter.
No. We can be fairly efficient, in terms of update traffic, even with a very naive format (just specifying each changed bit as a number). With arithmetic encoding we can be really efficient. > > > - Sending our filters to our darknet peers. (Easy) > > - Updating our darknet peers once an hour with the diffs we create > > anyway. > > (Easy) > > - Recording what version of the indexes we have sent to each darknet > > peer. > > (Easy) > > - Updating our darknet peers when they reconnect after downtime with > > all the > > diffs they missed. (Easy) > > IMHO this revision/diff/update system is too exacting a mechanism. > > If this is to be used only for failing requests, perhaps a quicker-to- > implement & lossy mechanism should be considered. > > i.e. a sparse (and decaying?) bloom filter > -- > initialize to empty (of course) > choose an update period (1 week you mentioned?) > send out 'rolling' updates of our bloom filter > *i.e. start at 'chunk' 0, then 1, then 2 > *at such a rate that in one period it will be fully transmitted (1/nth > chunk every 1/period) > whenever we get an update from our peer it overwrites that chunk of > the bloom filter > *if we don't get an update from our peer, zero out that chunk > *if we have been offline, zero out the chunks we missed The filter would not be useful until it had been transmitted in its entirety. Partitioning the keyspace and having separate bloom filters for each section, of the standard size, could work however; see the irc log I posted with evanbd. > -- > > Still must decide if going to listen to updates (i.e. keeping huge > bloom filters) > Transmits are WAY easier > Receives are easy > After ~ one period it becomes usable (starts producing hits) > > The downside is, the more a peer is down, the less likely we are to > have a usable filter... That is, unless a peer might be able to > request a chunk of the bloom filter that he missed/wants (!!?!)] > > Or even... just make it a pull-mechanism!!!! those peers interested > in our bloom filters can request any part of it; but it might cost > them request-capital. > > > - Using any filters we have to send GetOfferedKey-like requests for > > the data, > > and handling such requests. (Easy) > > This is the core of the enhancement, of course; but may be easier by > just requesting the data (making it like a turtled request) rather > than a new message. It would work in *exactly* the same way as GetOfferedKey works already. It is easy. > > > - Track success rates for bloom-based fetches. (Optional, can be > > done later) > > (Easy) > > - Tracking a large number of opennet peers which we have connected > > to over the > > last week, recording their total uptime over the week. (Moderate) > > - Calculating which have sufficient uptime for sharing bloom filters > > to be > > worthwhile. (Easy) > > - Sending those nodes the bloom filters, and keeping them up to > > date, and > > recording what version we sent them last. (Easy) > > - Limit the total memory usage for Bloom filters, by adding it up > > and then > > dropping the biggest filters from active use until we're not over > > the limit. > > Tell our peers so they don't send us updates. (Moderate) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 835 bytes Desc: This is a digitally signed message part. URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20090501/a69c91e2/attachment.pgp>
