Re: [freenet-dev] Changes to datastore bloom filters

Matthew Toseland Mon, 25 Oct 2010 07:11:41 -0700

On Sunday 24 October 2010 00:08:11 Juiceman wrote:
> On Oct 23, 2010 1:01 PM, "Matthew Toseland" <t...@amphibian.dyndns.org>
> wrote:
> > We are still doing quite a lot of disk I/O, especially on writes. And a
> typical node does a lot of writes. The quadratic probing mechanism in the
> salted hash store means we have to do 5 seeks on a write - 10 on a write to
> the cache, because we also check the store. And counting bloom filters
> aren't all that great a structure, and they have to be rebuilt regularly,
> which seems to cause enough disk I/O to be a problem for usability of the
> whole system.
> 
> I would heartily recommend anything that reduces disk io.  Disks are now the
> slowest part of a PC (except maybe if you have a ssd.)
> 
> > Evan proposed some time back an alternative: We keep a structure on disk,
> memory mapped, with the name number of slots as the datastore, and with each
> slot containing a short hash of the key stored there (with a special value
> to indicate empty). This would never need to be rebuilt (although it could
> be randomly checked and corrected), would have similar performance for the
> same size structure as bloom filters, and most importantly it would allow us
> to do a disk write with only one seek.
> >
> > If we use it for the equivalent of bloom filter sharing, we even get the
> advantage that we can transfer it a byte at a time and have it immediately
> work without having to transfer all of it. Whereas with classic bloom filter
> sharing we'd need to split the keyspace (hashed) into multiple bloom
> filters, each of which only works when the whole thing has been fetched,
> we'd need to make sure that none of them are over-full, and we'd need to
> keep them up to date, meaning they'd be counting filters and they'd have to
> be rebuilt periodically.
> >
> > Unfortunately it opens up a censorship attack: We'd have to send them our
> salt value, and they'd therefore be able to manufacture a key (actually 5
> keys) to overwrite any target block (via e.g. an insert). This might be
> acceptable for darknet connections, but it probably isn't for opennet.
> Whereas with ordinary bloom filter sharing, you have to wait until the data
> has transferred, and then you know whether they have the key - you don't
> know how to get rid of it, short of DoS'ing the node itself.
> 
> Similarly to you other email about not routing to newbies, you could only
> share bloom filters with all darknet peers and opennet nodes with uptime >=
> 30 minutes for example.


Yes, that may be sensible, even in the rare cases where the bandwidth cost 
isn't an issue.

> You could also put some other qualifies in there 
> also like % success, etc.  This would have the added benefit of not wasting
> bandwidth sharing bloom filters with nodes likely to go offline quickly.
> 
> Further security could be had by detecting if inserts coming from a node are
> "too" accurate.

IMHO detecting the attack would be quite difficult, especially if the attacker 
could do it over an extended period; it might force us to keep longer term logs 
that we really don't want to keep. On the other hand, if we can force them to 
stay connected for a longish period that increases the cost of the attack in 
itself...
> 
> > Arguably we shouldn't worry about bloom filter sharing since we won't be
> implementing it until after 0.8 anyway, in which case if we have time we
> should implement the new structure, to limit disk I/O.

signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Changes to datastore bloom filters

Reply via email to