On Oct 23, 2010 1:01 PM, "Matthew Toseland" <t...@amphibian.dyndns.org>
wrote:
> We are still doing quite a lot of disk I/O, especially on writes. And a
typical node does a lot of writes. The quadratic probing mechanism in the
salted hash store means we have to do 5 seeks on a write - 10 on a write to
the cache, because we also check the store. And counting bloom filters
aren't all that great a structure, and they have to be rebuilt regularly,
which seems to cause enough disk I/O to be a problem for usability of the
whole system.

I would heartily recommend anything that reduces disk io.  Disks are now the
slowest part of a PC (except maybe if you have a ssd.)

> Evan proposed some time back an alternative: We keep a structure on disk,
memory mapped, with the name number of slots as the datastore, and with each
slot containing a short hash of the key stored there (with a special value
to indicate empty). This would never need to be rebuilt (although it could
be randomly checked and corrected), would have similar performance for the
same size structure as bloom filters, and most importantly it would allow us
to do a disk write with only one seek.
>
> If we use it for the equivalent of bloom filter sharing, we even get the
advantage that we can transfer it a byte at a time and have it immediately
work without having to transfer all of it. Whereas with classic bloom filter
sharing we'd need to split the keyspace (hashed) into multiple bloom
filters, each of which only works when the whole thing has been fetched,
we'd need to make sure that none of them are over-full, and we'd need to
keep them up to date, meaning they'd be counting filters and they'd have to
be rebuilt periodically.
>
> Unfortunately it opens up a censorship attack: We'd have to send them our
salt value, and they'd therefore be able to manufacture a key (actually 5
keys) to overwrite any target block (via e.g. an insert). This might be
acceptable for darknet connections, but it probably isn't for opennet.
Whereas with ordinary bloom filter sharing, you have to wait until the data
has transferred, and then you know whether they have the key - you don't
know how to get rid of it, short of DoS'ing the node itself.

Similarly to you other email about not routing to newbies, you could only
share bloom filters with all darknet peers and opennet nodes with uptime >=
30 minutes for example.  You could also put some other qualifies in there
also like % success, etc.  This would have the added benefit of not wasting
bandwidth sharing bloom filters with nodes likely to go offline quickly.

Further security could be had by detecting if inserts coming from a node are
"too" accurate.

> Arguably we shouldn't worry about bloom filter sharing since we won't be
implementing it until after 0.8 anyway, in which case if we have time we
should implement the new structure, to limit disk I/O.
_______________________________________________
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to