We are still doing quite a lot of disk I/O, especially on writes. And a typical 
node does a lot of writes. The quadratic probing mechanism in the salted hash 
store means we have to do 5 seeks on a write - 10 on a write to the cache, 
because we also check the store. And counting bloom filters aren't all that 
great a structure, and they have to be rebuilt regularly, which seems to cause 
enough disk I/O to be a problem for usability of the whole system.

Evan proposed some time back an alternative: We keep a structure on disk, 
memory mapped, with the name number of slots as the datastore, and with each 
slot containing a short hash of the key stored there (with a special value to 
indicate empty). This would never need to be rebuilt (although it could be 
randomly checked and corrected), would have similar performance for the same 
size structure as bloom filters, and most importantly it would allow us to do a 
disk write with only one seek.

If we use it for the equivalent of bloom filter sharing, we even get the 
advantage that we can transfer it a byte at a time and have it immediately work 
without having to transfer all of it. Whereas with classic bloom filter sharing 
we'd need to split the keyspace (hashed) into multiple bloom filters, each of 
which only works when the whole thing has been fetched, we'd need to make sure 
that none of them are over-full, and we'd need to keep them up to date, meaning 
they'd be counting filters and they'd have to be rebuilt periodically.

Unfortunately it opens up a censorship attack: We'd have to send them our salt 
value, and they'd therefore be able to manufacture a key (actually 5 keys) to 
overwrite any target block (via e.g. an insert). This might be acceptable for 
darknet connections, but it probably isn't for opennet. Whereas with ordinary 
bloom filter sharing, you have to wait until the data has transferred, and then 
you know whether they have the key - you don't know how to get rid of it, short 
of DoS'ing the node itself.

Arguably we shouldn't worry about bloom filter sharing since we won't be 
implementing it until after 0.8 anyway, in which case if we have time we should 
implement the new structure, to limit disk I/O.

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to