We are still doing quite a lot of disk I/O, especially on writes. And a typical node does a lot of writes. The quadratic probing mechanism in the salted hash store means we have to do 5 seeks on a write - 10 on a write to the cache, because we also check the store. And counting bloom filters aren't all that great a structure, and they have to be rebuilt regularly, which seems to cause enough disk I/O to be a problem for usability of the whole system.
Evan proposed some time back an alternative: We keep a structure on disk, memory mapped, with the name number of slots as the datastore, and with each slot containing a short hash of the key stored there (with a special value to indicate empty). This would never need to be rebuilt (although it could be randomly checked and corrected), would have similar performance for the same size structure as bloom filters, and most importantly it would allow us to do a disk write with only one seek. If we use it for the equivalent of bloom filter sharing, we even get the advantage that we can transfer it a byte at a time and have it immediately work without having to transfer all of it. Whereas with classic bloom filter sharing we'd need to split the keyspace (hashed) into multiple bloom filters, each of which only works when the whole thing has been fetched, we'd need to make sure that none of them are over-full, and we'd need to keep them up to date, meaning they'd be counting filters and they'd have to be rebuilt periodically. Unfortunately it opens up a censorship attack: We'd have to send them our salt value, and they'd therefore be able to manufacture a key (actually 5 keys) to overwrite any target block (via e.g. an insert). This might be acceptable for darknet connections, but it probably isn't for opennet. Whereas with ordinary bloom filter sharing, you have to wait until the data has transferred, and then you know whether they have the key - you don't know how to get rid of it, short of DoS'ing the node itself. Arguably we shouldn't worry about bloom filter sharing since we won't be implementing it until after 0.8 anyway, in which case if we have time we should implement the new structure, to limit disk I/O.
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Devl mailing list Devl@freenetproject.org http://freenetproject.org/cgi-bin/mailman/listinfo/devl