On Thu, Mar 12, 2009 at 1:03 AM, Evan Daniel <[email protected]> wrote: > 2009/3/4 Matthew Toseland <[email protected]>: > >> STORAGE: >> There are 6 stores: a store and a cache for each of SSK, CHK and public key >> (for SSKs; we can avoid sending this over the network in many cases). Every >> block fetched is stored in the relevant cache. When an insert completes, it >> will be stored only if there is no node with a location closer to the key's >> location form than this node. But this calculation ignores peers with a lower >> reported uptime than 40% (this is simply an estimate computed by the node and >> sent to us on connection), and ignores backed off nodes unless all nodes are >> backed off. It does not take into account recently failed status etc. >> Arguably we should send uptime messages more often than on connection: this >> disadvantages high uptime nodes that connected when they were first >> installed. In practice the store fills up *much* slower than the cache. > > There is a technique that would make the store fill more quickly than > it currently does without any drawbacks (aside from a small amount of > development time ;) ). Right now, there are two hash tables with one > slot per location. One hash table is the store, one the cache. > (Obviously I'm only considering a single one of CHK/SSK/pubkey.) This > is equivalent to a single hash table with two slots per location, with > rules that decide which slot to use rather than which hash table. > > Currently, the rule is that inserts go in the store and fetches in the > cache. The change is this: when storing a key in the cache, if the > cache slot is already occupied but the store slot is empty, put it in > the store instead (and vice versa). Even without the bloom filter, > this doesn't add any disk reads -- by treating it as one hash table > with two slots per location, you put those two slots adjacent on disk > and simply make one larger read to retrieve both keys. > > This is a technique I first saw in hash tables for chess programs. > Evaluation results are cached, with two distinct slots per location. > One slot stores the most recent evaluation result, the other stores > the most expensive to recompute. There is a noticeable performance > improvement in the cache if you are willing to store a result in the > wrong slot when only one of the two is full already.
I think this is the right way to go. The 50%:50% cache/store size split cause some confusion on user (there were some discussion on frost months ago), and wasted the disk space allocated. > Evan Daniel _______________________________________________ Devl mailing list [email protected] http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
