On Thu, Mar 12, 2009 at 1:03 AM, Evan Daniel <[email protected]> wrote:
> 2009/3/4 Matthew Toseland <[email protected]>:
>
>> STORAGE:
>> There are 6 stores: a store and a cache for each of SSK, CHK and public key
>> (for SSKs; we can avoid sending this over the network in many cases). Every
>> block fetched is stored in the relevant cache. When an insert completes, it
>> will be stored only if there is no node with a location closer to the key's
>> location form than this node. But this calculation ignores peers with a lower
>> reported uptime than 40% (this is simply an estimate computed by the node and
>> sent to us on connection), and ignores backed off nodes unless all nodes are
>> backed off. It does not take into account recently failed status etc.
>> Arguably we should send uptime messages more often than on connection: this
>> disadvantages high uptime nodes that connected when they were first
>> installed. In practice the store fills up *much* slower than the cache.
>
> There is a technique that would make the store fill more quickly than
> it currently does without any drawbacks (aside from a small amount of
> development time ;) ).  Right now, there are two hash tables with one
> slot per location.  One hash table is the store, one the cache.
> (Obviously I'm only considering a single one of CHK/SSK/pubkey.)  This
> is equivalent to a single hash table with two slots per location, with
> rules that decide which slot to use rather than which hash table.
>
> Currently, the rule is that inserts go in the store and fetches in the
> cache.  The change is this: when storing a key in the cache, if the
> cache slot is already occupied but the store slot is empty, put it in
> the store instead (and vice versa).  Even without the bloom filter,
> this doesn't add any disk reads -- by treating it as one hash table
> with two slots per location, you put those two slots adjacent on disk
> and simply make one larger read to retrieve both keys.
>
> This is a technique I first saw in hash tables for chess programs.
> Evaluation results are cached, with two distinct slots per location.
> One slot stores the most recent evaluation result, the other stores
> the most expensive to recompute.  There is a noticeable performance
> improvement in the cache if you are willing to store a result in the
> wrong slot when only one of the two is full already.

I think this is the right way to go.
The 50%:50% cache/store size split cause some confusion on user
(there were some discussion on frost months ago), and wasted the disk
space allocated.


> Evan Daniel
_______________________________________________
Devl mailing list
[email protected]
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to