It would seem that the "weight" function should be proportional to the
size of the file *multiplied* by the time since last request.  This seems
intuitive since the resources consumed are proportional to both the size and
the time the item is alive.

But an efficient datastructure to manage this is a bit complicated,
since different sized docs will increase in weight at a different rate
over time (e.g. a size 1K doc aged 190 seconds will overtake a size
2K doc aged 90 seconds in another 10 seconds, and from then on should
be deeper in the stack.)

One possibility is to maintain buckets based on size (for example, powers
of two).  Each bucket is a stack, as before.  Then you throw stuff out
from the bottom of the bucket that has the document with largest weight.
When a document is accessed, it goes to the top of its bucket.

I am willing to write this data structure and provide a clean and generic
interface.  umm.... now that I look at CyclicArray, it has scalability
problems (linear searches!).  I will write something better if it's okay
with everyone...

So the requirements are:
        - Remove an element in O(log n) time.
        - Insert an element in O(log n).
        - Retire elements on Insert based on algorithm above.

Should it retire documents to keep the number of documents constant, or keep
the total size (in bytes) constant?


On Mon, Aug 07, 2000 at 03:05:26PM +0100, Ian Clarke wrote:
> 
> > > Ok, how is this for a solution to this exploit:  Rather than using the
> > > position in the datastore found by the method I describe as the insert
> > > point for the data, it is used as a lower-bound for the insert point,
> > > ie. the data is inserted at a random position above this point.
> > 
> > This might help some, but I'm still worried that uselessly small data will
> > provide a good way to choke up the datastores. I think you should at least
> > implement some lowest size threshhold, say 50 kB, so that all data smaller 
> > then
> > that is counted as having that size for this method.
> 
> Done - the exact threshold can be configured in the .freenetrc file.
> 
> Ian.
> 
> _______________________________________________
> Freenet-dev mailing list
> Freenet-dev at lists.sourceforge.net
> http://lists.sourceforge.net/mailman/listinfo/freenet-dev

-- 
Dev Random
Fingerprint: 3ABC FCEF 1BCE 4528 E4FD  15EB 173A 76D2 6959 DAF1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 232 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20000807/c87b3cc5/attachment.pgp>

Reply via email to