Re: [freenet-dev] Probabalistic caching with Ps(k)

Scott Young Thu, 27 Jun 2002 11:53:15 -0700


----- Original Message -----
From: "Edgar Friendly" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, June 26, 2002 12:41 AM
Subject: Re: [freenet-dev] Probabalistic caching with Ps(k)

> "Scott Young" <[EMAIL PROTECTED]> writes:
>
> > If data is cached with the probabibility of Ps(k), then data would not
be
> > likely to be cached where it is not likely to be found.  This would also
> > decrease I/O to the hard drive and increase node specialization.  A
function
> > more specific to probabalistic caching could be used, such as Pc(k) =
> > probability of caching ~= 1 - abs(k - average(all keys in
datastore))/(max
> > key value), which would tend to get nodes to tend to specialize more
toward
> > one section of the keyspace.
> > Any thoughts?
> >
> >
> > Scott Young
> >
> Yeah, I think that data isn't being cached in enough places, and
> that's causing requests for data that's in the network to fail.

Is it that it just isn't being cached in enough places, or it not being
cached enough in places that would improve finding the data?  Increased
specialization should reduce the number of hops needed to find data (since
it brings it closer to log(n) performance instead of linear performance),
reducing the need to have it cached sporadically throughout the network.

> Disk
> IO is not at all any kind of bottleneck in a freenet node, so that's
> an irrelevant bonus.  And increasing specialization at the cost of
> slowing down the rate at which the network redistributes load is a bad
> thing.

Does caching in this way effectively redistribute load?  I have some other
ideas about addressing this.  The way freenet allows plausible deniability
is by allowing any node along the request path to declare itself as the
originator.  If an overloaded  node does this rarely, then the nodes
requesting to it will find nodes closer to the data they're requesting, and
be able to offset the load from that node in subsequent requests.  To
prevent a malicious node from flooding a node, requesting a file, and using
the name of the originator to determine whether the node actually has that
file, a node could store the source that it itself got the file from, and
use that in its reply.

Another way to address load distribution is to look at the way TCP handles
it.  To apply this to freenet though, each node would have to know the load
of the nodes connected to it (this doesn't apply to TCP because each
connection does not share bandwidth with other connections in TCP, unlike
freenet).  Maybe a node could have a special reply to a request that says
the node is overloaded, and then the requesting node could back off the
overloaded node for a while.

> We don't want nodes specializing towards one section of the keyspace,
> we want nodes being good at whatever the rest of the network throws at
> them in terms of requests, whatever that is.  If the network settles
> to a state where each node gets a section of the keyspace, great, but
> I definitely don't want to limit the network to that one possibility.
>
> Thelema
> --

I wasn't sure if it would be good or bad when I suggested the function.  I
also realized after writing the message that that function would tend to
make nodes specialize toward the middle of the keyspace, which would
definitely be very limiting.  We also don't want nodes choosing their
specialization, so Gianni's Ps(k) function would probably be the best for
caching data were it is most at home on the network (although what's to
prevent a malicious node from only fulfilling requests within a certain
range of the keyspace, thus gradually ensuring it's specialization).  What I
'm basically saying is that data should be cached where it is most useful
for the network (i.e. easily found).

Scott Young

_______________________________________________
devl mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Probabalistic caching with Ps(k)

Reply via email to