----- Original Message ----- From: "Edgar Friendly" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Wednesday, June 26, 2002 12:41 AM Subject: Re: [freenet-dev] Probabalistic caching with Ps(k)
> "Scott Young" <[EMAIL PROTECTED]> writes: > > > If data is cached with the probabibility of Ps(k), then data would not be > > likely to be cached where it is not likely to be found. This would also > > decrease I/O to the hard drive and increase node specialization. A function > > more specific to probabalistic caching could be used, such as Pc(k) = > > probability of caching ~= 1 - abs(k - average(all keys in datastore))/(max > > key value), which would tend to get nodes to tend to specialize more toward > > one section of the keyspace. > > Any thoughts? > > > > > > Scott Young > > > Yeah, I think that data isn't being cached in enough places, and > that's causing requests for data that's in the network to fail. Is it that it just isn't being cached in enough places, or it not being cached enough in places that would improve finding the data? Increased specialization should reduce the number of hops needed to find data (since it brings it closer to log(n) performance instead of linear performance), reducing the need to have it cached sporadically throughout the network. > Disk > IO is not at all any kind of bottleneck in a freenet node, so that's > an irrelevant bonus. And increasing specialization at the cost of > slowing down the rate at which the network redistributes load is a bad > thing. Does caching in this way effectively redistribute load? I have some other ideas about addressing this. The way freenet allows plausible deniability is by allowing any node along the request path to declare itself as the originator. If an overloaded node does this rarely, then the nodes requesting to it will find nodes closer to the data they're requesting, and be able to offset the load from that node in subsequent requests. To prevent a malicious node from flooding a node, requesting a file, and using the name of the originator to determine whether the node actually has that file, a node could store the source that it itself got the file from, and use that in its reply. Another way to address load distribution is to look at the way TCP handles it. To apply this to freenet though, each node would have to know the load of the nodes connected to it (this doesn't apply to TCP because each connection does not share bandwidth with other connections in TCP, unlike freenet). Maybe a node could have a special reply to a request that says the node is overloaded, and then the requesting node could back off the overloaded node for a while. > We don't want nodes specializing towards one section of the keyspace, > we want nodes being good at whatever the rest of the network throws at > them in terms of requests, whatever that is. If the network settles > to a state where each node gets a section of the keyspace, great, but > I definitely don't want to limit the network to that one possibility. > > Thelema > -- I wasn't sure if it would be good or bad when I suggested the function. I also realized after writing the message that that function would tend to make nodes specialize toward the middle of the keyspace, which would definitely be very limiting. We also don't want nodes choosing their specialization, so Gianni's Ps(k) function would probably be the best for caching data were it is most at home on the network (although what's to prevent a malicious node from only fulfilling requests within a certain range of the keyspace, thus gradually ensuring it's specialization). What I 'm basically saying is that data should be cached where it is most useful for the network (i.e. easily found). Scott Young _______________________________________________ devl mailing list [EMAIL PROTECTED] http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl
