On Mon, 18 Feb 2002, Jeff Darcy wrote:

> > It is better to route data though other nodes even when there is a more
> > direct path between two nodes?
> 
> Sometimes, yes; it's entirely workload-dependent because you can't predict
> where the next access will come from.  It's also important to remember that
> "more direct" can mean different things in the overlay network (such as
> Freenet) and the underlying physical network.  Just yesterday, there was an
> article posted on Slashdot whose author had obviously forgotten that the two
> might not be congruent, and the whole proposal pretty much falls down once
> that is realized.

Well if I design the network it will be designed purely for a TCP/IP 
network where every node can contact every other node directly any way
without having to go through other nodes only very fast routers.  To me 
caching in a freenet node will only make sense when a node can be in the 
same place of routers.  For example if data normally has to go from 
A->B->C->D->E to get from A to E and.  Freenet caching, to me, will only 
make sense if data goes from A-> B-> FREENET NODE -> D -> E with out having to 
take any extra hops in the process, or at least only having to take a few 
extra hops.  However, read on as there are other reasons I don't want to 
always have to route data through other nodes.

> > In particular do you think that this
> > strategy is flawed:
> >
> >   I also believe that by simply transferring data from one node to another
> >   a lot of the benefits of freenet's caching algorithm can be retained.
> >   Because, once node A transfers a key from node B nodes close to A (say
> >   B,C,D) can transfer data from node A instead of having to download it
> >   from node S which is farther away.
> 
> What you're suggesting is merely a subset of what Freenet already does -
> i.e. the final recipient caches the data rather than all recipients in the
> path.  Why is it better for one to cache than for all to cache?

Well for one thing it will cause no-so-popular data to fall of the network 
faster which is one of the key things I would like to avoid.  I want 
DistribNet to serve as a solution for storing long term data not data 
which just happens to be popular at the moment.

> 
> >   The more
> >   nodes that download the data the less chance there is that *all* off
> >   them will be inaccessible to other nodes.
> 
> What you should be saying is "the more nodes that *have* the data...".  Why
> limit it to nodes that initiated new requests for the data?

Because it gives more freedom in how data is requested.  The main problem 
with always having to route data through other nodes is that it is 
incompatible with finding no-so-popular data sitting only on a few distant 
nodes.  I don't quite know how else to explain this.

> 
> >   However, if for some reason nodes A,B,C,D and completely inaccessible to
> >   other nodes, node S will notice that it is getting a large number of
> >   requested from far away and will upload the data to some other nodes
> >   which are closer to the origin of the requests.
> 
> I think you'll find that the algorithms by which S "notices" these request
> patterns either require huge amounts of memory and computation or don't work
> very well...or both.  Ditto for the algorithms for deciding where to
> "upload" the data (a.k.a. replica placement).  As I said, a lot of smart
> people have worked on these problems, but it's remarkably difficult to come
> up with anything better than sheer opportunistic caching like Freenet does.

Could you please give me some references.  I want to avoid mistakes others 
have made but I want to no *why* there methods fail and how.  Only then 
can I have any hope of creating any thing better.

--- 
http://kevin.atkinson.dhs.org




_______________________________________________
freenet-tech mailing list
[EMAIL PROTECTED]
http://lists.freenetproject.org/mailman/listinfo/tech

Reply via email to