I'd like to discuss what the considerations are for
network topology.  The particular topology
I mentioned (which I've since been convinced
isn't really a cube or torus after all) was
designed with the idea that it's important to
be able to reliably query the entire network
without sending any nodes duplicate queries.
I'm not sure how important these considerrations 
really are, though.  I got the impression that
there are huge numbers of duplicate queries sent to the
same node by multiple paths in the current gnutella network,
but this may be lessa problem than I think it is.  Also,
as the number of nodes in the network becomes large,
it clearly becomes impossible for every query to
reach ever node, and besides, this isn't really desirable
if you're getting lots of hits.

I get the impression that network design generally starts off
with the assumption that you've got data that is intended to
go to a particular place, and a "good" design is one where
you can get your data there in a small number of hops while 
avoiding creating any bottlenecks.  But the criteria for
a p2p file sharing network are very differnt;  you're not trying
to query any particular node, you just want to query a sufficent 
number of nodes such that you find what you're looking for
(assuming it's out there).

The implications I get from this are:
1) if you're looking for something pretty common, there's
really very little point in querying much of the network.
<aside>
I think maybe if I query packet included a "hits so far" stat as
well as a time to live (BTW, does anyone else think "time to live"
should be "time to die"?) and stop forawrding the query when the
hits passes some threshhold.  Of course, you're only seeing hits 
along one particular branch at  at time, so it may turn out that
you very seldom see enough hits on one branch for this number to 
be 
meaingful.
</aside>
2)If we accept the fact that most queries will only reach a small
piece of the network, then if we want to find something relatively 
obscure we should either have some way of designating some 
queries
as being "special" and needing/deserving wider distribution
(vast potential for abuse here) or we'd like to ensure that
if we requery we will hit a distinct (and ideally disjoint)
subset of the network.

So...
what are the other important considerations?
what are the implications for network topology?  

George

Reply via email to