[freenet-dev] Performance with almost disconnected darknets

Matthew Toseland Mon, 5 Jan 2009 19:16:32 +0000

On Monday 05 January 2009 14:17, vive wrote:
> On Tue, Dec 23, 2008 at 12:48:04PM +0000, Matthew Toseland wrote:
> > > 2. Data availability: the success rate and the number of steps in 
successful 
> > routes (here: the roundtrip steps) that is seen in simulation.
> > > 
> > > The simulation: this was made by joining both ideal (generated) social 
> > networks as well as networks from actual datasets on social networks (the 
pgp 
> > WoT). Both are navigable networks, although slighly differing. It is hoped 
> > that this can correspond roughly to the scenario above, by joining these 
> > networks with different numbers of random connections.
> > > 
> > > Simulation:
> > > 0. Load the two networks, each of size 4000. All nodes are set to be 
pure 
> > darknet.
> > > 1. Connect the networks by joining pairs of random nodes, where one node 
is 
> > picked from each network. I tried this with a cut of 10,100,1000,10000 
links.
> > > 2. Nodes randomize their positions and swap for a while to stabilize the 
> > locations (successful swaps get more uncommon after this) to start from 
> > a "stable" state
> > > 3. 1000 keys are inserted by random nodes into the network (both sink 
and 
> > cache). Nodes have a store of 50 keys and a cache of 50 keys. Both are LRU 
as 
> > normal.
> > 
> > Actually we use pseudo-random replacement now. This probably doesn't 
strongly 
> > affect the results.
> > 
> > > 4. The main phase: nodes swap positions when they can, and request data 
for 
> > a longer interval. (At the moment no new inserts nor churn, this can be 
the 
> > next thing to simulate, but need data input, see below.) The goal is to 
see 
> > what the data availability is, and how the positions have got affected 
after 
> > some time. Nodes request for data via keys that have different popularity: 
a 
> > standard Zipf distribution is used over the keys. The parameter is also 
> > varied in the simulations. I have typially let the 4000 nodes do in total 
> > 100000 requests and evaluated how that went. It could clearly be evaluated 
> > over longer time, but how long is reasonable?
> > > 5. Evaluate the two measures above (variation distance always starts 
close 
> > to 0 when one initially randomizes)
> > > 
> > The real world case most interesting to me is that of two darknets slowly 
> > merging. Does this result in a severe drop in performance in the short 
run? 
> > Hence, an important measurement is the success rate for requests made on 
the 
> > same side of the border as the data was inserted on. Is this dramatically 
> > lower with a smaller cut? How does it compare to that darknet running on 
its 
> > own?
> 
> Hi!
> 
> I did some simulations of this. See the attached plots. The model here is as
> described previously, but it tracks data and keys with respect to which part
> of the network they were originally inserted. Routes where nodes query for
> keys that were originally inserted in the same network part are labelled
> "short routes", otherwise "long routes" when they search for keys inserted 
in
> the other part.


This is great, it demonstrates with a reasonable degree of confidence that two 
darknets approaching each other and slowly merging will not have their local 
routing trashed, so we do not need to have any special handling for separate 
subnetworks; and, provided we assume the current model of just doing lots of 
requests, the most popular data from the other side will generally be 
available eventually and this improves dramatically as more links are 
established.
> 
> The impact of this varies depending on the size of the cut and how popular
> certain keys are. The plots illustrate both these dependencies.
> 
> The simulations show at least two things regarding to the previous fear. 
First,
> connecting two darknets loosely does not seem to break routing that much 
within
> a darknet, but shields the nodes off from data inserted in the other 
network.
> (And data is at least slowly made data available again. This is not very
> surprising, but here are the simulations showing some impact.) Second,
> real-world networks seem to be slightly better (than fair models where all
> nodes get the same number of connections) at making data available via long
> routes. This is seen by comparing the two plots where the zipf parameter is
> varied.
> The last effect is probably due to of a different degree distribution than 
the
> ideal model (then, a few high-degree nodes naturally see more queries 
passing
> through, quickly making remote data available for more nodes in their own
> network). 

It seems to be a lot more than slightly if I'm reading the plots right. Small 
world with fixed degree seems to be a lot slower than the somewhat-scale-free 
small world networks that real-world data tends to produce. But it's a 
tradeoff between performance and vulnerability ... ignoring the question of 
the capacity of well-connected nodes...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20090105/ce1d17dd/attachment.pgp>

[freenet-dev] Performance with almost disconnected darknets

Reply via email to