On Thursday 02 April 2009 23:29:06 David Cabanillas wrote:
> >Yes. Simulations of load management would be very welcome. You should
> investigate the existing simulators (particularly the event-based ones),
> which are in the SVN repository, I am hopeful that they can be adapted
> without massive effort. You will need to more or less reproduce the current
> load management system (AIMDs on the originator based on timeouts and
> rejections, plus backoff to deal with individual slow peers, documented on
> the wiki), and then simulate various proposals from the mailing lists for
> new
> systems.
> 
> Reviewing the SN repository I found
> http://freenet.googlecode.com/svn/trunk/apps/load-balancing-sims/ from
> 
phase1<http://freenet.googlecode.com/svn/trunk/apps/load-balancing-sims/phase1/>to
> phase7. Are you talking about this simulations?
> 
> 
>    - Simulations from phase1 to phase4 are only working with 3 nodes. In 6
>    and 7 at least they are working with 100 nodes.
>    - The simulation 4 introduces the AIMD algorithm (Peer.java).

On a per-link level, i.e. congestion control. Not on a global load limiting 
level. We use both: nodes use AIMD to decide how much data to send to their 
peers (estimating the capacity of the connection, like TCP does), and they 
also use a separate AIMD to estimate the capacity of the entire network, and 
thus determine how fast to start their own queued requests.

>    - In any simulation I have observed backoff process but we can consider
>    that backoff is useful or it is necessary to assure by means of 
simulations
>    that it is useful.

My recollection is these simulations are fairly low level, mostly used to 
validate the node to node congestion control algorithms. IMHO the real 
challenge is in load management, which is about how load is managed at the 
level of the network as a whole. Currently, the main mechanisms are:

Preemptive request rejection:

Nodes estimate their capacity, based mostly on bandwidth, and reject any 
requests they don't think they have the ability to service. 

Purpose: Don't accept so many requests that they end up timing out because we 
can't complete them all in a reasonable time. 

Code: NodeStats.java, shouldRejectRequest, output bandwidth liability is the 
main mechanism here.

Backoff: 
When a request times out, or a node rejects a request preemptively to prevent 
itself getting overloaded, we backoff from the node for a period. This starts 
at 1 second, and is doubled every time we get another timeout or rejection, 
to a maximum of 3 hours but it is reset back to 1 second when a request 
completes without getting a timeout or a rejection, but the actual backoff 
time is randomized between 0 and this value. There is also a separate backoff 
for transfer timeouts, which starts at 30 seconds. When a node is backed off, 
we only route to it as a last resort; we route to non-backed-off nodes first.

Purpose: Avoid sending requests to nodes that are overloaded or very slow. 
Interacts with the next mechanism to prevent slow nodes from dragging down 
the overall capacity of the network.

AIMD load limiting:
A Freenet node has an AIMD used to estimate the total capacity of the network, 
based on round trip times for each type of request, and on how many requests 
timeout or are rejected (by any node on the request chain, because we forward 
the rejections back to the source, flagged as non-local so they don't break 
the request).

Purpose: Propagate load back to the original requestor, get him to slow down.

If I remember correctly, these simulations are rather low level, more oriented 
to simulating the exchange of packets between nodes: congestion control, not 
load management. There were some other simulations more recently that might 
be more suitable, although you might want to try both; having said that, 
there may be easy ways to strip out unnecessary detail e.g. by treating data 
transfers as single 32KB packets. 


It should be possible to adapt the low-level simulations to model load 
management; it might simulate too much of the low level detail and thus not 
be fast enough, but alternatively it may be that the low level detail matters 
more than I expect... The other option is simsalabim, a recent simulator by 
vive. In any case, we need a simulator to model the load on and limited 
capacity of each node on the network; bandwidth is generally the scarce 
resource, so focus on that. As a first and very useful milestone, we need 
simulations of the above mechanisms, to sanity check the simulations and the 
systems we have deployed.

Lots more things that can be simulated (but I'm not saying you must implement 
all of these!):
- Turning backoff on and off, given different distributions of node 
capacities: how much difference does it make?
- Turtles. This was implemented recently, when a request takes over 1 minute 
to download the data the node closest to it will cancel the request 
downstream, download the data for itself, and then offer it to the downstream 
nodes. In practice it seems to have increased bandwidth usage considerably on 
nodes with fast connections.
- Bulk vs realtime requests flag. It would be useful to simulate this before 
we implement this.
- Various proposed new load management schemes such as token passing. The 
basic principle of token passing is that we tell our peers when we have the 
capacity for new requests, and then accept some of their requests; when we 
have accepted the requests, they are queued until we can either serve them 
from the datastore or from the results of another request, or we can forward 
them to one of our peers that says it can accept some requests. As I have 
mentioned there are many details that can be tweaked...
- Whether queueing requests (probably only bulk requests) is useful in 
general.
- Attack simulations: how do different schemes react to flooding, for example?
- Opennet and connection churn: Simulate how new connections are added on 
opennet, how nodes are bootstrapped onto the network, evaluate the effects of 
nodes joining and then leaving, or of nodes having low uptimes. (I think the 
existing simulations are mostly darknet).
- Shared datastore Bloom filters and encrypted tunnels: It would be 
interesting to simulate these two 0.9 features, although we will implement 
them regardless. The former allows darknet nodes (and possibly opennet nodes) 
to know what is in their peers' datastores, the latter sends out multiple 
(partially) random routed anchors which rendezvous and create a tunnel so 
that requests don't start close to the node starting them. We are hoping that 
these two features will more or less balance out, but we have no idea without 
simulations. Having said that, these are probably very dependant on file 
popularity patterns (i.e. the Zipf parameter).

I don't expect you to implement everything I've mentioned above! Getting a 
good simulation of the current model working is essential, particularly as it 
relates to load management. I'm not sure exactly how far load-balancing-sims 
went, mailing list traffic suggests phase 7 implements most of the current 
model, but I think preemptive rejection is missing, and IMHO that's a 
critical component of the current architecture. One thing to note is that 
vive has tried some interesting routing changes out without copying the code 
first, so you might need to turn them off. Simulating token passing would 
then be the next big thing, although as I mentioned there are a great many 
variations, most of which are in the mailing list archives; we would need to 
dig through the archives to find them.

I haven't personally written any simulations, the people to talk to if you 
need to discuss things in detail are:
- vive at freenetproject.org, who has done most recent work
- mrogers at freenetproject.org, who wrote the load balancing simulations 
originally
> 
> The proposals for extend the simulations are as follows:
> 
>    - To extend the simulations phases in large scale.
>    - To compare simulations using and not using backoff.

See here for mrogers' results from phase7 (these don't include preemptive 
rejection):
http://archives.freenetproject.org/message/20061122.001144.52dbb09d.en.html

>    - To apply peer-peer token buckets to track fairness between accepting
>    requests from different peers,

One more issue: Have you contributed any code to Freenet? What is your SVN 
username if so? It is our policy that nobody will be accepted for a SoC 
project unless they prove their coding ability first.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 835 bytes
Desc: This is a digitally signed message part.
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20090407/305c2c84/attachment.pgp>

Reply via email to