On Thursday 02 April 2009 23:29:06 David Cabanillas wrote: > >Yes. Simulations of load management would be very welcome. You should > investigate the existing simulators (particularly the event-based ones), > which are in the SVN repository, I am hopeful that they can be adapted > without massive effort. You will need to more or less reproduce the current > load management system (AIMDs on the originator based on timeouts and > rejections, plus backoff to deal with individual slow peers, documented on > the wiki), and then simulate various proposals from the mailing lists for > new > systems. > > Reviewing the SN repository I found > http://freenet.googlecode.com/svn/trunk/apps/load-balancing-sims/ from > phase1<http://freenet.googlecode.com/svn/trunk/apps/load-balancing-sims/phase1/>to > phase7. Are you talking about this simulations? > > > - Simulations from phase1 to phase4 are only working with 3 nodes. In 6 > and 7 at least they are working with 100 nodes. > - The simulation 4 introduces the AIMD algorithm (Peer.java).
On a per-link level, i.e. congestion control. Not on a global load limiting level. We use both: nodes use AIMD to decide how much data to send to their peers (estimating the capacity of the connection, like TCP does), and they also use a separate AIMD to estimate the capacity of the entire network, and thus determine how fast to start their own queued requests. > - In any simulation I have observed backoff process but we can consider > that backoff is useful or it is necessary to assure by means of simulations > that it is useful. My recollection is these simulations are fairly low level, mostly used to validate the node to node congestion control algorithms. IMHO the real challenge is in load management, which is about how load is managed at the level of the network as a whole. Currently, the main mechanisms are: Preemptive request rejection: Nodes estimate their capacity, based mostly on bandwidth, and reject any requests they don't think they have the ability to service. Purpose: Don't accept so many requests that they end up timing out because we can't complete them all in a reasonable time. Code: NodeStats.java, shouldRejectRequest, output bandwidth liability is the main mechanism here. Backoff: When a request times out, or a node rejects a request preemptively to prevent itself getting overloaded, we backoff from the node for a period. This starts at 1 second, and is doubled every time we get another timeout or rejection, to a maximum of 3 hours but it is reset back to 1 second when a request completes without getting a timeout or a rejection, but the actual backoff time is randomized between 0 and this value. There is also a separate backoff for transfer timeouts, which starts at 30 seconds. When a node is backed off, we only route to it as a last resort; we route to non-backed-off nodes first. Purpose: Avoid sending requests to nodes that are overloaded or very slow. Interacts with the next mechanism to prevent slow nodes from dragging down the overall capacity of the network. AIMD load limiting: A Freenet node has an AIMD used to estimate the total capacity of the network, based on round trip times for each type of request, and on how many requests timeout or are rejected (by any node on the request chain, because we forward the rejections back to the source, flagged as non-local so they don't break the request). Purpose: Propagate load back to the original requestor, get him to slow down. If I remember correctly, these simulations are rather low level, more oriented to simulating the exchange of packets between nodes: congestion control, not load management. There were some other simulations more recently that might be more suitable, although you might want to try both; having said that, there may be easy ways to strip out unnecessary detail e.g. by treating data transfers as single 32KB packets. It should be possible to adapt the low-level simulations to model load management; it might simulate too much of the low level detail and thus not be fast enough, but alternatively it may be that the low level detail matters more than I expect... The other option is simsalabim, a recent simulator by vive. In any case, we need a simulator to model the load on and limited capacity of each node on the network; bandwidth is generally the scarce resource, so focus on that. As a first and very useful milestone, we need simulations of the above mechanisms, to sanity check the simulations and the systems we have deployed. Lots more things that can be simulated (but I'm not saying you must implement all of these!): - Turning backoff on and off, given different distributions of node capacities: how much difference does it make? - Turtles. This was implemented recently, when a request takes over 1 minute to download the data the node closest to it will cancel the request downstream, download the data for itself, and then offer it to the downstream nodes. In practice it seems to have increased bandwidth usage considerably on nodes with fast connections. - Bulk vs realtime requests flag. It would be useful to simulate this before we implement this. - Various proposed new load management schemes such as token passing. The basic principle of token passing is that we tell our peers when we have the capacity for new requests, and then accept some of their requests; when we have accepted the requests, they are queued until we can either serve them from the datastore or from the results of another request, or we can forward them to one of our peers that says it can accept some requests. As I have mentioned there are many details that can be tweaked... - Whether queueing requests (probably only bulk requests) is useful in general. - Attack simulations: how do different schemes react to flooding, for example? - Opennet and connection churn: Simulate how new connections are added on opennet, how nodes are bootstrapped onto the network, evaluate the effects of nodes joining and then leaving, or of nodes having low uptimes. (I think the existing simulations are mostly darknet). - Shared datastore Bloom filters and encrypted tunnels: It would be interesting to simulate these two 0.9 features, although we will implement them regardless. The former allows darknet nodes (and possibly opennet nodes) to know what is in their peers' datastores, the latter sends out multiple (partially) random routed anchors which rendezvous and create a tunnel so that requests don't start close to the node starting them. We are hoping that these two features will more or less balance out, but we have no idea without simulations. Having said that, these are probably very dependant on file popularity patterns (i.e. the Zipf parameter). I don't expect you to implement everything I've mentioned above! Getting a good simulation of the current model working is essential, particularly as it relates to load management. I'm not sure exactly how far load-balancing-sims went, mailing list traffic suggests phase 7 implements most of the current model, but I think preemptive rejection is missing, and IMHO that's a critical component of the current architecture. One thing to note is that vive has tried some interesting routing changes out without copying the code first, so you might need to turn them off. Simulating token passing would then be the next big thing, although as I mentioned there are a great many variations, most of which are in the mailing list archives; we would need to dig through the archives to find them. I haven't personally written any simulations, the people to talk to if you need to discuss things in detail are: - vive at freenetproject.org, who has done most recent work - mrogers at freenetproject.org, who wrote the load balancing simulations originally > > The proposals for extend the simulations are as follows: > > - To extend the simulations phases in large scale. > - To compare simulations using and not using backoff. See here for mrogers' results from phase7 (these don't include preemptive rejection): http://archives.freenetproject.org/message/20061122.001144.52dbb09d.en.html > - To apply peer-peer token buckets to track fairness between accepting > requests from different peers, One more issue: Have you contributed any code to Freenet? What is your SVN username if so? It is our policy that nobody will be accepted for a SoC project unless they prove their coding ability first. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 835 bytes Desc: This is a digitally signed message part. URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20090407/305c2c84/attachment.pgp>