On Mar 24, 2010, at 2:24 PM, Matthew Toseland wrote: > Currently our load management system works like this: > - At the level of any node, a request will be preferentially routed > to the best node by location (or FOAF location), but if many nodes > are backed off, we can route to the worst node by location. In other > words there is no limit whatever on misrouting. > - At the level of the node originating requests, we attempt to > estimate the capacity of the network in a manner similar to TCP > (although more vague as we operate on rates rather than actual > windows). > > Hence we do not limit misrouting, but we do limit load. And limiting > load on the sender side is very expensive and inflexible, costing us > a great deal of performance.
True, yet from a pragmatic perspective... is there really a way around this? Network theory: you can control what you send but not what you receive. > Recently the CCN paper, and since long ago various people on Frost, > have pointed out that we do not in fact need to limit load. We don't > need to limit it at the sender side and arguably we don't need to do > more at the any-node level than basic limiting of the number of > requests in flight. This is because every time data is transferred > it is the result of a request, so congestion in the sense that TCP/ > IP deals with cannot really occur. That is fascinating, I'll have to think about that. If the node would always be busy I guess it could not be any worse than present. > Clearly we do need to limit the number of requests which are pending > on any given node. If we don't, we will end up sharing a finite and > often very small resource (the output bandwidth limit) among an > unbounded and potentially very large number of requests. And in order to limit the number of requests in-flight, we need to limit the number of requests which the node accepts. Isn't this already the case? > Also it would be good to directly limit misrouting by refusing to > send requests to a node that is too far from the ideal for the key > we are requesting. However, some nodes will be always very slow, and > some nodes will be temporarily very slow. IMHO these are two > different issues: Nodes which are always very slow (e.g. due to > being on dialup) should publish a very small capacity and be used > occasionally when they can be used, and the fact that we can't use > the very slow node for most requests that would in theory match it > should not be a big concern with regards to misrouting. Whereas > nodes which are temporarily very slow (e.g. temporarily suffering > under massive CPU load) should just be ignored - they reject > requests or time out, and we can backoff. Hence backoff should only > happen in critical cases (e.g. high ping time), and most of the time > load is limited by the published request limit, which takes over > from output bandwidth liability preemptive rejection. I think this is the true problem. Slow nodes.... heterogeneous bandwidth limitations... possibly some unrecognized (unmeasured?) machine-performance issues. I would like to resubmit my previous suggestion: a "pre-request" which calls for a node to make a very-firm estimate asto the amount of time it would take to deliver a given key. If a followup "actual" request exceeds the estimate (or is remotely cancelled) the upstream node is penalized (via backoff), thus temporarily removing it from a healthy network. Plus, it closely mirrors *real-life*... your boss says "when can you get this [very important] report to me", and you can indicate "given the amount of other bulk & important requests, and knowing how fast I can walk to your desk... 3 minutes". For security we may need to turn a fraction of these request-estimates into actual (bulk) requests. > Even with these precautions we will need a heuristic for the degree > of misrouting we will tolerate. Options: > - Always route to the ideal node, subject to the above limitations. > Arguably this is best, provided we can exclude outliers in a > satisfactory way. > - Allow routing to the top two or three nodes. > - Take a proportion of the routing table (e.g. top 25%). In my idea if the node received a request (or a pre-request), we could query an arbitrary number of nodes and take the most favored one. For bulk transfers we could take a measure of something other than the bottom-line time estimate, such as largest-%-capacity or largest-%- uptime of all links between the requestor and the datum. > > [...snip...] > > 6. Realtime vs bulk flag > > Add an extra flag to requests. If the flag is set to BULK, we > proceed as before, with queueing and a relatively long timeout. If > the flag is set to REALTIME, we need an instant decision, even if it > means an instant failure. So we try to route to the top nodes - a > slightly more generous heuristic than with bulk - and if we can't, > we fail as if we had timed out. Another way to implement this would > be to have a dedicated quota within the per-node limit for realtime > requests. A third option would be to allow realtime requests to be > queued but for a much shorter period, and use stats to estimate > whether that will be exceeded and if so fail immediately. Another idea... three classifications: realtime, fast, and bulk. Realtime requests always occur at the full link speed between nodes and have an embedded "deadline" timestamp (which could be increased at the origin until it succeeds). A node instantly rejects such a request if it would make any other transfers go over time-budget, otherwise all other transfers are (very-temporarily) suspended for the [one] realtime transfer. Hmmm... on second thought you would still need some kind of queuing or token system... or a realtime response could include the narrowest pipe; and a node could cancel any "slower" realtime requests. -- Robert Hailey -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20100329/93d8d532/attachment.html>
