[Tech] Re: What is backoff for?

Ian Clarke Thu, 10 Aug 2006 15:07:07 -0700

Ok, lets try it - but unless there is an obvious marked improvement,  
we must *remove* it (rather than just continuing to add complexity).


Ian.

On 10 Aug 2006, at 14:50, Matthew Toseland wrote:

> On Thu, Aug 10, 2006 at 11:43:32AM -0700, Ian Clarke wrote:
>> Why use a median, rather than a mean?
>>
>> I'm concerned about this, it increases (yet again) the complexity of
>> our load balancing, it hasn't been simulated, and I think there will
>> inevitably be unpredictable results.  For example, what happens if
>> *all* nodes have an AcceptedFrac below the median?  This mechanism
>> will just spiral down until all nodes are hardly accepting anything.
>
> The current mechanism is spiraling down, and nodes are accepting  
> hardly
> anything. The median is defined as the middle value if you order the
> values, so half the nodes will be above it and half below it. So we'll
> always be able to send as many requests as we want to half the nodes.
> The reason for the current lack of performance is overcompensation for
> load, as evidenced by the low bandwidth usage. To be precise, it's
> because backoff is used as a load limiting mechanism and not just a  
> load
> balancing mechanism.
>>
>> We can *not* afford to fall back into the trap of adding more and
>> more complexity to the point that nobody understands what is going
>> on, especially in the area of load balancing which is one of the
>> areas that is most difficult to get our heads around.
>
> The theory is simple: We have propagation of overload back to the
> original senders. They will slow down if they get lots of timeouts and
> forwarded RejectedOverload's. That's load *limiting*. The job of load
> *balancing* is to prevent all our traffic being routed to slow nodes
> which then generate lots of RejectedOverload's and slow down the whole
> network. We could take out load balancing completely, but that might
> result in a significant slowdown due to slow nodes. What's even better
> is that this scheme has fewer arbitrary constants than backoff.
>>
>> I really feel that we lack a "theory" of load balancing, a simple
>> conceptual model through which we can think about the problem and
>> find a solution.  Rather, we come up with idea after idea based on
>> vague and invariably flawed notions and assumptions, layering one on
>> top of the other until nobody understands what is happening.
>
> Load limiting: Sender side, determining how fast to send requests into
> the network. If there are too many RejectedOverload's, the sender  
> slows
> down. This will ensure the network is not overloaded, full stop. It
> works for TCP/IP, it should work for us.
>
> Load limiting without load balancing: If there are slow nodes near the
> sender, and we send these an equal proportion of our incoming requests
> (according to their location), then most of those requests will be
> rejected, and this results in an overall slowdown.
>
> Load balancing: Routing side. The purpose is to determine which  
> nodes to
> send requests to, in order that a) our requests are reasonably  
> fast, and
> b) we don't send too many requests to really slow nodes, causing an
> overall slowdown via load limiting. Load balancing can legitimately be
> "selfish" i.e. it can send a request even if all available nodes are
> slow, because load LIMITING will reduce the overall send rate if the
> overall network is slow; that's not load balancing's problem. Not  
> being
> "selfish" in this sense is a threat to anonymity (all nodes backed off
> except one), usability (all nodes backed off), and performance, and is
> not justified. TCP/IP runs over more than ethernet, and I don't think
> ethernet collision detection is a viable model for backoff.
>
> Conceptually this isn't a difficult model. And I don't think it could
> possibly be any worse than current performance; right now we are  
> plainly
> overcompensating for load, the reason for this is backoff, which is a
> means for both limiting and balancing load, when it should be solely a
> balancing mechanism; we have a better limiting mechanism already.
>>
>> We must place a moratorium on all but the most trivial "improvements"
>> to load balancing unless they have been simulated.
>
> That could be another month. Michael has rightly insisted on doing a
> fairly low level simulation, which has resulted in us rearchitecting a
> lot of the lower layers of Freenet; this is a good thing, but it takes
> time. Token passing is a substantial change, which we should not  
> deploy
> without simulation. It should solve our current problems, but if it
> takes another month for it to be ready, given the current situation of
> massive overcompensation for load resulting in universally low  
> bandwidth
> usage, lots of backoff, and really slow inserts, it seems  
> reasonable to
> me to try the above. Actually I don't think that load limiting with no
> backoff would do too badly, but it would be rather sensitive to modem
> nodes etc.
>>
>> Ian.
>>
>> On 9 Aug 2006, at 07:30, Matthew Toseland wrote:
>>
>>> What is backoff for? It's not to prevent overloading nodes; nodes
>>> reject
>>> requests pre-emptively. What it is for is this:
>>>
>>> If we don't backoff slow nodes, then these slow nodes will have to
>>> reject (or occasionally timeout) the requests that we do forward to
>>> them. Those failures will be propagated back to the request
>>> senders, who
>>> will have to reduce their send rate. In other words, if we don't  
>>> avoid
>>> sending requests to the slowest nodes, the requests we do send to  
>>> them
>>> will be rejected, and those rejections will cause a universal  
>>> slowdown
>>> in request rates.
>>>
>>> Thus, I don't think that the current mechanism is sensible. Nodes
>>> can be
>>> expected to take care of themselves, now that we have pre-emptive
>>> rejection. I don't think the ethernet analogy works 100%; we don't
>>> really have a shared medium.
>>>
>>> What we are looking at here is simply a way to ensure that requests
>>> are
>>> not forwarded to nodes who will inevitably reject them and thus
>>> cause an
>>> overall slowdown at the requester end. This is completely  
>>> different to
>>> the case in 0.5, where requesters never slowed down.
>>>
>>> So what am I proposing?
>>>
>>> For each node, we track the proportion of requests which are  
>>> accepted.
>>> We calculate a median. For any node above the median, all  
>>> requests are
>>> forwarded to it: if most of our nodes are slow, that's not our
>>> problem,
>>> it's our peers who want to forward requests through us who have to
>>> deal
>>> with it (by not sending requests to us). For nodes below the
>>> median, we
>>> artificially narrow their specializations:
>>>
>>> While routing:
>>>
>>> public double distance(PeerNode node, double target) {
>>>
>>> double distance;
>>> if(node.acceptedFrac >= medianAcceptedFrac) {
>>>     distance = rawDistance(node.location, target);
>>> } else {
>>>     distance = rawDistance(node.location, target) *
>>>             medianAcceptedFrac / node.acceptedFrac;
>>> }
>>> return distance;
>>> }
>>>
>>>
>>> Now, to some degree, this is alchemy, in the sense that it hasn't  
>>> been
>>> simulated. However it will be some time before the new rate limiting
>>> mechanism is available, and it appears a solid argument to me: If  
>>> all
>>> our peers are slow, that DOES NOT mean we shouldn't send requests to
>>> them. Because we have load propagation, it is entirely legitimate to
>>> send requests ANYWAY, and let upstream nodes deal with the  
>>> slowness -
>>> either the entire network is slow, in which case we reduce all the
>>> send rates, or this particular node is slow, in which case we reduce
>>> the number of requests sent to it! Note also that slavishly  
>>> following
>>> backoffs reduces our anonymity quite noticeably.
>>>
>>> Comments?
>>> -- 
>>> Matthew J Toseland - toad at amphibian.dyndns.org
>>> Freenet Project Official Codemonkey - http://freenetproject.org/
>>> ICTHUS - Nothing is impossible. Our Boss says so.
>>
>> Ian Clarke: Co-Founder & Chief Scientist Revver, Inc.
>> phone: 323.871.2828 | personal blog - http://locut.us/blog
>>
>
>
>
> -- 
> Matthew J Toseland - toad at amphibian.dyndns.org
> Freenet Project Official Codemonkey - http://freenetproject.org/
> ICTHUS - Nothing is impossible. Our Boss says so.

Ian Clarke: Co-Founder & Chief Scientist Revver, Inc.
phone: 323.871.2828 | personal blog - http://locut.us/blog

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://emu.freenetproject.org/pipermail/tech/attachments/20060810/d63bfd18/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
URL: 
<https://emu.freenetproject.org/pipermail/tech/attachments/20060810/d63bfd18/attachment.pgp>

[Tech] Re: What is backoff for?

Reply via email to