On Nov 30 2007, Matthew Toseland wrote: > Increasing MAX_PING_TIME would have no effect, for example, because most > nodes mostly reject on bandwidth liability.
MAX_PING_TIME was just an example - my point is that if we know most nodes aren't using the available bandwidth, we should tweak the rejection thresholds until most nodes hit their bandwidth limits. That doesn't require any new algorithms, just tuning the constants of the existing ones. > But the point I am making is > *we don't even limit effectively on bandwidth liability* : busy-looping > until a request gets through shouldRejectRequest() improves performance > significantly, therefore backoff and AIMD is not supplying enough > requests to the front end of the current load limiting system. To play devil's advocate for a minute: maybe it only improves performance because we're hammering our peers with so many requests that probabilistic rejection is effectively circumvented (sooner or later the coin will come up heads). This isn't necessarily a good strategy. I'm not opposed to disabling AIMD and replacing backoff with explicit "start/stop" signals, I'm just not convinced it will fix anything either. > Yes. Well really it's a form of token passing, but I'm trying to make it > simple and obviously correct. It's not really token passing - a peer that receives the "start" signal can send unlimited requests until it receives the "stop" signal (pre-emptive rejection). With token passing the peer knows how many requests it can send, so there's no need for pre-emptive rejection. That's not to say that I think token passing is better than your proposal - we never settled the question of how many tokens to hand out or how to allocate them, for example. A simple solution is definitely preferable. However, there's a reason most protocols don't use simple start/stop flow control: it's hard to get good performance because the peer's response is delayed by one RTT and you can't make smooth adjustments (it's all or nothing). To be honest I think we're just trying to compensate for a broken transport layer. Look at the way HTTP handles flow control: it doesn't. Flow control is left to the transport layer. Requests can be pipelined; if you're busy processing the last request, don't read another one from the socket. To handle timeouts, add a timestamp to the request and skip it if the timestamp indicates that the previous hop will have timed out and moved on. > We are not talking about the same queue. Local requests from the > various client-layer queues have to go through exactly the same process. Sorry, I realised that after sending the message. :-) > I mean that requests queued may not be successfully forwarded because > they are too far away from any of our peers' locations, yet since they > don't go away, our peers cannot send us any more requests which are > closer to the target. I believe what I said about this is sufficient. I must have missed something - does the twice-the-median limit only apply to misrouted requests? If it applies to all requests, then either we can send the head of the queue to *someone* or we can't send anything to anyone. Either way there's no way for a "bad" request to block a "good" request. Cheers, Michael
