I have an answer to why, but not a solution to how to deal with it. THE PROBLEM:
I am fairly sure that the reason for this is as follows: Realtime and bulk requests are treated separately for load management. We can accept up to half of our requests capacity from one peer. This is on the basis that a burst is acceptable if there is no other traffic. Realtime requests tend to be bursty, so at first glance this makes sense. Now the problem: WE SHARE BANDWIDTH FAIRLY BETWEEN OUR PEERS (in the packet sender loop), REGARDLESS OF THE PRIORITY OF THE QUEUED MESSAGES! Thus we can accept a load of requests - an fproxy burst, which is perfectly normal - thinking we can allocate half our bandwidth to one peer. (The output bandwidth liability roughly matches the 5 seconds per block criterion, although it's a tight fit if there is bulk traffic happening as well). Then we share our bandwidth equally between peers, and assuming that the other peers have some traffic, this means we have a lot less bandwidth than we thought available, and so the abnormally large number of transfers to the single node fail. This is of course only a problem while it is a burst - after the first few hops it's pretty reliable. Which fits with the stats: If you do a lot of fproxy requests, you get a lot of failures, and the stats show this as low success rate; but if you just run a node, the success rate even for realtime block transfers is pretty good. Unfortunately we need to have a lot of transfers in flight because of our relatively low success ratio - especially for bulk transfers. Although it's not that bad now ... Plus we gain security and bandwidth efficiency from always having a transfer in flight... Unfortunately, apart from increasing the inter-block timeout to 10 seconds, it is not immediately clear what the solution is... THE SOLUTION: EASYISH STEPS: First, I have increased the block timeout on realtime to 10 seconds. This may help somewhat in practice, as they are often only just 5 seconds; and the 60 second liability calculation is cutting it rather close for 5 seconds per block. However, it would be better if we could solve the underlying problem... Reducing the message size within a data block is a possible compromise solution. TOO MANY TRANSFERS?: Hmmm, is the real issue simply accepting too many transfers? The number of realtime transfers we can accept is: (bandwidth in kilobytes) * 60 / 32 Which is approximately (bandwidth in kilobytes) * 2. The number of realtime transfers a single node can have is therefore approximately (bandwidth in kilobytes). This is the bandwidth usage after taking into account that some bandwidth will be used for other things than transfers - but it doesn't count bulk transfers; both realtime and bulk make their own calculations based on the total. If a peer is using half our bandwidth and has (bandwidth in kilobytes) transfers, and is doing no bulk transfers, we should expect a typical packet interval of 2 seconds. This could build up over hops, but bear in mind that it's much less bursty the further away from the originator we go. If the peer has bulk transfers using half its bandwidth, we should expect a typical packet interval of 4 seconds. If we are sharing fairly between all peers, we should expect a typical packet interval of: (number of transfers = bandwidth) / (peer bandwidth = bandwidth / number of peers) Which is in seconds. So for 40 peers, 40 seconds. :| For bulk transfers, the block timeout is 30 seconds, but we accept twice as many transfers. However we are much less likely to get a big burst to a single node with bulk transfers. We could halve the total transfers limit to take into account the fact that there are both bulk and realtime transfers. We could then only allow one peer 1/4 of the total rather than 1/2, halving it again. This brings us into plausible worst case timeout territory - 10 seconds for 40 peers on realtime, 20 seconds for 40 peers on bulk. However the first step would be disruptive, and might result in significantly reduced average throughput. Arguably the problem is at the peer level, not at the total transfers level. Also, we should separate it from the total number of peers. So the new proposal is we allow one node to use bandwidth equivalent to 8 nodes' guaranteed bandwidth, i.e. 4 nodes' full fair share, when calculating how many transfers (requests) it can have at once. This gives: (number of transfers = bandwidth * 4 / number of peers) / (peer bandwidth = bandwidth / number of peers) Which is 2.5 seconds for realtime. There is little point in using a different formula for bulk because bursts are rare on bulk. Obviously we would have to ensure that this is lower than half the total if we have fewer than 8 peers. Is this consistent with what's been observed? Yes. The nodes that got timeouts on testnet had timeouts around 5 seconds - just over the limit. They also had very few peers, maybe 10 or so. This is not as high as it should be according to the formula above but it's in roughly the right region. Okay, I have implemented this. I'm pretty sure it will work ... OTHER IDEAS: Making PacketSender not fair between peers is HARD! I don't want to eliminate bulk vs realtime. Especially with new load management, it makes a lot of sense to have the separation. For downloads we want a lot of throughput. For fproxy we want low latency. We generally can't have both so it makes sense to be able to choose one or the other. Especially as most of our traffic is bulk, and there are lots of interesting things we can do with bulk in terms of performance and security. We don't want a peer with a retransmission problem to occupy all our bandwidth. We don't want to only send to the peer with realtime messages queued, and lose the other peers. Really all we need to do is give the peer 1/4 of our bandwidth: Half is for bulk, half is for realtime, and half of the latter is for this peer. Of course if we have 40 peers, this is still equivalent to 10 of our other peers ... This might or might not relate to some long term ideas about bloom filter sharing and preemptive datastore transfer, which would need an "idle bandwidth" mechanism. Another complication is if we choose a peer because it has high priority messages (such as requests), we will currently send a full sized packet, which might include low priority data; this is simple and efficient in terms of payload, but it may cause problems for another peer sending realtime data... See bug: https://bugs.freenetproject.org/view.php?id=4731 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20110318/3c5d5e10/attachment.pgp>