On Wednesday 08 August 2007 16:48, Matthew Toseland wrote: > On Wednesday 08 August 2007 16:48, Matthew Toseland wrote: > > On Wednesday 08 August 2007 16:47, Matthew Toseland wrote: > > > On Wednesday 08 August 2007 16:45, Matthew Toseland wrote: > > > > On Wednesday 08 August 2007 16:42, Matthew Toseland wrote: > > > > > On Wednesday 08 August 2007 16:26, Matthew Toseland wrote: > > > > > > On Wednesday 08 August 2007 16:25, Matthew Toseland wrote: > > > > > > > On Wednesday 08 August 2007 16:23, Matthew Toseland wrote: > > > > > > > > On Wednesday 08 August 2007 16:22, Matthew Toseland wrote: > > > > > > > > > On Wednesday 08 August 2007 16:20, Matthew Toseland wrote: > > > > > > > > > > On Wednesday 08 August 2007 16:19, Matthew Toseland wrote: > > > > > > > > > > > On Wednesday 08 August 2007 16:17, Matthew Toseland wrote: > > > > > > > > > > > > On Wednesday 08 August 2007 16:05, Matthew Toseland > > wrote: > > > > > > > > > > > > > Continued at end to minimise confusion! > > > > > > > > > > > > > > > > > > > > > > > > > > On Wednesday 08 August 2007 15:59, Matthew Toseland > > > > wrote: > > > > > > > > > > > > > > Unfortunately this thread is rather rambling, it > > > > > > > > > > > > > > includes lots of discussion on token passing as > > > > > > > > > > > > > > well as the original premise. > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- > > > > > > > > > > > > > > 2007.04.25 - 20:18:36GMT ----- > > > > > > > > > > > > > > > > > > > > > > > > > > > > I made some measurements on how freenet node > > > > > > > > > > > > > > behaves if bandwidth limit is set low: 10KBps and > > > > > > > > > > > > > > downto 6KBps (specificially, input bandwidth > > > > > > > > > > > > > > limit; output bandwidth limit was set to at least > > > > > > > > > > > > > > 15KBps but as expected factually used output > > > > > > > > > > > > > > bandwidth is comparable (just slightly above) > > > > > > > > > > > > > > with factually used input bandwidth). The node > > > > > > > > > > > > > > itself was running frost but no > > > > > > > > > > > > > > uploads/downloads, so absolute majority of > > > > > > > > > > > > > > network traffic was forwarded CHK/SSK > > > > > > > > > > > > > > requests/inserts. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Results are interesting enough: CHK traffic > > > > > > > > > > > > > > becomes as low as 5% (in packets) of CHK+SSK, > > > > > > > > > > > > > > while at least 92% of SSK requests were not > > > > > > > > > > > > > > satisfied for assorted failures (plus quite some > > > > > > > > > > > > > > more certainly resulted in NotFound response due > > > > > > > > > > > > > > to missing the key in whole network, but I don't > > > > > > > > > > > > > > have the number). This makes low traffic node > > > > > > > > > > > > > > working highly inefficient and improportionally > > > > > > > > > > > > > > slow; this also slows down its peers with all the > > > > > > > > > > > > > > extra reject traffic. Worse, input bandwidth > > > > > > > > > > > > > > sometimes goes over set limit, suggesting that on > > > > > > > > > > > > > > hardware 33600/56000 Kbps modem and even ISDN > > > > > > > > > > > > > > things will just get worse due to increased > > > > > > > > > > > > > > delays. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Another thing to note: low bandwidth node (LBN) > > > > > > > > > > > > > > almost exclusively reject requests with "input > > > > > > > > > > > > > > bandwidth liability" reason, and extremely rarely > > > > > > > > > > > > > > other reasons. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Speculating a bit, the same picture will likely > > > > > > > > > > > > > > be observed for peers of fast node (1Mbps or > > > > > > > > > > > > > > more) with many peers having typical home > > > > > > > > > > > > > > connection of 256Kbps or less. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Not sure if simulations ever showed anything like > > > > > > > > > > > > > > this, but contributing to network mostly SSK > > > > > > > > > > > > > > service (and absolute majority of SSK requests > > > > > > > > > > > > > > fail!) is rather useless: optimally working > > > > > > > > > > > > > > network is supposed to transfer at least one CHK > > > > > > > > > > > > > > block for each SSK key, and typically much much > > > > > > > > > > > > > > more (single 10MB file consists of 481 CHK > > > > > > > > > > > > > > blocks!), and even if you found SSK but not CHK > > > > > > > > > > > > > > the SSK points to, then you failed to find > > > > > > > > > > > > > > information you requested. > > > > > > > > > > > > > > > > > > > > > > > > > > > > OK to make the long story short[er], at the end > > > > > > > > > > > > > > of this message you will find a small patch that > > > > > > > > > > > > > > noticably improves LBN situation. Idea is to > > > > > > > > > > > > > > reserve some bandwidth for CHK transfers (and SSK > > > > > > > > > > > > > > inserts, as those are too rare to penalize, and > > > > > > > > > > > > > > more valuable than requests). The line directly > > > > > > > > > > > > > > before the inserted one implicitly penalizes CHK > > > > > > > > > > > > > > transfers (as much smaller SSK requests tend to > > > > > > > > > > > > > > rereserve bandwidth the next moment it got > > > > > > > > > > > > > > released after CHK transfer finish, while much > > > > > > > > > > > > > > larger CHK requests do not have such good > > > > > > > > > > > > > > chance), so bandwidth should be reserved for 2 > > > > > > > > > > > > > > CHKs at least (and tests show that's enough to > > > > > > > > > > > > > > make a > > > > > > > > > > > > > > difference). > > > > > > > > > > > > > > > > > > > > > > > > > > > > Another thing I tried was increasing the 90 > > > > > > > > > > > > > > seconds period up to 120. That had some (no > > > > > > > > > > > > > > numbers here; just "noticeable but small") > > > > > > > > > > > > > > positive effect on making traffic smoother and > > > > > > > > > > > > > > staying closer to set limit, without jumping up > > > > > > > > > > > > > > and down too much. Where the 90 seconds number > > > > > > > > > > > > > > came from anyway, and how dangerous 120 could be? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Some pros observed and/or thought out during > > > > > > > > > > > > > > tests of the patch: - I observe increase of > > > > > > > > > > > > > > output payload by approx. 15% (of total traffic), > > > > > > > > > > > > > > making LBN more useful for its peers. - the > > > > > > > > > > > > > > change is negligibly small for faster nodes so > > > > > > > > > > > > > > should not break anything globally. - entire > > > > > > > > > > > > > > network SSK flood traffic will be toned down a > > > > > > > > > > > > > > little bit (at temporary overloaded nodes only), > > > > > > > > > > > > > > additionally simplifying life for LBNs: after > > > > > > > > > > > > > > all, requesting the same SSK every 15 seconds for > > > > > > > > > > > > > > 35 hours, total 8100 times (factual numbers from > > > > > > > > > > > > > > one of the test before this patch applied; there > > > > > > > > > > > > > > are over 60 other SSKs that were requested more > > > > > > > > > > > > > > than 1000 times during the same period) is just > > > > > > > > > > > > > > way too much, SSKs are not inserted into network > > > > > > > > > > > > > > THAT fast. [does it worth to remember recently > > > > > > > > > > > > > > seen SSK requests, and do not forward them if > > > > > > > > > > > > > > same request was already forwarded within last 10 > > > > > > > > > > > > > > minutes and resulted in DNF/RNF? Table of > > > > > > > > > > > > > > recently requested SSKs that are closest to the > > > > > > > > > > > > > > node location should not be too big?]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > And contras: > > > > > > > > > > > > > > - in exceptional conditions (specificially, with > > > > > > > > > > > > > > less than 2 incoming CHK requests per 90 seconds; > > > > > > > > > > > > > > factually I observe 2-7 CHK requests per seconds, > > > > > > > > > > > > > > that's 180-630 per 90 seconds) notwithstanding > > > > > > > > > > > > > > node bandwidth speed, up to 800 Bps might end > > > > > > > > > > > > > > being unused. For high bandwidth node that's just > > > > > > > > > > > > > > way too small to notice, for LBN that's still > > > > > > > > > > > > > > acceptable (10% of 56Kbps) and will decrease > > > > > > > > > > > > > > roundtrip delays a bit which is always a good > > > > > > > > > > > > > > thing for so slow links. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Other notes: > > > > > > > > > > > > > > - distribution of location closeness/number of > > > > > > > > > > > > > > SSK requests is very nice: only SSK requests with > > > > > > > > > > > > > > location very close to node location get repeated > > > > > > > > > > > > > > frequently; farther SSK location is, less > > > > > > > > > > > > > > requests the node sees, with those SSKs seen only > > > > > > > > > > > > > > once or two times per 1-2 days period are > > > > > > > > > > > > > > distributed evenly among location space. This > > > > > > > > > > > > > > suggests that routing is working fine. - As far > > > > > > > > > > > > > > as I understand, if input bandwidth > > > > > > > > > > > > > > limit/liability exceeded (but a packet already > > > > > > > > > > > > > > received anyway), CHK/SSK request gets instantly > > > > > > > > > > > > > > rejected (thus throwing out received bytes while > > > > > > > > > > > > > > input bandwidth has no spare volume!); only > > > > > > > > > > > > > > otherwise node checks if the requested key exists > > > > > > > > > > > > > > in the storage. Heh? This feels like a serious > > > > > > > > > > > > > > bug hurting overall network performance: better > > > > > > > > > > > > > > query storage and hopefully send back result (or > > > > > > > > > > > > > > still reject if the key not found locally) rather > > > > > > > > > > > > > > than wait for retry request to waste more input > > > > > > > > > > > > > > bandwidth. At least for SSK reject and reply are > > > > > > > > > > > > > > comparable in output bandwidth usage, so worth a > > > > > > > > > > > > > > little delay in response. Or do I miss something? > > > > > > > > > > > > > > > > > > > > > > > > > > > > === > > > > > > > > > > > > > > diff --git > > > > > > > > > > > > > > a/freenet/src/freenet/node/NodeStats.java > > > > > > > > > > > > > > b/freenet/src/freenet/node/NodeStats.java > > > > > > > > > > > > > > index 3b091b4..fb9f8b9 100644 > > > > > > > > > > > > > > --- a/freenet/src/freenet/node/NodeStats.java > > > > > > > > > > > > > > +++ b/freenet/src/freenet/node/NodeStats.java > > > > > > > > > > > > > > @@ -414,6 +414,7 @@ public class NodeStats > > > > > > > > > > > > > > implements Persistable { > > > > > > > > > > > > > > > > > > > > > > > > > > > > successfulChkInsertBytesReceivedAverage.currentVa > > > > > > > > > > > > > >lu e( ) * node.getNumCHKInserts() + > > > > > > > > > > > > > > > > > > > > > > > > > > > > successfulSskInsertBytesReceivedAverage.currentVa > > > > > > > > > > > > > >lu e( ) * node.getNumSSKInserts(); > > > > > > > > > > > > > > bandwidthLiabilityInput += > > > > > > > > > > > > > > getSuccessfulBytes(isSSK, isInsert, > > > > > > > > > > > > > > true).currentValue(); + if (isSSK > > > > > > > > > > > > > > && !isInsert) > > > > > > > > > > > > > > bandwidthLiabilityInput+=successfulChkFetchBytesR > > > > > > > > > > > > > >ec ei ve dA ve ra ge .c ur re nt Va lu > > > > > > > > > > > > > > e()+successfulChkInsertBytesReceivedAverage.curre > > > > > > > > > > > > > >nt Va lu e( ); // slightly penalize SSK requests > > > > > > > > > > > > > > by reserving bandwidth for 2 additional CHK > > > > > > > > > > > > > > transfers (or SSK inserts if any) double > > > > > > > > > > > > > > bandwidthAvailableInput = > > > > > > > > > > > > > > node.getInputBandwidthLimit() * 90; // 90 seconds > > > > > > > > > > > > > > at full power > > > > > > > > > > > > > > if(bandwidthLiabilityInput > > > > > > > > > > > > > > > bandwidthAvailableInput) { === > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- > > > > > > > > > > > > > > 2007.04.26 - 16:56:59GMT ----- > > > > > > > > > > > > > > > > > > > > > > > > > > > > Most SSK requests fail. They DNF. The reason for > > > > > > > > > > > > > > this is most SSK requests are polling for data > > > > > > > > > > > > > > that has not yet been inserted. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Bandwidth liability is usually the main reason > > > > > > > > > > > > > > for rejection. If we reach most of the other > > > > > > > > > > > > > > reasons, there is a problem - usually a cyclical > > > > > > > > > > > > > > problem. The main reason for it is to ensure that > > > > > > > > > > > > > > we don't accept so many requests that some of > > > > > > > > > > > > > > them needlessly timeout even though they > > > > > > > > > > > > > > succeeded. The timeout is 120 seconds, so we need > > > > > > > > > > > > > > the actual transfer to take less than this; on a > > > > > > > > > > > > > > request, 30 seconds seems a reasonable upper > > > > > > > > > > > > > > bound for the search time. We don't throw out > > > > > > > > > > > > > > many bytes when we reject a request/insert > > > > > > > > > > > > > > because the bulk of it hasn't been sent yet, > > > > > > > > > > > > > > except with SSKs where typically a little under > > > > > > > > > > > > > > half of the total bytes will have been moved. > > > > > > > > > > > > > > Ideally we wouldn't send requests until we have a > > > > > > > > > > > > > > good idea that they will be accepted, but token > > > > > > > > > > > > > > passing load balancing is a long way off, not > > > > > > > > > > > > > > likely to happen for 0.7.0. > > > > > > > > > > > > > > > > > > > > > > > > > > > > We cannot control input bandwidth usage > > > > > > > > > > > > > > precisely. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Any more info on SSK flooding? Is it simply > > > > > > > > > > > > > > Frost? > > > > > > > > > > > > > > > > > > > > > > > > > > > > We can add a failure table, we had one before, > > > > > > > > > > > > > > however a failure table which results in actually > > > > > > > > > > > > > > blocking keys can be extremely dangerous; what I > > > > > > > > > > > > > > had envisaged was "per node failure tables" i.e. > > > > > > > > > > > > > > reroute requests which have recently failed to a > > > > > > > > > > > > > > different node since we know it isn't where it's > > > > > > > > > > > > > > supposed to be. > > > > > > > > > > > > > > > > > > > > > > > > > > > > On what do you base the assertion about key > > > > > > > > > > > > > > closeness? It would be nice to have a histogram > > > > > > > > > > > > > > or circle on the stats pages showing recent keys > > > > > > > > > > > > > > on the keyspace - can you write a patch? > > > > > > > > > > > > > > > > > > > > > > > > > > > > As far as your patch goes, surely rejecting more > > > > > > > > > > > > > > SSK requests would be counterproductive as it > > > > > > > > > > > > > > wastes bandwidth? Shouldn't a slow node accept > > > > > > > > > > > > > > those requests it's likely to be able to handle? > > > > > > > > > > > > > > > > > > > > > > > > > > > > I can see an argument that we shouldn't prefer > > > > > > > > > > > > > > SSKs, and that on slow nodes we do prefer SSKs... > > > > > > > > > > > > > > I'm not sure the above is the right way to deal > > > > > > > > > > > > > > with it though. The effect of the patch would be > > > > > > > > > > > > > > to never accept any SSKs unless we have plenty of > > > > > > > > > > > > > > spare bandwidth, correct? > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- > > > > > > > > > > > > > > 2007.04.26 - 18:41:32GMT ----- > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Ideally we wouldn't send requests until we have > > > > > > > > > > > > > > > a good idea that they will > > > > > > > > > > > > > > > > > > > > > > > > > > > > be accepted, but token passing load balancing is > > > > > > > > > > > > > > a long way off, not likely to happen for 0.7.0. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Well, even current algorithm implementation has > > > > > > > > > > > > > > certain room for improvement. Here is the typical > > > > > > > > > > > > > > numbers I observe: > > > > > > > > > > > > > > > > > > > > > > > > > > > > === > > > > > > > > > > > > > > unclaimedFIFO Message Counts > > > > > > > > > > > > > > * FNPRejectOverload: 89 (45.2%) > > > > > > > > > > > > > > * FNPInsertTransfersCompleted: 59 (29.9%) > > > > > > > > > > > > > > * FNPDataNotFound: 15 (7.6%) > > > > > > > > > > > > > > * packetTransmit: 12 (6.1%) > > > > > > > > > > > > > > * FNPRejectLoop: 7 (3.6%) > > > > > > > > > > > > > > * FNPAccepted: 6 (3.0%) > > > > > > > > > > > > > > * FNPSwapRejected: 4 (2.0%) > > > > > > > > > > > > > > * FNPDataInsertRejected: 4 (2.0%) > > > > > > > > > > > > > > * FNPRouteNotFound: 1 (0.5%) > > > > > > > > > > > > > > * Unclaimed Messages Considered: 197 > > > > > > > > > > > > > > === > > > > > > > > > > > > > > > > > > > > > > > > > > > > FNPRejectOverload always stays at top sometimes > > > > > > > > > > > > > > with hundreds messages (for the last hour before > > > > > > > > > > > > > > unclaimed messages expire, that's alot), and so > > > > > > > > > > > > > > indicates that there is some bug (or bugs) with > > > > > > > > > > > > > > bandwidth limiting obeying. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Any more info on SSK flooding? Is it simply > > > > > > > > > > > > > > > Frost? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Not local frost for sure, it generates just > > > > > > > > > > > > > > several SSK simultaneous requests (by default max > > > > > > > > > > > > > > 8: 6 for boards plus 2 for filesharing, AFAIR; > > > > > > > > > > > > > > practically 2-4 simutaneous requests most of the > > > > > > > > > > > > > > time). Other 100 SSK requests (without proposed > > > > > > > > > > > > > > patch) are forwarded ones. > > > > > > > > > > > > > > > > > > > > > > > > > > > > >We can add a failure table, we had one before, > > > > > > > > > > > > > > > however a failure table which > > > > > > > > > > > > > > > > > > > > > > > > > > > > results in actually blocking keys can be > > > > > > > > > > > > > > extremely dangerous; > > > > > > > > > > > > > > > > > > > > > > > > > > > > Is it, having timeout of max few minutes (i.e. at > > > > > > > > > > > > > > least few times less than SSK propagation time > > > > > > > > > > > > > > visible with frost messages)? Is it more > > > > > > > > > > > > > > dangerous than current wastage of bandwith for > > > > > > > > > > > > > > same SSK key requests several times per minute? > > > > > > > > > > > > > > Had some simulations been done on that in the > > > > > > > > > > > > > > past? > > > > > > > > > > > > > > > > > > > > > > > > > > > > BTW, isn't the observed very low store hit rate > > > > > > > > > > > > > > results from prioritising the likely-to-fail > > > > > > > > > > > > > > SSKs? > > > > > > > > > > > > > > > > > > > > > > > > > > > > BTW2 the failure table could also act as a > > > > > > > > > > > > > > targetted content propagation mechanism: if a > > > > > > > > > > > > > > node sees SSK insert for a temporary blacklisted > > > > > > > > > > > > > > (non-existing) SSK, then forwarding the insert > > > > > > > > > > > > > > (more likely insert copy, for security reasons > > > > > > > > > > > > > > and routing sake) to the original requestor > > > > > > > > > > > > > > should speed up propagaton of new SSKs toward the > > > > > > > > > > > > > > nodes that already anticipate/await for them. > > > > > > > > > > > > > > > > > > > > > > > > > > > > >what I had envisaged was "per node failure > > > > > > > > > > > > > > > tables" i.e. reroute requests > > > > > > > > > > > > > > > > > > > > > > > > > > > > which have recently failed to a different node > > > > > > > > > > > > > > since we know it isn't where it's supposed to be. > > > > > > > > > > > > > > > > > > > > > > > > > > > > At a glance, very nice idea. But LBNs typically > > > > > > > > > > > > > > answer with reject, not DFN... even with current > > > > > > > > > > > > > > code. Probably such rerouting will even further > > > > > > > > > > > > > > increase SSK traffic toward LBNs, and get sharply > > > > > > > > > > > > > > increased volume of SSK rejects as result. Hmm, > > > > > > > > > > > > > > some testing/simulation seems really needed here. > > > > > > > > > > > > > > > > > > > > > > > > > > > > >On what do you base the assertion about key > > > > > > > > > > > > > > > closeness? It would be nice to > > > > > > > > > > > > > > > > > > > > > > > > > > > > have a histogram or circle on the stats pages > > > > > > > > > > > > > > showing recent keys on the keyspace - can you > > > > > > > > > > > > > > write a patch? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Mmmm... in fact I just added custom logging, then > > > > > > > > > > > > > > a wild combination of grep/sed/sort/uniq to > > > > > > > > > > > > > > analyze the logs. But let me think, maybe > > > > > > > > > > > > > > visualizing a couple of stats files I operate > > > > > > > > > > > > > > with will be rather trivial... > > > > > > > > > > > > > > > > > > > > > > > > > > > > But I would rather stay away from stats page > > > > > > > > > > > > > > graphics at this time, as the stats files I > > > > > > > > > > > > > > operate (filtered+sorted+uniqued) with are rather > > > > > > > > > > > > > > large, 20-50Mb each - too much memory for the > > > > > > > > > > > > > > toy. Unless your 'recent' means just 10-15 > > > > > > > > > > > > > > minutes at most? > > > > > > > > > > > > > > > > > > > > > > > > > > > > >As far as your patch goes, surely rejecting more > > > > > > > > > > > > > > > SSK requests would be > > > > > > > > > > > > > > > > > > > > > > > > > > > > counterproductive as it wastes bandwidth? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Tests show the opposite: without the patch > > > > > > > > > > > > > > payload output at stats page never exceeded 38%, > > > > > > > > > > > > > > with patch it becomes 53% or little more after > > > > > > > > > > > > > > several minutes upon node restart. So, with the > > > > > > > > > > > > > > patch SSK/CHK forwarding behaviour 'feels' > > > > > > > > > > > > > > logical: > > > > > > > > > > > > > > > > > > > > > > > > > > > > without patch: > > > > > > > > > > > > > > - just several CHKs, and over over 100 SSKs very > > > > > > > > > > > > > > typical. > > > > > > > > > > > > > > > > > > > > > > > > > > > > with patch: > > > > > > > > > > > > > > - most of the time (say, 75%) number of currently > > > > > > > > > > > > > > forwarded CHK requests+inserts approximately > > > > > > > > > > > > > > equals to the number of SSK requests+inserts > > > > > > > > > > > > > > (i.e. 10-25 each, depending on set bandwidth > > > > > > > > > > > > > > limit); - sometimes (say, 10%) CHK requests start > > > > > > > > > > > > > > to prevail, but current SSK requests+inserts > > > > > > > > > > > > > > seems never go below the amount which CHK get at > > > > > > > > > > > > > > max without patch (i.e. 6-10). This is very > > > > > > > > > > > > > > typical when number of CHK inserts gets several > > > > > > > > > > > > > > times higher than CHK requests (close fast peer > > > > > > > > > > > > > > inserts something really large?). - other times > > > > > > > > > > > > > > (say, 15%) CHK > > > > > > > > > > > > > > requests+inserts flow does not saturate > > > > > > > > > > > > > > bandwidth, and number of SSK requests quickly > > > > > > > > > > > > > > climbs to 50 or even over 100+ as it typically > > > > > > > > > > > > > > gets without the patch. > > > > > > > > > > > > > > > > > > > > > > > > > > > > That's for LBN. Raising input bandwidth > > > > > > > > > > > > > > allotment, number of SSKs quickly grows > > > > > > > > > > > > > > resembling the situation without the patch. > > > > > > > > > > > > > > > > > > > > > > > > > > > > So that's why I suggest reserving bandwidth for 2 > > > > > > > > > > > > > > CHK transfers; 3 would kill SSKs, 1 still makes > > > > > > > > > > > > > > SSKs to seriously prevail over CHKs (but > > > > > > > > > > > > > > nonetheless gives quite better ratio, so is a > > > > > > > > > > > > > > legal value to try if the value of 2 alarms you > > > > > > > > > > > > > > too much). Just, in case of reserving bandwidth > > > > > > > > > > > > > > for 1 extra CHK the proposed patch is not really > > > > > > > > > > > > > > needed: simply comment out the line starting with > > > > > > > > > > > > > > "bandwidthLiabilityInput +=" and decrease 90 > > > > > > > > > > > > > > seconds constant to 80 (10 seconds is roughly how > > > > > > > > > > > > > > much 33.6Kbod modem takes to transmit a single > > > > > > > > > > > > > > CHK - using anything noticeably slower than > > > > > > > > > > > > > > 28800/33600bod for freenet will not ever work > > > > > > > > > > > > > > well anyway). > > > > > > > > > > > > > > > > > > > > > > > > > > > > >Shouldn't a slow node accept those requests it's > > > > > > > > > > > > > > > likely to be able to handle? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Considering the very high chance of SSK request > > > > > > > > > > > > > > failures (at lest 92%), I would say the answer is > > > > > > > > > > > > > > no. But with sane SSK failure rate (say 75% or > > > > > > > > > > > > > > below) SSK requests would likely not waste the > > > > > > > > > > > > > > limited thus precious LBN bandwidth so > > > > > > > > > > > > > > fruitlessly. > > > > > > > > > > > > > > > > > > > > > > > > > > > > The problem, in my belief, is too small size of > > > > > > > > > > > > > > UDP packets if SSK requests prevail: > > > > > > > > > > > > > > PPP(oE)/TCP/FNP overhead becomes too large while > > > > > > > > > > > > > > LBNs, unlike faster link nodes, almost never > > > > > > > > > > > > > > coalesce packets, obviously. > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- > > > > > > > > > > > > > > 2007.04.27 - 17:19:24GMT ----- > > > > > > > > > > > > > > > > > > > > > > > > > > > > The current algorithm is working, on most nodes, > > > > > > > > > > > > > > far better than it has in *ages*. I'm at 62% of a > > > > > > > > > > > > > > 700MB ISO, I started inserting it yesterday > > > > > > > > > > > > > > morning, and only a few of my peers are backed > > > > > > > > > > > > > > off - frequently none are backed off, right now > > > > > > > > > > > > > > it's 11 connected, 6 backed off, which is more > > > > > > > > > > > > > > backed off than I've seen for quite a while. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Re failure tables: Yes it is extremely dangerous. > > > > > > > > > > > > > > It can result in self-reinforcing key censorship, > > > > > > > > > > > > > > either as an attack or just occurring naturally. > > > > > > > > > > > > > > This happened on 0.5. And the hit ratio is only > > > > > > > > > > > > > > for CHKs iirc. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Even LBNs don't often send local > > > > > > > > > > > > > > RejectedOverload's on SSKs *once they have > > > > > > > > > > > > > > accepted them*. They may relay downstream RO's > > > > > > > > > > > > > > but that is not fatal. And if they reject some > > > > > > > > > > > > > > requests, so what, it's a slow node, it's bound > > > > > > > > > > > > > > to reject some requests with the current load > > > > > > > > > > > > > > balancing system. > > > > > > > > > > > > > > > > > > > > > > > > > > > > 10-15 minutes would be interesting. We already > > > > > > > > > > > > > > show a circle and histogram of nearby nodes from > > > > > > > > > > > > > > swapping and of our peers, you'd just have to add > > > > > > > > > > > > > > another one. It would be good to have a visual > > > > > > > > > > > > > > proof that routing is working on the level of > > > > > > > > > > > > > > adhering to node specialisations. I didn't expect > > > > > > > > > > > > > > it to be working given the load: I'm surprised > > > > > > > > > > > > > > that it does, it's an interesting result. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Packet size has nothing to do with it, ethernet > > > > > > > > > > > > > > has a 1472 byte maximum. Dial-up has 576 bytes > > > > > > > > > > > > > > max, but we ignore it, and use fragmented packets > > > > > > > > > > > > > > (this sucks, obviously, as it greatly increases > > > > > > > > > > > > > > the chance of losing a packet and having to > > > > > > > > > > > > > > retransmit it). > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please explain why the patch doesn't result in > > > > > > > > > > > > > > never accepting a single SSK? > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- > > > > > > > > > > > > > > 2007.04.27 - 19:31:14GMT ----- > > > > > > > > > > > > > > > > > > > > > > > > > > > > >Packet size has nothing to do with it, ethernet > > > > > > > > > > > > > > > has a 1472 byte maximum. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Dial-up has 576 bytes max, but we ignore it, and > > > > > > > > > > > > > > use fragmented packets (this sucks, obviously, as > > > > > > > > > > > > > > it greatly increases the chance of losing a > > > > > > > > > > > > > > packet and having to retransmit it). > > > > > > > > > > > > > > > > > > > > > > > > > > > > I am talking about typical/average packet size, > > > > > > > > > > > > > > not MTU. LBNs, unlike faster nodes, rarely have a > > > > > > > > > > > > > > chance to coalesce reject responses (over max > > > > > > > > > > > > > > 100ms), and thus send improportionally more tiny > > > > > > > > > > > > > > packets resulting in much higher protocols > > > > > > > > > > > > > > overhead. Thus having LBNs to mostly cater SSKs > > > > > > > > > > > > > > not CHKs results in lowest imaginable usefulness > > > > > > > > > > > > > > of LBNs for network as a whole. > > > > > > > > > > > > > > > > > > > > > > > > > > > > BTW in my experience typical/default dialup/PPP > > > > > > > > > > > > > > MTU is 1500 minus link level headers, like > > > > > > > > > > > > > > ethernet. 576 is a reasonable adjustment for > > > > > > > > > > > > > > interactive traffic like ssh but I fail to > > > > > > > > > > > > > > remember if it was used as default since the time > > > > > > > > > > > > > > the super fast 28800 bod modems became common. > > > > > > > > > > > > > > > > > > > > > > > > > > > > :) 1400+ is the typical size of GPRS PPP packets > > > > > > > > > > > > > > : too, and the same > > > > > > > > > > > > > > > > > > > > > > > > > > > > holds true for other popular wireless mediae like > > > > > > > > > > > > > > BlueTooth or WiFi; so I have no concerns > > > > > > > > > > > > > > regarding IP fragmentation. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please explain why the patch doesn't result in > > > > > > > > > > > > > > > never accepting a single SSK? > > > > > > > > > > > > > > > > > > > > > > > > > > > > I can not. :) Can you explain why the current > > > > > > > > > > > > > > code that penalizes CHKs still gives 5% for them, > > > > > > > > > > > > > > even if CHKs are 25 times larger and similarly > > > > > > > > > > > > > > less frequent so have really hard time to arrive > > > > > > > > > > > > > > at the exact moment when bandwidth liability is > > > > > > > > > > > > > > not maxed out? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Seriously, I believe that goes with 2 facts: > > > > > > > > > > > > > > > > > > > > > > > > > > > > - SSK requests are much more frequent, so any > > > > > > > > > > > > > > temporary drop of CHK requests level enables node > > > > > > > > > > > > > > to quickly get a bunch of new SSKs accepted for > > > > > > > > > > > > > > processing; - the large CHK requests (at times > > > > > > > > > > > > > > while they prevail over SSKs) tend to hit other > > > > > > > > > > > > > > limits too, like "output bandwidth liability", > > > > > > > > > > > > > > "Insufficient input/output bandwidth" throttles. > > > > > > > > > > > > > > Then the small SSK requests quickly pick up all > > > > > > > > > > > > > > the remaining bandwidth bits. > > > > > > > > > > > > > > > > > > > > > > > > > > > > But currently I do not have relevant statistics > > > > > > > > > > > > > > to prove that. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Anyway, please commit the following patch - it > > > > > > > > > > > > > > should equal out bandwidth rights for CHKs and > > > > > > > > > > > > > > SSKs at least half way toward fair/expected > > > > > > > > > > > > > > distribution (and the change will make any > > > > > > > > > > > > > > difference for high-/over-loaded nodes only). > > > > > > > > > > > > > > Once most of my peers (and their peers) update, I > > > > > > > > > > > > > > will study the new node traffic forwarding > > > > > > > > > > > > > > efficiency and behavior at different bandwidth > > > > > > > > > > > > > > limits and with different penalization levels > > > > > > > > > > > > > > again - and then will be in better position to > > > > > > > > > > > > > > prove the original proposal of reserving > > > > > > > > > > > > > > bandwidth for 2 CHKs is optimal (or maybe > > > > > > > > > > > > > > withdraw it). > > > > > > > > > > > > > > > > > > > > > > > > > > > > === > > > > > > > > > > > > > > diff --git > > > > > > > > > > > > > > a/freenet/src/freenet/node/NodeStats.java > > > > > > > > > > > > > > b/freenet/src/freenet/node/NodeStats.java > > > > > > > > > > > > > > index 3b091b4..98c82c3 100644 > > > > > > > > > > > > > > --- a/freenet/src/freenet/node/NodeStats.java > > > > > > > > > > > > > > +++ b/freenet/src/freenet/node/NodeStats.java > > > > > > > > > > > > > > @@ -399,9 +399,8 @@ public class NodeStats > > > > > > > > > > > > > > implements Persistable { > > > > > > > > > > > > > > > > > > > > > > > > > > > > successfulSskFetchBytesSentAverage.currentValue() > > > > > > > > > > > > > > * node.getNumSSKRequests() + > > > > > > > > > > > > > > > > > > > > > > > > > > > > successfulChkInsertBytesSentAverage.currentValue( > > > > > > > > > > > > > >) * node.getNumCHKInserts() + > > > > > > > > > > > > > > > > > > > > > > > > > > > > successfulSskInsertBytesSentAverage.currentValue( > > > > > > > > > > > > > >) * node.getNumSSKInserts(); > > > > > > > > > > > > > > - bandwidthLiabilityOutput += > > > > > > > > > > > > > > getSuccessfulBytes(isSSK, isInsert, > > > > > > > > > > > > > > false).currentValue(); double > > > > > > > > > > > > > > bandwidthAvailableOutput = - > > > > > > > > > > > > > > node.getOutputBandwidthLimit() * 90; // 90 > > > > > > > > > > > > > > seconds at full power; we have to leave some time > > > > > > > > > > > > > > for the search as well + > > > > > > > > > > > > > > node.getOutputBandwidthLimit() * 80; // 80 > > > > > > > > > > > > > > seconds at full power; we have to leave some time > > > > > > > > > > > > > > for the search as well bandwidthAvailableOutput > > > > > > > > > > > > > > *= > > > > > > > > > > > > > > NodeStats.FRACTION_OF_BANDWIDTH_USED_BY_REQUESTS; > > > > > > > > > > > > > > if(bandwidthLiabilityOutput > > > > > > > > > > > > > > > bandwidthAvailableOutput) { > > > > > > > > > > > > > > preemptiveRejectReasons.inc("Output bandwidth > > > > > > > > > > > > > > liability"); @@ -413,9 +412,8 @@ public class > > > > > > > > > > > > > > NodeStats implements Persistable { > > > > > > > > > > > > > > > > > > > > > > > > > > > > successfulSskFetchBytesReceivedAverage.currentVal > > > > > > > > > > > > > >ue () * node.getNumSSKRequests() + > > > > > > > > > > > > > > > > > > > > > > > > > > > > successfulChkInsertBytesReceivedAverage.currentVa > > > > > > > > > > > > > >lu e( ) * node.getNumCHKInserts() + > > > > > > > > > > > > > > > > > > > > > > > > > > > > successfulSskInsertBytesReceivedAverage.currentVa > > > > > > > > > > > > > >lu e( ) * node.getNumSSKInserts(); > > > > > > > > > > > > > > - bandwidthLiabilityInput += > > > > > > > > > > > > > > getSuccessfulBytes(isSSK, isInsert, > > > > > > > > > > > > > > true).currentValue(); double > > > > > > > > > > > > > > bandwidthAvailableInput = - > > > > > > > > > > > > > > node.getInputBandwidthLimit() * 90; // 90 seconds > > > > > > > > > > > > > > at full power > > > > > > > > > > > > > > + > > > > > > > > > > > > > > node.getInputBandwidthLimit() * 80; // 80 seconds > > > > > > > > > > > > > > at full power > > > > > > > > > > > > > > if(bandwidthLiabilityInput > > > > > > > > > > > > > > > bandwidthAvailableInput) { > > > > > > > > > > > > > > preemptiveRejectReasons.inc("Input bandwidth > > > > > > > > > > > > > > liability"); return "Input bandwidth liability"; > > > > > > > > > > > > > > === > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- > > > > > > > > > > > > > > 2007.04.28 - 17:05:53GMT ----- > > > > > > > > > > > > > > > > > > > > > > > > > > > > Why does assuming 80 seconds instead of 90 help? > > > > > > > > > > > > > > I would have expected it to make the situation > > > > > > > > > > > > > > worse. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Isn't what you want to increment the value you > > > > > > > > > > > > > > are multiplying the CHK byte counters by if the > > > > > > > > > > > > > > request is an SSK? In any case I'm not convinced > > > > > > > > > > > > > > - we accept 32x as many SSKs as CHKs precisely > > > > > > > > > > > > > > because they use 32x less bandwidth. As far as I > > > > > > > > > > > > > > can see incrementing the CHK counts but only on a > > > > > > > > > > > > > > CHK would just result in us never accepting an > > > > > > > > > > > > > > SSK... > > > > > > > > > > > > > > > > > > > > > > > > > > > > But by all means continue to investigate. > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- > > > > > > > > > > > > > > 2007.04.30 - 19:36:36GMT ----- > > > > > > > > > > > > > > > > > > > > > > > > > > > > > we accept 32x as many SSKs as CHKs precisely > > > > > > > > > > > > > > > because they use 32x less > > > > > > > > > > > > > > > > > > > > > > > > > > > > bandwidth. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sorry but I don't understand the rationale behind > > > > > > > > > > > > > > this. It seems to be based on the assumption that > > > > > > > > > > > > > > equal resources should be allocated to SSKs and > > > > > > > > > > > > > > CHKs, regardless of whether there's equal demand > > > > > > > > > > > > > > for resources. If we're only getting, say, 16 > > > > > > > > > > > > > > times as many SSK requests as CHK requests, would > > > > > > > > > > > > > > we reject CHK requests to keep things "fair"? > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- > > > > > > > > > > > > > > 2007.05.02 - 16:13:52GMT ----- > > > > > > > > > > > > > > > > > > > > > > > > > > > > Why should CHKs be prioritised over SSKs? > > > > > > > > > > > > > > > > > > > > > > > > > > > > What do you think of the patch I committed > > > > > > > > > > > > > > anyway? > > > > > > > > > > > > > > > > > > > > > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- > > > > > > > > > > > > > 2007.05.02 - 17:03:52GMT ----- > > > > > > > > > > > > > > > > > > > > > > > > > > > Why should CHKs be prioritised over SSKs? > > > > > > > > > > > > > > > > > > > > > > > > > > Because SSK success ratio is extremely low > > > > > > > > > > > > > (remember the example I gave within initial message > > > > > > > > > > > > > in the thread: 8100 rejected requests for the same > > > > > > > > > > > > > SSK key within 35 hours, and uncounted amount of > > > > > > > > > > > > > the same key requests resulted in DNF - and such > > > > > > > > > > > > > key is nowhere near being alone); linear > > > > > > > > > > > > > programming would certainly suggest that SSK > > > > > > > > > > > > > requests like they currently are should be totally > > > > > > > > > > > > > disabled. I do not propose anything as drastic; but > > > > > > > > > > > > > as SSKs currently use bandwidth very inefficiently > > > > > > > > > > > > > I propose to tone them down > > > > > > > > > > > > > heuristically to approximately CHK level while node > > > > > > > > > > > > > is short on bandwidth (but let SSKs go as much as > > > > > > > > > > > > > sending node needs if spare bandwidth available). > > > > > > > > > > > > > > > > > > > > > > > > > > [not that I meant to speak instead of mrogers] > > > > > > > > > > > > > > > > > > > > > > > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- > > > > > > > > > > > > > 2007.05.03 - 13:54:38GMT ----- > > > > > > > > > > > > > > > > > > > > > > > > > > >> Why should CHKs be prioritised over SSKs? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Because SSK success ratio is extremely low > > > > > > > > > > > > > > > > > > > > > > > > > > That doesn't make sense for two reasons: > > > > > > > > > > > > > > > > > > > > > > > > > > 1) rejecting SSKs will make the SSK success ratio > > > > > > > > > > > > > worse, not better 2) "SSK not found" is useful > > > > > > > > > > > > > information - for example, that's how you discover > > > > > > > > > > > > > the current version of a USK - but "SSK rejected > > > > > > > > > > > > > for not being a CHK" is not useful to anyone > > > > > > > > > > > > > > > > > > > > > > > > > > Let me use an analogy: you're designing a Cisco > > > > > > > > > > > > > router. Some of the packets it gets asked to > > > > > > > > > > > > > forward will be small SSH packets, others will be > > > > > > > > > > > > > large HTTP packets. Do you say "half our resources > > > > > > > > > > > > > should go to SSH because we don't want to > > > > > > > > > > > > > prioritise HTTP, so we'll only forward one HTTP > > > > > > > > > > > > > packet for every 32 SSH packets"? If you answered > > > > > > > > > > > > > yes, you just lost your job at Cisco. The router's > > > > > > > > > > > > > job is to deliver packets to the best of its > > > > > > > > > > > > > ability, not to decide what kind of packets the end > > > > > > > > > > > > > hosts "should" be sending. > > > > > > > > > > > > > > > > > > > > > > > > ----- Anonymous ----- 2007.05.03 - 18:37:00GMT ----- > > > > > > > > > > > > > > > > > > > > > > > > lets build an extreme case: > > > > > > > > > > > > There is a packet type which is 1 MB in size and one > > > > > > > > > > > > which is 1 byte in size. Both get constantly send to > > > > > > > > > > > > a router at a higher rate then the routers bandwidth > > > > > > > > > > > > quota allows it to forward the packets. If the > > > > > > > > > > > > following rules apply not a single 1 MB packet will > > > > > > > > > > > > get served: 1. the router doesn't care for the size > > > > > > > > > > > > of the packet or penalises any kind of packet, all > > > > > > > > > > > > are considered equal 2. if a packet arrives and the > > > > > > > > > > > > remaining bandwidth quota doesn't allow to forward > > > > > > > > > > > > it, it gets instantly rejected 3. no queueing of > > > > > > > > > > > > incomming packets is done (with part-allocation of > > > > > > > > > > > > available bandwidth if packet size exceed the free > > > > > > > > > > > > quota) > > > > > > > > > > > > > > > > > > > > > > > > Afaik this is a possible situation for freenet, > > > > > > > > > > > > correct me if I am wrong. > > > > > > > > > > > > > > > > > > > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- > > > > > > > > > > > 2007.05.04 - 11:15:09GMT ----- > > > > > > > > > > > > > > > > > > > > > > You're right, that pretty much describes the problem. > > > > > > > > > > > I'm suggesting that we should fix it by modifying step > > > > > > > > > > > 2: > > > > > > > > > > > > > > > > > > > > > > 2. if a packet arrives and the remaining bandwidth > > > > > > > > > > > quota doesn't allow to forward *an average packet*, it > > > > > > > > > > > gets instantly rejected > > > > > > > > > > > > > > > > > > > > > > The average is calculated over all packets, large and > > > > > > > > > > > small. In other words we ask: > > > > > > > > > > > > > > > > > > > > > > a) how much bandwidth does a packet use, on average? > > > > > > > > > > > b) is there enough bandwidth for another average > > > > > > > > > > > packet? > > > > > > > > > > > > > > > > > > > > > > Let's say 75% of packets are 1 byte in size and 25% are > > > > > > > > > > > 1 MB in size, so the average is 262,144.75 bytes. We > > > > > > > > > > > accept packets (large or small) if we have at least > > > > > > > > > > > 262,144.75 bytes of bandwidth left in the quota, and > > > > > > > > > > > reject them otherwise. If there's a long stream of 1 > > > > > > > > > > > byte packets followed by a 1 MB packet we might go over > > > > > > > > > > > quota temporarily, but we'll match the quota in the > > > > > > > > > > > long term. > > > > > > > > > > > > > > > > > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- > > > > > > > > > > > 2007.05.10 - 21:01:58GMT ----- > > > > > > > > > > > > > > > > > > > > > > Oh no, getting stuff instantly rejected is most often > > > > > > > > > > > not good (an exception would be certain kinds of > > > > > > > > > > > realtime traffic, but that is not applicable for > > > > > > > > > > > freenet at all). I have some experience with > > > > > > > > > > > independant/competing ISPs and their broken traffic > > > > > > > > > > > shaping routers that were always dropping packets not > > > > > > > > > > > fitting current shaping limits; TCP performance was > > > > > > > > > > > experiencing major hit there, and several TCP > > > > > > > > > > > connections running under the same shaping were always > > > > > > > > > > > taking seriously unfair bandwidth share (unless you get > > > > > > > > > > > quite long intervals for stats, like 10+ minutes). > > > > > > > > > > > Changing shaping processing by queueing an over-quota > > > > > > > > > > > packet (even a single packet queue!) till the > > > > > > > > > > > calculated average bandwidth allows to send the packet > > > > > > > > > > > (thus in the end increasing roundtrip slightly) was > > > > > > > > > > > always sufficient for TCP flow to work at 100% of the > > > > > > > > > > > shaped level, and having simultaneous TCP streams to > > > > > > > > > > > very equally share available bandwidth even for > > > > > > > > > > > sub-second stats intervals, and there were no other > > > > > > > > > > > working solution found (aside raising the shaping limit > > > > > > > > > > > above the maximum speed of TCP peers). > > > > > > > > > > > > > > > > > > > > > > I am not sure that can be directly applied to the > > > > > > > > > > > current freenet networking code; honestly, the > > > > > > > > > > > mechanism of first quickly accepting packets and then > > > > > > > > > > > slowly picking them using some kind of filters looks > > > > > > > > > > > unneccessary complicated and performance inoptimal, to > > > > > > > > > > > say least: I have another bright example why - the > > > > > > > > > > > mechanism quite resembles the traditional O/S network > > > > > > > > > > > packets handling (with received packets extracted from > > > > > > > > > > > NIC at highest priority - during hardware interrupt, > > > > > > > > > > > and then having CPU/server business logic failing to > > > > > > > > > > > process all received packets leading to internal queues > > > > > > > > > > > overflow), and after years and decades it is generally > > > > > > > > > > > agreed that such approach does not work well for server > > > > > > > > > > > applications; instead, linux for several years already > > > > > > > > > > > has mechanism named NAPI (which is optional for some > > > > > > > > > > > NIC drivers - check kernel config, but default and > > > > > > > > > > > mandatory for most server-grade and/or 1Gb NIC > > > > > > > > > > > drivers): hardware interrupt just sets a flag/semaphore > > > > > > > > > > > that NIC has received something, and instantly quits > > > > > > > > > > > leaving the particular NIC interrupt line disabled > > > > > > > > > > > (actual algorithm is a little bit more complex, > > > > > > > > > > > allowing hardware interrupt to perform extraction of a > > > > > > > > > > > very limited number of packets if the host is very > > > > > > > > > > > idle). Then there is a lowest priority kernel thread > > > > > > > > > > > ("software interrupt") woken up by the flag/semaphore > > > > > > > > > > > starts reading packets from NIC into O/S queues (where > > > > > > > > > > > user-level read()s get satisfied from), extracting only > > > > > > > > > > > limited number of packets at a time (then yielding CPU > > > > > > > > > > > for other runnable processes), and reenabling the NIC > > > > > > > > > > > interrupts only when it managed to empty the hardware > > > > > > > > > > > queue - with TCP flow control, and with the modern > > > > > > > > > > > ethernet hardware flow control that works exceptionally > > > > > > > > > > > well. Thus server business logic (i.e. useful work) > > > > > > > > > > > running at priority much higher than software interrupt > > > > > > > > > > > thread is never starved from CPU by hardware interrupts > > > > > > > > > > > that first pull in packets which then result in CPU > > > > > > > > > > > wasted to drop them from overflown system queue - > > > > > > > > > > > resulting in smooth behaviour and best sustained > > > > > > > > > > > performance. > > > > > > > > > > > > > > > > > > > > > > Or in short - on overload, delaying input packets > > > > > > > > > > > reading/processing is better than dropping or rejecting > > > > > > > > > > > them instantly. > > > > > > > > > > > > > > > > > > > > > > Toad - if you know a simple way to delay freenet reads > > > > > > > > > > > from UDP socket in order to enforce configured input > > > > > > > > > > > bandwidth limit, please do so. (And with that UDP read > > > > > > > > > > > delay, I would be very interested to test freenet node > > > > > > > > > > > without other input bandwidth limiters aside input > > > > > > > > > > > bandwidth liability used - chances that the UDP socket > > > > > > > > > > > read delay will be sufficient for quality shaping, with > > > > > > > > > > > the valuable help of sending node tracking the > > > > > > > > > > > roundtrip - an already well implemented feature). > > > > > > > > > > > > > > > > > > > > > > If the delay can not be done easily with the current > > > > > > > > > > > codebase, I will consider doing major rewrite of the > > > > > > > > > > > traffic accepting code part. Not of highest priority > > > > > > > > > > > tho, due to anticipated large amount of work - but > > > > > > > > > > > those high fruits look big and tasty. > > > > > > > > > > > > > > > > > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- > > > > > > > > > > 2007.05.11 - 14:56:37GMT ----- > > > > > > > > > > > > > > > > > > > > > I am not sure that can be directly applied to the > > > > > > > > > > > current freenet networking > > > > > > > > > > > > > > > > > > > > code; > > > > > > > > > > > > > > > > > > > > We're working on an idea called token-passing that's > > > > > > > > > > supposed to address this: you can only send a search > > > > > > > > > > (request/insert) to a peer if you have a flow control > > > > > > > > > > token from that peer. If you don't have a token you > > > > > > > > > > either keep the search in a queue until you receive a > > > > > > > > > > token, or send it to the next-best peer if the queue is > > > > > > > > > > full. > > > > > > > > > > > > > > > > > > > > > the mechanism quite resembles the traditional O/S > > > > > > > > > > > network packets handling > > > > > > > > > > > > > > > > > > > > (with received packets extracted from NIC at highest > > > > > > > > > > priority - during hardware interrupt, and then having > > > > > > > > > > CPU/server business logic failing to process all received > > > > > > > > > > packets leading to internal queues overflow) > > > > > > > > > > > > > > > > > > > > Interesting point - in the new congestion control layer, > > > > > > > > > > maybe the UDP reader shouldn't advance the receiver > > > > > > > > > > window until the internal queues have dropped below a > > > > > > > > > > certain size... but it might be tricky to implement > > > > > > > > > > because the internal queues all belong to different > > > > > > > > > > threads... > > > > > > > > > > > > > > > > > > > > > If the delay can not be done easily with the current > > > > > > > > > > > codebase, I will > > > > > > > > > > > > > > > > > > > > consider doing major rewrite of the traffic accepting > > > > > > > > > > code part. > > > > > > > > > > > > > > > > > > > > This is due to be rewritten soon anyway, so now's > > > > > > > > > > probably a good time to make suggestions. > > > > > > > > > > > > > > > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- > > > > > > > > > > 2007.05.14 - 16:46:24GMT ----- > > > > > > > > > > > > > > > > > > > > While token passing would indeed smooth the traffic out, > > > > > > > > > > it feels excessive: > > > > > > > > > > > > > > > > > > > > - it adds extra traffic; > > > > > > > > > > - it creates additional traffic patterns, that quite > > > > > > > > > > simplify attacks (like those aiming at reliably proving > > > > > > > > > > that a particular request originates from attacked node) > > > > > > > > > > against a node which all connections are monitored (by > > > > > > > > > > ISP), and some of them are fast but compromised > > > > > > > > > > (compromised peers). - it requires to pull a > > > > > > > > > > multidimensional set of heurictics on whom to send new > > > > > > > > > > token out of a thin air, and those heuristics will tend > > > > > > > > > > to disagree for different connection types. > > > > > > > > > > > > > > > > > > > > The method of delaying network reads (thats important - > > > > > > > > > > and AFAIK the only major missing thing to get shaping > > > > > > > > > > rolling smoothly already) should work similarly well > > > > > > > > > > (might be even better): just consider the metric 'the > > > > > > > > > > current peer roundtrip time is lower than the [peer] > > > > > > > > > > average roundtrip time' as equivalence of 'the peer gave > > > > > > > > > > us few tokens', and enjoy the bandwidth/crypt(CPU) free > > > > > > > > > > virtual token passing which obeys both hardware/ISP > > > > > > > > > > traffic shaping imposed limits, as well as software > > > > > > > > > > configured limits - whichever is stricter. > > > > > > > > > > > > > > > > > > > > So I currently discorage implementing explicit token > > > > > > > > > > passing, in favor of lower, equially tasty fruit. > > > > > > > > > > > > > > > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.17 > > > > > > > > > - 21:40:27GMT ----- > > > > > > > > > > > > > > > > > > > - it adds extra traffic > > > > > > > > > > > > > > > > > > Um, right. "Here are n tokens" takes about 6 bytes: two for > > > > > > > > > the message type, two for the message size, and two for the > > > > > > > > > number of tokens (we're never going to hand out more than > > > > > > > > > 65535 tokens in one go). It uses less traffic than "Can I > > > > > > > > > send you a request?" "Yes" "Here's the request", and it > > > > > > > > > avoids a round-trip. It also uses less traffic than "Can I > > > > > > > > > send you a request?" "No", because if you don't have a > > > > > > > > > token, you don't need to ask! > > > > > > > > > > > > > > > > > > > - it creates additional traffic patterns, that quite > > > > > > > > > > simplify attacks (like > > > > > > > > > > > > > > > > > > those aiming at reliably proving that a particular request > > > > > > > > > originates from attacked node) against a node which all > > > > > > > > > connections are monitored (by ISP), and some of them are > > > > > > > > > fast but compromised (compromised peers). > > > > > > > > > > > > > > > > > > Please explain how handing my peer some tokens reveals > > > > > > > > > anything about traffic patterns that wasn't already visible > > > > > > > > > to traffic analysis. If they can see the requests and > > > > > > > > > results going back and forth, who cares if they can also > > > > > > > > > see the tokens? > > > > > > > > > > > > > > > > > > > - it requires to pull a multidimensional set of > > > > > > > > > > heurictics on whom to send > > > > > > > > > > > > > > > > > > new token out of a thin air, and those heuristics will tend > > > > > > > > > to disagree for different connection types. > > > > > > > > > > > > > > > > > > No magical heuristics are needed - we hand out tokens as > > > > > > > > > long as we're not overloaded (measured by total queueing > > > > > > > > > delay, including the bandwidth limiter). That alone should > > > > > > > > > be enough to outperform the current system, because we'll > > > > > > > > > avoid wasting traffic on rejected searches. Then we can > > > > > > > > > start thinking about clever token allocation policies to > > > > > > > > > enforce fairness when the network's busy, without imposing > > > > > > > > > unnecessary limits when the network's idle, etc etc. But > > > > > > > > > token passing doesn't depend on any such policy - it's just > > > > > > > > > a lower-bandwidth alternative to pre-emptive rejection. > > > > > > > > > > > > > > > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.19 - > > > > > > > > 17:38:09GMT ----- > > > > > > > > > > > > > > > > We hand out tokens when we're not overloaded? What if later > > > > > > > > on we get overloaded? How exactly do we determine the total > > > > > > > > appropriate number of tokens in the system? > > > > > > > > > > > > > > > > Token passing would be really nice, it would solve the > > > > > > > > flooding problem and remove a lot of alchemy... > > > > > > > > > > > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.20 - > > > > > > > 11:19:36GMT ----- > > > > > > > > > > > > > > We determine the appropriate number of tokens by > > > > > > > experimentation: as we approach overload we produce tokens more > > > > > > > and more slowly, until at the overload point we don't produce > > > > > > > tokens at all. For example, delay between producing tokens = > > > > > > > MAX_DELAY/(MAX_DELAY - currentDelay) seconds, or something > > > > > > > similar. > > > > > > > > > > > > > > We stop producing tokens when all the buckets are full. To > > > > > > > limit the number of tokens in circulation, we adjust the bucket > > > > > > > sizes. Each peer's bucket starts at size 0, and is reset to 0 > > > > > > > whenever the peer reconnects. Whenever we add a token to an > > > > > > > empty bucket, we increase its size by 1. Thus if a peer doesn't > > > > > > > use the tokens we give it (ie doesn't empty its bucket), its > > > > > > > bucket stays small. If it uses the tokens we give it (ie > > > > > > > empties its bucket), we increase the size of its bucket as long > > > > > > > as we're not overloaded (the bucket size is only increased when > > > > > > > a token is handed out, which doesn't happen when we're > > > > > > > overloaded). > > > > > > > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.25 - > > > > > > 11:45:36GMT ----- > > > > > > > > > > > > Even this brief description proposes 3 heuristics already: > > > > > > > > > > > > - "more slowly or something similar" > > > > > > - "adjust bucket sizes" > > > > > > - "stays small" > > > > > > > > > > > > No doubts complete practical implementation will involve quite > > > > > > more, and I see no way to ensure they will all work smoothly for > > > > > > all the wide variety of connection types. > > > > > > > > > > > > This also totally silences the burst problems, as well as > > > > > > decreasing buckets during the bursts/overload periods. Kinda > > > > > > extreme case: if you gave a node 5 tokens, and it decided to use > > > > > > them 3 minutes later when the node got overloaded, what did you > > > > > > gain with the tokens? That is a true Pandora box. > > > > > > > > > > > > In the end it is timing that matters, while tokens represent data > > > > > > amount (and if you think of a formula time*bandwidth=data, then > > > > > > bandwidth, generally speaking, is not constant due to highly > > > > > > irregular traffic from different peers and other apps sharing the > > > > > > same internet connection). Purely controlling the latter will > > > > > > always be just a rough approximation of controlling the former, > > > > > > alas. > > > > > > > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.25 - > > > > > 17:20:00GMT ----- > > > > > > > > > > > I see no way to ensure they will all work smoothly for all the > > > > > > wide variety > > > > > > > > > > of connection types > > > > > > > > > > That may be true, but can you ensure that with the current > > > > > mechanism? > > > > > > > > > > > if you gave a node 5 tokens, and it decided to use them 3 minutes > > > > > > later when > > > > > > > > > > the node got overloaded, what did you gain with the tokens? > > > > > > > > > > It shouldn't ever reach that point, because when you give out 4 > > > > > tokens and get a burst that nearly overloads the node, you'll stop > > > > > increasing the size of the buckets. > > > > > > > > > > > Purely controlling the latter will always be just a rough > > > > > > approximation of > > > > > > > > > > controlling the former, alas. > > > > > > > > > > True, it's only a rough approximation and traffic is bursty and we > > > > > can't predict the future, but all those things also apply to the > > > > > current system. > > > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.25 - > > > > 20:35:30GMT ----- > > > > > > > > - I gave 4 tokens to peer A, and 4 tokens for peer B; > > > > - peer A used up the token in one burst, causing local bandwidth > > > > overage already. > > > > - now if peer B decides to use up the tokens, my bandwidth is already > > > > flooded badly. > > > > > > > > - I gave 2 tokens (generally: very small amount) to peer A, and 2 > > > > tokens to peer B at this time; > > > > - the peer A used up both of them, but peer B does not have anything > > > > to request for now. > > > > - as result, the available bandwidth is only half-loaded until peer A > > > > explicitly given another few tokens; but giving away small amounts of > > > > tokens puts traffic overhead (often in form of small sized packets > > > > with highest overhead), and speed of feeding small packs of tokens is > > > > seriously limited by roundtrip delays. > > > > > > > > Thus nothing is really gained with tokens, and fighting that > > > > situation sounds fruitless, and the only reasonable approach is to > > > > control the roundtrip delay instead of fighting it, by delaying > > > > network reads as needed to shape incoming traffic. This gives sending > > > > node not just boolen flag can/cannot send immediatelly, but certain > > > > number to estimate optimal outgoing interpacket delay toward a > > > > particular peer. > > > > > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.26 - > > > > 19:31:25GMT ----- > > > > > > > > You're making the jump from 0 to 4 tokens as if there are no > > > > intermediate stages. If 4 searches are enough to overload the node, > > > > then after you give 2 tokens each to A and B, you'll get a burst and > > > > stop increasing the size of the buckets. You should never reach a > > > > situation where there are more tokens in circulation than you can > > > > handle simultaneously. > > > > > > > > You're also confusing the number of tokens with the size of the > > > > buckets (probably my fault for not explaining it properly). If you > > > > give 2 tokens each to A and B, and A uses its tokens but B doesn't, > > > > you're free to give 2 more tokens to A; there's no reason for A to > > > > starve just because B is holding tokens. The only effect of B holding > > > > tokens is to limit the size of the bursts you can get from A to a > > > > level that will be safe even if you simultaneously get a burst from > > > > B. To avoid making these limits too strict, we start with small > > > > buckets and only increase the size of a bucket when it's empty, so > > > > idle peers are limited to smaller bursts than active peers, but the > > > > total burst size is still limited to what we can handle. > > > > > > > > Obviously we should also have TCP-style congestion control between > > > > peers, but that's not sufficient for network-wide load limiting in my > > > > opinion (eg look at the original Gnutella, which used TCP between > > > > peers but still had massive load limiting problems). > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.27 - > > > 08:54:29GMT ----- > > > > > > Why you suppose each peer submits a stable flow of traffic? I might be > > > downloading long and time lengthy file, gaining a bunch of tokens from > > > you; and then as download finishes, I still possess alot of tokens? > > > > > > As you start issuing tokens is smaller chunks, you experience higher > > > overhead from small token packets which failed to coalesce within thin > > > time frame. > > > > > > TCP style congestion control (ECN) is not between peers; its existance > > > is primarily targetted at ECN being directly supported by all (most) > > > TCP/IP routers - without ECN or with ECN not supported by routers along > > > the path TCP flow control enforced by packet drops only (when roudtrip > > > measurements happen to fail predicting real momentarily network load). > > > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.27 - > > > 15:06:51GMT ----- > > > > > > > Why you suppose each peer submits a stable flow of traffic? > > > > > > I don't, that's why I have frequently mentioned bursts. > > > > > > > I might be downloading long and time lengthy file, gaining a bunch > > > > of > > > > > > tokens from you; and then as download finishes, I still possess alot of > > > tokens? > > > > > > Yes, if I have previously been able to cope with large bursts of > > > traffic it's possible I will give you a large number of tokens. If not, > > > I won't. So what's the problem? > > > > > > > As you start issuing tokens is smaller chunks, you experience higher > > > > > > overhead from small token packets which failed to coalesce within thin > > > time frame. > > > > > > The problem of coalescing small messages on low-bandwidth links is not > > > specified to token-passing. It also applies to the current mechanism, > > > with its "can I send you a search?" "yes/no" messages that would be > > > eliminated by token-passing. > > > > > > Why you keep talking about ECN I have no idea... are you suggesting > > > that dropping/marking Freenet searches would have a similar effect to > > > dropping/marking TCP packets in ECN? I don't think the two cases are > > > comparable: in TCP you have a stream of packets between the same > > > endpoints, so it makes sense to ask the sender to slow down if one of > > > the routers is overloaded. In Freenet each search is routed > > > independently to a different location, so it makes no sense to ask the > > > sender to slow down if one of the nodes is overloaded - future searches > > > handled by that node will come from different senders anyway. What you > > > need to do is ask the overloaded node's neighbours to put less load on > > > it. If that means they can accept fewer searches, then they should ask > > > their neighbours to put less load on them, and so on. > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.27 - > > > 18:38:46GMT ----- > > > > > > > Yes, if I have previously been able to cope with large bursts of > > > > traffic > > > > > > it's possible I will give you a large number of tokens. If not, I > > > won't. So what's the problem? > > > > > > Once several dormant peers start sending new traffic at approximately > > > the same time, the tokens passing mechanism have no even teoretical > > > chance to avoid bandwidth overages/packets drops. More subtle > > > mechanisms needed in order for sending nodes to know/decide not only > > > about that fact that more traffic can be sent, but also how fast that > > > should be done if there is more than one packet to send. Once such > > > mechanisms are correctly implemented (and in noticeable part they > > > already are), I see no advantages the explicit tokens passing could > > > add. > > > > > > > so it makes no sense to ask the sender to slow down if one of the > > > > nodes is > > > > > > overloaded > > > > > > Oh, why it could be so? Requests for keys with close locations are > > > likely to get routed via the same overloaded node, and with peer > > > locations pattern I keep observing the 'closeness' might easily be as > > > wide as 0.4 and even more (the range of locations that prefer the same > > > peer). The requests sent recently (in term of hops passed, not time) > > > show even distribution of the keys along the 0..1 space, so pushing 40% > > > of traffic at a single peer and disregarding his overload cries can not > > > be seen as good. > > > > > > The ECN distinguishing feature is that congestion/impending overload > > > shared state gets maintained at all devices along the route (including > > > endpoints) that at least potentially do traffic shaping (due to > > > explicit settings or just overload). Lacking it, both TCP and FNP have > > > to resort downto carefully tracking roundtrips, scurpuluously > > > calculating all delays at routers and final hosts in order to react > > > upon even minor degradations quickly (but smoothly). FNP gets more > > > complicated, as packet processing efforts (including all the > > > encryption/decryption) are not as negligible as with TCP, but not > > > indoable - the packets/protocol already carry the temporal information > > > implicitly, it just has to be handled a little bit more careful. > > > > > > OTOH, adding explicit auxilarly traffic (be it tokens assing or > > > whatever) creates a certain dilemma: trying to maximize the variable > > > bandwidth usage without overages, the additional even little (but for > > > many links, not so little due to the massive FNP+UDP+IP enveloping) > > > overhead might turn out to be the bit that still causes the overages > > > (in fact, the 'bit' in extremely unluky events flow can get up to 20% > > > of traffic). Isn't it better to keep using simpler protocol, maybe > > > aiming it at 98% load if overages are known to cause unreasonably > > > frequent packet drops, or just believed to be absolute evil? > > > > > > > > > On a slightly different topic, how about wireless connections like > > > radioethernet: high bandwidth and higher-than-average link delays are > > > combined with constant non-zero packets drop rate (0.1-2%, but > > > sometimes up to 5% in large city daytime, or in industrial area). > > > Similar things with satellite links, during bad weather or in polar > > > areas where any dish size is almost never sufficient. Explicitly > > > passing tokens would be dangerous because of very high bursts: there > > > might be 500KB data en-route for unidirectional SOHO grade satellite > > > link, 1MB for bidirectional one. But if outstanding several hundreds > > > tokens is not feasible and a little over a dozen KB is a limit, such > > > links will be usable at 2-5% of available bandwidth at most. > > > > > > (You know how does TCP behave over such links? During periods of lower > > > drop rate it works more or less acceptably, especially if you aren't in > > > haste. At other times people simply get a teabrake or find another task > > > for the time being. Solution here could be SCTP like protocols but they > > > are very inmature and unpopular, and unlikely suit freenet needs). > > > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.28 - > > > 22:10:09GMT ----- > > > > > > > Once several dormant peers start sending new traffic at approximately > > > > the > > > > > > same time, the tokens passing mechanism have no even teoretical chance > > > to avoid bandwidth overages/packets drops. > > > > > > We're going in circles here. As I've already said, you never have more > > > tokens outstanding than you can handle if they come back > > > *simultaneously*. So I will only allow you to accumulate a large number > > > of tokens if I can handle all those tokens, and all other outstanding > > > tokens, being used at once. We don't have any magic way of determining > > > how many tokens we can handle at once, so we *gradually* increase the > > > limits until we get close to overload. > > > > > > > More subtle mechanisms needed in order for sending nodes to > > > > know/decide not > > > > > > only about that fact that more traffic can be sent, but also how fast > > > that should be done if there is more than one packet to send. > > > > > > Yes, we need a TCP-like mechanism to avoid congestion between peers, > > > but that's not enough to avoid overloading the peers themselves (rather > > > than the individual links). > > > > > > > pushing 40% of traffic at a single peer and disregarding his overload > > > > cries > > > > > > can not be seen as good > > > > > > When did I say we should ignore overload? I said the overloaded node's > > > neighbours should respond, rather than the source of the traffic, > > > because the overloaded node's neighbours have more control over the > > > amount of load it receives than a source elsewhere on the network. > > > > > > > The ECN distinguishing feature is that congestion/impending overload > > > > shared > > > > > > state gets maintained at all devices along the route > > > > > > But my point was that in FNP there's no concept of a 'route' as there > > > is in TCP. A series of TCP packets will travel along (more or less) the > > > same route between the same endpoints. A series of FNP requests will > > > travel along diverging routes to completely different endpoints. So > > > asking the source to slow down when it hits an overloaded node is > > > shutting the stable door after the horse has bolted: the source has > > > already moved on to some other route that might be more or less loaded > > > than the current route. > > > > > > > Isn't it better to keep using simpler protocol > > > > > > Sorry, I don't see how the current backoff/throttling/preemptive > > > rejection mechanisms are simpler than token-passing. Equally > > > complicated, maybe, but not less. :-) > > > > > > > Solution here could be SCTP like protocols but they are very inmature > > > > and > > > > > > unpopular, and unlikely suit freenet needs > > > > > > Yup, when I was working on the new congestion control design last year > > > I considered copying DCCP or TCP Vegas to be more loss-tolerant (and > > > it's still a possibility, if you want to design and document it). But > > > in the end I decided the rest of the world was going to carry on using > > > TCP, no matter how unsuitable, and the world's networks would be made > > > suitable for TCP by hook or by crook (or by ugly MAC-layer > > > retransmission hacks), so we could just copy TCP and let the layer 2 > > > engineers solve the problem. :-) > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.28 - 22:10:09GMT > > ----- > > > > > Once several dormant peers start sending new traffic at approximately > > > the > > > > same time, the tokens passing mechanism have no even teoretical chance to > > avoid bandwidth overages/packets drops. > > > > We're going in circles here. As I've already said, you never have more > > tokens outstanding than you can handle if they come back > > *simultaneously*. So I will only allow you to accumulate a large number > > of tokens if I can handle all those tokens, and all other outstanding > > tokens, being used at once. We don't have any magic way of determining > > how many tokens we can handle at once, so we *gradually* increase the > > limits until we get close to overload. > > > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.29 - 01:26:17GMT > > ----- > > > > Okay, so just like the current code, we avoid timeouts by sacrificing > > bandwidth - sometimes, sacrificing A LOT of bandwidth. Is that bad? > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.29 - 19:47:54GMT > > ----- > > > > We'd be sacrificing a lot of bandwidth at startup, but eventually we > > should reach full speed... I guess it partly depends on how long we > > expect nodes to stay up. > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.29 - 21:39:39GMT ----- > > Not if the nodes don't take up their tokens. And not if the requests fail, > like the current algorithm. > > BTW I'd appreciate input on the load impact of ultra-lightweight passive > requests. > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.06.01 - 21:15:53GMT > ----- > > Sorry, what do you mean by nodes not taking up their tokens? We hand out > tokens to our peers, they don't need to ask for them. If a peer doesn't use > its tokens we don't increase the size of its bucket, so inactive peers will > only lock up a few tokens. > > I don't see what difference it makes whether the requests succeed or fail - > the token allocation algorithm isn't affected by that. (And it shouldn't > be, because a failure could indicate a million different things, not all of > which can be solved by reducing our own throughput.) > > About passive requests: it's a cool idea, but it's not going to be trivial > to design or implement (there's a lot of literature on p2p > publish/subscribe that you should probably look at first). Obviously the > choice is yours because you're the one doing the work, but it seems to me > that this is a pretty major addition for a vaguely specified benefit. Can > we quantify the bandwidth savings, even roughly? For example, if we assume > each client polls every 45 minutes and the key is updated once every 3 > hours, we're only reducing 4 requests per client to one request per client, > and the requests are pretty small compared to the response. There don't > need to be very many broken paths in the subscription web before the > overhead of unnecessary responses outweighs the overhead of polling. > > Checking that your peers still want the response before sending it doesn't > solve the problem of broken paths, because your peers don't necessarily > know whether they still want it until they've checked with their peers, and > their peers, and so on all the way back to the client. You could send a > notification all the way back to the client, which then re-requests the > key, but by that time you've nearly equalled the 4 requests per client of > the current system, but with request-notification-request-response instead > of request-request-request-request-response. Notifications might also make > it possible to discovery the identity of the client through round-trip time > measurement.
----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.06.04 - 16:24:57GMT ----- With a client making just one SSK requests every 45 minutes, and 500 clients doing the same (f.e. frost on a common board or thaw on a popular index), currently it results in some node (which location is closests to the key location) seeing the requests at rate one every 5.4 seconds. That's how much the benefit can be for the single node handling a single key. As frost requests several keys for single board, and there are many boards, that's approximately how much can be saved for each node. Add here frost filesharing, thaw, freemulet, ARK, and other SSK based traffic - as freenet grows, the SSK unsuccessful requests flood is going to get worse if nothing done. ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.06.07 - 10:31:14GMT ----- The number of clients is irrelevant to the comparison, because it's the same with or without failure tables. With failure tables, the network will have to handle one request per client to set up the failure tables, plus one notification per client when the data is ready, plus one reply to the notification per client (which might take the form of a request) indicating that the client still wants the data. Without failure tables, the network has to handle one request per client every 45 minutes. So the question is, how much bandwidth are we saving by switching from one request every 45 minutes to one request, one notification and one notification reply? As far as I can see, if boards are updated every few hours on average, the answer is "not much". I'm not saying it's a bad idea, I'm just questioning whether the additional complexity is worth the (so far unquantified) bandwidth savings. ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.07 - 14:48:09GMT ----- One request per client-node per key per 45 minutes. Without failure tables, each request goes the full distance. With failure tables, for a popular key, most of them won't go more than a few hops - and in fact many of them will not start in the first place because the originator node still has a failure table entry, and is subscribed. This is a *massive* saving on the total number of request-hops and bandwidth etc used by those requests. When the data is eventually found, if every node is subscribed, there will be an offer sent along every link, but only one actual data transfer will be sent per node. Since we assumed every node is interested in the key, requests would have occurred on each of these links anyway. So there should be a significant saving overall. Not to mention side benefits such as greatly reduced latency etc. ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.06.07 - 15:45:07GMT ----- In a static network I agree that it's a big saving. But if there's significant churn then there's going to be significant overhead involved in maintaining a subscription tree (or subscription web or whatever) for each key. All I'm asking is, can we quantify the savings? Can we say "if the average node lifetime is x and the polling interval is y and the update interval is z, we'll save bandwidth if at least n clients want the key"? Maybe the question is premature because you're still designing the subscription system - I'm not asking you to give me a number, I'm just saying we should think about how we would go about calculating that number. ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.07 - 16:38:07GMT ----- Good question. First subquery: What form does that overhead take? Because it's all unreliable, we let a request through every 10 minutes or so. But if churn happens, we let more requests through. That's basically the overhead - some requests which would have happened anyway. HOWEVER it comes with some extra: For 10 minutes, the old branch is still subscribed along with the new branch (after that, it drops out, although we might offer it for that first hop anyway up to some period). So what that means is that when the data is eventually found, it goes to the old branch too - that's bandwidth overhead. If there is only one client and he requests the data less than every 10 minutes, the overhead would be zero: the situation would be identical to what it is at present. As far as I can see, it improves from there - modulo the "extra branches" problem. One way to reduce the extra branches problem is to reconfirm the subscription when it triggers i.e. propagate the notification faster than the data - but that would probably lead to a lot of extra complexity, and not be worth it in general. ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.06.07 - 22:50:25GMT ----- Thanks for that. Could you clarify a couple of points? 1. When you talk about the old branch and the new branch, do these branches go upstream from the current node (ie towards the requester) or downstream (ie towards the key)? 2. How can the old branch still be subscribed along with the new branch? Haven't we lost contact with the old branch? 3. What happens if we lose all our subscribers (upstream neighbours that have requested the key)? Do we unsubscribe from our downstream neighbour, or just ignore the notification when it arrives? ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.07 - 23:11:02GMT ----- 1. Towards the key. We have a number of incoming requests, which we terminate and remember. The first request caused us to route towards the key, and it DNFed. But later we get another request, and at that instant there is a better node we could route the request towards. So we do. That creates the second branch. We now have multiple incoming subscribers and two outgoing subscriptions. 2. We are still subscribed to it. And if we have bidirectional subscriptions, we would offer the key to that branch, if we got it from the other branch. 3. We never unsubscribe, we never tell our downstream nodes we have lost subscription, etc. An important design goal is simplicity. If we get an offer, and we're not interested in the key because we don't have any subscribers and we don't want it for ourself, we don't respond to the offer. ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.06.07 - 23:54:48GMT ----- OK thanks, one more question. :-) 4. What happens if we lose our downstream neighbour(s) due to churn? Upstream nodes will still expect to get the key from us when it's published - do we automatically subscribe through the best remaining downstream node? What if all remaining nodes are already subscribed through us? It seems like we could end up with an island of nodes that are all subscribed through each other, but are disconnected from the location where the key will be published. Any request that hits the island will turn into a subscription, increasing the size of the island. ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.08 - 12:52:04GMT ----- We could very well end up as an island. This mechanism is inherently unreliable. One protection we have is that if we get a request, and our subscribed-through node is no longer available, or there is a better node we could route through, we will route it despite the failure table. Another protection is that after 10 minutes we will route the request anyway. A third protection, if we agree on it (Anonymous@ is strongly opposed) is bidirectional subscription: If we get the data, we offer it to all nodes which either have subscribed to us, or we have subscribed to. It increases the average success rate, reduces the average latency, and reduces the resource usage for more popular files. It doesn't guarantee anything. Making guarantees makes the protocol immensely complex. I'm not saying it's impossible, but right now it's hard, and we need something easy. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <https://emu.freenetproject.org/pipermail/tech/attachments/20070808/bb05fec5/attachment.pgp>