[Tech] [old frost] bandwidth usage improvements for old nodes (and much more!)

Matthew Toseland Wed, 8 Aug 2007 16:26:08 +0100

On Wednesday 08 August 2007 16:25, Matthew Toseland wrote:
> On Wednesday 08 August 2007 16:23, Matthew Toseland wrote:
> > On Wednesday 08 August 2007 16:22, Matthew Toseland wrote:
> > > On Wednesday 08 August 2007 16:20, Matthew Toseland wrote:
> > > > On Wednesday 08 August 2007 16:19, Matthew Toseland wrote:
> > > > > On Wednesday 08 August 2007 16:17, Matthew Toseland wrote:
> > > > > > On Wednesday 08 August 2007 16:05, Matthew Toseland wrote:
> > > > > > > Continued at end to minimise confusion!
> > > > > > >
> > > > > > > On Wednesday 08 August 2007 15:59, Matthew Toseland wrote:
> > > > > > > > Unfortunately this thread is rather rambling, it includes
> > > > > > > > lots of discussion on token passing as well as the original
> > > > > > > > premise.
> > > > > > > >
> > > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.04.25
> > > > > > > > - 20:18:36GMT -----
> > > > > > > >
> > > > > > > > I made some measurements on how freenet node behaves if
> > > > > > > > bandwidth limit is set low: 10KBps and downto 6KBps
> > > > > > > > (specificially, input bandwidth limit; output bandwidth limit
> > > > > > > > was set to at least 15KBps but as expected factually used
> > > > > > > > output bandwidth is comparable (just slightly above) with
> > > > > > > > factually used input bandwidth). The node itself was running
> > > > > > > > frost but no
> > > > > > > > uploads/downloads, so absolute majority of network traffic
> > > > > > > > was forwarded CHK/SSK
> > > > > > > > requests/inserts.
> > > > > > > >
> > > > > > > > Results are interesting enough: CHK traffic becomes as low as
> > > > > > > > 5% (in packets) of CHK+SSK, while at least 92% of SSK
> > > > > > > > requests were not satisfied for assorted failures (plus quite
> > > > > > > > some more certainly resulted in NotFound response due to
> > > > > > > > missing the key in whole network, but I don't have the
> > > > > > > > number). This makes low traffic node working highly
> > > > > > > > inefficient and improportionally slow; this also slows down
> > > > > > > > its peers with all the extra reject traffic. Worse, input
> > > > > > > > bandwidth sometimes goes over set limit, suggesting that on
> > > > > > > > hardware 33600/56000 Kbps modem and even ISDN things will
> > > > > > > > just get worse due to increased delays.
> > > > > > > >
> > > > > > > > Another thing to note: low bandwidth node (LBN) almost
> > > > > > > > exclusively reject requests with "input bandwidth liability"
> > > > > > > > reason, and extremely rarely other reasons.
> > > > > > > >
> > > > > > > > Speculating a bit, the same picture will likely be observed
> > > > > > > > for peers of fast node (1Mbps or more) with many peers having
> > > > > > > > typical home connection of 256Kbps or less.
> > > > > > > >
> > > > > > > > Not sure if simulations ever showed anything like this, but
> > > > > > > > contributing to network mostly SSK service (and absolute
> > > > > > > > majority of SSK requests fail!) is rather useless: optimally
> > > > > > > > working network is supposed to transfer at least one CHK
> > > > > > > > block for each SSK key, and typically much much more (single
> > > > > > > > 10MB file consists of 481 CHK blocks!), and even if you found
> > > > > > > > SSK but not CHK the SSK points to, then you failed to find
> > > > > > > > information you requested.
> > > > > > > >
> > > > > > > > OK to make the long story short[er], at the end of this
> > > > > > > > message you will find a small patch that noticably improves
> > > > > > > > LBN situation. Idea is to reserve some bandwidth for CHK
> > > > > > > > transfers (and SSK inserts, as those are too rare to
> > > > > > > > penalize, and more valuable than requests). The line directly
> > > > > > > > before the inserted one implicitly penalizes CHK transfers
> > > > > > > > (as much smaller SSK requests tend to rereserve bandwidth the
> > > > > > > > next moment it got released after CHK transfer finish, while
> > > > > > > > much larger CHK requests do not have such good chance), so
> > > > > > > > bandwidth should be reserved for 2 CHKs at least (and tests
> > > > > > > > show that's enough to make a
> > > > > > > > difference).
> > > > > > > >
> > > > > > > > Another thing I tried was increasing the 90 seconds period up
> > > > > > > > to 120. That had some (no numbers here; just "noticeable but
> > > > > > > > small") positive effect on making traffic smoother and
> > > > > > > > staying closer to set limit, without jumping up and down too
> > > > > > > > much. Where the 90 seconds number came from anyway, and how
> > > > > > > > dangerous 120 could be?
> > > > > > > >
> > > > > > > > Some pros observed and/or thought out during tests of the
> > > > > > > > patch: - I observe increase of output payload by approx. 15%
> > > > > > > > (of total traffic), making LBN more useful for its peers.
> > > > > > > > - the change is negligibly small for faster nodes so should
> > > > > > > > not break anything globally.
> > > > > > > > - entire network SSK flood traffic will be toned down a
> > > > > > > > little bit (at temporary overloaded nodes only), additionally
> > > > > > > > simplifying life for LBNs: after all, requesting the same SSK
> > > > > > > > every 15 seconds for 35 hours, total 8100 times (factual
> > > > > > > > numbers from one of the test before this patch applied; there
> > > > > > > > are over 60 other SSKs that were requested more than 1000
> > > > > > > > times during the same period) is just way too much, SSKs are
> > > > > > > > not inserted into network THAT fast. [does it worth to
> > > > > > > > remember recently seen SSK requests, and do not forward them
> > > > > > > > if same request was already forwarded within last 10 minutes
> > > > > > > > and resulted in DNF/RNF? Table of recently requested SSKs
> > > > > > > > that are closest to the node location should not be too
> > > > > > > > big?].
> > > > > > > >
> > > > > > > > And contras:
> > > > > > > > - in exceptional conditions (specificially, with less than 2
> > > > > > > > incoming CHK requests per 90 seconds; factually I observe 2-7
> > > > > > > > CHK requests per seconds, that's 180-630 per 90 seconds)
> > > > > > > > notwithstanding node bandwidth speed, up to 800 Bps might end
> > > > > > > > being unused. For high bandwidth node that's just way too
> > > > > > > > small to notice, for LBN that's still acceptable (10% of
> > > > > > > > 56Kbps) and will decrease roundtrip delays a bit which is
> > > > > > > > always a good thing for so slow links.
> > > > > > > >
> > > > > > > > Other notes:
> > > > > > > > - distribution of location closeness/number of SSK requests
> > > > > > > > is very nice: only SSK requests with location very close to
> > > > > > > > node location get repeated frequently; farther SSK location
> > > > > > > > is, less requests the node sees, with those SSKs seen only
> > > > > > > > once or two times per 1-2 days period are distributed evenly
> > > > > > > > among location space. This suggests that routing is working
> > > > > > > > fine. - As far as I understand, if input bandwidth
> > > > > > > > limit/liability exceeded (but a packet already received
> > > > > > > > anyway), CHK/SSK request gets instantly rejected (thus
> > > > > > > > throwing out received bytes while input bandwidth has no
> > > > > > > > spare volume!); only otherwise node checks if the requested
> > > > > > > > key exists in the storage. Heh? This feels like a serious bug
> > > > > > > > hurting overall network performance: better query storage and
> > > > > > > > hopefully send back result (or still reject if the key not
> > > > > > > > found locally) rather than wait for retry request to waste
> > > > > > > > more input bandwidth. At least for SSK reject and reply are
> > > > > > > > comparable in output bandwidth usage, so worth a little delay
> > > > > > > > in response. Or do I miss something?
> > > > > > > >
> > > > > > > > ===
> > > > > > > > diff --git a/freenet/src/freenet/node/NodeStats.java
> > > > > > > > b/freenet/src/freenet/node/NodeStats.java
> > > > > > > > index 3b091b4..fb9f8b9 100644
> > > > > > > > --- a/freenet/src/freenet/node/NodeStats.java
> > > > > > > > +++ b/freenet/src/freenet/node/NodeStats.java
> > > > > > > > @@ -414,6 +414,7 @@ public class NodeStats implements
> > > > > > > > Persistable {
> > > > > > > >
> > > > > > > > successfulChkInsertBytesReceivedAverage.currentValue() *
> > > > > > > > node.getNumCHKInserts() +
> > > > > > > >
> > > > > > > > successfulSskInsertBytesReceivedAverage.currentValue() *
> > > > > > > > node.getNumSSKInserts();
> > > > > > > >                 bandwidthLiabilityInput +=
> > > > > > > > getSuccessfulBytes(isSSK, isInsert, true).currentValue();
> > > > > > > > +               if (isSSK && !isInsert)
> > > > > > > > bandwidthLiabilityInput+=successfulChkFetchBytesReceivedAvera
> > > > > > > >ge .c ur re nt Va lu
> > > > > > > > e()+successfulChkInsertBytesReceivedAverage.currentValue();
> > > > > > > > // slightly penalize SSK requests by reserving bandwidth for
> > > > > > > > 2 additional CHK transfers (or SSK inserts if any)
> > > > > > > >                 double bandwidthAvailableInput =
> > > > > > > >                         node.getInputBandwidthLimit() * 90;
> > > > > > > > // 90 seconds at full power
> > > > > > > >                 if(bandwidthLiabilityInput >
> > > > > > > > bandwidthAvailableInput) { ===
> > > > > > > >
> > > > > > > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.04.26 -
> > > > > > > > 16:56:59GMT -----
> > > > > > > >
> > > > > > > > Most SSK requests fail. They DNF. The reason for this is most
> > > > > > > > SSK requests are polling for data that has not yet been
> > > > > > > > inserted.
> > > > > > > >
> > > > > > > > Bandwidth liability is usually the main reason for rejection.
> > > > > > > > If we reach most of the other reasons, there is a problem -
> > > > > > > > usually a cyclical problem. The main reason for it is to
> > > > > > > > ensure that we don't accept so many requests that some of
> > > > > > > > them needlessly timeout even though they succeeded. The
> > > > > > > > timeout is 120 seconds, so we need the actual transfer to
> > > > > > > > take less than this; on a request, 30 seconds seems a
> > > > > > > > reasonable upper bound for the search time. We don't throw
> > > > > > > > out many bytes when we reject a
> > > > > > > > request/insert because the bulk of it hasn't been sent yet,
> > > > > > > > except with SSKs where typically a little under half of the
> > > > > > > > total bytes will have been moved. Ideally we wouldn't send
> > > > > > > > requests until we have a good idea that they will be
> > > > > > > > accepted, but token passing load balancing is a long way off,
> > > > > > > > not likely to happen for 0.7.0.
> > > > > > > >
> > > > > > > > We cannot control input bandwidth usage precisely.
> > > > > > > >
> > > > > > > > Any more info on SSK flooding? Is it simply Frost?
> > > > > > > >
> > > > > > > > We can add a failure table, we had one before, however a
> > > > > > > > failure table which results in actually blocking keys can be
> > > > > > > > extremely dangerous; what I had envisaged was "per node
> > > > > > > > failure tables" i.e. reroute requests which have recently
> > > > > > > > failed to a different node since we know it isn't where it's
> > > > > > > > supposed to be.
> > > > > > > >
> > > > > > > > On what do you base the assertion about key closeness? It
> > > > > > > > would be nice to have a histogram or circle on the stats
> > > > > > > > pages showing recent keys on the keyspace - can you write a
> > > > > > > > patch?
> > > > > > > >
> > > > > > > > As far as your patch goes, surely rejecting more SSK requests
> > > > > > > > would be counterproductive as it wastes bandwidth? Shouldn't
> > > > > > > > a slow node accept those requests it's likely to be able to
> > > > > > > > handle?
> > > > > > > >
> > > > > > > > I can see an argument that we shouldn't prefer SSKs, and that
> > > > > > > > on slow nodes we do prefer SSKs... I'm not sure the above is
> > > > > > > > the right way to deal with it though. The effect of the patch
> > > > > > > > would be to never accept any SSKs unless we have plenty of
> > > > > > > > spare bandwidth, correct?
> > > > > > > >
> > > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.04.26
> > > > > > > > - 18:41:32GMT -----
> > > > > > > >
> > > > > > > > > Ideally we wouldn't send requests until we have a good idea
> > > > > > > > > that they will
> > > > > > > >
> > > > > > > > be accepted, but token passing load balancing is a long way
> > > > > > > > off, not likely to happen for 0.7.0.
> > > > > > > >
> > > > > > > > Well, even current algorithm implementation has certain room
> > > > > > > > for improvement. Here is the typical numbers I observe:
> > > > > > > >
> > > > > > > > ===
> > > > > > > > unclaimedFIFO Message Counts
> > > > > > > >     * FNPRejectOverload: 89 (45.2%)
> > > > > > > >     * FNPInsertTransfersCompleted: 59 (29.9%)
> > > > > > > >     * FNPDataNotFound: 15 (7.6%)
> > > > > > > >     * packetTransmit: 12 (6.1%)
> > > > > > > >     * FNPRejectLoop: 7 (3.6%)
> > > > > > > >     * FNPAccepted: 6 (3.0%)
> > > > > > > >     * FNPSwapRejected: 4 (2.0%)
> > > > > > > >     * FNPDataInsertRejected: 4 (2.0%)
> > > > > > > >     * FNPRouteNotFound: 1 (0.5%)
> > > > > > > >     * Unclaimed Messages Considered: 197
> > > > > > > > ===
> > > > > > > >
> > > > > > > > FNPRejectOverload always stays at top sometimes with hundreds
> > > > > > > > messages (for the last hour before unclaimed messages expire,
> > > > > > > > that's alot), and so indicates that there is some bug (or
> > > > > > > > bugs) with bandwidth limiting obeying.
> > > > > > > >
> > > > > > > > > Any more info on SSK flooding? Is it simply Frost?
> > > > > > > >
> > > > > > > > Not local frost for sure, it generates just several SSK
> > > > > > > > simultaneous requests (by default max 8: 6 for boards plus 2
> > > > > > > > for filesharing, AFAIR; practically 2-4 simutaneous requests
> > > > > > > > most of the time). Other 100 SSK requests (without proposed
> > > > > > > > patch) are forwarded ones.
> > > > > > > >
> > > > > > > > >We can add a failure table, we had one before, however a
> > > > > > > > > failure table which
> > > > > > > >
> > > > > > > > results in actually blocking keys can be extremely dangerous;
> > > > > > > >
> > > > > > > > Is it, having timeout of max few minutes (i.e. at least few
> > > > > > > > times less than SSK propagation time visible with frost
> > > > > > > > messages)? Is it more dangerous than current wastage of
> > > > > > > > bandwith for same SSK key requests several times per minute?
> > > > > > > > Had some simulations been done on that in the past?
> > > > > > > >
> > > > > > > > BTW, isn't the observed very low store hit rate results from
> > > > > > > > prioritising the likely-to-fail SSKs?
> > > > > > > >
> > > > > > > > BTW2 the failure table could also act as a targetted content
> > > > > > > > propagation mechanism: if a node sees SSK insert for a
> > > > > > > > temporary blacklisted (non-existing) SSK, then forwarding the
> > > > > > > > insert (more likely insert copy, for security reasons and
> > > > > > > > routing sake) to the original requestor should speed up
> > > > > > > > propagaton of new SSKs toward the nodes that already
> > > > > > > > anticipate/await for them.
> > > > > > > >
> > > > > > > > >what I had envisaged was "per node failure tables" i.e.
> > > > > > > > > reroute requests
> > > > > > > >
> > > > > > > > which have recently failed to a different node since we know
> > > > > > > > it isn't where it's supposed to be.
> > > > > > > >
> > > > > > > > At a glance, very nice idea. But LBNs typically answer with
> > > > > > > > reject, not DFN... even with current code. Probably such
> > > > > > > > rerouting will even further increase SSK traffic toward LBNs,
> > > > > > > > and get sharply increased volume of SSK rejects as result.
> > > > > > > > Hmm, some testing/simulation seems really needed here.
> > > > > > > >
> > > > > > > > >On what do you base the assertion about key closeness? It
> > > > > > > > > would be nice to
> > > > > > > >
> > > > > > > > have a histogram or circle on the stats pages showing recent
> > > > > > > > keys on the keyspace - can you write a patch?
> > > > > > > >
> > > > > > > > Mmmm... in fact I just added custom logging, then a wild
> > > > > > > > combination of grep/sed/sort/uniq to analyze the logs. But
> > > > > > > > let me think, maybe visualizing a couple of stats files I
> > > > > > > > operate with will be rather trivial...
> > > > > > > >
> > > > > > > > But I would rather stay away from stats page graphics at this
> > > > > > > > time, as the stats files I operate  (filtered+sorted+uniqued)
> > > > > > > > with are rather large, 20-50Mb each - too much memory for the
> > > > > > > > toy. Unless your 'recent' means just 10-15 minutes at most?
> > > > > > > >
> > > > > > > > >As far as your patch goes, surely rejecting more SSK
> > > > > > > > > requests would be
> > > > > > > >
> > > > > > > > counterproductive as it wastes bandwidth?
> > > > > > > >
> > > > > > > > Tests show the opposite: without the patch payload output at
> > > > > > > > stats page never exceeded 38%, with patch it becomes 53% or
> > > > > > > > little more after several minutes upon node restart. So, with
> > > > > > > > the patch SSK/CHK forwarding behaviour 'feels' logical:
> > > > > > > >
> > > > > > > > without patch:
> > > > > > > > - just several CHKs, and over over 100 SSKs very typical.
> > > > > > > >
> > > > > > > > with patch:
> > > > > > > > - most of the time (say, 75%) number of currently forwarded
> > > > > > > > CHK requests+inserts approximately equals to the number of
> > > > > > > > SSK requests+inserts (i.e. 10-25 each, depending on set
> > > > > > > > bandwidth limit); - sometimes (say, 10%) CHK requests start
> > > > > > > > to prevail, but current SSK requests+inserts seems never go
> > > > > > > > below the amount which CHK get at max without patch (i.e.
> > > > > > > > 6-10). This is very typical when number of CHK inserts gets
> > > > > > > > several times higher than CHK requests (close fast peer
> > > > > > > > inserts something really large?). - other times (say, 15%)
> > > > > > > > CHK requests+inserts flow does not saturate bandwidth, and
> > > > > > > > number of SSK requests quickly climbs to 50 or even over 100+
> > > > > > > > as it typically gets without the patch.
> > > > > > > >
> > > > > > > > That's for LBN. Raising input bandwidth allotment, number of
> > > > > > > > SSKs quickly grows resembling the situation without the
> > > > > > > > patch.
> > > > > > > >
> > > > > > > > So that's why I suggest reserving bandwidth for 2 CHK
> > > > > > > > transfers; 3 would kill SSKs, 1 still makes SSKs to seriously
> > > > > > > > prevail over CHKs (but nonetheless gives quite better ratio,
> > > > > > > > so is a legal value to try if the value of 2 alarms you too
> > > > > > > > much). Just, in case of reserving bandwidth for 1 extra CHK
> > > > > > > > the proposed patch is not really needed: simply comment out
> > > > > > > > the line starting with "bandwidthLiabilityInput +=" and
> > > > > > > > decrease 90 seconds constant to 80 (10 seconds is roughly how
> > > > > > > > much 33.6Kbod modem takes to transmit a single CHK - using
> > > > > > > > anything noticeably slower than 28800/33600bod for freenet
> > > > > > > > will not ever work well anyway).
> > > > > > > >
> > > > > > > > >Shouldn't a slow node accept those requests it's likely to
> > > > > > > > > be able to handle?
> > > > > > > >
> > > > > > > > Considering the very high chance of SSK request failures (at
> > > > > > > > lest 92%), I would say the answer is no. But with sane SSK
> > > > > > > > failure rate (say 75% or below) SSK requests would likely not
> > > > > > > > waste the limited thus precious LBN bandwidth so fruitlessly.
> > > > > > > >
> > > > > > > > The problem, in my belief, is too small size of UDP packets
> > > > > > > > if SSK requests prevail: PPP(oE)/TCP/FNP overhead becomes too
> > > > > > > > large while LBNs, unlike faster link nodes, almost never
> > > > > > > > coalesce packets, obviously.
> > > > > > > >
> > > > > > > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.04.27 -
> > > > > > > > 17:19:24GMT -----
> > > > > > > >
> > > > > > > > The current algorithm is working, on most nodes, far better
> > > > > > > > than it has in *ages*. I'm at 62% of a 700MB ISO, I started
> > > > > > > > inserting it yesterday morning, and only a few of my peers
> > > > > > > > are backed off - frequently none are backed off, right now
> > > > > > > > it's 11 connected, 6 backed off, which is more backed off
> > > > > > > > than I've seen for quite a while.
> > > > > > > >
> > > > > > > > Re failure tables: Yes it is extremely dangerous. It can
> > > > > > > > result in self-reinforcing key censorship, either as an
> > > > > > > > attack or just occurring naturally. This happened on 0.5. And
> > > > > > > > the hit ratio is only for CHKs iirc.
> > > > > > > >
> > > > > > > > Even LBNs don't often send local RejectedOverload's on SSKs
> > > > > > > > *once they have accepted them*. They may relay downstream
> > > > > > > > RO's but that is not fatal. And if they reject some requests,
> > > > > > > > so what, it's a slow node, it's bound to reject some requests
> > > > > > > > with the current load balancing system.
> > > > > > > >
> > > > > > > > 10-15 minutes would be interesting. We already show a circle
> > > > > > > > and histogram of nearby nodes from swapping and of our peers,
> > > > > > > > you'd just have to add another one. It would be good to have
> > > > > > > > a visual proof that routing is working on the level of
> > > > > > > > adhering to node specialisations. I didn't expect it to be
> > > > > > > > working given the load: I'm surprised that it does, it's an
> > > > > > > > interesting result.
> > > > > > > >
> > > > > > > > Packet size has nothing to do with it, ethernet has a 1472
> > > > > > > > byte maximum. Dial-up has 576 bytes max, but we ignore it,
> > > > > > > > and use fragmented packets (this sucks, obviously, as it
> > > > > > > > greatly increases the chance of losing a packet and having to
> > > > > > > > retransmit it).
> > > > > > > >
> > > > > > > > Please explain why the patch doesn't result in never
> > > > > > > > accepting a single SSK?
> > > > > > > >
> > > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.04.27
> > > > > > > > - 19:31:14GMT -----
> > > > > > > >
> > > > > > > > >Packet size has nothing to do with it, ethernet has a 1472
> > > > > > > > > byte maximum.
> > > > > > > >
> > > > > > > > Dial-up has 576 bytes max, but we ignore it, and use
> > > > > > > > fragmented packets (this sucks, obviously, as it greatly
> > > > > > > > increases the chance of losing a packet and having to
> > > > > > > > retransmit it).
> > > > > > > >
> > > > > > > > I am talking about typical/average packet size, not MTU.
> > > > > > > > LBNs, unlike faster nodes, rarely have a chance to coalesce
> > > > > > > > reject responses (over max 100ms), and thus send
> > > > > > > > improportionally more tiny packets resulting in much higher
> > > > > > > > protocols overhead. Thus having LBNs to mostly cater SSKs not
> > > > > > > > CHKs results in lowest imaginable usefulness of LBNs for
> > > > > > > > network as a whole.
> > > > > > > >
> > > > > > > > BTW in my experience typical/default dialup/PPP MTU is 1500
> > > > > > > > minus link level headers, like ethernet. 576 is a reasonable
> > > > > > > > adjustment for interactive traffic like ssh but I fail to
> > > > > > > > remember if it was used as default since the time the super
> > > > > > > > fast 28800 bod modems became common.
> > > > > > > >
> > > > > > > > :) 1400+ is the typical size of GPRS PPP packets too, and the
> > > > > > > > : same
> > > > > > > >
> > > > > > > > holds true for other popular wireless mediae like BlueTooth
> > > > > > > > or WiFi; so I have no concerns regarding IP fragmentation.
> > > > > > > >
> > > > > > > > > Please explain why the patch doesn't result in never
> > > > > > > > > accepting a single SSK?
> > > > > > > >
> > > > > > > > I can not. :) Can you explain why the current code that
> > > > > > > > penalizes CHKs still gives 5% for them, even if CHKs are 25
> > > > > > > > times larger and similarly less frequent so have really hard
> > > > > > > > time to arrive at the exact moment when bandwidth liability
> > > > > > > > is not maxed out?
> > > > > > > >
> > > > > > > > Seriously, I believe that goes with 2 facts:
> > > > > > > >
> > > > > > > > - SSK requests are much more frequent, so any temporary drop
> > > > > > > > of CHK requests level enables node to quickly get a bunch of
> > > > > > > > new SSKs accepted for processing;
> > > > > > > > - the large CHK requests (at times while they prevail over
> > > > > > > > SSKs) tend to hit other limits too, like "output bandwidth
> > > > > > > > liability", "Insufficient input/output bandwidth" throttles.
> > > > > > > > Then the small SSK requests quickly pick up all the remaining
> > > > > > > > bandwidth bits.
> > > > > > > >
> > > > > > > > But currently I do not have relevant statistics to prove
> > > > > > > > that.
> > > > > > > >
> > > > > > > > Anyway, please commit the following patch - it should equal
> > > > > > > > out bandwidth rights for CHKs and SSKs at least half way
> > > > > > > > toward fair/expected distribution (and the change will make
> > > > > > > > any difference for high-/over-loaded nodes only). Once most
> > > > > > > > of my peers (and their peers) update, I will study the new
> > > > > > > > node traffic forwarding efficiency and behavior at different
> > > > > > > > bandwidth limits and with different penalization levels again
> > > > > > > > - and then will be in better position to prove the original
> > > > > > > > proposal of reserving bandwidth for 2 CHKs is optimal (or
> > > > > > > > maybe withdraw it).
> > > > > > > >
> > > > > > > > ===
> > > > > > > > diff --git a/freenet/src/freenet/node/NodeStats.java
> > > > > > > > b/freenet/src/freenet/node/NodeStats.java
> > > > > > > > index 3b091b4..98c82c3 100644
> > > > > > > > --- a/freenet/src/freenet/node/NodeStats.java
> > > > > > > > +++ b/freenet/src/freenet/node/NodeStats.java
> > > > > > > > @@ -399,9 +399,8 @@ public class NodeStats implements
> > > > > > > > Persistable {
> > > > > > > >
> > > > > > > > successfulSskFetchBytesSentAverage.currentValue() *
> > > > > > > > node.getNumSSKRequests() +
> > > > > > > >
> > > > > > > > successfulChkInsertBytesSentAverage.currentValue() *
> > > > > > > > node.getNumCHKInserts() +
> > > > > > > >
> > > > > > > > successfulSskInsertBytesSentAverage.currentValue() *
> > > > > > > > node.getNumSSKInserts();
> > > > > > > > -               bandwidthLiabilityOutput +=
> > > > > > > > getSuccessfulBytes(isSSK, isInsert, false).currentValue();
> > > > > > > >                 double bandwidthAvailableOutput =
> > > > > > > > -                       node.getOutputBandwidthLimit() * 90;
> > > > > > > > // 90 seconds at full power; we have to leave some time for
> > > > > > > > the search as well +
> > > > > > > > node.getOutputBandwidthLimit() * 80; // 80 seconds at full
> > > > > > > > power; we have to leave some time for the search as well
> > > > > > > > bandwidthAvailableOutput *=
> > > > > > > > NodeStats.FRACTION_OF_BANDWIDTH_USED_BY_REQUESTS;
> > > > > > > >                 if(bandwidthLiabilityOutput >
> > > > > > > > bandwidthAvailableOutput) {
> > > > > > > > preemptiveRejectReasons.inc("Output bandwidth liability"); @@
> > > > > > > > -413,9 +412,8 @@ public class NodeStats implements
> > > > > > > > Persistable {
> > > > > > > >
> > > > > > > > successfulSskFetchBytesReceivedAverage.currentValue() *
> > > > > > > > node.getNumSSKRequests() +
> > > > > > > >
> > > > > > > > successfulChkInsertBytesReceivedAverage.currentValue() *
> > > > > > > > node.getNumCHKInserts() +
> > > > > > > >
> > > > > > > > successfulSskInsertBytesReceivedAverage.currentValue() *
> > > > > > > > node.getNumSSKInserts();
> > > > > > > > -               bandwidthLiabilityInput +=
> > > > > > > > getSuccessfulBytes(isSSK, isInsert, true).currentValue();
> > > > > > > >                 double bandwidthAvailableInput =
> > > > > > > > -                       node.getInputBandwidthLimit() * 90;
> > > > > > > > // 90 seconds at full power
> > > > > > > > +                       node.getInputBandwidthLimit() * 80;
> > > > > > > > // 80 seconds at full power
> > > > > > > >                 if(bandwidthLiabilityInput >
> > > > > > > > bandwidthAvailableInput) { preemptiveRejectReasons.inc("Input
> > > > > > > > bandwidth liability"); return "Input bandwidth liability";
> > > > > > > > ===
> > > > > > > >
> > > > > > > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.04.28 -
> > > > > > > > 17:05:53GMT -----
> > > > > > > >
> > > > > > > > Why does assuming 80 seconds instead of 90 help? I would have
> > > > > > > > expected it to make the situation worse.
> > > > > > > >
> > > > > > > > Isn't what you want to increment the value you are
> > > > > > > > multiplying the CHK byte counters by if the request is an
> > > > > > > > SSK? In any case I'm not convinced - we accept 32x as many
> > > > > > > > SSKs as CHKs precisely because they use 32x less bandwidth.
> > > > > > > > As far as I can see incrementing the CHK counts but only on a
> > > > > > > > CHK would just result in us never accepting an SSK...
> > > > > > > >
> > > > > > > > But by all means continue to investigate.
> > > > > > > >
> > > > > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.04.30 -
> > > > > > > > 19:36:36GMT -----
> > > > > > > >
> > > > > > > > > we accept 32x as many SSKs as CHKs precisely because they
> > > > > > > > > use 32x less
> > > > > > > >
> > > > > > > > bandwidth.
> > > > > > > >
> > > > > > > > Sorry but I don't understand the rationale behind this. It
> > > > > > > > seems to be based on the assumption that equal resources
> > > > > > > > should be allocated to SSKs and CHKs, regardless of whether
> > > > > > > > there's equal demand for resources. If we're only getting,
> > > > > > > > say, 16 times as many SSK requests as CHK requests, would we
> > > > > > > > reject CHK requests to keep things "fair"?
> > > > > > > >
> > > > > > > > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.02 -
> > > > > > > > 16:13:52GMT -----
> > > > > > > >
> > > > > > > > Why should CHKs be prioritised over SSKs?
> > > > > > > >
> > > > > > > > What do you think of the patch I committed anyway?
> > > > > > >
> > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.02 -
> > > > > > > 17:03:52GMT -----
> > > > > > >
> > > > > > > > Why should CHKs be prioritised over SSKs?
> > > > > > >
> > > > > > > Because SSK success ratio is extremely low (remember the
> > > > > > > example I gave within initial message in the thread: 8100
> > > > > > > rejected requests for the same SSK key within 35 hours, and
> > > > > > > uncounted amount of the same key requests resulted in DNF - and
> > > > > > > such key is nowhere near being alone); linear programming would
> > > > > > > certainly suggest that SSK requests like they currently are
> > > > > > > should be totally disabled. I do not propose anything as
> > > > > > > drastic; but as SSKs currently use bandwidth very inefficiently
> > > > > > > I propose to tone them down heuristically to approximately CHK
> > > > > > > level while node is short on bandwidth (but let SSKs go as much
> > > > > > > as sending node needs if spare bandwidth available).
> > > > > > >
> > > > > > > [not that I meant to speak instead of mrogers]
> > > > > > >
> > > > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.03 -
> > > > > > > 13:54:38GMT -----
> > > > > > >
> > > > > > > >> Why should CHKs be prioritised over SSKs?
> > > > > > > >
> > > > > > > > Because SSK success ratio is extremely low
> > > > > > >
> > > > > > > That doesn't make sense for two reasons:
> > > > > > >
> > > > > > > 1) rejecting SSKs will make the SSK success ratio worse, not
> > > > > > > better 2) "SSK not found" is useful information - for example,
> > > > > > > that's how you discover the current version of a USK - but "SSK
> > > > > > > rejected for not being a CHK" is not useful to anyone
> > > > > > >
> > > > > > > Let me use an analogy: you're designing a Cisco router. Some of
> > > > > > > the packets it gets asked to forward will be small SSH packets,
> > > > > > > others will be large HTTP packets. Do you say "half our
> > > > > > > resources should go to SSH because we don't want to prioritise
> > > > > > > HTTP, so we'll only forward one HTTP packet for every 32 SSH
> > > > > > > packets"? If you answered yes, you just lost your job at Cisco.
> > > > > > > The router's job is to deliver packets to the best of its
> > > > > > > ability, not to decide what kind of packets the end hosts
> > > > > > > "should" be sending.
> > > > > >
> > > > > > ----- Anonymous ----- 2007.05.03 - 18:37:00GMT -----
> > > > > >
> > > > > > lets build an extreme case:
> > > > > > There is a packet type which is 1 MB in size and one which is 1
> > > > > > byte in size. Both get constantly send to a router at a higher
> > > > > > rate then the routers bandwidth quota allows it to forward the
> > > > > > packets. If the following rules apply not a single 1 MB packet
> > > > > > will get served: 1. the router doesn't care for the size of the
> > > > > > packet or penalises any kind of packet, all are considered equal
> > > > > > 2. if a packet arrives and the remaining bandwidth quota doesn't
> > > > > > allow to forward it, it gets instantly rejected
> > > > > > 3. no queueing of incomming packets is done (with part-allocation
> > > > > > of available bandwidth if packet size exceed the free quota)
> > > > > >
> > > > > > Afaik this is a possible situation for freenet, correct me if I
> > > > > > am wrong.
> > > > >
> > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.04 -
> > > > > 11:15:09GMT -----
> > > > >
> > > > > You're right, that pretty much describes the problem. I'm
> > > > > suggesting that we should fix it by modifying step 2:
> > > > >
> > > > > 2. if a packet arrives and the remaining bandwidth quota doesn't
> > > > > allow to forward *an average packet*, it gets instantly rejected
> > > > >
> > > > > The average is calculated over all packets, large and small. In
> > > > > other words we ask:
> > > > >
> > > > > a) how much bandwidth does a packet use, on average?
> > > > > b) is there enough bandwidth for another average packet?
> > > > >
> > > > > Let's say 75% of packets are 1 byte in size and 25% are 1 MB in
> > > > > size, so the average is 262,144.75 bytes. We accept packets (large
> > > > > or small) if we have at least 262,144.75 bytes of bandwidth left in
> > > > > the quota, and reject them otherwise. If there's a long stream of 1
> > > > > byte packets followed by a 1 MB packet we might go over quota
> > > > > temporarily, but we'll match the quota in the long term.
> > > > >
> > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.10 -
> > > > > 21:01:58GMT -----
> > > > >
> > > > > Oh no, getting stuff instantly rejected is most often not good (an
> > > > > exception would be certain kinds of realtime traffic, but that is
> > > > > not applicable for freenet at all). I have some experience with
> > > > > independant/competing ISPs and their broken traffic shaping routers
> > > > > that were always dropping packets not fitting current shaping
> > > > > limits; TCP performance was experiencing major hit there, and
> > > > > several TCP connections running under the same shaping were always
> > > > > taking seriously unfair bandwidth share (unless you get quite long
> > > > > intervals for stats, like 10+ minutes). Changing shaping processing
> > > > > by queueing an over-quota packet (even a single packet queue!) till
> > > > > the calculated average bandwidth allows to send the packet (thus in
> > > > > the end increasing roundtrip slightly) was always sufficient for
> > > > > TCP flow to work at 100% of the shaped level, and having
> > > > > simultaneous TCP streams to very equally share available bandwidth
> > > > > even for sub-second stats intervals, and there were no other
> > > > > working solution found (aside raising the shaping limit above the
> > > > > maximum speed of TCP peers).
> > > > >
> > > > > I am not sure that can be directly applied to the current freenet
> > > > > networking code; honestly, the mechanism of first quickly accepting
> > > > > packets and then slowly picking them using some kind of filters
> > > > > looks unneccessary complicated and performance inoptimal, to say
> > > > > least: I have another bright example why - the mechanism quite
> > > > > resembles the traditional O/S network packets handling (with
> > > > > received packets extracted from NIC at highest priority - during
> > > > > hardware interrupt, and then having CPU/server business logic
> > > > > failing to process all received packets leading to internal queues
> > > > > overflow), and after years and decades it is generally agreed that
> > > > > such approach does not work well for server applications; instead,
> > > > > linux for several years already has mechanism named NAPI (which is
> > > > > optional for some NIC drivers - check kernel config, but default
> > > > > and mandatory for most server-grade and/or 1Gb NIC drivers):
> > > > > hardware interrupt just sets a flag/semaphore that NIC has received
> > > > > something, and instantly quits leaving the particular NIC interrupt
> > > > > line disabled (actual algorithm is a little bit more complex,
> > > > > allowing hardware interrupt to perform extraction of a very limited
> > > > > number of packets if the host is very idle). Then there is a lowest
> > > > > priority kernel thread ("software interrupt") woken up by the
> > > > > flag/semaphore starts reading packets from NIC into O/S queues
> > > > > (where user-level read()s get satisfied from), extracting only
> > > > > limited number of packets at a time (then yielding CPU for other
> > > > > runnable processes), and reenabling the NIC interrupts only when it
> > > > > managed to empty the hardware queue - with TCP flow control, and
> > > > > with the modern ethernet hardware flow control that works
> > > > > exceptionally well. Thus server business logic (i.e. useful work)
> > > > > running at priority much higher than software interrupt thread is
> > > > > never starved from CPU by hardware interrupts that first pull in
> > > > > packets which then result in CPU wasted to drop them from overflown
> > > > > system queue - resulting in smooth behaviour and best sustained
> > > > > performance.
> > > > >
> > > > > Or in short - on overload, delaying input packets
> > > > > reading/processing is better than dropping or rejecting them
> > > > > instantly.
> > > > >
> > > > > Toad - if you know a simple way to delay freenet reads from UDP
> > > > > socket in order to enforce configured input bandwidth limit, please
> > > > > do so. (And with that UDP read delay, I would be very interested to
> > > > > test freenet node without other input bandwidth limiters aside
> > > > > input bandwidth liability used - chances that the UDP socket read
> > > > > delay will be sufficient for quality shaping, with the valuable
> > > > > help of sending node tracking the roundtrip - an already well
> > > > > implemented feature).
> > > > >
> > > > > If the delay can not be done easily with the current codebase, I
> > > > > will consider doing major rewrite of the traffic accepting code
> > > > > part. Not of highest priority tho, due to anticipated large amount
> > > > > of work - but those high fruits look big and tasty.
> > > >
> > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.11 -
> > > > 14:56:37GMT -----
> > > >
> > > > > I am not sure that can be directly applied to the current freenet
> > > > > networking
> > > >
> > > > code;
> > > >
> > > > We're working on an idea called token-passing that's supposed to
> > > > address this: you can only send a search (request/insert) to a peer
> > > > if you have a flow control token from that peer. If you don't have a
> > > > token you either keep the search in a queue until you receive a
> > > > token, or send it to the next-best peer if the queue is full.
> > > >
> > > > > the mechanism quite resembles the traditional O/S network packets
> > > > > handling
> > > >
> > > > (with received packets extracted from NIC at highest priority -
> > > > during hardware interrupt, and then having CPU/server business logic
> > > > failing to process all received packets leading to internal queues
> > > > overflow)
> > > >
> > > > Interesting point - in the new congestion control layer, maybe the
> > > > UDP reader shouldn't advance the receiver window until the internal
> > > > queues have dropped below a certain size... but it might be tricky to
> > > > implement because the internal queues all belong to different
> > > > threads...
> > > >
> > > > > If the delay can not be done easily with the current codebase, I
> > > > > will
> > > >
> > > > consider doing major rewrite of the traffic accepting code part.
> > > >
> > > > This is due to be rewritten soon anyway, so now's probably a good
> > > > time to make suggestions.
> > > >
> > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.14 -
> > > > 16:46:24GMT -----
> > > >
> > > > While token passing would indeed smooth the traffic out, it feels
> > > > excessive:
> > > >
> > > > - it adds extra traffic;
> > > > - it creates additional traffic patterns, that quite simplify attacks
> > > > (like those aiming at reliably proving that a particular request
> > > > originates from attacked node) against a node which all connections
> > > > are monitored (by ISP), and some of them are fast but compromised
> > > > (compromised peers).
> > > > - it requires to pull a multidimensional set of heurictics on whom to
> > > > send new token out of a thin air, and those heuristics will tend to
> > > > disagree for different connection types.
> > > >
> > > > The method of delaying network reads (thats important - and AFAIK the
> > > > only major missing thing to get shaping rolling smoothly already)
> > > > should work similarly well (might be even better): just consider the
> > > > metric 'the current peer roundtrip time is lower than the [peer]
> > > > average roundtrip time' as equivalence of 'the peer gave us few
> > > > tokens', and enjoy the bandwidth/crypt(CPU) free virtual token
> > > > passing which obeys both hardware/ISP traffic shaping imposed limits,
> > > > as well as software configured limits - whichever is stricter.
> > > >
> > > > So I currently discorage implementing explicit token passing, in
> > > > favor of lower, equially tasty fruit.
> > >
> > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.17 -
> > > 21:40:27GMT -----
> > >
> > > > - it adds extra traffic
> > >
> > > Um, right. "Here are n tokens" takes about 6 bytes: two for the message
> > > type, two for the message size, and two for the number of tokens (we're
> > > never going to hand out more than 65535 tokens in one go). It uses less
> > > traffic than "Can I send you a request?" "Yes" "Here's the request",
> > > and it avoids a round-trip. It also uses less traffic than "Can I send
> > > you a request?" "No", because if you don't have a token, you don't need
> > > to ask!
> > >
> > > > - it creates additional traffic patterns, that quite simplify attacks
> > > > (like
> > >
> > > those aiming at reliably proving that a particular request originates
> > > from attacked node) against a node which all connections are monitored
> > > (by ISP), and some of them are fast but compromised (compromised
> > > peers).
> > >
> > > Please explain how handing my peer some tokens reveals anything about
> > > traffic patterns that wasn't already visible to traffic analysis. If
> > > they can see the requests and results going back and forth, who cares
> > > if they can also see the tokens?
> > >
> > > > - it requires to pull a multidimensional set of heurictics on whom to
> > > > send
> > >
> > > new token out of a thin air, and those heuristics will tend to disagree
> > > for different connection types.
> > >
> > > No magical heuristics are needed - we hand out tokens as long as we're
> > > not overloaded (measured by total queueing delay, including the
> > > bandwidth limiter). That alone should be enough to outperform the
> > > current system, because we'll avoid wasting traffic on rejected
> > > searches. Then we can start thinking about clever token allocation
> > > policies to enforce fairness when the network's busy, without imposing
> > > unnecessary limits when the network's idle, etc etc. But token passing
> > > doesn't depend on any such policy - it's just a lower-bandwidth
> > > alternative to pre-emptive rejection.
> >
> > ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.05.19 - 17:38:09GMT
> > -----
> >
> > We hand out tokens when we're not overloaded? What if later on we get
> > overloaded? How exactly do we determine the total appropriate number of
> > tokens in the system?
> >
> > Token passing would be really nice, it would solve the flooding problem
> > and remove a lot of alchemy...
>
> ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.20 - 11:19:36GMT
> -----
>
> We determine the appropriate number of tokens by experimentation: as we
> approach overload we produce tokens more and more slowly, until at the
> overload point we don't produce tokens at all. For example, delay between
> producing tokens = MAX_DELAY/(MAX_DELAY - currentDelay) seconds, or
> something similar.
>
> We stop producing tokens when all the buckets are full. To limit the number
> of tokens in circulation, we adjust the bucket sizes. Each peer's bucket
> starts at size 0, and is reset to 0 whenever the peer reconnects. Whenever
> we add a token to an empty bucket, we increase its size by 1. Thus if a
> peer doesn't use the tokens we give it (ie doesn't empty its bucket), its
> bucket stays small. If it uses the tokens we give it (ie empties its
> bucket), we increase the size of its bucket as long as we're not overloaded
> (the bucket size is only increased when a token is handed out, which
> doesn't happen when we're overloaded).


----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.25 - 
11:45:36GMT -----

Even this brief description proposes 3 heuristics already:

- "more slowly or something similar"
- "adjust bucket sizes"
- "stays small"

No doubts complete practical implementation will involve quite more, and I see 
no way to ensure they will all work smoothly for all the wide variety of 
connection types.

This also totally silences the burst problems, as well as decreasing buckets 
during the bursts/overload periods. Kinda extreme case: if you gave a node 5 
tokens, and it decided to use them 3 minutes later when the node got 
overloaded, what did you gain with the tokens? That is a true Pandora box.

In the end it is timing that matters, while tokens represent data amount (and 
if you think of a formula time*bandwidth=data, then bandwidth, generally 
speaking, is not constant due to highly irregular traffic from different 
peers and other apps sharing the same internet connection). Purely 
controlling the latter will always be just a rough approximation of 
controlling the former, alas.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/tech/attachments/20070808/bfec0016/attachment.pgp>

[Tech] [old frost] bandwidth usage improvements for old nodes (and much more!)

Reply via email to