Re: [aqm] Text for aqm-recommendation on independent ECN config

Ilpo Järvinen Wed, 11 Dec 2013 13:12:25 -0800

On Wed, 11 Dec 2013, Dave Taht wrote:

> On Wed, Dec 11, 2013 at 11:21 AM, Bob Briscoe <[email protected]> wrote:
> > Jim,
> >
> > At 16:55 11/12/2013, Jim Gettys wrote:
> > On Tue, Dec 10, 2013 at 10:04 PM, Bob Briscoe <[email protected]> wrote:
> > Jim,
> >
> > I'm just checking we're not talking past each other. I'll repeat two quotes
> > from each of us, then comment.
> >
> > On Thu, Dec 5, 2013 at 1:13 PM, Bob Briscoe <[email protected]> wrote:
> >
> > 3{New}. It SHOULD be possible to make different instances of an AQM
> > algorithm apply to different subsets of packets that share the same queue.
> > It SHOULD be possible to classify packets into these subsets at least by ECN
> > codepoint [RFC3168] and Diffserv codepoint [RFC2474] (or the equivalent of
> > these fields at lower layers),
> >
> >
> > At 19:50 05/12/2013, Jim Gettys wrote:
> >
> > "Certainly, it may be the same instance of an AQM algorithm, rather than
> > different instances, for example."
> >
> >
> > That's true of course, but the case with one AQM handling all packets within
> > a queue is the norm. I want to check you're happy with the converse:
> > 1) A set-up more like WRED which was based on Dave Clark's RIO (RED with in
> > and out of contract). So we can have WPIE, WCoDel etc where the
> > differentiation between aggregates is provided by different AQM instances in
> > the same queue, not by different queues with different scheduling
> > priorities.
> > 2) Extending this so that AQM differentiation can be between ECN-capable and
> > Not-ECN-capable aggregates, not just between Diffserv classes (an example
> > being CoDel with a lower 'interval' for ECN-capable packets).
> >
> > I presented the evaluations of this last idea in tsvwg on the final Friday
> > of the Vancouver IETF - I don't think you were there. <
> > http://www.ietf.org/proceedings/88/slides/slides-88-tsvwg-20.pdf >
> >
> >
> > Yes, unfortunately I had to leave before the Friday session.
> > This is my primary motivation for this wordsmithing - I'm trying allow us to
> > move towards zero signalling delays in CoDel, PIE and RED (currently
> > defaults of 200ms, 100ms and 512packets respectively, which are not good for
> > dynamics).
> >
> >
> > Certainly signalling delays are very important: this is why I'm favorably
> > inclined to "head mark/drop", as it signals TCP as quickly as possible,
> > keeping the response of the TCP feedback loop as tight as possible (and part
> > of why I like CoDel so much for the highly variable bandwidth problem we
> > face at the edge of the net).
> >
> > It's *really* important than when the bandwidth drops suddenly that everyone
> > gets told to slow down quickly (exactly how quickly probably depends on the
> > propagation change characteristics of the medium), or packets can pile up in
> > a big way.
> >
> > How quickly the mark/drop algorithm can figure out that signalling is
> > appropriate is the *other* piece of getting good dynamics.  Here I don't
> > doubt that something may be discovered that is better than CoDel in the
> > slightest.
> > It takes a CoDel instance (within an fq structure) 200ms from its queue
> > first passing 'threshold' before it will ever drop the first packet (unless
> > the queue hits taildrop before that). So if the RTT is 20ms, that's 220ms
> > signalling delay. In fq_codel this creates considerable self-delay for short
> > flows or r-t apps, which kill their own latency before they get any loss
> > signal to tell them to slow down. Even for elastic flows, with congestion
> > signals delayed by so much, they risk hitting themselves with a huge train
> > of overshoot loss. This would be the same for fq_pie, except the number is
> > 100ms + RTT.
> 
> Things have so consistently expressed things this way that I began to
> doubt the reality myself. It seems like a large number of folk on this list
> don't get it either, so I am going to try an explain in a new way.
> 
> Tackling codel first:
> 
> the first phase of codel has effectively a "training" period where a link 
> going
> from unloaded to loaded for the first time ever - the very first drop with the
> default interval will happen in 200ms, yes. IF it stays loaded and  over the
> target delay after the first drop/mark, it will then tune to ever
> smaller intervals
> to approximate an ideal drop rate until the latency on the link drops below 
> the
> target. At which point the algorithm saves that rate, and stops doing anything
> until the next time the target delay is exceeded.
> 
> Some keep asserting that that is all there is to codel, saying things like
>  "there is a linear increase in drop probability" using the invsqrt mechanism,
> 
> *which is true during the training phase*.
> 
> After that approximation of ideal drop/mark rate is obtained, the algorithm
> goes quiescent until the next time the target delay is consistently exceeded,
> at which point it schedules the next drop at a little more than the stored
> previous drop rate. It then continually seeks around that point up and down.
> 
> If the delay drops below the target in this phase, the algorithm stops
> again and decreases the drop rate again, as it's too high. If the delay
> stays above target after the drop for the current value of the interval,
> the drop rate increases.
>
> This is an interesting solution to kleinrocks formulation of "power", where
> he once said an average of one packet should be in the queue, codel aims
> to never have less than one packet in the queue.
> 
> And the switch into and out of drop mode going above target is entirely
> dependent on the characteristics over time of the flows on the system,
> completely nonlinear, and where codel spends 99.999% of it's time on a
> loaded link.
> 
> As debussy said: "Music is the *space* between the notes".
> 
> I wish i had a name for this second "seeking" phase that makes as much 
> sense as "congestion avoidance".
> 
> So asserting that you'll have a 200ms interval on a codled link always 
> is just blatantly incorrect. On first boot, yes. On a busy network, 
> never again.[1]


No no. The queue empties after CoDel overshoots the marking probability 
and then CoDel stops and starts from scratch. And yes, CoDel will always 
overshoot for sure because it _controls_ until the queue is in its
control, i.e., below the threshold. How big the overshoot is, of course 
depends.

What this means in terms of TCP:

The network/or you fq-queue (in case of fq_codel if you so want) won't be 
all that busy according to CoDel once CoDel kindly "coddled" the TCPs, 
that's the whole point of CoDel I'm told :-). This is because the queue 
happens to be "under control" only if TCP backs off to below 5ms + RTT 
level of utilization. ...Now remember the effect of beta here. If the 
network remains more than "5ms busy", CoDel thinks that the queue is not 
in control and keeps shooting again and until eventually the network is no 
longer "busy". The worst case beta * (5ms + RTT) is quite small 
utilization and it takes time for a TCP to recover network busyness.

> In the event of a link going completely idle, and staying
> idle, there is hysteresis built into the code so it will retain that
> drop rate for a few
> hundred milliseconds (it's 8*interval in some versions of the code,
> 4 in others), before resetting count to 1 and the resulting estimation
> window to interval.

True, however, it only means that you'll overshoot more or the time was 
too short to retain the trained count in memory (and in that case CoDel 
forgets it like you admit). Or do you think that the magic number applied 
to count on the recall (was it "-2") works for all traffic?


-- 
 i.
_______________________________________________
aqm mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] Text for aqm-recommendation on independent ECN config

Reply via email to