Re: [aqm] [tsvwg] Immediate ECN: Autotuning AQM for RTT

Bob Briscoe Tue, 12 Nov 2013 14:21:09 -0800

Greg, inline...

At 18:57 11/11/2013, Greg White wrote:

On 11/11/13, 9:02 AM, "Bob Briscoe" <[email protected]> wrote:

>Greg,
>
>At 06:54 09/11/2013, Greg White wrote:
>>This is very interesting work.  There are a lot of unanswered questions
>>about ecn / no-ecn coexistence and differential treatment in an AQM, and
>>this could provide some answers.
>>
>>To those who groaned that ECN was not included in DOCSIS 3.1, read these
>>slides (and Naeem Khademi's).
>
>Indeed. However, I think it would be safe to recommend that ECN
>support should at least be implemented and separately configurable,
>then it can be turned on or off by operators later.
>
>I'm aware that this doubles the amount of configuration, but we've
>had some success already (with RED) in relating all the ECN
>parameters to the drop parameters by a formula, so hopefully the
>vendor could configure the ECN parameters automatically based on the
>drop parameters.

I don't have a lot of confidence that I can recommend something to be
burned into silicon at this point (and doing it in software is a
non-starter).  The "turned on or off by operators later" would be a given,
but if implemented based on what we know now, I don't have a lot of
confidence that the switch would be ever set to the "turned on" state.  I
can't recommend that vendors add gates for a feature that my intuition
tells me would never be used.  If separately configurable means something
that would increase the probability of the feature being used, then that's
great, but what specifically are we talking about?  It could be a lot of
gates.

Also, you've suggested *not* differentiating between "classic ECN" and
"immediate ECN" in the ECT flags, but this is just a suggestion at this
point.  Again, not something I feel confident about taking for granted.
We'll continue to track this discussion though.


Understood.

Given the cable industry has decided to move onthis early schedule, we can now do the necessaryresearch so that competing access technologies can properly exploit ECN ;)



more...

>>Bob, CoDel uses "interval" both as a hold-off for the first packet drop
>>and as the numerator in the invsqrt drop scheduling.  Setting interval =
>>0
>>would result in ECN being signaled on *every* ECN capable packet when the
>>sojourn time is above threshold.    This jibes with some of your charts
>>for RED, but others show a ramp up in mark probability rather than a step
>>function. Could you clarify?
>
>We've only looked at WRED in detail, because it was much more
>interesting (to us) to reconfigure existing implementations than have
>to wait for new code to be implemented and tested.
>
>The suggestions for PIE and CoDel are just conceptual at this stage -
>we've done no implementation of this idea with either. (I said this
>verbally when presenting the slides, but I should have put it in
>writing too). Please read my suggestions for PIE and CoDel in this light.
>
>I'm not surprised that CoDel derives other parameters from 'interval'
>that should have been declared and set separately. Andrew McGregor
>also pointed out to me that CoDel sets threshold = 0.2*interval, so
>threshold would have to be declared separately as well. This starts
>to reveal just how many magic numbers there really are in the CoDel
>algorithm.


In the ns2 CoDel code that we've used (from Kathie), there are two
independent parameters, threshold and interval, and the defaults are 5ms
for threshold and 100ms for interval, so I don¹t know where the "threshold
= 0.2*interval" is coming from (maybe it is different in the current linux
or openWRT distros).


Sorry, altho it turns out I was wrong anyway (see below), I meant to say:
        0.05 * interval
        (= interval / 20)
I was working from memory and I knew there was a 2 in there somewhere.

Andrew McGregor said at the mic in tsvwg lastFriday that threshold was defined relative tointerval, and I remembered it from when I firstlooked at the CoDel code. However, I've justchecked and even Andrew's ns3 port of CoDel setsthreshold to 5ms, independent of the setting ofinterval. So consider this a memory thow-back on everyone's part and ignore it.

To be more clear about my earlier statement, CoDel actually uses
"interval" to control a single aspect of the drop policy, however, the
code (and descriptions of it) imply that there are two different
functions, and my comments above were written in that vein. However,
here's how I would describe it more succinctly:  the "next drop time" is
always set according to interval/sqrt(count).  In the specific case of
sojourn time crossing above threshold from below, count happens to equal
1, so the first drop is set at one interval in the future.  This is what
is sometimes described as the "hold off" period.

OK. From an admittedly hasty scan through AndrewMcGregor's port of CoDel to ns3 (not the Linuxcode), it seems to introduce this hold-off timein the ControlLaw() function, but it only callsControlLaw() from the ShouldDrop() function ifthe queue has remained above threshold for duration interval.

That implies that CoDel holds off signallinganything for 2 * interval, ie 200ms for thedefault setting of interval = 100ms.

That said, there are some "magic numbers" in the algorithm, specifically
the part that regulates very low drop probabilities (e.g. 8*interval,
count *= 0.9844), but that¹s about it.


And the following magic numbers, of course:
* threshold [5ms]
* the power used in the control law [0.5 in the sqrt function]

I¹m still thinking about how to achieve differential treatment of marking
vs dropping in CoDel in a logical way.

That would only be possible if CoDel's controllaw for drop was based on some understandablelogic in the first place. I've started anotherthread on that, based on analysis I've just done of CoDel's control law.

A naive answer would be to create a secondvariable for the interval used in ControlLaw(),let's call it I. Then solely set interval = 0.

But that would still leave a signalling delay ofI. As you say, CoDel's control law for increasingthe dropping frequency is also based on the assumption that I = RTT = 100ms.

By my analysis (in the other email), therationale for CoDel's current control law seemsincorrect, which is probably why we're findingit's tough to base any new thinking on it.

>
>
>>Setting max_burst = 0 in PIE would not result in the step function
>>behavior.
>
>It's not meant to result in a step-function.
>
>In the WRED example, it solely avoids the delay of queue averaging,
>so that once the /instantaneous/ queue exceeds min_thresh it marks
>with increasing probability (not intended to be a step).
>
>Similarly with PIE, the formula:
>         p = p + alpha*(est_del-target_del) + beta*(est_del-est_del_old);
>would still gradually increase the probability of drop (not a step
>function), but it would start to do so as soon as the queue exceeded
>target_del, rather than waiting for max_burst.
>
>Is that what you meant?


Yes, that is what I meant.

However, it seemed to me that what you were proposing was to move the
intelligence to the transport, and keep the queue simple. In that regard,
would a very simple threshold:  always mark when instantaneous queue
exceeds the threshold, never mark when it doesn¹t, be an acceptable way to
do things?

This would be easily implementable, and could be straightforwardly
combined (I think) with any AQM that is controlling drops (including
CoDel).

A simple step threshold would be fine if alltraffic used ECN. DCTCP proves that such a reallydumb AQM works well if complemented by a smarterTCP (which proved to me that our problem is with TCP).

But I don't think it's so easy to combine such asimple AQM for ECN traffic, with a complex AQM like CoDel for drop.


The logic I started from was:
* If starvation is going to happen, it will happen over time

* So we should be able to design two AQMs (forECN and drop) that will not cause each other tostarve, by only considering where they eventuallyconverge to - solely in the presence of stablelong-running flows. We can elide dynamics, likesmoothing, that disappear over time.* CoDel looks for the min queuing delay overinterval, so we can elide that part of itsbehaviour. But the control law continues toincrease drop with time, irrespective of how thequeue is growing (as long as it is greater thanthreshold). Then, when drop is high enough,assuming responsive flows, the CoDel queue willfall below threshold and switch back tonon-dropping mode. Then the cycle will repeat.It's not easy to elide away behaviour thatstabilises by continually switching between twodiscrete modes, rather than stabilising in one mode.

This is what the short-term memory within CoDeldoes - if it returns to dropping mode soon enoughafter leaving it, it starts dropping from whereit left off. So it's actually continuallyswitching in and out of dropping mode, whichmakes it much harder to think about analytically.And much harder to design a complementary ECN behaviour for it.



________

I prefer the principle: "Design forVerifiability", which is why I always have anallergic reaction to CoDel. See John Doyle'spaper below, which uses TCP and AQM design as acase study for how to use and abuse this principle:

Doyle, J.C., Carlson, J., Low, S.H., Paganini,F., Vinnicombe, G., Willinger, W., Hickey, J.,Parrilo, P. & Vandenberghe, L., "Robustness andthe Internet: Theoretical Foundations," Caltech Draft Paper (March 2002)

<http://netlab.caltech.edu/pub/papers/RIPartII.pdf>

The IRTF network complexity research group istrying to make this stuff understandable by us mere IETFers.

Bob

-Greg


>
>
>Bob
>
>
>>-Greg
>>
>>
>>On 11/7/13, 1:03 PM, "Bob Briscoe" <[email protected]> wrote:
>>
>> >Folks,
>> >
>> >"Immediate ECN" slides:
>> ><http://bobbriscoe.net/presents/1311ietf/1311tsvarea-iecn.pptx>
>> ><http://bobbriscoe.net/presents/1311ietf/1311tsvarea-iecn.pdf>
>> >
>> >PS. This talk fell off the end of the TSVAREA agenda. It's mostly
>> >relevant to AQM, but I didn't originally bring it to AQM, because it
>> >affects 3 wgs: tsvwg, aqm & tcpm.
>> >
>> >In the AQM wg, there was dismay about CableLabs not including
>> >anything about ECN in DOCSIS3.1. This talk is about AQM dynamics; and
>> >how ECN can take out the 100ms of delay that CoDel and PIE introduce
>> >- it's essentially about auto-tuning for RTT.
>> >
>> >It gives an interim recommendation for hardware designers that there
>> >should be a second instance of the AQM algo for ECN packets so that
>> >it can be configured with different parameters (think of WRED instead
>>of
>> >RED).
>> >
>> >Specifically, for ECN packets:
>> >interval = 0 (for CoDel)
>> >max_burst = 0 (for PIE)
>> >
>> >
>> >Bob
>> >
>> >PS. We have a paper under submission, which we can supply on request.
>> >We plan to document this in the IETF too.
>> >
>> >
>> >
>> >
>> >________________________________________________________________
>> >Bob Briscoe,                                                  BT
>
>________________________________________________________________
>Bob Briscoe,                                                  BT
>

_______________________________________________
aqm mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/aqm


________________________________________________________________

Bob Briscoe, BT

_______________________________________________
aqm mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] [tsvwg] Immediate ECN: Autotuning AQM for RTT

Reply via email to