Re: [aqm] [tsvwg] Immediate and Explicit Congestion Notification

Bob Briscoe Sun, 27 Oct 2013 18:37:25 -0700

John, inline...

At 12:16 26/10/2013, John Leslie wrote:

Bob Briscoe <[email protected]> wrote:
>
> Exec summary
> * Early tests show promise that we may have found a way to make the
> ultra-low queuing delay of data centre TCP incrementally deployable
> on the public Internet
> * For rtcweb, we need to address
>   a) cc for r-t media [rmcat w-g in progress]
>   b) Making TCP nicer
>   c) minimise ability of TCP to bloat queues [AQM w-g now in progress]
>   This addresses b) & c)
>
> The problem
> * All AQMs delay dropping for about one (hard-coded) worst-case RTT,
> in case a burst dissipates (allegedly a 'good queue' according to Van
> Jacobson)


   This assertion is going to need a lot of support.

   Bob is a man after my own heart suggesting that an ECN notification
may be sent earlier than a packet drop would be indicated. I don't know
if we can get there; but IMHO that is essential to getting ECN deployed
and used.

   I don't think I agree with Bob that what's hard-coded is necessarily
a "worst-case" RTT -- and I'm quite sure I'm not willing to make any
pronouncement about "all AQMs".

   I suggest the talk might be more useful if Bob outlined the AQMs
currently in widespread use and detailed _how_ they delay dropping
for an estimated RTT.

You're right. The 'about' in my sentence was meant to indicate someleeway. The specifics depend on each AQM...

The only AQM I know of that doesn't smooth over some nominal RTT isDCTCP itself.

* CoDel was designed for 'interval' to be a worst-case (largest) RTT,which it recommends to be set to 100ms. One the queue has exceededthreshold, CoDel delays for time 'interval' before starting to signalcongestion.

- I already said how sluggish CoDel would be for a flow with a muchshorter RTT than CoDel (others have made this point, e.g. for datacentres :https://lists.bufferbloat.net/pipermail/codel/2012-August/000448.html).- And in the other direction, we already know that utilisationsuffers fairly badly for flows with RTT significantly larger than 100ms.

* PIE suppresses all drops for time max_burst (set to 100ms bydefault) from when the drop probability it calculates (but doesn'tnecessarily use) first rises above zero. This is very similar toCoDel, and similar comments are applicable.

* RED requires the constant for its exponentially weighted movingaverage (w_q) to be set taking into account how many packets arelikely to arrive at the link in a 'typical' RTT. Reverse engineeringthe values recommended by Sally Floyd in the RED paper and in herfamous RED parameters Web page<http://www.icir.org/floyd/REDparameters.txt>, she recommended a'typical' RTT of about 130ms.

[BTW, I know of people who don't calculate w_q, but just use thevalue of "0.002" that Sally recommended for her 45Mb/s link in theoriginal RED paper simulations (and repeated at<http://www.icir.org/floyd/REDparameters.txt>). This was calculatedassuming about 500 packets arrive at a link (from all flows) in atypical RTT. Links have got a lot faster since 1993. Nonetheless, shewas considering 45Mb/s for an aggregated link in those days, and ithappens to be about right for a single user today.]

> * For a flow with 1/10 or 1/100 of this RTT (e.g. from a CDN or your
> home media server), any congestion signal is delayed tens or hundreds
> of its own RTTs by these AQMs.

   Clearly, RTTs differing by a factor of ten are quite common at most
nodes traversed in a typical path; and it seems _very_ suboptimal to
have the responsibility for guessing the RTT at the node which must
drop packets.

For packets that do not support ECN, the dropping node has to make aguess at the RTT, so as not to drop packets unnecessarily, becausedrop is an impairment as well as a congestion signal. So a transportcannot 'undrop' packets.

Our point though is that a network node doesn't have to mimic thisbehaviour for ECN packets, because ECN is not an impairment. So atransport can un-ECN-mark packets (by smoothing out bursts itself).

> * A TCP flow in slow-start doesn't need the burst smoothed anyway
>   - delaying the signal just makes slow-start overshoot more
>   - a TCP in slow-start knows that it won't allow the burst to
> dissipate anyway

   A critical point! (It seems obvious to me, but is it obvious to
everyone?)

> The solution: make ECN also mean "Immediate Congestion Notification"?
> * For ECN-capable packets, shift the job of hiding bursts from network to
> host:
>   - the network signals ECN with no smoothing delay
>   - then the transport can hide bursts of ECN signals from itself

   But can we get there from here?

   The node doing the ECN notification _can't_ know how the transport
will react; and the transport receiving and ECN notification can't know
whether the forwarding node has "smoothed" the signal. (It is truly a
shame we haven't left any bits for signals like this!)

Well, we do have ECT(1) still only assigned experimentally and neverused, which we could decide to use for this immediate ECN. However,first I want to see whether people think it might be feasible to justredefine the meaning of CE.

Rationale: So few buffers have ECN support turned on anyway that weshould be able to redefine ECN so that many more will want to turn it on.

For those AQMs that already support ECN, we believe thisretrospective change will make them only a little worse than they arealready (and the operator can update them by simple reconfigurationanyway, and is more likely to do so, given these are clearlyearly-adopter networks).

>   - the transport knows
>     o whether it's TCP or RTP etc,
>     o whether its in congestion avoidance or slow-start,
>     o and it knows its RTT,
>     o so it can know whether to respond immediately or to smooth the
>     signals,
>     o and if so, over what time

   Yes, but it can't know what smoothing may already have been applied.


Yes. If this is a problem, we will have to consider using ECT(1) not CE.
But it's pretty academic when so few buffers support ECN.

The tiny proportion that do support ECN will already smooth by a'typical RTT' of about 100ms.

If a 20ms RTT flow adds smoothing over its own RTT to this, it willbe smooth over 120ms.The main problem there is not the extra 20ms, it's the original100ms, which we won't lose unless we make this change somehow.

>   - then short RTT flows can smooth the signals with only the delay
> of their /own/ RTT

> o so they can fill troughs and absorb peaks that longer RTTflows cannot

>   - a TCP only needs to smooth the signals if in congestion avoidance
>     o in slow start, it can respond immediately, thus reducing overshoot

   This would, IMHO, improve "slow start".

> Incremental Deployment:
> * Immediate congestion notification doesn't need new AQM implementation
>   - it can use the widely implemented WRED algorithm with an
> unexpected configuration

   Bob is beginning to lose me here. Does he mean that a forwarding node
would apply WRED for both drop and ECN, but with different parameters?

> * The network classifies packets for this AQM treatment based on
> their ECN-capability
>   - Without ECN, it smoothes the queue before signalling drops

   Bob has lost me now -- apparently he doesn't mean different
parameters... and I don't recognize this "smoothing" step in WRED.

I do mean that a forwarding node would apply WRED for both drop andECN, but with different parameters.

Each WRED policy-map includes a setting for this smoothing parameter,which Cisco calls the exponential-weighting-constant. Many peopledon't notice it's there and they just leave it at the default. Forinstance, Cisco set it to2^(-9) ~ 0.002 by default for each of the WRED policy-maps (seehttp://www.cisco.com/en/US/docs/ios/12_0s/feature/guide/fswfq26.html#wp1039982).

>   - With ECN, it signals immediately, without any smoothing delay
>   - (as today, the operator can still use WRED with the Diffserv field too)

   (Do we need to confuse this discussion by adding diffserv?)


A non-Diffserv network still doesn't need to worry about Diffserv.

I put this in parentheses because, if WRED is used today, it isusually used with Diffserv, and I didn't want anyone to worry thatthey wouldn't be able to continue to do this (e.g. BT use WRED withDiffserv in enterprise networks, as do many other carriers).

> * For TCP apps, the stack will use 'DCTCP' (we've tweaked it), if the
> ends negotiate ECN with the accurate feedback capability.

   Have we settled on "accurate feedback" already? I thought that was
still under discussion. (I don't follow exactly what it adds...)

See response from Richard Scheffenegger. Essentially the TCPM WG hasaccepted the requirements doc, but not decided between the mechanisms on offer.

> * It should 'just work' if an RTP app or a Reno TCP uses ECN.

   I don't see any way for a Reno transport using ECN to avoid being
starved if ECN arrives earlier (without notice).

We haven't tested legacy Reno with ECN yet (we figured legacy Renowithout ECN is a lot more prevalent, so focused on this first).Nonetheless, Reno-ECN is unlikely to starve, because starvation isabout long-running behaviour, and once a flow has run for more than acouple of 100ms RTTs, the immediate ECN signals should be nodifferent from a smoothed ECN. I suspect Reno-ECN might be worse inits short-term dynamics. But remember Reno-ECN is likely to be a tinycorner-case.

> The request:
> * Much more evaluation to do, but first we want to know:
>   - if the idea works, would the IETF have an appetite for tweaking
> the definition of ECN so it is merely equivalent to drop in the long
> term, but the dynamics need not be equivalent.

   There's a good question there; but I don't think we're ready for it.

At this stage, even we haven't got many answers. So I'm not askingthe IETF to answer the question right now. I'm merely saying, /if/our idea works, is there at least an /appetite/ in the IETF forreconsidering the definition of ECN?

We wanted to make the IETF aware of this research early, because itmight want to at least hold off on any actions that would otherwiseclose off this option.

And if we find that any change is completely out of the question, wehave to try a different tack (e.g. ECT(1)).

   I'd really like to discuss the dynamics of responding more quickly
but perhaps less drastically for almost any real-time flow.

   But proving "equivalence in the long term" seems too hard.

This should be the easy part, because the longer that conditions arestable, a smoothed signal should tend towards an unsmoothed signal,all other factors being equal.

Equivalence during dynamics is the hard part, and I'm suggesting wedon't sweat too much about that, as long as the performanceevaluations are not too far apart.

> Much better than the ECN that didn't get deployed
> * This is Explicit and Immediate Congestion Notification (EICN?)
>   - same wire protocol, much greater benefits
> * The advantage of the original ECN (avoiding congestive loss) was
> too small to be worth the deployment hassle

   Actually, I don't agree that was the problem -- instead I believe
the code has been deployed but administratively suppressed because
the operators don't trust the transports. There _is_ a significant
improvement from one-RTT reaction instead of several (to detect a
drop), but the whole process is just too complicated, while the
opportunity for abuse remains obvious.

I agree. That's the 'deployment hassle' side of my sentence - theextra trust-enhancing mechanisms that seemed necessary were too muchpain for the small gain.

> * Predictable ultra-low latency without loss too (similar to
> DCTCP-ECN) would be worth deploying

   I'm optimistic that latency will become an easier argument.

> * But we all thought DCTCP could only be deployed in isolation (e.g.
> data centres)
>   - we all thought DCTCP traffic would starve alongside today's TCP traffic
>   - because in a DCTCP queue, the ECN threshold is lower than you
> would trigger drop
>   - and we thought ECN & drop had to be equivalent.

   (I'm not sure we'll succeed at breaking that "equivalence"...)

> * We believe we've found a way to ensure DCTCP-ECN traffic doesn't starve
>   - we still make DCTCP-ECN equivalent to drop in the long-run, but
> not in its dynamics

   (I'm still not sure it's worth arguing the "long-run".)

I mean competing long-running ECN & non-ECN flows stabilise atpredictable rates, rather than one ratchetting itself down to nothingover time (starvation).

That's the primary concern of congestion control 'fairness', beforeanyone starts worrying about what the relative rates are. Given appsget different relative rates with different RTTs, with different sizeobjects or by opening multiple flows, we don't need to sweat so muchabout precisely equal flow rates; but we must sweat about stable convergence.

Results so far show that the proposed idea is at least very robustagainst starvation.

Bob

--
John Leslie <[email protected]>


________________________________________________________________

Bob Briscoe, BT

_______________________________________________
aqm mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] [tsvwg] Immediate and Explicit Congestion Notification

Reply via email to