Re: [aqm] [tsvwg] Immediate and Explicit Congestion Notification (was: Guidelines for Adding Congestion Notification to Protocols that Encapsulate IP)

John Leslie Sat, 26 Oct 2013 05:17:54 -0700

   (If this continues, we've got to fix the title and cross-posting...)

Bob Briscoe <[email protected]> wrote:
> 
> Exec summary
> * Early tests show promise that we may have found a way to make the 
> ultra-low queuing delay of data centre TCP incrementally deployable 
> on the public Internet
> * For rtcweb, we need to address
>   a) cc for r-t media [rmcat w-g in progress]
>   b) Making TCP nicer
>   c) minimise ability of TCP to bloat queues [AQM w-g now in progress]
>   This addresses b) & c)
> 
> The problem
> * All AQMs delay dropping for about one (hard-coded) worst-case RTT, 
> in case a burst dissipates (allegedly a 'good queue' according to Van 
> Jacobson)


   This assertion is going to need a lot of support.

   Bob is a man after my own heart suggesting that an ECN notification
may be sent earlier than a packet drop would be indicated. I don't know
if we can get there; but IMHO that is essential to getting ECN deployed
and used.

   I don't think I agree with Bob that what's hard-coded is necessarily
a "worst-case" RTT -- and I'm quite sure I'm not willing to make any
pronouncement about "all AQMs".

   I suggest the talk might be more useful if Bob outlined the AQMs
currently in widespread use and detailed _how_ they delay dropping
for an estimated RTT.

> * For a flow with 1/10 or 1/100 of this RTT (e.g. from a CDN or your 
> home media server), any congestion signal is delayed tens or hundreds 
> of its own RTTs by these AQMs.

   Clearly, RTTs differing by a factor of ten are quite common at most
nodes traversed in a typical path; and it seems _very_ suboptimal to
have the responsibility for guessing the RTT at the node which must
drop packets.

> * A TCP flow in slow-start doesn't need the burst smoothed anyway
>   - delaying the signal just makes slow-start overshoot more
>   - a TCP in slow-start knows that it won't allow the burst to 
> dissipate anyway

   A critical point! (It seems obvious to me, but is it obvious to
everyone?)

> The solution: make ECN also mean "Immediate Congestion Notification"?
> * For ECN-capable packets, shift the job of hiding bursts from network to 
> host:
>   - the network signals ECN with no smoothing delay
>   - then the transport can hide bursts of ECN signals from itself

   But can we get there from here?

   The node doing the ECN notification _can't_ know how the transport
will react; and the transport receiving and ECN notification can't know
whether the forwarding node has "smoothed" the signal. (It is truly a
shame we haven't left any bits for signals like this!)

>   - the transport knows
>     o whether it's TCP or RTP etc,
>     o whether its in congestion avoidance or slow-start,
>     o and it knows its RTT,
>     o so it can know whether to respond immediately or to smooth the 
>     signals,
>     o and if so, over what time

   Yes, but it can't know what smoothing may already have been applied.

>   - then short RTT flows can smooth the signals with only the delay 
> of their /own/ RTT
>     o so they can fill troughs and absorb peaks that longer RTT flows cannot
>   - a TCP only needs to smooth the signals if in congestion avoidance
>     o in slow start, it can respond immediately, thus reducing overshoot

   This would, IMHO, improve "slow start".

> Incremental Deployment:
> * Immediate congestion notification doesn't need new AQM implementation
>   - it can use the widely implemented WRED algorithm with an 
> unexpected configuration

   Bob is beginning to lose me here. Does he mean that a forwarding node
would apply WRED for both drop and ECN, but with different parameters?

> * The network classifies packets for this AQM treatment based on 
> their ECN-capability
>   - Without ECN, it smoothes the queue before signalling drops

   Bob has lost me now -- apparently he doesn't mean different
parameters... and I don't recognize this "smoothing" step in WRED.

>   - With ECN, it signals immediately, without any smoothing delay
>   - (as today, the operator can still use WRED with the Diffserv field too)

   (Do we need to confuse this discussion by adding diffserv?)

> * For TCP apps, the stack will use 'DCTCP' (we've tweaked it), if the 
> ends negotiate ECN with the accurate feedback capability.

   Have we settled on "accurate feedback" already? I thought that was
still under discussion. (I don't follow exactly what it adds...)

> * It should 'just work' if an RTP app or a Reno TCP uses ECN.

   I don't see any way for a Reno transport using ECN to avoid being
starved if ECN arrives earlier (without notice).

> The request:
> * Much more evaluation to do, but first we want to know:
>   - if the idea works, would the IETF have an appetite for tweaking 
> the definition of ECN so it is merely equivalent to drop in the long 
> term, but the dynamics need not be equivalent.

   There's a good question there; but I don't think we're ready for it.

   I'd really like to discuss the dynamics of responding more quickly
but perhaps less drastically for almost any real-time flow.

   But proving "equivalence in the long term" seems too hard.

> Much better than the ECN that didn't get deployed
> * This is Explicit and Immediate Congestion Notification (EICN?)
>   - same wire protocol, much greater benefits
> * The advantage of the original ECN (avoiding congestive loss) was 
> too small to be worth the deployment hassle

   Actually, I don't agree that was the problem -- instead I believe
the code has been deployed but administratively suppressed because
the operators don't trust the transports. There _is_ a significant
improvement from one-RTT reaction instead of several (to detect a
drop), but the whole process is just too complicated, while the
opportunity for abuse remains obvious.

> * Predictable ultra-low latency without loss too (similar to 
> DCTCP-ECN) would be worth deploying

   I'm optimistic that latency will become an easier argument.

> * But we all thought DCTCP could only be deployed in isolation (e.g. 
> data centres)
>   - we all thought DCTCP traffic would starve alongside today's TCP traffic
>   - because in a DCTCP queue, the ECN threshold is lower than you 
> would trigger drop
>   - and we thought ECN & drop had to be equivalent.

   (I'm not sure we'll succeed at breaking that "equivalence"...)

> * We believe we've found a way to ensure DCTCP-ECN traffic doesn't starve
>   - we still make DCTCP-ECN equivalent to drop in the long-run, but 
> not in its dynamics

   (I'm still not sure it's worth arguing the "long-run".)

--
John Leslie <[email protected]>
_______________________________________________
aqm mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] [tsvwg] Immediate and Explicit Congestion Notification (was: Guidelines for Adding Congestion Notification to Protocols that Encapsulate IP)

Reply via email to