[aqm] some feedback on draft-kuhn-aqm-eval-guidelines

Dave Taht Sun, 02 Mar 2014 02:34:24 -0800

I took a look over this again

http://datatracker.ietf.org/doc/draft-kuhn-aqm-eval-guidelines/?include_text=1


so I'd be ready for the meeting.

Rather than commenting directly on the document I had merely forwarded
what I'd written at the time of the cablelabs study in the hope that
the next version of the aqm evaluation guidelines would be better. For
reference, those comments remain at:

http://www.ietf.org/mail-archive/web/aqm/current/msg00506.html

and I'll now delve deeply into the document additionally:

A) The name of the group and the scope of work

Re:

"The IETF AQM working group was recently formed to standardize AQM
   schemes that are robust, easily implemented, and successfully
   deployed in today's networks. "

The actual name of the working group is "Active Queue Management and
Packet Scheduling".

As I recall the vote was almost entirely in favor of that name and
scope of work.  This is a problem throughout the document. To give you
one example among the many:

" The AQM working group was recently formed within the TSV area to
         ^^ and packet scheduling
   address the problems with large unmanaged buffers in the Internet.
   Specifically, the AQM WG is tasked with standardizing AQM schemes
                                                            ^^ and
packet schedulers
   that not only address concerns with such buffers, but also are
   robust under wide variety of operating conditions."

B) What is the nature of a flow?

Measuring tcp elephant flows is easy. However, actual usage of the
internet along the edge consists of many, many, short flows over any
given interval.

A flow has a beginning, a middle, and end.

Later on this document uses examples of 50 full rate TCP
flows as examples of what should be tested, and this bears very little
resemblence to reality. Certainly there are cases where the overall
sizes and numbers of flows are enormous, particularly in flows going
from supercomputer center to supercomputer center, or in encapsulated
traffic tunneled from bnodes over a transit provider... and perhaps
that should be an evaluation scenario...

but the majority of flows are mice that have a distinct beginning,
middle, and end.

as one example, see TCP mice and elephants:

http://www.cs.unc.edu/Research/dirt/proj/marron/MiceElephants/

Web traffic in particular is dominated by flows that almost never get
out of slow start and hefty usage of single packet "Ant"-like flows,
like DNS queries, etc.

New protocols like QUIC, WEBRTC, TV multicast, also have distinct properties
and should be evaluated and developed further with aqm and packet scheduling
techniques in mind.

I also strongly support the idea of pursuing Fred's "Lemmings" as a
very interesting test case for a variety of aqm and packet scheduling
techniques in the data center.

C) On characterizing load...

Nowhere in this document is a good definition of the characteristics of
"load", or how it is generated.

I like trace driven results from actual packet captures, background
loads generatedby typical user activity like file transfer, etc.

D) Re: Section 2:

I have tried to characterize an overall goal of this wg here as aiming
for 100% utilization with 0 queueing delay.

You can make tradeoffs - sacrificing say, 10% utilization for getting
less delay as in last IETF's ARED presentation. Or in the fq_codel case
aiming for 100% utilization with near 0 queueing delay for most flows,
only incurring delay for queue-building flows. Or aiming for a maximum
delay of 20ms or less for all flows as per RED, codel or pie.

The Remy work (iccrg preso on wednesday) points to the fact that humans may
not be up to the job of designing such algorithms in the future.

E) 3.1 - "The links are supposed to be symmetric"

As the vast majority of end-user links are asymmetric, in ratios
ranging from 6x1 to 15x1, I would change this to "links are supposed
to be asymmetric".

3.2


"  The size of the buffers MUST be carefully, set considering the
   bandwith-delay product."

Well, no, I'd actually appreciate a baseline set from observed buffer
sizes observed today in the field. We have plenty of data and easy
emulations of that, to show how those, deployed, buffer sizes behave
against many forms of traffic.

"3.2.1.1.  Topology Description

   The topology is presented in Figure 1.  For this scenario, the
   capacities of the links MUST be set to 10Mbps and the RTTs to 100ms."

This has no resemblence to anything seen in the real world,
anywhere. It is only seen in NS2 models.

Suggested alternate language:

MUST be set to a variety of realistic upload/download speeds, and the
RTTs modeled on what is observed in the field.

3.2.1.2

"Single TCP New Reno flow between sender A and receiver B, that
   transfers a large file for a period of 50s."

I don't know anything that uses new reno. MacOS uses something similar
that has stretch acks and a few other bugs. Linux has defaulted to cubic
for 8(?) years. Not clear to what extent DCTCP or compound are deployed.
Windows XP is being retired and that was the last of the tcps that didn't have
sliding windows.

Why New Reno?

Should there be a section for a scavanging TCP like ledbat?

3.2.1.3.  Aggressive Transport Sender

What form of cubic? What is in NS2 doesn't (so far as I know) have
support for proportional rate reduction, or ecn, among other modern
features.

3.2.1.4.  Unresponsive Transport Sender

"  This scenario consists of UDP flow(s) with an aggregate rate of
   12Mbps between sender A and receiver B, that transfers a large file
   for a period of 50s. Graphs described in Section 2.3 MUST be
   generated."

Against some other set of saner flow(s)? The way this reads, it's
running alone, so I guess an ideal result here is 10Mbits/sec with 20%
packet loss and minimal delay?

Or, if against another flows, what? And what is the desired result?

What sort of traffic is this intended to resemble? a single channel
of VOIP is 64kb/sec. The only thing I can think of that might be
unicast in this way at somewhere near this speed is multicast tv.

All in favor of 1 voip stream, or 1 videoconference vs other loads.

3.2.3.  Inter-RTT and intra-protocol fairness

This is not just a comment on this section, but on the whole document.

"TCP dynamics are a driving force for AQM design.  It is therefore
   important to evaluate against a set of RTT (e.g., from 5 ms to 200
   ms)."

And then goes on to specify a set of tests at 5ms and 200ms. Yes TCP
dynamics are IMPORTANT, and very sensitive to RTT.

but there is a wider range of RTTs in the real world than this, and
neither of these choices is representative.

The average RTT as observed by one report I can't find this morning is
37ms, down from 38ms last year.

Going coast to coast in the US is not much over 70ms. Going from California
to New Zealand is 240ms. Across England, don't know? 30ms?

There is an enormous amount of market pressure to reduce these
physical RTTs to the cloud still further - examples including things
like netflix co-locating everywhere they can, and top services
co-locating and using CDNs to push data closer to the user.

Here in London, I am 3.6 - 4.8ms away from www.google.com, and in
california, 2.

Lastly, in the data center physical RTTs are well below 1ms.

I would argue in favor of select a minimum, maximum and median from each
technology to test against, and to use physical RTTs in the ranges
observed in the real world. A quick chart off the top of my head

|Tech|Down|Up|edge RTT|cloud RTT|
|DSL|2mbit |384kbit |>10ms|30-80ms |
|DSL|6mbit|768kbit|||
|DSL|12mbit |1.2kbit ||| # I don't know the commonly deployed dsl sizes
|Cable|8Mbit|1Mbit|||
|Cable|20Mbit|4Mbit|||
|Cable|100Mbit|20Mbit|||
|ethernet |10Mbit |10mbit |||
|ethernet |100Mbit|100Mbit |||
|ethernet |1000Mbit|1000Mbit |||
|GPON Fiber |24Mbit|24mbit|||
|GPON Fiber |100Mbit|100Mbit |||
|GPON Fiber |1000Mbit|1000Mbit |||
|Wifi |1mbit|1mbit|20-2000ms ||
|Wifi| wifi is a pita|
|Wifi |450mbit|450mbit|1-2000ms ||
|LTE  |1mbit|300kbit |||
|LTE  |80mbit||||
|Data Center|1GigE |1GigE|>1ms  |<1ms |
|Data Center|10GigE|10GigE|>1ms  |<1ms|

More from 3.23

"   These flows MUST have the same congestion control algorithm. "

I confess it would be interesting to observe this with different
congestion control algorithms.

"   The output that MUST be measured is the ratio between the average
   goodput values of the two flows (Section 2.2.3) and the packet drop
   rate for each flow (Section 2.2.2).""

The results I get from measuring the goodput values of dozens of
bitorrent flows are quite interesting, where the interaction of IWX,
and slow start from competing traffic intercedes. What I think I
observe is torrent getting out of the way of multiple flows in slow
start much better than flows in congestion avoidance.

So two long duration flows at varying RTTs is good. Measuring many
long duration flows with LEDBAT against other flows would also
be useful.

3.2.4.1

10Mbit/100ms RTT again. Um... no.

Incidentally I usually use a period of 60 or 100s, given that things
like speedboost and minstrel tend to do interesting things in the first 30s of a
connection. I'm interested in those interesting things, but it's hard
to discern them when almost your entire test is during them.

Suggest doubling the test time to 100s.

3.2.4.5: 50 full rate flows for 150 seconds against fluctuating bandwidths

I have no idea what sort of traffic or environmental conditions this
is supposed to resemble. Wifi doesn't switch between bandwidths this
way, even abstractly, if that is the intent.

Users don't generally start 50 flows at exactly the same time either.
15 maybe...

I agree it is interesting to observe bandwidth variations, but limiting this
varying test to 50 flows only vs any of the other tests seems limiting.

As a counterpoint...

How about 5000+ 6-32k transfering flows for 150 seconds, against these
fluctuating bandwidths, measuring effective start and completion time,
and modeling DNS traffic as well, while against one or more long
duration flows?

And starting these new flows across the range of the test while under load?

3.3.1 Wifi

This has no bearing on wifi for that I can see.

I happen to like the test here, outlined, for all technologies.

   o  Five repeating TCP transfers: repeating 5MB file transmission;

except I'd prefer to see something like 50k rather than 5MB
and be measuring uploads and downloads.

   o  One continuous TCP transfer: continuous file transmission;

Again not clear in what direction this is going. Uploads only?
downloads only? Full Bidirectional?

   o  Four HTTP web traffic (repeated download of 700kB);

and it's not clear if this represents a real web load or not from the
document, where loading a web page of this size would consist of
20-120 different flows.

3.3.2 Sat is just wrong

I figure this is a bad cut/paste from another section of the document.
If you are going to emulate satellite traffic your RTT should be
greater than 800ms.

...

Somewhere around here I ran out of steam.


-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html

_______________________________________________
aqm mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/aqm

[aqm] some feedback on draft-kuhn-aqm-eval-guidelines

Reply via email to