[aqm] Comments on draft-ietf-aqm-recommendation-00, message 1 of several

David Collier-Brown Sat, 18 Jan 2014 17:45:22 -0800

Section, paragraph (page)
=======  =========  ====

1, 3 (p3)


   ... mechanisms [RFC5681], while necessary and powerful, are not
   sufficient to provide good service in all circumstances.  Basically,
   there is a limit to how much control can be accomplished from the
   edges of the network.  Some mechanisms are needed in the network
   devices to complement the endpoint congestion avoidance mechanisms.

This reads a bit like middleboxes aren't using anything, and that they
need different
mechanisms.  I'd suggest they need to be as active as using the same
control
mechanisms as the endpoints, or at least their components of the control
mechanisms.

How about "Network devices need to actively use the same mechanisms to
provide hop-by-hop congestion avoidance as are used by the endpoints,
especially the newer mechanisms."  That reflects the fact that the
experimentation is going on at or quite near the edge, and the middle
has been a bit left out.

I'd also suggest the second sentence would be stronger if it doesn't
begin with "basically".


1, 4 (p3)
   ... To a rough approximation, queue management algorithms
   manage the length of packet queues by marking or dropping packets
   when necessary or appropriate, while scheduling algorithms determine ...

Based on the description in 1.1.2, para 2 (the one about queues I
mentioned in the prefatory remarks), I suggest we say here that queue
management algorithms "/minimize/ the length of packet queues". 

This is a bit of a truism in the queuing world, and a different mind-set
than buffer management. One wouldn't say "minimize buffering", as that
would draw an entirely backwards picture in the reader's mind (;-)), but
we do wish to minimize the waiting-in-queue, for both throughput and
latency reasons, not just for one or the other.


1, 6 (p3)
   The second issue, discussed in Section 3 of this memo, is the
   potential for future congestive collapse of the Internet due to
   ...
   available.  It is imperative that this work be energetically pursued,
   to ensure the future stability of the Internet.

Unless it is impossible for congestive collapse to happen now, I'd
suggest putting this into the present tense: say "potential for
congestive collapse" and "retain the stability of the internet"
As it stands, it sounds like we're arguing that this can only happen at
some future date, when something worsens, and so the reader could
reasonably conclude that they need not act.

1, 8 (p4) 
   The discussion in this memo applies to "best-effort" traffic, which
   is to say, traffic generated by applications that accept the
   occasional loss, duplication, or reordering of traffic in flight.  It
   also applies to other traffic, such as real-time traffic that can
   adapt its sending rate to reduce loss and/or delay.  It is most
   effective, when the adaption occurs on time scales of a single RTT or
   a small number of RTTs, for elastic traffic [RFC1633].

I find this just plain hard to read.  I'm going to assume you mean to
identify the traffic regimes we consider applicable, and in indicate
their desirability. If so, may I suggest:

   The discussion in this memo applies to most observed kinds of traffic, but 
not to <specific 
   technical term goes here> ones, where no delay, loss or reordering can be 
permitted.  It works 
   for so-called "real-time" traffic that can adapt its sending rate to reduce 
loss and/or delay, 
   and is optimal for the common case, "best effort" traffic, traffic that can 
accept the
   occasional loss, duplication, or reordering of traffic in flight. It is most 
effective for 
   <a specific technical term goes here too?> traffic where such events occur 
on time scales of a 
   single RTT or a small number of RTTs.


I initially through you were referring forward to the responsive,
unresponsive and tcp-friendly flows you mention in section 3, but they
clearly don't fit: can you flesh this paragraph out with the kinds of
traffic you mean, or indicate if you meant something else entirely?

2, 1 (p4)

   The traditional technique for managing the queue length in a network
   device is to set a maximum length (in terms of packets) for each
   queue, accept packets for the queue until the maximum length is
   reached, then reject (drop) subsequent incoming packets until the
   queue decreases because a packet from the queue has been transmitted.

I'd break the paragraph here and let its name be a separate paragraph.

I'd follow it with a statement of why we have to deal with queues, such as

    These queues build up wherever there is a connection between an incoming 
and an 
    outgoing stream of packets, and a buffer is used to hold packets where they 
are 
    being routed, where the input and output are running at different speeds, 
or 
    where a group of packets arrive together at the input and they need to be 
held until
    output processing can drain them away.

This then distinguishes between the three uses of the buffer, and
motivates the following argument.

2.1, 1 (p4)

       In some situations tail drop allows a single connection or a few
       flows to monopolize queue space, preventing other connections
       from getting room in the queue.  This "lock-out" phenomenon is
       often the result of synchronization or other timing effects.


I think it might be better to say "inadvertent synchronization", as that
avoids suggesting that some separate and intentional synchronization was
causing the problem.

2.1, 2 (p4)

       The tail drop discipline allows queues to maintain a full (or,
       almost full) buffer for long periods of time, since tail drop
       signals congestion (via a packet drop) only when the queue has
       become full.  

I changed "status" to "buffer" in the above. Following this sentence, may I
suggest we skip immediately to

       The point of buffering in the network is to absorb data bursts
       and to transmit them during the (hopefully) ensuing bursts of
       silence.  This is essential to permit the transmission of bursty
       data.  Normally small queues are preferred in network devices,
       with sufficient queue capacity to absorb the bursts. 

then

       Even though TCP constrains the congestion window of
       a flow, packets often arrive at network devices in bursts
       [Leland94].  If the queue is full or almost full, an arriving
       burst will cause multiple packets to be dropped.  This can result
       in a global synchronization of flows throttling back, followed by
       a sustained period of lowered link utilization, reducing overall
       throughput.

       The counter-intuitive result is that maintaining normally-small
       buffer residencies (queues) can result in higher throughput as 
       well as lower end-to-end delay.  In summary, buffer/queue limits 
       should not reflect the steady state queues we want to be maintained 
       in the network; instead, they should reflect the size of bursts that 
       a network device needs to absorb. 

I Inserted "buffer residencies (queues)" in place of "queues" and
"buffer/queue limits" in place of "queue limits", then we could follow
up with

       The naive assumption might be that there is a simple tradeoff
       between delay and throughput, and that the recommendation that
       queues be maintained in a "non-full" state essentially translates
       to a recommendation that low end-to-end delay is more important
       than high throughput. 

I'd then conclude with

        As we see above, this is by no means the case. Small queues and 
        normally-empty buffers maximize throughput as well.


2.2, 5 (p5)

   We know in general how to solve the full-queues problem for
   "responsive" flows, i.e., those flows that throttle back in response
   to congestion notification.  In the current Internet, dropped packets
   provide a critical mechanism indicating congestion notification to
   hosts.  The solution to the full-queues problem is for network
   devices to drop packets before a queue becomes full, so that hosts
   can respond to congestion before buffers overflow.  We call such a
   proactive approach AQM.  By dropping packets before buffers overflow,
   AQM allows network devices to control when and how many packets to
   drop.

As we're going to use ECN fairly soon as part of AQM, we need to
foreshadow that, and this is also a good place to define AQM.  I suggest
changing  "The solution to the full-queues problem ... We call such a
proactive approach AQM. " to

   The solution to the full-queues problem is for network devices to drop 
packets
   or otherwise notify the sender before a queue becomes full, so that hosts
   can respond to congestion before buffers overflow.  We define such a 
proactive
   approach to be /Advanced Queue Management/ (AQM).  By dropping or marking 
packets
   before buffers overflow, AQM allows network devices to maximize performance.

End of first day: comments, suggestions and brickbats cordially accepted!

--dave
[If anyone would prefer this as amendments to an editable format, just say]

_______________________________________________
aqm mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/aqm

[aqm] Comments on draft-ietf-aqm-recommendation-00, message 1 of several

Reply via email to