On Nov 7, 2013, at 8:59 AM, "Akhtar, Shahid (Shahid)" 
<[email protected]> wrote:

> Hi All,
> 
> Had some comments on Fred's document. I have added the comments as track 
> changes in a word document to easily see them. I used the 02 version. 
> 
> Thanks. 


Permit me to put your comments in email, along with my own views. Also adding 
my co-author and the other working group chair on the CC line; if he is like 
me, he receives far too much email, and mail that is explicitly to or copies 
him bubbles higher in the column.

>> 4.  Conclusions and Recommendations
>>   [snip] 
>>    3.  The algorithms that the IETF recommends SHOULD NOT require
>>        operational (especially manual) configuration or tuning.
> 
> Some tuning may be required or implicitly assumed for virtually all AQMs – 
> please see my comment later.

That's an opinion. One of the objectives of Van and Kathy's work, and 
separately of Rong Pan et al's work, is to design an algorithm that may have 
different initial conditions drawn from a table given the interface it finds 
itself on, but requires no manual tuning. The great failure of RED, recommended 
in RFC 2309, is not that it doesn't work when properly configured; it's that 
real humans don't have the time to properly tune it differently for each of the 
thousands of link endpoints in their networks. There is no point in changing 
away from RED if that is also true of the replacement.

>>    7.  Research, engineering, and measurement efforts are needed
>>        regarding the design of mechanisms to deal with flows that are
>>        unresponsive to congestion notification or are responsive, but
>>        are more aggressive than present TCP.
>  
>       Do we want to make a suggestion on how to configure buffer sizes with 
> each type of AQM here (e.g. 2xBDP etc) or simply state that research should 
> be conducted on the best buffer sizes to use with AQM.
 
I'm not sure that buffer sizes are specific to AQM algorithms; I'd entertain 
evidence otherwise. Buffer *thresholds* ("at what point do we start 
dropping/marking traffic?") may differ between algorithms. Buffer size ("how 
many bytes/packets do we allow into the queue in the worst case?") is a matter 
of the characteristics of burst behavior in a given network and the 
applications it supports. If I have, say, a Map/Reduce application that 
simultaneously asks thousands of systems a question, the queues in the 
intervening switches will need to be able to briefly absorb thousands of 
response packets. The key word here is "briefly". When Van or Kathy talk about 
"good queue" and "bad queue", they are saying that burst behavior may call for 
deep queues, but we really want the steady state to achieve 100% utilization 
with a statistically empty queue if we can possibly achieve that.

>> 4.3.  AQM algorithms deployed SHOULD NOT require operational tuning
>>  
>>    A number of algorithms have been proposed.  Many require some form of
>>    tuning or initial condition.  This can make them difficult to use
>>    operationally.  Hence, self-tuning algorithms are to be preferred.
>>    The algorithms that the IETF recommends SHOULD NOT require
>>    operational (especially manual) configuration or tuning.
>  
> May be better to state that tuning should be minimized. For the second 
> sentence “The algorithms that the IETF recommends should minimize tuning or 
> configurations changes for specific traffic or network conditions”
>  
> I would argue that all AQMs will likely require or assume some type of 
> configuration/tuning.
>  
> For example, if we take Codel:
>  
> ·       For small thin links, such as 1-10Mbs DSL, the 5ms target would 
> increase packet loss significantly and at 2Mbps, a single MTU time may even 
> exceed the 5ms target.
>  
> ·       If the average RTT of all flows going through a link is more than 
> 500ms, e.g. for satellite, then the 100ms interval would prematurely drop 
> packets before the sources have had a chance to reduce their sending rate. Or 
> if the average RTT is very low – e.g. 10ms – such as for flows between 
> data-center elements, then 100ms interval may be slow to signal congestion 
> back to the sources and significant packet loss may have occurred before such 
> signaling.


What you describe is what I referred to as "initial conditions derived from the 
links the algorithm finds itself on". We may be in violent agreement there. If 
so, the wording I might suggest would be "SHOULD NOT require operational 
(especially manual) configuration or tuning apart from automated determination 
of initial conditions" or some such thing.

>> 4.7.  The need for further research
>>  
>>    The second recommendation of [RFC2309] called for further research in
>>    the interaction between network queues and host applications, and the
>>    means of signaling between them.  This research has occurred, and we
>>    as a community have learned a lot.  However, we are not done.
>>  
>>    We have learned that the problems of congestion, latency and buffer-
>>    sizing have not gone away, and are becoming more important to many
>>    users.  A number of self-tuning AQM algorithms have be found that
>>    offer significant advantages for deployed networks.  There is also
>>    renewed interest in deploying AQM and the potential of ECN.
>>  
>>    An obvious example of further research in 2013 is the need to
>>    consider the use of Map/Reduce applications in data centers; do we
>>    need to extend our taxonomy of TCP/SCTP sessions to include not only
>>    "mice" and "elephants", but "lemmings"?  "Lemmings" are flash crowds
>>    of "mice" that the network inadvertently tries to signal to as if
>>    they were elephant flows, resulting in head of line blocking in data
>>    center applications.
>  
> Such research should also focus on improving end user QoE from AQMs rather 
> than network related metrics only. Often a significant change in a network 
> metric may only make a minimal change in end-user QoE and thus the value of 
> such change may be minimal.


There is also a question of what user is under discussion. If you take a look 
at http://www.ietf.org/proceedings/88/slides/slides-88-aqm-0.pptx and 
specifically the third slide, you will see (and tomorrow afternoon I will 
discuss) a capture I took of pings from my home network to another site 
overnight. In the evening, we watched a movie (home network), and in the 
morning I had a video conference (office network). I'll tell you right now that 
both worked fine, and managing the delay to zero or allowing it to be one 
hundred ms would not have materially affected the QOE of either. However, my 
ping is a competing application, and it saw sustained increase in delay, 
variation in delay, and the possibility of loss as a result of the queuing. The 
value of AQM is in part to the application being throttled, but in large part 
to competing applications, and the QOE of both must be considered.

>>    Examples of other required research include:
>>  
>>    o  Research into new AQM and scheduling algorithms.
>>  
>>    o  Research into the use of and deployment of ECN alongside AQM.
>>  
>>    o  Tools for enabling AQM (and ECN) deployment and measuring the
>>       performance.
>>  
>>    o  Methods for mitigating the impact of non-conformant and malicious
>>       flows.
>  
> Methods or configurations that leverage deployed AQMs such as RED/WRED to 
> reduce delays and lockout for typical traffic which require minimal effort or 
> tuning from the operator.
 
Not a complete sentence, but I think I understand what you're getting at. You 
would like to have research determine how to easily configure existing systems 
using the tools at hand. I'm all for it in the near term.

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
aqm mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/aqm

Reply via email to