Re: [aqm] IETF88 Fri 08Nov13 - 12:30 Regency B

Akhtar, Shahid (Shahid) Fri, 08 Nov 2013 05:57:39 -0800

Fred,

I have taken the liberty to tie your comments to the e-mail list of comments I 
had sent and put a "FB" next to your comment. I have added "SA" next to my 
comments. This  is easier for people to read.

-Shahid.

____________________________________________________

Section 4.3 - Manual tuning

The document states and implies that AQMs should not require manual tuning. In 
section 4.3 it states "The algorithms that the IETF recommends SHOULD NOT 
require operational (especially manual) configuration or tuning". I recommend 
to change this sentence to: "The algorithms that the IETF recommends should 
minimize tuning or configurations changes for specific traffic or network 
conditions"

I would further argue that all AQMs will likely require or implicitly assume 
some type of configuration/tuning.

For example, if we take Codel:
*       For small thin links, such as 1-10Mbs DSL, the 5ms target would 
increase packet loss significantly and at 2Mbps, a single MTU time may even 
exceed the 5ms target.
*       If the average RTT of all flows going through a link is more than 
500ms, e.g. for satellite, then the 100ms interval would prematurely drop 
packets before the sources have had a chance to reduce their sending rate. Or 
if the average RTT is very low - e.g. 10ms - such as for flows between 
data-center elements, then 100ms interval may be slow to signal congestion back 
to the sources and significant packet loss may have occurred before such 
signaling.

One of the the objectives of newer AQMs being defined here should be to 
minimize tuning, but we should recognize that likely tuning or some 
configuration cannot be eliminated altogether.

FB: That's an opinion. One of the objectives of Van and Kathy's work, and 
separately of Rong Pan et al's work, is to design an algorithm that may have 
different initial conditions drawn from a table given the interface it finds 
itself on, but requires no manual tuning. The great failure of RED, recommended 
in RFC 2309, is not that it doesn't work when properly configured; it's that 
real humans don't have the time to properly tune it differently for each of the 
thousands of link endpoints in their networks. There is no point in changing 
away from RED if that is also true of the replacement.

SA: You argue that "initial conditions" determine some of the parameters of 
newer AQMs (like Codel and PIE), then those same initial conditions would also 
determine some of the key parameters for RED/WRED.

Can you explain further why "real humans don't have the time to properly tune 
it differently for each of the thousands of link endpoints in their networks" 
with a realistic example.

FB: What you describe is what I referred to as "initial conditions derived from 
the links the algorithm finds itself on". We may be in violent agreement there. 
If so, the wording I might suggest would be "SHOULD NOT require operational 
(especially manual) configuration or tuning apart from automated determination 
of initial conditions" or some such thing.

SA: Looks like we may be agreeing on something here - which is great. It would 
be valuable to describe the meaning of "initial conditions" somewhere.

Section 4.7 - Further Research

AQM impact on end-user QoE
Research should also focus on improving end user QoE from AQMs rather than 
network related metrics only. Often a significant change in a network metric 
may only make a minimal change in end-user QoE and thus the value of such 
change may be minimal.

FB: There is also a question of what user is under discussion. If you take a 
look at http://www.ietf.org/proceedings/88/slides/slides-88-aqm-0.pptx and 
specifically the third slide, you will see (and tomorrow afternoon I will 
discuss) a capture I took of pings from my home network to another site 
overnight. In the evening, we watched a movie (home network), and in the 
morning I had a video conference (office network). I'll tell you right now that 
both worked fine, and managing the delay to zero or allowing it to be one 
hundred ms would not have materially affected the QOE of either. However, my 
ping is a competing application, and it saw sustained increase in delay, 
variation in delay, and the possibility of loss as a result of the queuing. The 
value of AQM is in part to the application being throttled, but in large part 
to competing applications, and the QOE of both must be considered.

SA: My point about bringing up QoE is to tie the impact of a new investment 
(such as AQM algorithms) to an operator to revenue. By improving QoE of revenue 
making services, an operator may determine that it makes business sense to 
invest. I do not understand how "ping" - is linked a revenue making service? 
The QoE of Web browsing which is a key service to end-users is closely tied to 
latency as the PLT of a web page is directly linked to RTT. However the QoE of 
different services such as web browsing, HAS Video, progessive download video 
and others should be weighed according to their importance to the end user.

Suggestion on best buffer sizes
Research should make suggestions on how to configure buffer sizes with each 
type of AQM (e.g. 2xBDP etc) - explaining why/how such buffer sizes improve 
end-user QoE and network health.

FB: I'm not sure that buffer sizes are specific to AQM algorithms; I'd 
entertain evidence otherwise. Buffer *thresholds* ("at what point do we start 
dropping/marking traffic?") may differ between algorithms. Buffer size ("how 
many bytes/packets do we allow into the queue in the worst case?") is a matter 
of the characteristics of burst behavior in a given network and the 
applications it supports. If I have, say, a Map/Reduce application that 
simultaneously asks thousands of systems a question, the queues in the 
intervening switches will need to be able to briefly absorb thousands of 
response packets. The key word here is "briefly". When Van or Kathy talk about 
"good queue" and "bad queue", they are saying that burst behavior may call for 
deep queues, but we really want the steady state to achieve 100% utilization 
with a statistically empty queue if we can possibly achieve that.

SA: In access situations (with AQM) buffer sizes can determine the size of a 
burst that the buffer can withstand and our research shows that it can directly 
impact end user QoE.

Best way to leverage deployed AQMs
Research should be done on methods or configurations that leverage deployed 
AQMs such as RED/WRED to reduce delays and lockout for typical traffic which 
require minimal effort or tuning from the operator.

FB: Not a complete sentence, but I think I understand what you're getting at. 
You would like to have research determine how to easily configure existing 
systems using the tools at hand. I'm all for it in the near term.

SA: Glad we agree on this - assume some update will be made around this topic.

-----Original Message-----
From: Fred Baker (fred) [mailto:[email protected]]
Sent: Thursday, November 07, 2013 12:45 PM
To: Akhtar, Shahid (Shahid)
Cc: Richard Scheffenegger; [email protected]; Naeem Khademi ([email protected]); 
Gorry Fairhurst; Wesley Eddy
Subject: Re: IETF88 Fri 08Nov13 - 12:30 Regency B

On Nov 7, 2013, at 8:59 AM, "Akhtar, Shahid (Shahid)" 
<[email protected]> wrote:

> Hi All,
>
> Had some comments on Fred's document. I have added the comments as track 
> changes in a word document to easily see them. I used the 02 version.
>
> Thanks.

Permit me to put your comments in email, along with my own views. Also adding 
my co-author and the other working group chair on the CC line; if he is like 
me, he receives far too much email, and mail that is explicitly to or copies 
him bubbles higher in the column.

>> 4.  Conclusions and Recommendations
>>   [snip]
>>    3.  The algorithms that the IETF recommends SHOULD NOT require
>>        operational (especially manual) configuration or tuning.
>
> Some tuning may be required or implicitly assumed for virtually all AQMs - 
> please see my comment later.

FB: That's an opinion. One of the objectives of Van and Kathy's work, and 
separately of Rong Pan et al's work, is to design an algorithm that may have 
different initial conditions drawn from a table given the interface it finds 
itself on, but requires no manual tuning. The great failure of RED, recommended 
in RFC 2309, is not that it doesn't work when properly configured; it's that 
real humans don't have the time to properly tune it differently for each of the 
thousands of link endpoints in their networks. There is no point in changing 
away from RED if that is also true of the replacement.

SA: You argue that "initial conditions" determine some of the parameters of 
newer AQMs (like Codel and PIE), then those same initial conditions would also 
determine the key parameters for RED/WRED.

Can you explain further why "real humans don't have the time to properly tune 
it differently for each of the thousands of link endpoints in their networks" 
with a realistic example.

>>    7.  Research, engineering, and measurement efforts are needed
>>        regarding the design of mechanisms to deal with flows that are
>>        unresponsive to congestion notification or are responsive, but
>>        are more aggressive than present TCP.
>
>       Do we want to make a suggestion on how to configure buffer sizes with 
> each type of AQM here (e.g. 2xBDP etc) or simply state that research should 
> be conducted on the best buffer sizes to use with AQM.

FB: I'm not sure that buffer sizes are specific to AQM algorithms; I'd 
entertain evidence otherwise. Buffer *thresholds* ("at what point do we start 
dropping/marking traffic?") may differ between algorithms. Buffer size ("how 
many bytes/packets do we allow into the queue in the worst case?") is a matter 
of the characteristics of burst behavior in a given network and the 
applications it supports. If I have, say, a Map/Reduce application that 
simultaneously asks thousands of systems a question, the queues in the 
intervening switches will need to be able to briefly absorb thousands of 
response packets. The key word here is "briefly". When Van or Kathy talk about 
"good queue" and "bad queue", they are saying that burst behavior may call for 
deep queues, but we really want the steady state to achieve 100% utilization 
with a statistically empty queue if we can possibly achieve that.

SA: In access situations (with AQM) buffer sizes can determine the size of a 
burst that the buffer can withstand and our research shows that it can directly 
impact end user QoE.

>> 4.3.  AQM algorithms deployed SHOULD NOT require operational tuning
>>
>>    A number of algorithms have been proposed.  Many require some form of
>>    tuning or initial condition.  This can make them difficult to use
>>    operationally.  Hence, self-tuning algorithms are to be preferred.
>>    The algorithms that the IETF recommends SHOULD NOT require
>>    operational (especially manual) configuration or tuning.
>
> May be better to state that tuning should be minimized. For the second 
> sentence "The algorithms that the IETF recommends should minimize tuning or 
> configurations changes for specific traffic or network conditions"
>
> I would argue that all AQMs will likely require or assume some type of 
> configuration/tuning.
>
> For example, if we take Codel:
>
> *       For small thin links, such as 1-10Mbs DSL, the 5ms target would 
> increase packet loss significantly and at 2Mbps, a single MTU time may even 
> exceed the 5ms target.
>
> *       If the average RTT of all flows going through a link is more than 
> 500ms, e.g. for satellite, then the 100ms interval would prematurely drop 
> packets before the sources have had a chance to reduce their sending rate. Or 
> if the average RTT is very low - e.g. 10ms - such as for flows between 
> data-center elements, then 100ms interval may be slow to signal congestion 
> back to the sources and significant packet loss may have occurred before such 
> signaling.

FB: What you describe is what I referred to as "initial conditions derived from 
the links the algorithm finds itself on". We may be in violent agreement there. 
If so, the wording I might suggest would be "SHOULD NOT require operational 
(especially manual) configuration or tuning apart from automated determination 
of initial conditions" or some such thing.

SA: Looks like we may be agreeing on something here - which is great. It would 
be valuable to describe the meaning of "initial conditions" somewhere.

>> 4.7.  The need for further research
>>
>>    The second recommendation of [RFC2309] called for further research in
>>    the interaction between network queues and host applications, and the
>>    means of signaling between them.  This research has occurred, and we
>>    as a community have learned a lot.  However, we are not done.
>>
>>    We have learned that the problems of congestion, latency and buffer-
>>    sizing have not gone away, and are becoming more important to many
>>    users.  A number of self-tuning AQM algorithms have be found that
>>    offer significant advantages for deployed networks.  There is also
>>    renewed interest in deploying AQM and the potential of ECN.
>>
>>    An obvious example of further research in 2013 is the need to
>>    consider the use of Map/Reduce applications in data centers; do we
>>    need to extend our taxonomy of TCP/SCTP sessions to include not only
>>    "mice" and "elephants", but "lemmings"?  "Lemmings" are flash crowds
>>    of "mice" that the network inadvertently tries to signal to as if
>>    they were elephant flows, resulting in head of line blocking in data
>>    center applications.
>
> Such research should also focus on improving end user QoE from AQMs rather 
> than network related metrics only. Often a significant change in a network 
> metric may only make a minimal change in end-user QoE and thus the value of 
> such change may be minimal.

FB: There is also a question of what user is under discussion. If you take a 
look at http://www.ietf.org/proceedings/88/slides/slides-88-aqm-0.pptx and 
specifically the third slide, you will see (and tomorrow afternoon I will 
discuss) a capture I took of pings from my home network to another site 
overnight. In the evening, we watched a movie (home network), and in the 
morning I had a video conference (office network). I'll tell you right now that 
both worked fine, and managing the delay to zero or allowing it to be one 
hundred ms would not have materially affected the QOE of either. However, my 
ping is a competing application, and it saw sustained increase in delay, 
variation in delay, and the possibility of loss as a result of the queuing. The 
value of AQM is in part to the application being throttled, but in large part 
to competing applications, and the QOE of both must be considered.

SA: My point about bringing up QoE is to tie the impact of a new investment 
(such as AQM algorithms) to an operator to revenue. By improving QoE of revenue 
making services, an operator may determine that it makes business sense to 
invest. I do not understand how "ping" - is linked a revenue making service? 
The QoE of Web browsing which is a key service to end-users is closely tied to 
latency as the PLT of a web page is directly linked to RTT. However the QoE of 
different services such as web browsing, HAS Video, progessive download video 
and others should be weighed according to their importance to the end user.

>>    Examples of other required research include:
>>
>>    o  Research into new AQM and scheduling algorithms.
>>
>>    o  Research into the use of and deployment of ECN alongside AQM.
>>
>>    o  Tools for enabling AQM (and ECN) deployment and measuring the
>>       performance.
>>
>>    o  Methods for mitigating the impact of non-conformant and malicious
>>       flows.
>
> Methods or configurations that leverage deployed AQMs such as RED/WRED to 
> reduce delays and lockout for typical traffic which require minimal effort or 
> tuning from the operator.

FB: Not a complete sentence, but I think I understand what you're getting at. 
You would like to have research determine how to easily configure existing 
systems using the tools at hand. I'm all for it in the near term.

SA: Glad we agree on this - assume some update will be made around this topic.

_______________________________________________
aqm mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] IETF88 Fri 08Nov13 - 12:30 Regency B

Reply via email to