I thought every bullet point here was marvelous:

http://www.ietf.org/mail-archive/web/aqm/current/msg01118.html

and would like to see it captured in a formal document somewhere if it
is not captured in the ecn advocacy document.

I have only three quibbles, one kind of major.

re: #5 Slow-starts{Note 3} cause spikes in delay.

- AQM without ECN cannot remove this delay, and typically AQM is
designed to allow such bursts of delay in the hope they will disappear
of their own accord. - Flow queuing can remove the effect of these
delay bursts on other flows, "but only if it gives all flows a
separate queue from the start."

I don't think this last is proven. Further, I am pretty sure that a
fully dedicated queue per flow is kind of dangerous.

I would have said:

- AQM *with* ECN cannot remove this delay, and typically AQM is
designed to allow such bursts of delay in the hope they will disappear
of their own accord. AQM without ECN or with ECN overload protection
can make a dent in this delay but not instantaneously. FQ can remove
the effect of these delay bursts on other flows.

re: #6 Slow-starts{Note 3} can cause runs of losses, which in turn cause delays.
  - AQM without ECN cannot remove these delays.
  - Flow queuing cannot remove these losses, if self-induced.
  - Delay-based SS like HSS can mitigate these losses, with increased
risk of longer completion time.
  - ECN can remove these losses, and the consequent delays.

We have a problem with interpreting delay within a flow and delays
induced in other flows throughout much of our debates and perhaps we
should come up with a clean word to distinguish between these two
forms of induced delay. AQM with ECN in this case would reduce the
amount of induced delay here on that flow, but cause delay for other
flows, and the overall rate would only be reduced by half while
perhaps 5 of the IW10 packets (as an example) could have been dropped
(clearing the immediate congestion for another flow).

With the current overload protection in the different ecn enabled AQM
algorithms, different things happen, as I have noted elsewhere. pie
very quickly starts dropping even ecn marked packets, when slammed
with stuff in slow start, which is perhaps as it should be.

Re: "Whether flow queuing is applicable depends on the scale. The work
I'm doing with Koen is to reduce the cost of the queuing mechanisms on
our BNGs (broadband network gateways). We're trying to reduce the cost
of per-customer queuing at scale, so per-flow queuing is simply out of
the question. Whereas ECN requires no more processing than drop."

Three subpoints.

1) It turns out that the amount of packet inspection needed to pry
apart a packet and mark it can be quite a lot at dequeue time,
additional memory accesses for configuration variables as well.
hashing and timestamping the headers at enqueue time is in some ways
lighter weight, particularly if offloaded to the rx hardware.

There are other things besides queue algorithms that are pretty
heavyweight in the code path, notably FIB lookups (recently massively
improved in linux 4.0) - which can benefit from be-ing paralellized,
as they are, with the current 10GigE hardware in most intel systems,
with 16 cpus handling the load of, typically, 64 rx and tx queues.

So it is a total systems (amdahl's law) sort of problem as to where
the trade-offs are. I would be interested to know of the cpu, network
hardware, and memory design of your BNGs. I am painfully aware of how
hard it is to do software rate limiting in Linux, by this point, in
particular. Doing it in hardware turned out to be straightforward
(senic).

In the design of cake it was basically my hope to find a simple means
to apply it
to many, many customer specific queues, but that requires a customer
lookup service filter not yet designed, and some attention to how rx
queues are handled in the stack on a per cpu basis, and perhaps some
custom hardware. I look forward to trying it at 10GigE soon.

2) per-flow queuing is a mere matter of memory organization after that
point, and we already know how to scale that to millions of flows on a
day to day basis on intel hardware.[1]

As sort of a side note that doesn't really fit anywhere

3) - FQ, for lack of a better word, can act as a "step down
transformer". Imagine if you will a TSO burst emitted at 10GigE,
hitting a saturated 10GigE link with 1000 flows with FQ enabled. Each
packet from this flow will be slowed down and delivered at effectively
10Mbits/sec. At one level this is desirable, giving the ultimate
endpoint more time to maneuver. Breaking up the burst applies a form
of pacing, even if the bottleneck link is only momentarily saturated
with another flow(s).

At another it isn't desirable, particularly in the case of packet
aggregation on the endpoint.

This burst break-up is, of course, something that already basically
happens on switched ports, by design.

[1] Please note I just said per-flow queuing not any particular form
of fq algorithm and had sort of segued into really high end hardware.
At lower rates (say, below 1GigE) I am pretty happy with the
characteristics and defaults of fq_codel. I have not the foggiest idea
if sfq, drr, fq_codel, qfq, or any other fq algorithm have observably
different behavior at higher rates than gigE. some preliminary
indications show that drr is better than per-packet fairness due to
making GRO offloads work better on hosts receiving packets. Work on
cake continues... and it's the software rate limiter that has cost me
the most hair.

-- 
Dave Täht
Open Networking needs **Open Source Hardware**

https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67

_______________________________________________
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Reply via email to