I thought every bullet point here was marvelous: http://www.ietf.org/mail-archive/web/aqm/current/msg01118.html
and would like to see it captured in a formal document somewhere if it is not captured in the ecn advocacy document. I have only three quibbles, one kind of major. re: #5 Slow-starts{Note 3} cause spikes in delay. - AQM without ECN cannot remove this delay, and typically AQM is designed to allow such bursts of delay in the hope they will disappear of their own accord. - Flow queuing can remove the effect of these delay bursts on other flows, "but only if it gives all flows a separate queue from the start." I don't think this last is proven. Further, I am pretty sure that a fully dedicated queue per flow is kind of dangerous. I would have said: - AQM *with* ECN cannot remove this delay, and typically AQM is designed to allow such bursts of delay in the hope they will disappear of their own accord. AQM without ECN or with ECN overload protection can make a dent in this delay but not instantaneously. FQ can remove the effect of these delay bursts on other flows. re: #6 Slow-starts{Note 3} can cause runs of losses, which in turn cause delays. - AQM without ECN cannot remove these delays. - Flow queuing cannot remove these losses, if self-induced. - Delay-based SS like HSS can mitigate these losses, with increased risk of longer completion time. - ECN can remove these losses, and the consequent delays. We have a problem with interpreting delay within a flow and delays induced in other flows throughout much of our debates and perhaps we should come up with a clean word to distinguish between these two forms of induced delay. AQM with ECN in this case would reduce the amount of induced delay here on that flow, but cause delay for other flows, and the overall rate would only be reduced by half while perhaps 5 of the IW10 packets (as an example) could have been dropped (clearing the immediate congestion for another flow). With the current overload protection in the different ecn enabled AQM algorithms, different things happen, as I have noted elsewhere. pie very quickly starts dropping even ecn marked packets, when slammed with stuff in slow start, which is perhaps as it should be. Re: "Whether flow queuing is applicable depends on the scale. The work I'm doing with Koen is to reduce the cost of the queuing mechanisms on our BNGs (broadband network gateways). We're trying to reduce the cost of per-customer queuing at scale, so per-flow queuing is simply out of the question. Whereas ECN requires no more processing than drop." Three subpoints. 1) It turns out that the amount of packet inspection needed to pry apart a packet and mark it can be quite a lot at dequeue time, additional memory accesses for configuration variables as well. hashing and timestamping the headers at enqueue time is in some ways lighter weight, particularly if offloaded to the rx hardware. There are other things besides queue algorithms that are pretty heavyweight in the code path, notably FIB lookups (recently massively improved in linux 4.0) - which can benefit from be-ing paralellized, as they are, with the current 10GigE hardware in most intel systems, with 16 cpus handling the load of, typically, 64 rx and tx queues. So it is a total systems (amdahl's law) sort of problem as to where the trade-offs are. I would be interested to know of the cpu, network hardware, and memory design of your BNGs. I am painfully aware of how hard it is to do software rate limiting in Linux, by this point, in particular. Doing it in hardware turned out to be straightforward (senic). In the design of cake it was basically my hope to find a simple means to apply it to many, many customer specific queues, but that requires a customer lookup service filter not yet designed, and some attention to how rx queues are handled in the stack on a per cpu basis, and perhaps some custom hardware. I look forward to trying it at 10GigE soon. 2) per-flow queuing is a mere matter of memory organization after that point, and we already know how to scale that to millions of flows on a day to day basis on intel hardware.[1] As sort of a side note that doesn't really fit anywhere 3) - FQ, for lack of a better word, can act as a "step down transformer". Imagine if you will a TSO burst emitted at 10GigE, hitting a saturated 10GigE link with 1000 flows with FQ enabled. Each packet from this flow will be slowed down and delivered at effectively 10Mbits/sec. At one level this is desirable, giving the ultimate endpoint more time to maneuver. Breaking up the burst applies a form of pacing, even if the bottleneck link is only momentarily saturated with another flow(s). At another it isn't desirable, particularly in the case of packet aggregation on the endpoint. This burst break-up is, of course, something that already basically happens on switched ports, by design. [1] Please note I just said per-flow queuing not any particular form of fq algorithm and had sort of segued into really high end hardware. At lower rates (say, below 1GigE) I am pretty happy with the characteristics and defaults of fq_codel. I have not the foggiest idea if sfq, drr, fq_codel, qfq, or any other fq algorithm have observably different behavior at higher rates than gigE. some preliminary indications show that drr is better than per-packet fairness due to making GRO offloads work better on hosts receiving packets. Work on cake continues... and it's the software rate limiter that has cost me the most hair. -- Dave Täht Open Networking needs **Open Source Hardware** https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67 _______________________________________________ aqm mailing list aqm@ietf.org https://www.ietf.org/mailman/listinfo/aqm