On Fri, Dec 13, 2013 at 12:15 AM, Dave Taht <[email protected]> wrote: > For starters, the codel signaling delay from the onset of continuous > over 5ms delay on packets defaults to target 100ms, not 200ms.
interval 100ms. it has been a long week. > I don't > know who started saying 200ms but even I started believing it with the > few brain cells I've had to spare of late. 5x a CDN rtt in a world of > 30-60k images sounds about right. > > Secondly, codel drops/marks from head, not from tail, so the signal > gets back to the sender in 1/2 the real physical RTT after that, > rather than the tail of a queue that may be out of control at that > point. Much faster than pie. > > There has been so much misinformation spread of late on these threads. > I'm hoping we're beginning to make a dent in it? I look forward to > making all this clear on the upcoming RFCs. I think I should stop now, > revisit the rest of this thread and see what else can be cleared up > before even beginning to tackle fq_codel after I get caught up on > sleep. > > as for your other comments... > > I have always said deploy RED and for that matter DRR, SFQ or SQF > where you can. I distinctly remember polling the crowd at the first > uknof I went to and being sad to discover only about 4% of the room > had (4 people). > > I DO hold that red is too hard to configure for ordinary mortals, and > that it doesn't work at all on variable bandwidth links like cable, or > wireless, which happen to be the dominant form of end-user link > nowadays. > > As for the hysteresis "problem", in practice it doesn't seem to be > much of a problem. things get well under control before a web page > completes.Same goes for my tests against DASH traffic. I have plenty > of plots and traces of this. Many are on the results webpage for > bufferbloat.net. > > as for a good default for interval, a good number IS dependent on your > RTT, and without coupling the ingress and output queues, it's > difficult to determine or even auto tune that. Perhaps with connection > tracking or some other form of coupling, one day. The ACC code from > the gargoyle router project is worth looking at. > > I am satisfied that fq_codel can be deployed on fixed rate lines > without any tuning on bandwidths ranging from 4mbit to 1gbit, today, > as it stands. I have done hundreds of thousands of tests to prove > that. Optimizations are helpful for the 3 band system that is what > mostly deployed today, such as smaller quantum's on slow asymmetric > links, and a smaller packet limit on low memory routers. > > A larger target is working well on sub 4ms links. I think that could > auto tune better. s/4ms/4mbit/ > > A lower target and interval seem right for data center use, but I have > yet to get anyone to run my suite of published tests. > > A rate limiter is required to compensate for ISP's lousy > dslam/cmts/gpon head end and CPE at least until this code makes it > onto those devices. Long lead times predominate on this sort of > hardware - We have three years to get DOCSIS 3.1 right, as one > example. > > These are second order problems that will be fixed over time. Wifi and > wireless remain problematic, but dents in those problems seem imminent > by next year, and many of the problems aren't aqm or packet scheduling > ones. > > SO. damn straight, I'm one of the people pushing for deployment, > notably on boxes that are easy to upgrade and fix as we learn more > about what we should be doing. I'm definitely reluctant to hard code > stuff into big iron or hard to replace firmware as yet. But as matt > mathis said at ietf - what we have is such an improvement over what is > in place today, that it is time to deploy. After almost 3 years of > effort I'm happy to have a few million boxes in place to learn more > from. Aren't you? > > We just have a couple billion boxes left to fix. Plenty of time to > tweak things as we go along. If you want RED, or ARED, in linux, it's > been fixed now for 2 years to perform as to the spec. Go for it. If > you could create something to automate RED configuration as I have for > the ceroshaper tool in cerowrt, let me know. > > Any time someone has debloating code worth working on... I'm willing > to help. I've been helping on pie, and as you know I've been looking > over your DCTCP experiment carefully, finding and fixing bugs, and > moving the code forward to where it can be compared against a modern > kernel and a modern TCP and modern AQM and packet scheduling systems. > > > > > On Thu, Dec 12, 2013 at 4:05 PM, Bob Briscoe <[email protected]> wrote: >> Dave, >> >> >> At 22:11 12/12/2013, Dave Taht wrote: >>> >>> but quickly... >>> >>> Bob, I object to your characterization of users links being busy 1-3% >>> of the time. That's an average. >> >> >> I said it was an average. You're repeating and agreeing with what I said, >> but saying you object to me saying it? >> >> >>> When they are busy, they are very busy >>> for short periods, typically 2-16 seconds in the case of web traffic, >>> then idle for minutes. DASH traffic is busy for 2+ seconds every 10 on >>> a 20mbit link, and so on, for 1.5 hours or so. Etc. >> >> >> Yes, again, you're agreeing with me. >> >> The mean for a Web session is towards the low end of the 2-16 seconds range >> even now. And as we get the other latency-saving advances out there (e.g. >> removing TCP & TLS handshakes, proper pipelining, and a faster replacement >> for slow-start without overshoot), then there is potential for Web sessions >> to drop to 1-2 seconds or less, because they're usually a long way from >> being bandwidth limited. >> >> You said fq_codel decays out its memory in 800ms. So it will typically have >> lost all its memory when each Web transfer starts and when each DASH >> transfer re-starts. >> >> So I think you're agreeing that this 200ms signalling delay will be >> predominant? >> >> >>> In both cases >>> congestion exists, and in both cases AQM reaction time measured in >>> 200ms or so is still vastly superior to what happens today, and packet >>> scheduling masks it to a large extent. >> >> >> You're saying it's OK to propose a solution that delays signalling >> congestion for about 10 typical CDN RTTs,... because it's better than >> nothing? >> >> That's rich. I have to say this... You're one of the group of protagonists >> who has persuaded the world to embark on a programme of /implementation/ >> updates that will take years, and rubbished deploying the AQM that was >> already implemented (RED), even tho it was already much better than nothing >> too. >> >> Yes, auto-config for line rate is a nice bell to add to the bicycle. If we >> are going to embark on new implementations, auto-config for RTT is no less >> important. >> >> Lack of auto-config for either requires the fixed config to be at the slow >> end of the range. And both have a similar range of variability (the RTT >> range is actually wider). So lack of auto-config in either case leads to a >> similar order of unnecessary delay. Particularly given traffic is >> predominantly in short sparse bursts. >> >> >> >> Bob >> >> >> >> >>> Jim, yes, I was trying to establish the groundwork for ensuring >>> everyone really understood codel-by-itself before talking about >>> fq_codel. I'm still not sure that's been established. Anyone here care >>> to calculate the number of drops on two flows going through codel and >>> fq_codel, starting with iw10, over a 10mbit 2ms RTT link? And when, >>> approximately, the queue becomes ideal? And how often the fq_codel >>> "fast queue" gets used in this case? Gold stars for everyone that gets >>> it right. >>> >>> Jim, I'd like you to use larger download speeds than a mbit for your >>> examples. somewhere between 8 and 20mbit seems appropriate. (IMHO iw10 >>> should not be used on the modern internet on sub 10Mbit links.) The >>> dynamics change significantly as you get more bandwidth than iw10 >>> abuses. >>> >>> Anyway… >>> >>> After I got as far as describing fq_codel accurately in this thread, >>> then I'd hoped to be able to tackle the immediate ECN issue, the value >>> of randomness in pie, and the effectiveness and need for ECN on the >>> edge as it is currently defined. >>> >>> and I figured that then I might have written enough to get closer to an >>> rfc. >>> >>> and when I started kibitzing on this thread I thought I was talking to >>> the DCTCP case which I've spent a few months studying up on, and >>> looking over alternative ideas there, like >>> >>> http://conferences.sigcomm.org/co-next/2013/program/p49.pdf >>> http://conferences.sigcomm.org/co-next/2013/program/p151.pdf >>> >>> Scalable, Optimal Flow Routing in Datacenters via Local Link Balancing >>> >>> http://www.irt-systemx.fr/wp-content/uploads/2013/12/AINTEC.ppt >>> >>> Sigh. I'll >>> >>> On Thu, Dec 12, 2013 at 11:35 AM, Jim Gettys <[email protected]> wrote: >>> > >>> > >>> > >>> > On Wed, Dec 11, 2013 at 2:21 PM, Bob Briscoe <[email protected]> wrote: >>> >> >>> >> Jim, >>> >> >>> >> >>> >> At 16:55 11/12/2013, Jim Gettys wrote: >>> >> >>> >> >>> >> >>> >> On Tue, Dec 10, 2013 at 10:04 PM, Bob Briscoe <[email protected]> >>> >> wrote: >>> >> Jim, >>> >> >>> >> I'm just checking we're not talking past each other. I'll repeat two >>> >> quotes from each of us, then comment. >>> >> >>> >> On Thu, Dec 5, 2013 at 1:13 PM, Bob Briscoe <[email protected]> wrote: >>> >> >>> >> 3{New}. It SHOULD be possible to make different instances of an AQM >>> >> algorithm apply to different subsets of packets that share the same >>> >> queue. >>> >> It SHOULD be possible to classify packets into these subsets at least >>> >> by ECN >>> >> codepoint [RFC3168] and Diffserv codepoint [RFC2474] (or the equivalent >>> >> of >>> >> these fields at lower layers), >>> >> >>> >> >>> >> At 19:50 05/12/2013, Jim Gettys wrote: >>> >> >>> >> "Certainly, it may be the same instance of an AQ >>> >> M algorithm, rather than different instances, for example." >>> >> >>> >> >>> >> That's true of course, but the case with one AQM handling all packets >>> >> within a queue is the norm. I want to check you're happy with the >>> >> converse: >>> >> 1) A set-up more like WRED which was based on Dave Clark's RIO (RED >>> >> with >>> >> in and out of contract). So we can have WPIE, WCoDel etc where the >>> >> differentiation between aggregates is provided by different AQM >>> >> instances in >>> >> the same queue, not by different queues with different scheduling >>> >> priorities. >>> >> 2) Extending this so that AQM differentiation can be between >>> >> ECN-capable >>> >> and Not-ECN-capable aggregates, not just between Diffserv classes (an >>> >> example being CoDel with a lower 'interval' for ECN-capable packets). >>> >> >>> >> I presented the evaluations of this last idea in tsvwg on the final >>> >> Friday >>> >> of the Vancouver IETF - I don't think you were there. < >>> >> http://www.ietf.org/proceedings/88/slides/slides-88-tsvwg-20.pdf > >>> >> >>> >> >>> >> Yes, unfortunately I had to leave before the Friday session. >>> >> This is my primary motivation for this wordsmithing - I'm trying allow >>> >> us >>> >> to move towards zero signalling delays in CoDel, PIE and RED (currently >>> >> defaults of 200ms, 100ms and 512packets respectively, which are not >>> >> good for >>> >> dynamics). >>> >> >>> >> >>> >> Certainly signalling delays are very important: this is why I'm >>> >> favorably >>> >> inclined to "head mark/drop", as it signals TCP as quickly as possible, >>> >> keeping the response of the TCP feedback loop as tight as possible (and >>> >> part >>> >> of why I like CoDel so much for the highly variable bandwidth problem >>> >> we >>> >> face at the edge of the net). >>> >> >>> >> It's *really* important than when the bandwidth drops suddenly that >>> >> everyone gets told to slow down quickly (exactly how quickly probably >>> >> depends on the propagation change characteristics of the medium), or >>> >> packets >>> >> can pile up in a big way. >>> >> >>> >> How quickly the mark/drop algorithm can figure out that signalling is >>> >> appropriate is the *other* piece of getting good dynamics. Here I >>> >> don't >>> >> doubt that something may be discovered that is better than CoDel in the >>> >> slightest. >>> >> It takes a CoDel instance (within an fq structure) 200ms from its queue >>> >> first passing 'threshold' before it will ever drop the first packet >>> >> (unless >>> >> the queue hits taildrop before that). So if the RTT is 20ms, that's >>> >> 220ms >>> >> signalling delay. >>> > >>> > >>> > No, again, see Dave's mail, and you are missing the flow scheduling >>> > aspect >>> > of this and thinking in terms of a single queue and the usual mark/drop >>> > cases; this is the exact cognitive problem I'm talking about. >>> > >>> > The flow scheduling aspect of fq_codel is *more* important than what >>> > mark/drop algorithm decides to signal the TCP's to regulate their servo >>> > systems. I really wish this algorithm had been called "fs_codel", >>> > rather >>> > than "fq_codel", as it is so easy to confuse "fair" with "flow", and >>> > "queue" >>> > with "scheduling". >>> > >>> > Regulating TCP is absolutely essential to keep TCP "sane" and >>> > responsive, >>> > but it isn't the most important part of what is going on. >>> > >>> > Here's the case of a new TCP flow on a previously idle link: After the >>> > TCP >>> > open handshake, you will have no more than 4 or (if IW10 is active) >>> > packets >>> > for that flow in the queue (actually, 3 packets, as presumably at least >>> > one >>> > packet is in process of transmission; @ 1Mbps, that first full size >>> > packet >>> > takes 13ms). >>> > >>> > If another flow starts, it will get preferentially scheduled to the >>> > existing >>> > flow(s) until it has also built a queue, at which time it competes >>> > "fairly" >>> > with the other flows in the packet scheduling. >>> > This means, in the simple low bandwidth case, that as soon as you start >>> > the >>> > second flow, it gets the next possible opportunity to be transmitted in >>> > preference to any flow that has built a queue. >>> > >>> > Here's the kicker: >>> > >>> > We've just avoid 3 packets worth of "head of line blocking" latency for >>> > the >>> > second flow to "do its thing". @1Mbs, that is 40ms (3*13ms) saved just >>> > in >>> > the TCP open, and more in that the first packet from the second flow >>> > then >>> > gets scheduled immediately, getting that TCP moving. >>> > >>> > Similarly for your voip packet; it saves those 40ms. And your bulk flow >>> > gets >>> > its first packet through ASAP as well; for web traffic, that usually >>> > contains the size information or other metadata required for the web >>> > browser >>> > to unblock. And that is for a *single* competing TCP connection. >>> > >>> > Now, let's look at today's web: there are many embedded objects in a >>> > page. >>> > Once the base page is downloaded, the web browser (presuming its DNS >>> > lookups >>> > are already cached), opens a pile of connections all at once. Whether I >>> > like it or not, (which I don't, as I tried to prevent it with pipelining >>> > in >>> > HTTP/1.1 in the 1990's), you can easily have 5-30 TCP connections start >>> > almost simultaneously. See the connections/page plot on my web page >>> > explaining all this stuff in more detail at: >>> > >>> > http://gettys.wordpress.com/2013/07/10/low-latency-requires-smart-queuing-traditional-aqm-is-not-enough/ >>> > >>> > Without any TCP flow control ever happening, I can easily get 40 (or >>> > even >>> > several hundred packets!) arriving at near line rate (the initial window >>> > * >>> > number of embedded objects). Most of these TCP connections *won't ever >>> > even >>> > get out of slow start*; the objects are typically pretty small. I've >>> > observed transient latency from HOL blocking of > 100ms on 50Mbps cable >>> > service, on some web pages. >>> > >>> > Nothing we will/can do at the TCP level can help this situation!!! >>> > Nothing, >>> > other than in the long term getting the incentives right so that fewer >>> > TCP >>> > connections might be preferable to gaming TCP, as the web already has. >>> > >>> > That CoDel may take a while to figure out that it should start marking >>> > in >>> > the idle case is really pretty irrelevant; the packets have already >>> > arrived >>> > and unless we do better packet scheduling, you are fully stuck. Dave's >>> > trying to also explain that a number of people's understanding of how >>> > CoDel >>> > works has been wrong. >>> > >>> > But as I say, *the details of the mark/drop algorithm doesn't matter*. >>> > Having 3, or 100 packets queued in a FIFO queue will already mean you >>> > are >>> > screwed for anything low latency at the bandwidths we all care about. >>> > >>> > Once a flow drains to zero, it again gets treated as a new flow. >>> > And what we choose to define a "flow" to be is arbitrary, though the >>> > code >>> > today does the usual 5tuple. >>> > >>> >> In fq_codel this creates considerable self-delay for short flows or r-t >>> >> apps, which kill their own latency before they get any loss signal to >>> >> tell >>> >> them to slow down. >>> > >>> > >>> > Not likely in the common case. See Dave's comments and also note that >>> > the >>> > initial burst of web page load runs the queue up high immediately. >>> > >>> > And as I keep saying, and will say again, the flow queuing decisions >>> > avoiding HOL blocking explained are much more important to dealing with >>> > the >>> > latency problems. >>> > >>> > So go invent better algorithms that CoDel for drop/mark, that can be >>> > applied >>> > to the flow queuing parts of the algorithm, and I'm happy. Running code >>> > please. >>> > >>> > >>> > >>> >> >>> >> Even for elastic flows, with congestion signals delayed by so much, >>> >> they >>> >> risk hitting themselves with a huge train of overshoot loss. This would >>> >> be >>> >> the same for fq_pie, except the number is 100ms + RTT. >>> > >>> > >>> > When fq_pie exists to test, I'll be happy to see. >>> > >>> > I keep saying, and I'll say again: while some mark/drop algorithm needs >>> > to >>> > exist, flow scheduling is more important to getting good latency.... >>> > >>> > >>> >> >>> >> >>> >> Yes, the e2e transport could measure delay growth, but it doesn't know >>> >> whether the delay is coming from a queue that is isolated from others >>> >> or >>> >> not. So it doesn't want to slow down too quickly in response to delay >>> >> growth >>> >> in case it gets screwed by other traffic. Ie. using delay growth as a >>> >> signal >>> >> entails considerable signalling delay due to all the uncertainty. >>> >> >>> >> The proposal you missed in tsvwg was to define ECN as an immediate >>> >> signal >>> >> from the network, 'interval'=0 in CoDel terms, so the host always gets >>> >> congestion signals as fast as possible, and if it needs bursts of >>> >> signals >>> >> smoothed out, it can do that itself. >>> > >>> > >>> > Yes, I get that. I got that a year or more ago. The idea has potential >>> > merit. ECN as it is currently defined is not so useful. >>> > >>> > And I wish it would get a different name so we could more easily talk >>> > about >>> > the two different things now being called ECN. >>> > >>> > I can see having ECN marks on a burst of packets may be helpful in >>> > having >>> > the receiver judge things in a highly variable wireless scenario; it may >>> > have additional information about the medium and know that that >>> > transient >>> > has gone away, and that it may not be a wise idea to slow the connection >>> > at >>> > all. >>> > >>> > >>> >> >>> >> The suggested wording ensures all AQM implementations will allow >>> >> operators, vendors and users to configure such a mechanism. But I've >>> >> generalised it from ECN to Diffserv too (because the implementation >>> >> would be >>> >> no different). >>> > >>> > >>> > As noted before, Diffserv is still interesting, even though the packet >>> > scheduling in fq_codel (or similar algorithms) makes it much less >>> > necessary. >>> > There are two aspects to this: >>> > >>> > 1) higher priority contention to the medium. If I have a real VOIP >>> > packet, >>> > there are ways I can ask for higher priority access to the medium and >>> > reduce >>> > the total number of transmit opportunities my traffic requires (and that >>> > is >>> > often the scarcest resource on WiFi or Docsis). >>> > 2) any hint I can get helps (at the edge) so I can distinguish those >>> > packets >>> > from the way the web has been gaming the network. >>> > >>> > Even so, for Diffserv to be safely trusted and honored even in your >>> > home, >>> > the end user (who is the network operator in this case), will have be >>> > able >>> > to know that a device or application is using it and control whether or >>> > not >>> > it's honored. Unless it is under the network operator's control (in this >>> > case, you the home user) Diffserv can/will get gamed to uselessness by >>> > application and device manufacturers. Ergo why Dave and I hack on home >>> > routers again: this level of control is not currently present in these >>> > devices, and must be for Diffserv to be useful. >>> > >>> >> >>> >> >>> >> >>> >> My basic issue is one of terminology: people have talked about "best >>> >> effort" queues. In reality, this is a "class" of service, rather than >>> >> a >>> >> single queue, and when you get into the mental model of BE being a >>> >> single >>> >> queue, (rather than a set of queues) it can lead one astray quickly and >>> >> easily. >>> >> >>> >> >>> >> Yeah, I know this. I suspected we were talking past each other. >>> >> >>> >> I need you to allow the other case into your mind for this >>> >> conversation. >>> >> The wording is specifically about the case where "different subsets of >>> >> packets ... share the same queue". >>> > >>> > >>> > And the word "queue" in most people's minds implies ordering, and FIFO >>> > behavior. This is the terminological issue I'm harping on. It's also >>> > why I >>> > think "bufferbloat" is a better term for our situation than "queue >>> > bloat", >>> > which you liked and have harped at me about. Buffers don't have such an >>> > implication of ordering. >>> > So if you talked about a buffer here, rather than a queue, I'd be a lot >>> > happier. At least in my mind, queues are ordered. >>> > >>> >> >>> >> We can talk about an fq structure for this another time, but it's a >>> >> really >>> >> complicated way of doing it. >>> >> >>> >> Given simple looks like it could work, why get complicated already? >>> > >>> > >>> > Because the flow scheduling is such a win. You can't solve the whole >>> > problem just with mark/drop algorithms and FIFO queues and get reliably >>> > to >>> > decent latencies. >>> > >>> > Now, whether Fred's document can/should go into anything like that that >>> > detail is far from clear (arguably not). >>> > >>> > I just don't want to further the mythology we can get to decent >>> > latencies at >>> > the edge of the network to continue with FIFO queues and an AQM >>> > algorithm, >>> > and it's clear that many haven't yet internalized that its flow >>> > scheduling >>> > combined with a self tuning adaptive mark/drop algorithm that we must >>> > have, >>> > but that the flow scheduling makes the biggest difference. >>> > >>> >> >>> >> >>> >> It's really easy to fall into the idea of a single software queue >>> >> mapping >>> >> to some single hardware supported queue, and that's a cognitive >>> >> mistake, as >>> >> aggregating MACs are showing us; transmit ops are often the scarcest >>> >> resource... >>> >> >>> >> >>> >> It's only a cognitive mistake if one is not aware of all the options. >>> >> I'm >>> >> fully aware of all the options. >>> >> >>> >> To be specific, a queue into a wireless medium should be configured so >>> >> it >>> >> holds some 'good queue' in reserve for transmit ops, but the queues on >>> >> top >>> >> of this that TCP self-inflicts even briefly are not 'good queues' even >>> >> if >>> >> they are isolated from other flows by fq - VJ was wrong to generalise >>> >> the >>> >> phrase 'good queue' to all bursts of queue - it is only necessary to >>> >> hold >>> >> back from signalling and allow a burst of queue if the only possible >>> >> signal >>> >> is a drop. With ECN, you don't have this dilemma. This is the key to >>> >> rapid >>> >> dynamics. >>> > >>> > >>> > But you also then have to solve the HOL blocking problem, and do so >>> > urgently. Ergo flow scheduling.... To get to where we need to go, you >>> > have >>> > to worry about the order in which each packet is scheduled. >>> > >>> >> >>> >> >>> >> Diffserv marking has the potential to give a "hint" to distinguish how >>> >> particular flows should be handled (scheduled) in a service class, and >>> >> as my >>> >> previous example shows, that hint may be very useful in channel access >>> >> decisions (e.g. voip on 802.11). >>> > >>> > >>> > ECN doesn't help a bit with the head of line blocking problem; it >>> > actually >>> > will make it worse with FIFO scheduling. ECN means that you can't even >>> > get >>> > the packets out of the way. >>> > >>> > Which says you'd better be doing more clever scheduling than a FIFO. >>> > >>> > If you want ECN to be usable in the way you want it to be at low >>> > bandwidths, >>> > you better become a fan of flow scheduling... >>> >> >>> >> >>> >> But fq_codel teaches the lesson that packet scheduling combined with >>> >> keeping TCP sane is a key improvement over handling either problem >>> >> apart... >>> >> In particular, the first packets of new flows/reappearing flows are >>> >> vastly >>> >> more "important" than other packets in terms of the latency cost to >>> >> users of >>> >> that service. Each flow has in essence its own queue in this service >>> >> class, >>> >> and we're using information from that to help schedule the packets in >>> >> ways >>> >> that minimize latency to the user. >>> >> >>> >> >>> >> I know all this. Please can we keep to the conversation about how to >>> >> avoid >>> >> the 200ms signalling delay that fq_codel inflicts on each flow (and the >>> >> similar signalling delays that other AQMs inflict). >>> > >>> > >>> > As I said, it's not the big problem we have today at the edge of the >>> > network, and Dave's mail explains your model of CoDel isn't so correct. >>> > >>> > It's only very long lived flows where the signalling even matters, and >>> > that's not what we get at the edge of today's network; instead, we get a >>> > mix >>> > of a few larger (e.g. video player flows) with the DOS attack that is >>> > web >>> > traffic, with some isochronous VOIP and teleconferencing traffic. >>> >> >>> >> >>> >> >>> >> >>> >> So in this case, a single algorithm is acting over a bunch of flows in >>> >> a >>> >> single class of service, and both scheduling packets among the flows, >>> >> and >>> >> signalling TCP flows appropriately when they should "slow down". >>> >> >>> >> >>> >> Yup, I know this. >>> >> >>> >> >>> >> >>> >> So I think you and I are on close to the same page (but have been >>> >> burned >>> >> badly in the past by terminology issues getting in the way). On >>> >> HTTP/1.1 we >>> >> wasted probably > 2 years talking past each other because we didn't >>> >> have >>> >> clear and concise terminology that we all understood the same way. >>> >> >>> >> >>> >> As I thought, we are talking past each other. >>> > >>> > >>> > Yes, in part because I think few have internalized what the web has done >>> > to >>> > edge of the network. It isn't what I had hoped it would be when I was >>> > HTTP/1.1 editor. Why this outcome occurred is too long a discussion for >>> > this thread. >>> > >>> > >>> >> We need to be able to have a conversation that is not always "Hmm, >>> >> that's >>> >> sounds like it might be interesting. Can I tell you about fq_codel >>> >> now?" >>> > >>> > >>> > Running code of other algorithms very welcome. Fq_codel is running code. >>> > Pie is running code. >>> > >>> > Maybe fq_pie will be running code we can test someday. >>> > >>> > Even then, I want a low target latency so individual TCP's are kept >>> > responsive, and I need an algorithm that can keep up quickly with the >>> > dynamics of wireless, so most simplistic tests will not be useful (we >>> > don't >>> > have such good evaluation tests today). >>> > >>> > When it is, and if it is better than fq_codel, I'll be happy. But the >>> > mark/drop part of the algorithm *isn't* the most important part of the >>> > algorithm. The packet scheduling decisions are.... >>> > - Jim >>> > >>> >> >>> >> >>> >> >>> >> >>> >> Bob >>> >> >>> >> >>> >> >>> >> And I don't claim I have the right terminology for all this stuff, >>> >> either >>> >> (even in this mail). >>> >> >>> >> Which is why I was loathe to suggest exact text... >>> >> - Jim >>> >> >>> >> >>> >> >>> >> At 19:50 05/12/2013, Jim Gettys wrote: >>> >> >>> >> >>> >> >>> >> On Thu, Dec 5, 2013 at 1:13 PM, Bob Briscoe <[email protected]> wrote: >>> >> Fred, Gorry, all, >>> >> I promised to suggest text for draft-ietf-aqm-recommendation about >>> >> allowing the AQM's behaviour to be independent for ECN and non-ECN >>> >> packets. >>> >> In the process, I realised we can't talk about independent AQMs for ECN >>> >> without also including Diffserv. >>> >> This gets messy, because I believe a good AQM for BE traffic with and >>> >> without ECN, should remove much if not all the need for Diffserv. But >>> >> we >>> >> can't ignore Diffserv. >>> >> >>> >> >>> >> I agree in principle with what Bob is trying to say here (and is very >>> >> much >>> >> what I've been saying in my blog entry of last summer). >>> >> >>> >> Once you have things under control, the need for Diffserv diminishes >>> >> dramatically (but does not go away). >>> >> >>> >> But as Bob notes, there is still a good use for Diffserv: suitably >>> >> marked >>> >> traffic may want to contend for access to the channel differently: your >>> >> marked VOIP packets may want to change the priority with which you >>> >> request >>> >> channel access, so that you get more timely access to the medium. This >>> >> conserves transmit opportunities, which is often the scarcest resource >>> >> in >>> >> many systems (e.g. 802.11, DOCSIS, etc.). This can be the difference >>> >> between >>> >> your VOIP working well, and not working well, on a busy 802.11 network >>> >> as >>> >> well as using the channel as efficiently as possible. >>> >> >>> >> Similarly, if you have packets you know are background, it's helpful to >>> >> know that to ensure that they never contend for access to the medium >>> >> but >>> >> will always defer to other traffic, and just scavenge available space >>> >> in >>> >> other transmit opportunities where possible. >>> >> >>> >> I'm a bit loathe though to tie the behavior to queues, however; in >>> >> particular, best effort traffic may want to be sent in the same >>> >> aggregate as >>> >> higher (or lower) priority traffic, if there is remaining space in the >>> >> aggregate. >>> >> >>> >> In short, the mental model we've had that there is a one-to-one model >>> >> of >>> >> hardware and software queues (not to mention flows in a given software >>> >> queue) is often incorrect (or at least seriously sub-optimal) in >>> >> today's >>> >> systems (even if the hardware queues "work" properly, which it appears >>> >> they >>> >> do not in 802.11). >>> >> >>> >> So I'm not sure Bob's new section 3 here is how to best to state this >>> >> (or >>> >> to deal with the terminology problem). Certainly, it may be the same >>> >> instance of an AQM algorithm, rather than different instances, for >>> >> example. >>> >> And " >>> >> It SHOULD be possible" is more a pious wish than anything else. But I >>> >> agree in spirit with what Bob's trying to say. >>> >> - Jim >>> >> >>> >> >>> >> >>> >> _________________________________________________________________________________________ >>> >> {In Section 4: add another bullet between recommendations 2 & 3:} >>> >> 3{New}. It SHOULD be possible to make different instances of an AQM >>> >> algorithm apply to different subsets of packets that share the same >>> >> queue. >>> >> It SHOULD be possible to classify packets into these subsets at least >>> >> by ECN >>> >> codepoint [RFC3168] and Diffserv codepoint [RFC2474] (or the equivalent >>> >> of >>> >> these fields at lower layers). >>> >> {Then a new section to expand on this before the current Section 4.3.} >>> >> 4.3{New}. Independent AQM Instances for ECN and Diffserv >>> >> The recommendation to provide a separate instance of the AQM for ECN >>> >> packets goes beyond the assumptions of RFC 3168, which assumed that >>> >> only one >>> >> instance of an AQM will handle both ECN-capable and non-ECN-capable >>> >> packets. >>> >> >>> >> >>> >> Bob >>> >> >>> >> >>> >> ________________________________________________________________ Bob >>> >> Briscoe, BT >>> >> _______________________________________________ aqm mailing list >>> >> [email protected] https://www.ietf.org/mailman/listinfo/aqm >>> >> >>> >> >>> >> _______________________________________________ >>> >> aqm mailing list >>> >> [email protected] >>> >> https://www.ietf.org/mailman/listinfo/aqm >>> >> >>> >> >>> >> ________________________________________________________________ >>> >> Bob Briscoe, BT >>> >> >>> >> _______________________________________________ >>> >> aqm mailing list >>> >> [email protected] >>> >> https://www.ietf.org/mailman/listinfo/aqm >>> >> >>> >> >>> >> _______________________________________________ >>> >> aqm mailing list >>> >> [email protected] >>> >> https://www.ietf.org/mailman/listinfo/aqm >>> >> >>> >> ________________________________________________________________ >>> >> Bob Briscoe, BT >>> >> >>> >> >>> >> _______________________________________________ >>> >> aqm mailing list >>> >> [email protected] >>> >> https://www.ietf.org/mailman/listinfo/aqm >>> >> >>> > I can imagine many other possible algorithms that CoDel for the >>> > mark/drop >>> > algorithm; we happen to like CoDel atm for two reasons: 1) it is self >>> > adapting to the line rate, and 2) head mark/drop signals the TCP's as >>> > soon >>> > as a decision is made rather than possibly being applied much later >>> > (such as >>> > random or tail drop). We welcome and encourage further essays in the >>> > art. >>> > >>> > >>> > _______________________________________________ >>> > aqm mailing list >>> > [email protected] >>> > https://www.ietf.org/mailman/listinfo/aqm >>> > >>> >>> >>> >>> -- >>> Dave Täht >>> >>> Fixing bufferbloat with cerowrt: >>> http://www.teklibre.com/cerowrt/subscribe.html >> >> >> ________________________________________________________________ >> Bob Briscoe, BT > > > > -- > Dave Täht > > Fixing bufferbloat with cerowrt: > http://www.teklibre.com/cerowrt/subscribe.html -- Dave Täht Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html _______________________________________________ aqm mailing list [email protected] https://www.ietf.org/mailman/listinfo/aqm
