Re: TCP congestion control and large router buffers
Some more historical pointers: If you want to look at the early history of the latency discussion, look at Stuart Cheshire's famous rant It's the Latency, Stupid (http://rescomp.stanford.edu/~cheshire/rants/Latency.html). Then look at Matt Mathis's 1997 TCP equation (and the 1998 Padhye-Firoiu version of that): The throughput is proportional to the inverse square root of the packet loss and the inverse RTT -- so as the RTT starts growing due to increasing buffers, the packet loss must grow to keep equilibrium! We started to understand that you have to drop packets in order to limit queueing pretty well in the late 1990s. E.g., RFC 3819 contains an explicit warning against keeping packets for too long (section 13). But, as you notice, for faster networks, the bufferbloat effect can be limited in effect by intelligent window size management, but the dominating Windows XP was not intelligent, just limited in its widely used default configuration. So the first ones to fully see the effect were the ones with many TCP connections, i.e. Bittorrent. The modern window size tuning schemes in Windows 7 and Linux break a lot of things -- you are just describing the tip of the iceberg here. The IETF working group LEDBAT (motivated by the Bittorrent observations) has been working on a scheme to run large transfers without triggering humungous buffer growth. Gruesse, Carsten
Re: TCP congestion control and large router buffers
On 12/21/2010 04:24 PM, Fred Baker wrote: On Dec 20, 2010, at 11:18 PM, Mikael Abrahamsson wrote: On Mon, 20 Dec 2010, Jim Gettys wrote: Common knowledge among whom? I'm hardly a naive Internet user. Anyone actually looking into the matter. The Cisco fair-queue command was introduced in IOS 11.0 according tohttp://www.cisco.com/en/US/docs/ios/12_2/qos/command/reference/qrfcmd1.html#wp1098249 to somewhat handle the problem. I have no idea when this was in time, but I guess early 90:ties? 1995. I know the guy that wrote the code. Meet me in a bar and we can share war stories. The technology actually helps with problems like RFC 6057 addresses pretty effectively. is a good idea, you aren't old enough to have experienced the NSFnet collapse during the 1980's (as I did). I have post-traumatic stress disorder from that experience; I'm worried about the confluence of these changes, folks. I'm happy you were there, I was under the impression that routers had large buffers back then as well? Not really. Yup, several of us were there. The common routers on the NSFNET and related networks were fuzzballs, which had 8 (count them, 8) 576 byte buffers, Cisco AGS/AGS+, and Proteon routers. The Cisco routers of the day generally had 40 buffers on each interface by default, and might have had configuration changes; I can't comment on the Proteon routers. For a 56 KBPS line, given 1504 bytes per message (1500 bytes IP+data, and four bytes of HDLC overhead), that's theoretically 8.5 seconds. But given that messages were in fact usually 576 bytes of IP data (cf fuzzballs and unix behavior for off-LAN communications) and interspersed with TCP control messages (Acks, SYNs, FINs, RST), real queue depths were more like two seconds at a bottleneck router. The question would be the impact of a sequence of routers all acting as bottlenecks. IMHO, AQM (RED or whatever) is your friend. The question is what to set min-threshold to. Kathy Nichols (Van's wife) did a lot of simulations. I don't know that the paper was ever published, but as I recall she wound up recommending something like this: line rate ms queue depth (MBPS)RED min-threshold 2 32 10 16 155 8 622 4 2,500 2 10,000 1 I don't know if you are referring to the RED in a different light paper: that was never published, though an early draft escaped and can be found on the net. RED in a different light identifies two bugs in the RED algorithm, and proposes a better algorithm that only depends on the link output bandwidth. That draft still has a bug. The (almost completed) version of the paper that never got published; Van has retrieved it from back up, and I'm trying to pry it out of Van's hands to get it converted to something we can read today (it's in FrameMaker). In the meanwhile, turn on (W)RED! For routers run by most people on this list, it's always way better than nothing, even if Van doesn't think classic RED will solve the home router bufferbloat problem. (where we have 2 orders of magnitude variation of wireless bandwidth along with highly variable workload). That's not true in the internet core. But yes, I agree that we'd all be much helped if manufacturers of both ends of all links had the common decency of introducing a WRED (with ECN marking) AQM that had 0% drop probability at 40ms and 100% drop probability at 200ms (and linear increase between). so, min-threshold=40 ms and max-threshold=200 ms. That's good on low speed links; it will actually control queue depths to an average of O(min-threshold) at whatever value you set it to. The problem with 40 ms is that it interacts poorly with some applications, notably voice and video. It also doesn't match well to published studies like http://www.pittsburgh.intel-research.net/~kpapagia/papers/p2pdelay-analysis.pdf. In that study, a min-threshold of 40 ms would have cut in only on six a-few-second events in the course of a five hour sample. If 40 ms is on the order of magnitude of a typical RTT, it suggests that you could still have multiple retransmissions from the same session in the same queue. A good photo of buffer bloat is at ftp://ftpeng.cisco.com/fred/RTT/Pages/4.html ftp://ftpeng.cisco.com/fred/RTT/Pages/5.html The first is a trace I took overnight in a hotel I stayed in. Never mind the name of the hotel, it's not important. The second is the delay distribution, which is highly unusual - you expect to see delay distributions more like ftp://ftpeng.cisco.com/fred/RTT/Pages/8.html Thanks, Fred! Can I use these in the general bufferbloat talk I'm working on with attribution? It's a far better example/presentation in a graphic form than I currently have for the internet core case (where I don't even have anything other than memory of probing the hotel's ISP's network). (which actually shows two distributions -
Re: TCP congestion control and large router buffers
On Dec 22, 2010, at 8:48 AM, Jim Gettys wrote: I don't know if you are referring to the RED in a different light paper: that was never published, though an early draft escaped and can be found on the net. Precisely. RED in a different light identifies two bugs in the RED algorithm, and proposes a better algorithm that only depends on the link output bandwidth. That draft still has a bug. The (almost completed) version of the paper that never got published; Van has retrieved it from back up, and I'm trying to pry it out of Van's hands to get it converted to something we can read today (it's in FrameMaker). In the meanwhile, turn on (W)RED! For routers run by most people on this list, it's always way better than nothing, even if Van doesn't think classic RED will solve the home router bufferbloat problem. (where we have 2 orders of magnitude variation of wireless bandwidth along with highly variable workload). That's not true in the internet core. But yes, I agree that we'd all be much helped if manufacturers of both ends of all links had the common decency of introducing a WRED (with ECN marking) AQM that had 0% drop probability at 40ms and 100% drop probability at 200ms (and linear increase between). so, min-threshold=40 ms and max-threshold=200 ms. That's good on low speed links; it will actually control queue depths to an average of O(min-threshold) at whatever value you set it to. The problem with 40 ms is that it interacts poorly with some applications, notably voice and video. It also doesn't match well to published studies like http://www.pittsburgh.intel-research.net/~kpapagia/papers/p2pdelay-analysis.pdf. In that study, a min-threshold of 40 ms would have cut in only on six a-few-second events in the course of a five hour sample. If 40 ms is on the order of magnitude of a typical RTT, it suggests that you could still have multiple retransmissions from the same session in the same queue. A good photo of buffer bloat is at ftp://ftpeng.cisco.com/fred/RTT/Pages/4.html ftp://ftpeng.cisco.com/fred/RTT/Pages/5.html The first is a trace I took overnight in a hotel I stayed in. Never mind the name of the hotel, it's not important. The second is the delay distribution, which is highly unusual - you expect to see delay distributions more like ftp://ftpeng.cisco.com/fred/RTT/Pages/8.html Thanks, Fred! Can I use these in the general bufferbloat talk I'm working on with attribution? It's a far better example/presentation in a graphic form than I currently have for the internet core case (where I don't even have anything other than memory of probing the hotel's ISP's network). Yes. Do me a favor and remove the name of the hotel. They don't need the bad press. (which actually shows two distributions - the blue one is fairly normal, and the green one is a link that spends much of the day chock-a-block). My conjecture re 5.html is that the link *never* drops, and at times has as many as nine retransmissions of the same packet in it. The spikes in the graph are about a TCP RTO timeout apart. That's a truly worst case. For N-1 of the N retransmissions, it's a waste of storage space and a waste of bandwidth. AQM is your friend. Your buffer should be able to temporarily buffer as much as an RTT of traffic, which is to say that it should be large enough to ensure that if you get a big burst followed by a silent period you should be able to use the entire capacity of the link to ride it out. Your min-threshold should be at a value that makes your median queue depth relatively shallow. The numbers above are a reasonable guide, but as in all things, YMMV. Yup. AQM is our friend. And we need it in many places we hadn't realised we did (like our OS's). - Jim
RE: TCP congestion control and large router buffers
I don't know if you are referring to the RED in a different light paper: that was never published, though an early draft escaped and can be found on the net. RED in a different light identifies two bugs in the RED algorithm, and proposes a better algorithm that only depends on the link output bandwidth. That draft still has a bug. I also noticed another paper published later that references RED in a different light: http://www.icir.org/floyd/adaptivered/ Adaptive RED: An Algorithm for Increasing the Robustness of RED's Active Queue Management (postscript, PDF). Sally Floyd, Ramakrishna Gummadi, and Scott Shenker. August 1, 2001. And this one: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.98.1556rep=rep 1type=pdf July 15, 2002 Active Queue Management using Adaptive RED Rahul Verma, Aravind Iyer and Abhay Karandikar Abhay But it doesn't look like aRED went anywhere
Re: TCP congestion control and large router buffers
On 21 Dec 2010, at 07:18, Mikael Abrahamsson swm...@swm.pp.se wrote: On Mon, 20 Dec 2010, Jim Gettys wrote: Common knowledge among whom? I'm hardly a naive Internet user. Anyone actually looking into the matter. The Cisco fair-queue command was introduced in IOS 11.0 according to http://www.cisco.com/en/US/docs/ios/12_2/qos/command/reference/qrfcmd1.html#wp1098249 to somewhat handle the problem. I have no idea when this was in time, but I guess early 90:ties? 200ms is good; but it is often up to multiple *seconds*. Resulting latencies on broadband gears are often horrific: see the netalyzr plots that I posted in my blog. See: I know of the problem, it's no news to me. You don't have to convince me. I've been using Cisco routers as a CPE because of this for a long time. Interestingly I've just tried to enable WRED on a Cisco 877 (advsecurity 15.1) and the random-detect commands are missing. Cisco's feature navigator says it's supported though. Weird. Also, there doesn't appear to be a way to enable fair-queue on the wireless interface. Is fair-queue seen as a bad strategy for wireless and it's varying throughput/goodput rates? And finally it doesn't support inbound shaping so I can't experience with trying to build the queues on it rather than the DSLAM. I'm a little nonplussed to be honest. However, I did notice the output queue on the dialler interface defaults to 1000 packets. (Perhaps that's a hangover from when it had to queue packets whilst dialling? I've come too late to networking to know). Reducing that number to 10 (~60ms @ 1500 bytes @ 8Mbps) has noticeably increased the latency response and fairness of the connection under load. Sam
Re: TCP congestion control and large router buffers
On Dec 20, 2010, at 11:18 PM, Mikael Abrahamsson wrote: On Mon, 20 Dec 2010, Jim Gettys wrote: Common knowledge among whom? I'm hardly a naive Internet user. Anyone actually looking into the matter. The Cisco fair-queue command was introduced in IOS 11.0 according to http://www.cisco.com/en/US/docs/ios/12_2/qos/command/reference/qrfcmd1.html#wp1098249 to somewhat handle the problem. I have no idea when this was in time, but I guess early 90:ties? 1995. I know the guy that wrote the code. Meet me in a bar and we can share war stories. The technology actually helps with problems like RFC 6057 addresses pretty effectively. is a good idea, you aren't old enough to have experienced the NSFnet collapse during the 1980's (as I did). I have post-traumatic stress disorder from that experience; I'm worried about the confluence of these changes, folks. I'm happy you were there, I was under the impression that routers had large buffers back then as well? Not really. Yup, several of us were there. The common routers on the NSFNET and related networks were fuzzballs, which had 8 (count them, 8) 576 byte buffers, Cisco AGS/AGS+, and Proteon routers. The Cisco routers of the day generally had 40 buffers on each interface by default, and might have had configuration changes; I can't comment on the Proteon routers. For a 56 KBPS line, given 1504 bytes per message (1500 bytes IP+data, and four bytes of HDLC overhead), that's theoretically 8.5 seconds. But given that messages were in fact usually 576 bytes of IP data (cf fuzzballs and unix behavior for off-LAN communications) and interspersed with TCP control messages (Acks, SYNs, FINs, RST), real queue depths were more like two seconds at a bottleneck router. The question would be the impact of a sequence of routers all acting as bottlenecks. IMHO, AQM (RED or whatever) is your friend. The question is what to set min-threshold to. Kathy Nichols (Van's wife) did a lot of simulations. I don't know that the paper was ever published, but as I recall she wound up recommending something like this: line rate ms queue depth (MBPS)RED min-threshold 2 32 10 16 155 8 622 4 2,500 2 10,000 1 But yes, I agree that we'd all be much helped if manufacturers of both ends of all links had the common decency of introducing a WRED (with ECN marking) AQM that had 0% drop probability at 40ms and 100% drop probability at 200ms (and linear increase between). so, min-threshold=40 ms and max-threshold=200 ms. That's good on low speed links; it will actually control queue depths to an average of O(min-threshold) at whatever value you set it to. The problem with 40 ms is that it interacts poorly with some applications, notably voice and video. It also doesn't match well to published studies like http://www.pittsburgh.intel-research.net/~kpapagia/papers/p2pdelay-analysis.pdf. In that study, a min-threshold of 40 ms would have cut in only on six a-few-second events in the course of a five hour sample. If 40 ms is on the order of magnitude of a typical RTT, it suggests that you could still have multiple retransmissions from the same session in the same queue. A good photo of buffer bloat is at ftp://ftpeng.cisco.com/fred/RTT/Pages/4.html ftp://ftpeng.cisco.com/fred/RTT/Pages/5.html The first is a trace I took overnight in a hotel I stayed in. Never mind the name of the hotel, it's not important. The second is the delay distribution, which is highly unusual - you expect to see delay distributions more like ftp://ftpeng.cisco.com/fred/RTT/Pages/8.html (which actually shows two distributions - the blue one is fairly normal, and the green one is a link that spends much of the day chock-a-block). My conjecture re 5.html is that the link *never* drops, and at times has as many as nine retransmissions of the same packet in it. The spikes in the graph are about a TCP RTO timeout apart. That's a truly worst case. For N-1 of the N retransmissions, it's a waste of storage space and a waste of bandwidth. AQM is your friend. Your buffer should be able to temporarily buffer as much as an RTT of traffic, which is to say that it should be large enough to ensure that if you get a big burst followed by a silent period you should be able to use the entire capacity of the link to ride it out. Your min-threshold should be at a value that makes your median queue depth relatively shallow. The numbers above are a reasonable guide, but as in all things, YMMV.
Re: TCP congestion control and large router buffers
On 12/19/2010 02:16 PM, Joel Jaeggli wrote: On 12/9/10 7:20 AM, Mikael Abrahamsson wrote: On Thu, 9 Dec 2010, Vasil Kolev wrote: I wonder why this hasn't made the rounds here. From what I see, a change in this part (e.g. lower buffers in customer routers, or a change (yet another) to the congestion control algorithms) would do miracles for end-user perceived performance and should help in some way with the net neutrality dispute. It's really hard to replace all the home user's hardware. Trying to fix the problem by fixing all of that is much more painful (and expensive) than fixing the network to not have the buffers. I'd say this is common knowledge and has been for a long time. Common knowledge among whom? I'm hardly a naive Internet user. And the statement is wrong: the large router buffers have effectively destroyed TCP's congestion avoidance altogether. In the world of CPEs, lowest price and simplicity is what counts, so nobody cares about buffer depth and AQM, that's why you get ADSL CPEs with 200+ ms of upstream FIFO buffer (no AQM) in most devices. 200ms is good; but it is often up to multiple *seconds*. Resulting latencies on broadband gears are often horrific: see the netalyzr plots that I posted in my blog. See: http://gettys.wordpress.com/2010/12/06/whose-house-is-of-glasse-must-not-throw-stones-at-another/ Dave Clark first discovered bufferbloat on his DSLAM: he used the 6 second latency he saw to DDOS his son's excessive WOW playing. All broadband technologies are affected, as are, it turns out, all operating systems and likely all home routers as well (see other posts I've made recently). DSL, cable and FIOS all have problems. How many of retail ISP's service calls have been due to this terrible performance? I know I was harassing Comcast with multiple service calls over a year ago over what I now think was bufferbloat. And periodically for a number of years before that (roughly since DOCSIS 2 deployed, I would guess). The Internet is slow today, Daddy was usually Daddy saturating the home link, and bufferbloat the cause. Every time they would complain, I'd stop what I was doing, and the problem would vanish. A really nice willow the wisp... you're going to see more of it, at a minimum cpe are going to have to be able to drain a gig-e into a port that may be only 100Mb/s. The QOS options available in a ~$100 cpe router are adequate for the basic purpose. But the port may only be 1 Mb/second; 802.11g is 20Mbps tops; but drops to 1Mbps in extremis. So the real dynamic range is at least a factor of 1000 to 1. d-link dir-825 or 665 are examples of such devices Yes, and E3000's and others. Some are half measures, and have a single knob for both shaping uplinks and downlink bandwidth. The QOS features in home routers can help, but does not solve all problems. In part, because as broadband bandwidth increases, the bottleneck link may shift/often shifts to the home router to edge device links, and there are similar (or even worse) bufferbloat problems in both the home routers and operating systems. Personally I have MQC configured on my interface which has assured bw for small packets and ssh packets, and I also run fair-queue to make tcp sessions get a fair share. I don't know any non-cisco devices that does this. the consumer cpe that care seem to be mostly oriented along keeping gaming and voip from being interfereed with by p2p and file transfers. An unappreciated issue is that these buffers have destroyed TCP (and all other congestion avoiding protocols) congestion avoidance. Secondly, any modern operating system (anything other than Windows XP), implements window scaling, and will, within about 10 seconds, *fill* the buffers with a single TCP connection, and they stay full until traffic drops enough to allow them to empty (which may take seconds). Since congestion avoidance has been defeated, you get nasty behaviour out of TCP. Congestion avoidance depends on *timely* notification to the end points of congestion: these buffers have destroyed the *timely* requirement of a fundamental presumption of internet protocol design. If you think that simultaneously: 1) destroying congestion avoidance 2) destroying slow-start, as many major web sites are by increasing their initial window 3) browsers are now using many TCP connections simultaneously 4) while the TCP traffic shifts to window scaling, enabling even a single TCP connection to fill these buffers. 5) increasing numbers of large uploads/downloads (not just bittorrent, HD movie delivery to disk, backup, crash dump uploads, etc) is a good idea, you aren't old enough to have experienced the NSFnet collapse during the 1980's (as I did). I have post-traumatic stress disorder from that experience; I'm worried about the confluence of these changes, folks. And
Re: TCP congestion control and large router buffers
On Mon, 20 Dec 2010, Jim Gettys wrote: Common knowledge among whom? I'm hardly a naive Internet user. Anyone actually looking into the matter. The Cisco fair-queue command was introduced in IOS 11.0 according to http://www.cisco.com/en/US/docs/ios/12_2/qos/command/reference/qrfcmd1.html#wp1098249 to somewhat handle the problem. I have no idea when this was in time, but I guess early 90:ties? And the statement is wrong: the large router buffers have effectively destroyed TCP's congestion avoidance altogether. Routers have had large buffers since way before residential broadband even came around, the basic premise of TCP is that routers have buffers and quite a lot of it. 200ms is good; but it is often up to multiple *seconds*. Resulting latencies on broadband gears are often horrific: see the netalyzr plots that I posted in my blog. See: I know of the problem, it's no news to me. You don't have to convince me. I've been using Cisco routers as a CPE because of this for a long time. Dave Clark first discovered bufferbloat on his DSLAM: he used the 6 second latency he saw to DDOS his son's excessive WOW playing. When I procured a DSLAM around 2003 or so it had 40ms of buffering at 24meg ADSL2+ speed, when the speed went down, the buffers length in bytes was constant so buffering time also went up. It didn't do any AQM either, but at least it did .1p prioritization and had 4 buffers so there was a little possibility of doing things upstream of it. All broadband technologies are affected, as are, it turns out, all operating systems and likely all home routers as well (see other posts I've made recently). DSL, cable and FIOS all have problems. Yes. How many of retail ISP's service calls have been due to this terrible performance? A lot, I'm sure. Secondly, any modern operating system (anything other than Windows XP), implements window scaling, and will, within about 10 seconds, *fill* the buffers with a single TCP connection, and they stay full until traffic drops enough to allow them to empty (which may take seconds). Since congestion avoidance has been defeated, you get nasty behaviour out of TCP. That is exactly what TCP was designed to do, use as much bandwidth as it can. Congestion is detected by two means, latency goes up and/or there is packet loss. TCP was designed with router buffers in mind. Anyhow, one thing that might help would be ECN in conjunction with WRED, but already there you're way over most CPE manufacturers head. is a good idea, you aren't old enough to have experienced the NSFnet collapse during the 1980's (as I did). I have post-traumatic stress disorder from that experience; I'm worried about the confluence of these changes, folks. I'm happy you were there, I was under the impression that routers had large buffers back then as well? The best you can do is what Ooma has done; bandwidth shaping along with being closest to the broadband connection (or by fancy home routers with classification and bandwidth shaping). That won't help the downstream direction where a single other user (or yourself), can inject large packet bursts routinely by browsing web sites like YouTube or Google images (unless some miracle occurs, and the broadband head ends are classifying traffic in the downstream direction over those links). There is definitely a lot of improvement to be had. For FTTH, if you use an L2 switch with a few ms of buffering as the ISP handoff device, you don't get this problem. There are even TCP algorithms to handle this case where you have little buffers and just tail-drop But yes, I agree that we'd all be much helped if manufacturers of both ends of all links had the common decency of introducing a WRED (with ECN marking) AQM that had 0% drop probability at 40ms and 100% drop probability at 200ms (and linear increase between). -- Mikael Abrahamssonemail: swm...@swm.pp.se
Re: TCP congestion control and large router buffers
On 12/9/10 7:20 AM, Mikael Abrahamsson wrote: On Thu, 9 Dec 2010, Vasil Kolev wrote: I wonder why this hasn't made the rounds here. From what I see, a change in this part (e.g. lower buffers in customer routers, or a change (yet another) to the congestion control algorithms) would do miracles for end-user perceived performance and should help in some way with the net neutrality dispute. I'd say this is common knowledge and has been for a long time. In the world of CPEs, lowest price and simplicity is what counts, so nobody cares about buffer depth and AQM, that's why you get ADSL CPEs with 200+ ms of upstream FIFO buffer (no AQM) in most devices. you're going to see more of it, at a minimum cpe are going to have to be able to drain a gig-e into a port that may be only 100Mb/s. The QOS options available in a ~$100 cpe router are adequate for the basic purpose. d-link dir-825 or 665 are examples of such devices Personally I have MQC configured on my interface which has assured bw for small packets and ssh packets, and I also run fair-queue to make tcp sessions get a fair share. I don't know any non-cisco devices that does this. the consumer cpe that care seem to be mostly oriented along keeping gaming and voip from being interfereed with by p2p and file transfers.
Re: TCP congestion control and large router buffers
On Tue, 14 Dec 2010, Sam Stickland wrote: But there's no need for AQM, just smaller buffers would make a huge difference. Well, yes, buffering packets more than let's say 30-50ms on a 1 meg link doesn't make much sense. But doing some basic AQM would make things even better (some packets would see 0 buffering instead of 30ms). Surely buffers that can store seconds worth of data are simply too big? FIFO with seconds worth of data is just silly, yes. -- Mikael Abrahamssonemail: swm...@swm.pp.se
RE: TCP congestion control and large router buffers
On Tue, 14 Dec 2010, Sam Stickland wrote: But there's no need for AQM, just smaller buffers would make a huge difference. Well, yes, buffering packets more than let's say 30-50ms on a 1 meg link doesn't make much sense. But doing some basic AQM would make things even better (some packets would see 0 buffering instead of 30ms). Surely buffers that can store seconds worth of data are simply too big? FIFO with seconds worth of data is just silly, yes. -- Mikael Abrahamsson Well, Jim Getty was reporting seeing tens of seconds of buffering (comments in the original LWN link to his first posting) which is just ludicrous. No TCP stack is going to respond properly to congestion with that sort of delay. Some form of AQM is probably a good thing as would be the wider use of ECN. Finding out that a buffer filled and a packet (or many packets) was dropped five seconds after the fact, isn't going to help anyone and you just end up whipsawing the window size (Lawrence Welk effect http://www.oeta.onenet.net/welk/PM/images/Lawrence.jpg ?). I would favor seeing more use of ECN so that a sender can be notified to back off when a buffer is approaching capacity but there is apparently still a lot of hardware out there that has problems with it. You need enough buffering to satisfy packets in flight for a connection on the other side of the planet but man, what he has been reporting is just insane and it would be no wonder performance can be crap.
RE: TCP congestion control and large router buffers
On Tue, 14 Dec 2010, George Bonser wrote: that sort of delay. Some form of AQM is probably a good thing as would be the wider use of ECN. Finding out that a buffer filled and a packet (or many packets) was dropped five seconds after the fact, isn't going ECN pretty much needs WRED, and then people need to implement that first. The only routing platform I know to support it is 7200 and the other types of cpu routers from Cisco running fairly recent IOS (seems to have been introduced in 12.2T). http://www.cisco.com/en/US/docs/ios/12_2t/12_2t8/feature/guide/ftwrdecn.html You need enough buffering to satisfy packets in flight for a connection on the other side of the planet but man, what he has been reporting is just insane and it would be no wonder performance can be crap. Yeah, 30-60ms of buffering is what I have favoured so far. With L2 switches you don't get anywhere near that, but on the other side a few ms of buffering+tail drop has much less impact on interactive applications compared to seconds of buffering. -- Mikael Abrahamssonemail: swm...@swm.pp.se
Re: TCP congestion control and large router buffers
On Mon, 13 Dec 2010, Sam Stickland wrote: Ironically though, wouldn't smaller buffers cost less thus making the CPEs 1 megabyte of buffer (regular RAM) isn't really expensive. cheaper still? I believe the argument made in the blog post is that cheaper RAM been causing the CPE manufacturers to mistakenly provision too much buffer space, which in turn apparently means that TCP can't stabilize at a rate less than available bandwidth (I.e. It's the old 1980's congestion collapse problem all over again). Of course, you'll only see this if a single TCP stream is actually capable of saturating the link. Sam I would guess they're running standard OSes and haven't tuned the buffers at all. Implementing WRED or fair-queue (even if it just means turning it on) requires validation which the CPE manufacturers want to minimize. Also it's our fault as a business, how many ISPs have included AQM in their RFPs for CPEs and actually would pay USD5 more per device for this feature? I'm not very surprised at the lack of this though, it's hard to explain to the end customer with some kind of marketing, both for the ISP and the CPE vendor if they're selling to end customers. It's one of those in the black box things that should just work, but there is little upside in having it because it's hard to charge for. -- Mikael Abrahamssonemail: swm...@swm.pp.se
Re: TCP congestion control and large router buffers
On Thu, 9 Dec 2010, Vasil Kolev wrote: I wonder why this hasn't made the rounds here. From what I see, a change in this part (e.g. lower buffers in customer routers, or a change (yet another) to the congestion control algorithms) would do miracles for end-user perceived performance and should help in some way with the net neutrality dispute. I'd say this is common knowledge and has been for a long time. In the world of CPEs, lowest price and simplicity is what counts, so nobody cares about buffer depth and AQM, that's why you get ADSL CPEs with 200+ ms of upstream FIFO buffer (no AQM) in most devices. Personally I have MQC configured on my interface which has assured bw for small packets and ssh packets, and I also run fair-queue to make tcp sessions get a fair share. I don't know any non-cisco devices that does this. -- Mikael Abrahamssonemail: swm...@swm.pp.se