Re: e1000 full-duplex TCP performance well below wire speed
Hi Bruce, On Jan 30, 2008 5:25 PM, Bruce Allen [EMAIL PROTECTED] wrote: In our application (cluster computing) we use a very tightly coupled high-speed low-latency network. There is no 'wide area traffic'. So it's hard for me to understand why any networking components or software layers should take more than milliseconds to ramp up or back off in speed. Perhaps we should be asking for a TCP congestion avoidance algorithm which is designed for a data center environment where there are very few hops and typical packet delivery times are tens or hundreds of microseconds. It's very different than delivering data thousands of km across a WAN. If your network latency is low, regardless of type of protocols should give you more than 900Mbps. I can guess the RTT of two machines is less than 4ms in your case and I remember the throughputs of all high-speed protocols (including tcp-reno) were more than 900Mbps with 4ms RTT. So, my question which kernel version did you use with your broadcomm NIC and got more than 900Mbps? I have two machines connected by a gig switch and I can see what happens in my environment. Could you post what parameters did you use for netperf testing? and also if you set any parameters for your testing, please post them here so that I can see that happens to me as well. Regards, Sangtae -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SACK scoreboard
On Jan 9, 2008 9:56 AM, John Heffner [EMAIL PROTECTED] wrote: I also wonder how much of a problem this is (for now, with window sizes of order 1 packets. My understanding is that the biggest problems arise from O(N^2) time for recovery because every ack was expensive. Have current tests shown the final ack to be a major source of problems? Yes, several people have reported this. I may have missed some of this. Does anyone have a link to some recent data? I had some testing on this a month ago. A small set of recent results with linux 2.6.23.9 are at http://netsrv.csc.ncsu.edu/net-2.6.23.9/sack_efficiency One of serious cases with a large number of packet losses (initial loss is around 8000 packets) is at http://netsrv.csc.ncsu.edu/net-2.6.23.9/sack_efficiency/600--TCP-TCP-NONE--400-3-1.0--1000-120-0-0-1-1-5-500--1.0-0.5-133000-73-300-0.93-150--3/ Also, there is a comparison among three Linux kernels (2.6.13, 2.6.18-rc4, 2.6.20.3) at http://netsrv.csc.ncsu.edu/wiki/index.php/Efficiency_of_SACK_processing Sangtae -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: possible bug in tcp_probe
Hi Gavin, This is fixed in the current version of tcp_probe by Stephen. Please see the below. commit 662ad4f8efd3ba2ed710d36003f968b500e6f123 Author: Stephen Hemminger [EMAIL PROTECTED] Date: Wed Jul 11 19:43:52 2007 -0700 [TCP]: tcp probe wraparound handling and other changes Switch from formatting messages in probe routine and copying with kfifo, to using a small circular queue of information and formatting on read. This avoids wraparound issues with kfifo, and saves one copy. Also make sure to state correct license, rather than copying off some other driver I started with. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] Signed-off-by: David S. Miller [EMAIL PROTECTED] You can copy the current version of tcp_probe to your kernel version and it should work. Before this has not been fixed, in my case, I changed kfifo not to put the data if it doesn't have enough space. I will send you a patch if you want this. Regards, Sangtae On Nov 13, 2007 6:26 AM, Gavin McCullagh [EMAIL PROTECTED] wrote: Hi, I'm using linux v2.6.22.6 and tcp_probe with a couple of small modifications[1]. Even with moderately large numbers of flows (16 on the one machine) and increasingly as I monitor more flows than that, I get strange overflow problems such as this one: 74.259589763 192.168.2.1 36988 192.168.3.5 5001 0x679c23dc 0x679bc3b4 18 13 9114624 78 76 1 0 64 74.260590660 192.168.2.1 44261 192.168.3.5 5006 0x573bb3ed 0x573b700d 13 9 5254144 155 127 1 0 64 74.261607478 192.168.2.1 44261 192.168.3.5 5006 0x588.066586741 192.168.2.1 33739 192.168.3.5 5009 0xe26d1767 0xe26cf577 2 3 13090816 443 15818 1 0 64 88.066690797 192.168.2.1 33739 192.168.3.5 5009 0xe26d1767 0xe26cfb1f 3 3 13092864 2365 15818 1 0 64 88.067625714 192.168.2.1 59385 192.168.3.5 5012 0x411c1090 0x411bd258 12 9 14578688 2807 15812 1 0 64 As you can see the third line has been truncated as well as the next roughly 14 seconds of data after which data continues writing as usual. I don't think my small changes are causing this but perhaps I'm wrong. Does anyone know what might be causing the above? Many thanks for any ideas, Gavin [1] I have slightly modified tcp_probe to print out information for a range of ports (instead of one port or all) and to print info from the congestion avoidance inet_csk_ca struct. This adds a couple of extra fields to the end. If either of these are of interest as patches I'll happily submit them. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [TCP] tcp_probe: a trivial fix for mismatched number of printl arguments.
Just a fix to correct the number of printl arguments. Now, srtt is logging correctly. Signed-off-by: Sangtae Ha [EMAIL PROTECTED] --- net/ipv4/tcp_probe.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/net/ipv4/tcp_probe.c b/net/ipv4/tcp_probe.c index 3938d5d..1b72c55 100644 --- a/net/ipv4/tcp_probe.c +++ b/net/ipv4/tcp_probe.c @@ -95,7 +95,7 @@ static int jtcp_rcv_established(struct sock *sk, struct sk_buff *skb, /* Only update if port matches */ if ((port == 0 || ntohs(inet-dport) == port || ntohs(inet-sport) == port) (full || tp-snd_cwnd != tcpw.lastcwnd)) { - printl(%d.%d.%d.%d:%u %d.%d.%d.%d:%u %d %#x %#x %u %u %u\n, + printl(%d.%d.%d.%d:%u %d.%d.%d.%d:%u %d %#x %#x %u %u %u %u\n, NIPQUAD(inet-saddr), ntohs(inet-sport), NIPQUAD(inet-daddr), ntohs(inet-dport), skb-len, tp-snd_nxt, tp-snd_una, -- 1.5.0.6 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20.7 TCP cubic (and bic) initial slow start way too slow?
Hi Bill, This is the small patch that has been applied to 2.6.22. Also, there is limited slow start, which is an experimental RFC (RFC3742), to surmount this large increase during slow start. But, your kernel might not have this. Please check there is a sysctl variable tcp_max_ssthresh. Thanks, Sangtae On 5/12/07, Bill Fink [EMAIL PROTECTED] wrote: On Thu, 10 May 2007, Injong Rhee wrote: Oops. I thought Bill was using 2.6.20 instead of 2.6.22 which should contain our latest update. I am using 2.6.20.7. Regarding slow start behavior, the latest version should not change though. I think it would be ok to change the slow start of bic and cubic to the default slow start. But what we observed is that when BDP is large, increasing cwnd by two times is really an overkill. consider increasing from 1024 into 2048 packets..maybe the target is somewhere between them. We have potentially a large number of packets flushed into the network. That was the original motivation to change slow start from the default into a more gentle version. But I see the point that Bill is raising. We are working on improving this behavior in our lab. We will get back to this topic in a couple of weeks after we finish our testing and produce a patch. Is it feasible to replace the version of cubic in 2.6.20.7 with the new 2.1 version of cubic without changing the rest of the kernel, or are there kernel changes/dependencies that would prevent that? I've tried building and running a 2.6.21-git13 kernel, but am having some difficulties. I will be away the rest of the weekend so won't be able to get back to this until Monday. -Bill P.S. When getting into the the 10 Gbps range, I'm not sure there's any way to avoid the types of large increases during slow start that you mention, if you want to achieve those kinds of data rates. - Original Message - From: Stephen Hemminger [EMAIL PROTECTED] To: David Miller [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; netdev@vger.kernel.org Sent: Thursday, May 10, 2007 4:45 PM Subject: Re: 2.6.20.7 TCP cubic (and bic) initial slow start way too slow? On Thu, 10 May 2007 13:35:22 -0700 (PDT) David Miller [EMAIL PROTECTED] wrote: From: [EMAIL PROTECTED] Date: Thu, 10 May 2007 14:39:25 -0400 (EDT) Bill, Could you test with the lastest version of CUBIC? this is not the latest version of it you tested. Rhee-sangsang-nim, it might be a lot easier for people if you provide a patch against the current tree for users to test instead of constantly pointing them to your web site. - The 2.6.22 version should have the latest version, that I know of. There was small patch from 2.6.21 that went in. tcp_cubic-2.6.20.3.patch Description: Binary data
Re: 2.6.20.7 TCP cubic (and bic) initial slow start way too slow?
= 7257.1575 Mbps 890.4663 MB / 1.00 sec = 7470.0567 Mbps 911.5039 MB / 1.00 sec = 7646.3560 Mbps 4829.9375 MB / 10.05 sec = 4033.0191 Mbps 76 %TX 32 %RX [EMAIL PROTECTED] ~]# netstat -s | grep -i retrans 1093 segments retransmited 1093 fast retransmits And then with the default bic behavior: [EMAIL PROTECTED] ~]# echo 100 /sys/module/tcp_bic/parameters/initial_ssthresh [EMAIL PROTECTED] ~]# cat /sys/module/tcp_bic/parameters/initial_ssthresh 100 [EMAIL PROTECTED] ~]# nuttcp -T10 -i1 -w100m 192.168.89.15 9.9548 MB / 1.00 sec = 83.1028 Mbps 47.5439 MB / 1.00 sec = 398.8351 Mbps 107.6147 MB / 1.00 sec = 902.7506 Mbps 183.9038 MB / 1.00 sec = 1542.7124 Mbps 313.4875 MB / 1.00 sec = 2629.7689 Mbps 531.0012 MB / 1.00 sec = 4454.3032 Mbps 841.7866 MB / 1.00 sec = 7061.5098 Mbps 837.5867 MB / 1.00 sec = 7026.4041 Mbps 834.8889 MB / 1.00 sec = 7003.3667 Mbps 4539.6250 MB / 10.00 sec = 3806.5410 Mbps 50 %TX 34 %RX [EMAIL PROTECTED] ~]# netstat -s | grep -i retrans 1093 segments retransmited 1093 fast retransmits bic actually does much better than cubic for this scenario, and only loses out to the standard Reno aggressive slow start behavior by a small amount. Of course in the case of no congestion, it loses out by a much more significant margin. This reinforces my belief that it's best to marry the standard Reno aggressive initial slow start behavior with the better performance of bic or cubic during the subsequent steady state portion of the TCP session. I can of course achieve that objective by setting initial_ssthresh to 0, but perhaps that should be made the default behavior. -Bill On Wed, 9 May 2007, I wrote: Hi Sangtae, On Tue, 8 May 2007, SANGTAE HA wrote: Hi Bill, At this time, BIC and CUBIC use a less aggressive slow start than other protocols. Because we observed slow start is somewhat aggressive and introduced a lot of packet losses. This may be changed to standard slow start in later version of BIC and CUBIC, but, at this time, we still using a modified slow start. slow start is somewhat of a misnomer. However, I'd argue in favor of using the standard slow start for BIC and CUBIC as the default. Is the rationale for using a less agressive slow start to be gentler to certain receivers, which possibly can't handle a rapidly increasing initial burst of packets (and the resultant necessary allocation of system resources)? Or is it related to encountering actual network congestion during the initial slow start period, and how well that is responded to? So, as you observed, this modified slow start behavior may slow for 10G testing. You can alleviate this for your 10G testing by changing BIC and CUBIC to use a standard slow start by loading these modules with initial_ssthresh=0. I saw the initial_ssthresh parameter, but didn't know what it did or even what its units were. I saw the default value was 100 and tried increasing it, but I didn't think to try setting it to 0. [EMAIL PROTECTED] ~]# grep -r initial_ssthresh /usr/src/kernels/linux-2.6.20.7/Documentation/ [EMAIL PROTECTED] ~]# It would be good to have some documentation for these bic and cubic parameters similar to the documentation in ip-sysctl.txt for the /proc/sys/net/ipv[46]/* variables (I know, I know, I should just use the source). Is it expected that the cubic slow start is that much less agressive than the bic slow start (from 10 secs to max rate for bic in my test to 25 secs to max rate for cubic). This could be considered a performance regression since the default TCP was changed from bic to cubic. In any event, I'm now happy as setting initial_ssthresh to 0 works well for me. [EMAIL PROTECTED] ~]# netstat -s | grep -i retrans 0 segments retransmited [EMAIL PROTECTED] ~]# cat /proc/sys/net/ipv4/tcp_congestion_control cubic [EMAIL PROTECTED] ~]# cat /sys/module/tcp_cubic/parameters/initial_ssthresh 0 [EMAIL PROTECTED] ~]# nuttcp -T10 -i1 -w60m 192.168.89.15 69.9829 MB / 1.00 sec = 584.2065 Mbps 843.1467 MB / 1.00 sec = 7072.9052 Mbps 844.3655 MB / 1.00 sec = 7082.6544 Mbps 842.2671 MB / 1.00 sec = 7065.7169 Mbps 839.9204 MB / 1.00 sec = 7045.8335 Mbps 840.1780 MB / 1.00 sec = 7048.3114 Mbps 834.1475 MB / 1.00 sec = 6997.4270 Mbps 835.5972 MB / 1.00 sec = 7009.3148 Mbps 835.8152 MB / 1.00 sec = 7011.7537 Mbps 830.9333 MB / 1.00 sec = 6969.9281 Mbps 7617.1875 MB / 10.01 sec = 6386.2622 Mbps 90 %TX 46 %RX [EMAIL PROTECTED] ~]# netstat -s | grep -i retrans 0 segments retransmited -Thanks a lot! -Bill Regards, Sangtae On 5/6/07, Bill Fink [EMAIL PROTECTED] wrote: The initial TCP slow start on 2.6.20.7 cubic (and to a lesser extent bic) seems to be way too slow. With an ~80 ms RTT
Re: 2.6.20.7 TCP cubic (and bic) initial slow start way too slow?
Hi Bill, At this time, BIC and CUBIC use a less aggressive slow start than other protocols. Because we observed slow start is somewhat aggressive and introduced a lot of packet losses. This may be changed to standard slow start in later version of BIC and CUBIC, but, at this time, we still using a modified slow start. So, as you observed, this modified slow start behavior may slow for 10G testing. You can alleviate this for your 10G testing by changing BIC and CUBIC to use a standard slow start by loading these modules with initial_ssthresh=0. Regards, Sangtae On 5/6/07, Bill Fink [EMAIL PROTECTED] wrote: The initial TCP slow start on 2.6.20.7 cubic (and to a lesser extent bic) seems to be way too slow. With an ~80 ms RTT, this is what cubic delivers (thirty second test with one second interval reporting and specifying a socket buffer size of 60 MB): [EMAIL PROTECTED] ~]# netstat -s | grep -i retrans 0 segments retransmited [EMAIL PROTECTED] ~]# cat /proc/sys/net/ipv4/tcp_congestion_control cubic [EMAIL PROTECTED] ~]# nuttcp -T30 -i1 -w60m 192.168.89.15 6.8188 MB / 1.00 sec = 57.0365 Mbps 16.2097 MB / 1.00 sec = 135.9824 Mbps 25.4553 MB / 1.00 sec = 213.5420 Mbps 35.5127 MB / 1.00 sec = 297.9119 Mbps 43.0066 MB / 1.00 sec = 360.7770 Mbps 50.3210 MB / 1.00 sec = 422.1370 Mbps 59.0796 MB / 1.00 sec = 495.6124 Mbps 69.1284 MB / 1.00 sec = 579.9098 Mbps 76.6479 MB / 1.00 sec = 642.9130 Mbps 90.6189 MB / 1.00 sec = 760.2835 Mbps 109.4348 MB / 1.00 sec = 918.0361 Mbps 128.3105 MB / 1.00 sec = 1076.3813 Mbps 150.4932 MB / 1.00 sec = 1262.4686 Mbps 175.9229 MB / 1.00 sec = 1475.7965 Mbps 205.9412 MB / 1.00 sec = 1727.6150 Mbps 240.8130 MB / 1.00 sec = 2020.1504 Mbps 282.1790 MB / 1.00 sec = 2367.1644 Mbps 318.1841 MB / 1.00 sec = 2669.1349 Mbps 372.6814 MB / 1.00 sec = 3126.1687 Mbps 440.8411 MB / 1.00 sec = 3698.5200 Mbps 524.8633 MB / 1.00 sec = 4403.0220 Mbps 614.3542 MB / 1.00 sec = 5153.7367 Mbps 718.9917 MB / 1.00 sec = 6031.5386 Mbps 829.0474 MB / 1.00 sec = 6954.6438 Mbps 867.3289 MB / 1.00 sec = 7275.9510 Mbps 865.7759 MB / 1.00 sec = 7262.9813 Mbps 864.4795 MB / 1.00 sec = 7251.7071 Mbps 864.5425 MB / 1.00 sec = 7252.8519 Mbps 867.3372 MB / 1.00 sec = 7246.9232 Mbps 10773.6875 MB / 30.00 sec = 3012.3936 Mbps 38 %TX 25 %RX [EMAIL PROTECTED] ~]# netstat -s | grep -i retrans 0 segments retransmited It takes 25 seconds for cubic TCP to reach its maximal rate. Note that there were no TCP retransmissions (no congestion experienced). Now with bic (only 20 second test this time): [EMAIL PROTECTED] ~]# echo bic /proc/sys/net/ipv4/tcp_congestion_control [EMAIL PROTECTED] ~]# cat /proc/sys/net/ipv4/tcp_congestion_control bic [EMAIL PROTECTED] ~]# nuttcp -T20 -i1 -w60m 192.168.89.15 9.9548 MB / 1.00 sec = 83.1497 Mbps 47.2021 MB / 1.00 sec = 395.9762 Mbps 92.4304 MB / 1.00 sec = 775.3889 Mbps 134.3774 MB / 1.00 sec = 1127.2758 Mbps 194.3286 MB / 1.00 sec = 1630.1987 Mbps 280.0598 MB / 1.00 sec = 2349.3613 Mbps 404.3201 MB / 1.00 sec = 3391.8250 Mbps 559.1594 MB / 1.00 sec = 4690.6677 Mbps 792.7100 MB / 1.00 sec = 6650.0257 Mbps 857.2241 MB / 1.00 sec = 7190.6942 Mbps 852.6912 MB / 1.00 sec = 7153.3283 Mbps 852.6968 MB / 1.00 sec = 7153.2538 Mbps 851.3162 MB / 1.00 sec = 7141.7575 Mbps 851.4927 MB / 1.00 sec = 7143.0240 Mbps 850.8782 MB / 1.00 sec = 7137.8762 Mbps 852.7119 MB / 1.00 sec = 7153.2949 Mbps 852.3879 MB / 1.00 sec = 7150.2982 Mbps 850.2163 MB / 1.00 sec = 7132.5165 Mbps 849.8340 MB / 1.00 sec = 7129.0026 Mbps 11882.7500 MB / 20.00 sec = 4984.0068 Mbps 67 %TX 41 %RX [EMAIL PROTECTED] ~]# netstat -s | grep -i retrans 0 segments retransmited bic does better but still takes 10 seconds to achieve its maximal rate. Surprisingly venerable reno does the best (only a 10 second test): [EMAIL PROTECTED] ~]# echo reno /proc/sys/net/ipv4/tcp_congestion_control [EMAIL PROTECTED] ~]# cat /proc/sys/net/ipv4/tcp_congestion_control reno [EMAIL PROTECTED] ~]# nuttcp -T10 -i1 -w60m 192.168.89.15 69.9829 MB / 1.01 sec = 583.5822 Mbps 844.3870 MB / 1.00 sec = 7083.2808 Mbps 862.7568 MB / 1.00 sec = 7237.7342 Mbps 859.5725 MB / 1.00 sec = 7210.8981 Mbps 860.1365 MB / 1.00 sec = 7215.4487 Mbps 865.3940 MB / 1.00 sec = 7259.8434 Mbps 863.9678 MB / 1.00 sec = 7247.4942 Mbps 864.7493 MB / 1.00 sec = 7254.4634 Mbps 864.6660 MB / 1.00 sec = 7253.5183 Mbps 7816.9375 MB / 10.00 sec = 6554.4883 Mbps 90 %TX 53 %RX [EMAIL PROTECTED] ~]# netstat -s | grep -i retrans 0 segments retransmited reno achieves its maximal rate in about 2 seconds. This is what I would expect from the exponential increase during TCP's initial slow start. To achieve 10 Gbps on an 80 ms RTT with 9000 byte jumbo frame packets would require: [EMAIL
Re: [PATCH] [TCP] Highspeed: Limited slow-start is nowadays in tcp_slow_start
Hi David, I ran couple of testing to see the limited slow start for HSTCP. For this testing, I set max_ssthresh value to 100. With the slow start, it takes around 4sec to hit the cwnd of 21862 (more than 6000 packet drops for one rtt). With the limited slow start, it takes 108sec to hit the cwnd of 8420. If taken into account a delayed ack, it looks fine for me. See the plots below. With Slow Start: http://netsrv.csc.ncsu.edu/net-2.6.22/slow_start/600--HSTCP-HSTCP-NONE--400-3-1.0--1000-120-0-0-1-1-5-500--1.0-0.2-133000-73-300-0.93-150--1/ With Limited Slow Start: http://netsrv.csc.ncsu.edu/net-2.6.22/limited_slow_start/600--HSTCP-HSTCP-NONE--400-3-1.0--1000-120-0-0-1-1-5-500--1.0-0.2-133000-73-300-0.93-150--1/ Thanks, Sangtae On 5/3/07, David Miller [EMAIL PROTECTED] wrote: From: Ilpo_Järvinen [EMAIL PROTECTED] Date: Thu, 3 May 2007 15:34:25 +0300 (EEST) Reuse limited slow-start (RFC3742) included into tcp_cong instead of having another implementation in High Speed TCP. Compile tested only. Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED] Thanks for noticing this code duplication. I've applied this, but it would be great if someone would do some sanity tests of highspeed to make sure it's ok. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Net-2.6.22 TCP testing
Hi all, See the TCP testing results of net-2.6.22.git tree at http://netsrv.csc.ncsu.edu/wiki/index.php/Intra_protocol_fairness_testing_with_net-2.6.22.git In addition to Stephen's recent 1Mbit DSL result, the results include the cases with four different bandwidths (10M/100M/200M/400M) and background traffic. Thanks, Sangtae - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
net-2.6.22 TCP testing
Hi all, See the TCP testing results of net-2.6.22.git tree at http://netsrv.csc.ncsu.edu/wiki/index.php/Intra_protocol_fairness_testing_with_net-2.6.22.git In addition to Stephen's recent 1Mbit DSL result, the results include the cases with four different bandwidths (10M/100M/200M/400M) and the presence of background traffic. Thanks, Sangtae - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 3/3] tcp: remove experimental variants from default list
Hi Baruch, I would like to add some comments on your argument. On 2/13/07, Baruch Even [EMAIL PROTECTED] wrote: * David Miller [EMAIL PROTECTED] [070213 00:53]: From: Baruch Even [EMAIL PROTECTED] Date: Tue, 13 Feb 2007 00:12:41 +0200 The problem is that you actually put a mostly untested algorithm as the default for everyone to use. The BIC example is important, it was the default algorithm for a long while and had implementation bugs that no one cared for. And if our TCP Reno implementation had some bugs, what should we change the default to? This is just idiotic logic. These kinds of comments are just wanking, and lead to nowhere, so please kill the noise. If we have bugs in a particular algorithm, we should just fix them. I hope you've finished attempting to insult me. But I hope it won't prevent you from getting back to the topic. The above quote of me was a prelude to show the repeat behaviour where bic was added without testing, modified by Stephen and made default with no serious testing of what was put in the kernel. What kind of serious testing you want to? I've been testing all highspeed protocols including BIC and CUBIC for two and half years now. Even Stephen didn't test CUBIC algorithm by himself, he might see the results from our experimental studies. I don't care what algorithm is default in kernel, however, it is not appropriate to get back to Reno. As Windows decided to go with Compound TCP, why we want to back to 80's algorithm? It seems this happens again no with cubic. And you failed to respond to this issue. The behaviour of cubic wasn't properly verified as the algorithm in the linux kernel is not the one that was actually proposed and you intend to make it the default without sufficient testing, that seems to me to be quite unreasonable. According to claims of Doug Leith the cubic algorithm that is in the kernel is different from what was proposed and tested. That's an important issue which is deflected by personal attacks. Did you read that paper? http://wil.cs.caltech.edu/pfldnet2007/paper/CUBIC_analysis.pdf Then, please read the rebuttal for that paper. http://www.csc.ncsu.edu/faculty/rhee/Rebuttal-LSM-new.pdf Also, the implementation can be different. The cubic code inside of current kernel introduces faster calculation of cubic root. Even though we had some bugs on CUBIC implementation, it is fixed now. My main gripe is that there is a run to make an untested algorithm the default for all Linux installations. And saying that I should test it is not an escape route, if it's untested it shouldn't be made the default algorithm. What is criteria for untested? Who judges that this algorithm is fully tested and is ready to use? Could you tell me some other groups did more testing than ours? Dummynet Testbed Result http://netsrv.csc.ncsu.edu/highspeed/ http://netsrv.csc.ncsu.edu/convex-ordering/ http://www.csc.ncsu.edu/faculty/rhee/export/comnet-v3-rhee.pdf Real testing between Korea and Japan (Seoul-Daejon-Busan-Japan) http://netsrv.csc.ncsu.edu/highspeed/exp/ We still do testing with latest kernel version on production networks(4ms, 6ms, 9ms, 45ms, and 200ms). I will post the results when those are ready. My skimming of the PFLDNet 2007 proceedings showed only the works by Injong and Doug on Cubic and Injong tested some version on Linux 2.6.13(!) which might noe be the version in the current tree. Doug shows some weaknesses of the Cubic algorithm as implemented in Linux. As I mentioned, please read the paper and rebuttal carefully. Also, in PFLDnet 2007, Prof. R. Srikant proposed a new algorithm that uses BIC and CUBIC curve based on delay estimation even he didn't know about BIC and CUBIC before. I felt the CUBIC algorithm itself is not a bad idea as other newly proposed algorithms follow BIC and CUBIC approaches. I admit all proposed algorithms have their advantages over others. Do you still think that making Cubic the default is a good idea? Then, what do you want to make a default? You want to get back to BIC? or Reno? Baruch - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Sangtae - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html