Andi Kleen wrote:
Hallo,
While looking for something else in tcp_output.c I noticed that
MTU probing seems to be only done in tcp_write_xmit (when
packets come directly from process context), but not via the timer
driven timer retransmit path (tcp_retransmit_skb). Is that intentional?
It
Ritesh Kumar wrote:
On 1/16/08, Bill Fink [EMAIL PROTECTED] wrote:
On Tue, 15 Jan 2008, Ritesh Kumar wrote:
Hi,
I am using linux 2.6.20 and am trying to limit the receiver window
size for a TCP connection. However, it seems that auto tuning is not
turning itself off even after I use the
David Miller wrote:
From: John Heffner [EMAIL PROTECTED]
Date: Tue, 08 Jan 2008 23:27:08 -0500
I also wonder how much of a problem this is (for now, with window sizes
of order 1 packets. My understanding is that the biggest problems
arise from O(N^2) time for recovery because every ack
SANGTAE HA wrote:
On Jan 9, 2008 9:56 AM, John Heffner [EMAIL PROTECTED] wrote:
I also wonder how much of a problem this is (for now, with window sizes
of order 1 packets. My understanding is that the biggest problems
arise from O(N^2) time for recovery because every ack was expensive
David Miller wrote:
Ilpo, just trying to keep an old conversation from dying off.
Did you happen to read a recent blog posting of mine?
http://vger.kernel.org/~davem/cgi-bin/blog.cgi/2007/12/31#tcp_overhead
I've been thinking more and more and I think we might be able
to get away with
Andi Kleen wrote:
David Miller [EMAIL PROTECTED] writes:
The big problem is that recovery from even a single packet loss in a
window makes us run kfree_skb() for a all the packets in a full
window's worth of data when recovery completes.
Why exactly is it a problem to free them all at once?
David Miller wrote:
From: Ilpo_Järvinen [EMAIL PROTECTED]
Date: Thu, 20 Dec 2007 13:40:51 +0200 (EET)
[PATCH] [TCP]: Fix TSO deferring
I'd say that most of what tcp_tso_should_defer had in between
there was dead code because of this.
Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]
David Miller wrote:
Ilpo, I was pondering the kind of debugging one does to find
congestion control issues and even SACK bugs and it's currently too
painful because there is no standard way to track state changes.
I assume you're using something like carefully crafted printk's,
kprobes, or even
Ilpo Järvinen wrote:
...I'm still to figure out why tcp_cwnd_down uses snd_ssthresh/2
as lower bound even though the ssthresh was already halved,
so snd_ssthresh should suffice.
I remember this coming up at least once before, so it's probably worth a
comment in the code. Rate-halving
Ilpo Järvinen wrote:
On Tue, 4 Dec 2007, John Heffner wrote:
Ilpo Järvinen wrote:
...I'm still to figure out why tcp_cwnd_down uses snd_ssthresh/2
as lower bound even though the ssthresh was already halved, so snd_ssthresh
should suffice.
I remember this coming up at least once before, so
there with SYN still queued.
Use of write_seq check guarantees that there's a valid skb in
send_head so I removed the extra check.
Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]
Acked-by: John Heffner [EMAIL PROTECTED]
---
net/ipv4/tcp_output.c |7 +--
1 files changed, 1 insertions(+), 6
[EMAIL PROTECTED]
Acked-by: John Heffner [EMAIL PROTECTED]
---
net/ipv4/tcp_output.c | 17 -
1 files changed, 8 insertions(+), 9 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 30d6737..ff22ce8 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4
Stephen Hemminger wrote:
Looks like a memory over commit with small machines??
Begin forwarded message:
Date: Fri, 19 Oct 2007 01:35:33 -0700 (PDT)
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bug 9189] New: Oops in kernel 2.6.21-rc4 through 2.6.23, page
allocation failure
[snip]
Ben Greear wrote:
I just tried turning off my explicit SO_SNDBUF/SO_RCVBUG settings in my
app,
and the connection ran very poorly through a link with even a small
bit of latency (~2-4ms I believe).
I often run at full gigabit or faster with latencies of 100+ ms. Can
you give a bit more
gigabit machines:
$ ./tcpsend -t10 dew
Sent 1240415312 bytes in 10.033101 seconds
Throughput: 123632294 B/s
-John
/*
* discard.c
* A simple discard server.
*
* Copyright 2003 John Heffner.
*/
#include stdio.h
#include signal.h
#include unistd.h
#include string.h
#include stdlib.h
#include
Larry McVoy wrote:
On Tue, Oct 02, 2007 at 06:52:54PM +0800, Herbert Xu wrote:
One of my clients also has gigabit so I played around with just that
one and it (itanium running hpux w/ broadcom gigabit) can push the load
as well. One weird thing is that it is dependent on the direction the
data
Larry McVoy wrote:
More data, we've conclusively eliminated the card / cpu from the mix.
We've got 2 ia64 boxes with e1000 interfaces. One box is running
linux 2.6.12 and the other is running hpux 11.
I made sure the linux one was running at gigabit and reran the tests
from the linux/ia64 =
Yes it has this problem. I've observed it in practice on a busy firewall.
-John
Chris Friesen wrote:
Hi all,
We're considering some hardware that uses the sk98lin network hardware,
and we'll be using jumbo frames. Looking at the driver, when using a
9KB MTU it seems like it would end
Stephen Hemminger wrote:
On Fri, 28 Sep 2007 00:08:33 +0200
Eric Dumazet [EMAIL PROTECTED] wrote:
Hi all
I am sure some of you are going to tell me that prequeue is not
all black :)
Thank you
[RFC] Make TCP prequeue configurable
The TCP prequeue thing is based on old facts, and has
Any reason you're overloading tcpi_unacked and tcpi_sacked? It seems
that setting idiag_rqueue and idiag_wqueue are sufficient.
-John
Rick Jones wrote:
Return some useful information such as the maximum listen backlog and the
current listen backlog in the tcp_info structure and have that
Rick Jones wrote:
John Heffner wrote:
Any reason you're overloading tcpi_unacked and tcpi_sacked? It seems
that setting idiag_rqueue and idiag_wqueue are sufficient.
Different fields for different structures. The tcp_info struct doesn't
have the idiag_mumble, so to get the two values
I don't know why the owner field is a (struct sock_iocb *). I'm assuming
it's historical. Can someone check this out? Did I miss some alternate
usage?
These patches are against net-2.6.24.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL
Changes asserts in sunrpc to use sock_owned_by_user() macro instead of
referencing sock_lock.owner directly.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
net/sunrpc/svcsock.c |2 +-
net/sunrpc/xprtsock.c |2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net
-by: John Heffner [EMAIL PROTECTED]
---
include/net/sock.h |7 +++
net/core/sock.c|6 +++---
2 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h
index 802c670..5ed9fa4 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -76,10
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
misc/ss.c |4
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/misc/ss.c b/misc/ss.c
index 5d14f13..d617f6d 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -953,6 +953,10 @@ void *parse_hostcond(char *addr)
memset(a, 0
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
Makefile |3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/Makefile b/Makefile
index af0d5e4..7e4605c 100644
--- a/Makefile
+++ b/Makefile
@@ -29,7 +29,8 @@ LDLIBS += -L../lib -lnetlink -lutil
SUBDIRS=lib ip tc misc
Rick Jones wrote:
Like I said the consumers of this are a triffle well,
anxious :)
Just curious, did you or this customer try with F-RTO enabled? Or is
this case you're dealing with truly hopeless?
-John
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a
David Miller wrote:
From: Rick Jones [EMAIL PROTECTED]
Date: Wed, 29 Aug 2007 15:29:03 -0700
David Miller wrote:
None of the research folks want to commit to saying a lower value is
OK, even though it's quite clear that on a local 10 gigabit link a
minimum value of even 200 is absolutely and
John Heffner wrote:
What exactly causes such a huge delay? What is the TCP measured RTO
in these circumstances where spurious RTOs happen and a 3 second
minimum RTO makes things better?
I haven't done a lot of work on wireless myself, but my understanding is
that one of the biggest problems
Stephen Hemminger wrote:
On Wed, 29 Aug 2007 15:28:12 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:
And reading NCR some more, we already have something similar in the
form of Alexey's reordering detection, in fact it handles exactly the
case NCR supposedly deals with. We do not trigger
David Miller wrote:
From: Rick Jones [EMAIL PROTECTED]
Date: Wed, 29 Aug 2007 16:06:27 -0700
I belive the biggest component comes from link-layer retransmissions.
There can also be some short outtages thanks to signal blocking,
tunnels, people with big hats and whatnot that the link-layer
OBATA Noboru wrote:
Is it correct that you think my problem can be addressed either
by the followings?
(1) Make the application timeouts longer. (Steve has shown that
making an application timeouts twice the failover detection
timeout would be a solution.)
Right. Is there something
Bill Fink wrote:
Here's the beforeafter delta of the receiver's netstat -s
statistics for the TSO enabled case:
Ip:
3659898 total packets received
3659898 incoming packets delivered
80050 requests sent out
Tcp:
2 passive connection openings
3659897 segments received
TJ wrote:
Right now Juniper are claiming the issue that brought this to the
surface (the bug linked to in my original post) is a problem with the
implementation of TCP_DEFER_ACCEPT.
My position so far is that the Juniper DX OS is not following the HTTP
standard because it doesn't send a request
TJ wrote:
client SYN server LISTENING
client SYN ACK server SYN_RECEIVED (time-out 3s)
server: inet_rsk(req)-acked = 1
client ACK server (discarded)
client SYN ACK (DUP) server (time-out 6s)
client ACK (DUP) server (discarded)
client SYN ACK (DUP) server (time-out 12s)
That sounds right to me.
-John
Ilpo Järvinen wrote:
On Mon, 6 Aug 2007, Ilpo Järvinen wrote:
...Goto logic could be cleaner (somebody has any suggestion for better
way to structure it?)
...I could probably move the setting of snd_cwnd earlier to avoid
this problem if this seems a valid
I believe the current calculation is correct. The RFC specifies a
window of no more than 4380 bytes unless 2*MSS 4380. If you change
the code in this way, then MSS=1461 will give you an initial window of
3*MSS == 4383, violating the spec. Reading the pseudocode in the RFC
3390 is a bit
Rick Jones wrote:
as an asside, tcp_max_ssthresh sounds like the maximum value ssthresh
can take-on. is that correct, or is this more of a once ssthresh is
above this, behave in this new way? If that is the case, while the
I don't like it either, but you'll have to talk to Sally Floyd
Benjamin LaHaise wrote:
According to your patch, several packets with fin bit might be sent,
including one with data. If another host does not receive fin
retransmit, then that logic is broken, and it can not be fixed by
duplicating fins, I would even say, that remote box should drop second
Benjamin LaHaise wrote:
On Tue, May 01, 2007 at 09:41:28PM +0400, Evgeniy Polyakov wrote:
Hmm, 2.2 machine in your test seems to behave incorrectly:
I am aware of that. However, I think that the loss of certain packets and
reordering can result in the same behaviour. What's more, is that
This backs out the the transport layer MTU checks that don't work. As a
consequence, I had to back out the PMTUDISC_PROBE patch as well. These
patches should fix the problem with ipv6 that the transport layer change
tried to address, and re-implement PMTUDISC_PROBE. I think this
approach is
This reverts commit 87e927a0583bd4a8ba9e97cd75b58d8aa1c76e37.
This idea does not work, as pointed at by Patrick McHardy.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
net/ipv4/ip_output.c |4 +---
net/ipv4/raw.c|8 +++-
net/ipv6/ip6_output.c | 11 +--
net/ipv6
Adds a check in ip6_fragment() mirroring ip_fragment() for packets
that we can't fragment, and sends an ICMP Packet Too Big message
in response.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
net/ipv6/ip6_output.c | 13 +
1 files changed, 13 insertions(+), 0 deletions(-)
diff
This reverts commit d21d2a90b879c0cf159df5944847e6d9833816eb.
Must be backed out because commit 87e927a0583bd4a8ba9e97cd75b58d8aa1c76e37
does not work.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
include/linux/in.h |1 -
include/linux/in6.h |1 -
include/linux/skbuff.h
, like
traceroute/tracepath.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
include/linux/in.h |1 +
include/linux/in6.h |1 +
net/ipv4/ip_output.c | 20 +++-
net/ipv4/ip_sockglue.c |2 +-
net/ipv6/ip6_output.c| 15 ---
net/ipv6
This reverts commit 87e927a0583bd4a8ba9e97cd75b58d8aa1c76e37.
This idea does not work, as pointed at by Patrick McHardy.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
net/ipv4/ip_output.c |4 +---
net/ipv4/raw.c|8 +++-
net/ipv6/ip6_output.c | 11 +--
net/ipv6
This reverts commit d21d2a90b879c0cf159df5944847e6d9833816eb.
Must be backed out because commit 87e927a0583bd4a8ba9e97cd75b58d8aa1c76e37
does not work.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
include/linux/in.h |1 -
include/linux/in6.h |1 -
include/linux/skbuff.h
, like
traceroute/tracepath.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
include/linux/in.h |1 +
include/linux/in6.h |1 +
net/ipv4/ip_output.c | 20 +++-
net/ipv4/ip_sockglue.c |2 +-
net/ipv6/ip6_output.c| 15 ---
net/ipv6
Adds a check in ip6_fragment() mirroring ip_fragment() for packets
that we can't fragment, and sends an ICMP Packet Too Big message
in response.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
net/ipv6/ip6_output.c | 13 +
1 files changed, 13 insertions(+), 0 deletions(-)
diff
Sorry, forgot the -n flag on git-format-patch. Patches resent with
correct sequence numbers.
Thanks,
-John
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Robert Iakobashvili wrote:
Hi John,
On 4/15/07, John Heffner [EMAIL PROTECTED] wrote:
Robert Iakobashvili wrote:
Vanilla 2.6.18.3 works for me perfectly, whereas 2.6.19.5 and
2.6.20.6 do not.
Looking into the tcp /proc entries of 2.6.18.3 versus 2.6.19.5
tcp_rmem and tcp_wmem are the same
Robert Iakobashvili wrote:
Kernels 2.6.19 and 2.6.20 series are effectively broken right now.
Don't you wish to patch them?
I don't know if this qualifies as an unconditional bug. The commit
above was actually a bugfix so that the limits were not higher than
total memory on some systems,
Stephen Hemminger wrote:
A guess: maybe something related to a PAWS wraparound problem.
Does turning off sysctl net.ipv4.tcp_timestamps fix it?
That was my first thought too (aside from netfilter), but a failed PAWS
check should not result in a reset..
-John
-
To unsubscribe from this
---
2.6.18.312288 16384 24576
2.6.19.5 30724096 6144
Is not it done deliberately by the below patch:
commit 9e950efa20dc8037c27509666cba6999da9368e8
Author: John Heffner [EMAIL PROTECTED]
Date: Mon Nov 6 23:10:51 2006 -0800
[TCP]: Don't use highmem in tcp hash
Patrick McHardy wrote:
John Heffner wrote:
Check the pmtu check at the transport layer (for UDP, ICMP and raw), and
send a local error if socket is PMTUDISC_DO and packet is too big. This is
actually a pure bugfix for ipv6. For ipv4, it allows us to do pmtu checks
in the same way as for ipv6
---
doc/tracepath.sgml | 13 +
1 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/doc/tracepath.sgml b/doc/tracepath.sgml
index 71eaa8d..c0f308b 100644
--- a/doc/tracepath.sgml
+++ b/doc/tracepath.sgml
@@ -15,6 +15,7 @@ traces path to a network host discovering MTU
---
doc/tracepath.sgml |9 +
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/doc/tracepath.sgml b/doc/tracepath.sgml
index c0f308b..1bc83b9 100644
--- a/doc/tracepath.sgml
+++ b/doc/tracepath.sgml
@@ -15,6 +15,7 @@ traces path to a network host discovering MTU along
This fixes a bug that would miss a hop after an ICMP packet too big message,
since it would continue increase the TTL without probing again.
---
tracepath.c |6 ++
tracepath6.c |6 ++
2 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/tracepath.c b/tracepath.c
index
We should only print the asymm messages in tracepath/6 when you receive a
TTL expired message, because this is the only time when we'd expect the
same number of hops back as our TTL was set to for a symmetric path.
---
tracepath.c | 25 -
tracepath6.c | 25
Document new IP_PMTUDISC_PROBE value for IP_MTU_DISCOVERY. (Going into
2.6.22).
Thanks,
-John
diff -rU3 man-pages-2.43-a/man7/ip.7 man-pages-2.43-b/man7/ip.7
--- man-pages-2.43-a/man7/ip.7 2006-09-26 09:54:29.0 -0400
+++ man-pages-2.43-b/man7/ip.7 2007-03-27 15:46:18.0
Mark Huth wrote:
David Miller wrote:
From: [EMAIL PROTECTED] (David Griego)
Date: Tue, 27 Mar 2007 14:47:54 -0700
Adds an IOCTL for aborting established TCP connections, and is
designed to be an HA performance improvement for cleaning up, failure
notification, and application
John Heffner wrote:
I also believe this is a useful thing to have. I'm not 100% sure this
ioctl is the way to go, but it seems reasonable. This directly
corresponds to writing deleteTcb to the tcpConnectionState variable in
the TCP MIB (RFC 4022). I don't think it constitutes a protocol
These are a few changes to fix/clean up some of the MTU discovery
processing with non-stream sockets, and add a probing mode. See also
matching patches to tracepath to take advantage of this.
-John
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to
Check the pmtu check at the transport layer (for UDP, ICMP and raw), and
send a local error if socket is PMTUDISC_DO and packet is too big. This is
actually a pure bugfix for ipv6. For ipv4, it allows us to do pmtu checks
in the same way as for ipv6.
Signed-off-by: John Heffner [EMAIL PROTECTED
Do fragmentation check in ip_forward, similar to ipv6 forwarding. Also add
a debug printk in the DF check in ip_fragment since we should now never
reach it.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
net/ipv4/ip_forward.c |8
net/ipv4/ip_output.c |2 ++
2 files changed
, like
traceroute/tracepath.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
include/linux/in.h |1 +
include/linux/in6.h |1 +
include/linux/skbuff.h |3 ++-
include/net/ip.h |2 +-
net/core/skbuff.c|2 ++
net/ipv4/ip_output.c | 14
These add some changes that make tracepath a little more useful for
diagnosing MTU issues. The length flag helps distinguish between MTU
black holes and other types of black holes by allowing you to vary the
probe packet lengths. Using PMTUDISC_PROBE gives you the same results
on each run
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
tracepath.c | 10 --
tracepath6.c | 10 --
2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/tracepath.c b/tracepath.c
index 1f901ba..a562d88 100644
--- a/tracepath.c
+++ b/tracepath.c
@@ -24,6 +24,10
David Miller wrote:
From: John Heffner [EMAIL PROTECTED]
Date: Wed, 14 Mar 2007 17:25:22 -0400
The current tcp_mem initialization gives values that are really too
small for systems with ~256-768 MB of memory, and also for systems with
larger page sizes (ia64). This patch gives an alternate
just below full_space/2 for a bit then rises
above full_space/2.
I've also attached a corrected version of my earlier patch that I think
solves the problem you noted.
Thanks,
-John
Do full receiver-side SWS avoidance when rcvbuf mss.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
commit
David Miller wrote:
From: John Heffner [EMAIL PROTECTED]
Date: Fri, 02 Mar 2007 16:16:39 -0500
Please don't apply the patch I sent. I've been thinking about this a
bit harder, and it may not fix this particular problem. (Hard to say
without knowing exactly what it is.) As the comment above
. :)
I think this attached patch does the correct SWS avoidance.
Thanks,
-John
Do receiver-side SWS avoidance for rcvbuf MSS.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
commit 38d33181c93a28cf7fb2f9f3377305a04636c054
tree 503f8a9de6e78694bae9fc2eb1c9dd5d26a0b5ed
parent
David Miller wrote:
From: Alex Sidorenko [EMAIL PROTECTED]
Date: Fri, 2 Mar 2007 15:21:58 -0500
they told us that they use small rcvbuf to throttle bandwidth for this
application. I explained it would be better to use TC for this purpose. They
agreed and will probably redesign their
Document sysctl tcp_moderate_rcvbuf.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
commit 4c5fd9d3a9ea8b939aed1afda2ac0fc54e3df592
tree c25c2fd01e076fbb7356a8c37d06d2e22c60f263
parent aef8811abbc9249a2bd59bd2331bbe523df05d17
author John Heffner [EMAIL PROTECTED] Mon, 26 Feb 2007 19:44:58
Document sysctl tcp_no_metrics_save.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
commit 17cb799000caef3b2fed28cc5d0601bb2311efa8
tree c27ccf561065b145bc48d0b8dbbaa3c608015e03
parent 4c5fd9d3a9ea8b939aed1afda2ac0fc54e3df592
author John Heffner [EMAIL PROTECTED] Mon, 26 Feb 2007 19:51:50
Documentation for sysctls tcp_mtu_probing and tcp_base_mss.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
commit 6da0563572e0a6d0abda9d950f30902844c37862
tree 6f21ae02c11a1340412a926e8e2f568f5ed3b5a8
parent 17cb799000caef3b2fed28cc5d0601bb2311efa8
author John Heffner [EMAIL PROTECTED] Mon
My patch is meant as a replacement for YeAH patch 2/2, not meant to back
it out. You do still need the second hunk below. Sorry 'bout that.
If you're going to apply YeAH patch 2/2 first, you will also need to
remove the declaration of tcp_limited_slow_start() in include/net/tcp.h.
Thanks,
Fix arithmetic order bug in limited slow start. The subtraction needs to be
done before snd_cwnd is incremented.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
commit 244e7411d99443df7b7ae849ba6ebbec4c2342bc
tree e6d5985a22448f59f8bef393542e1d5497ee5684
parent
Ilpo Järvinen wrote:
BTW, while looking this patch, I noticed that snd_cwnd_clamp is only u16
while snd_cwnd is u32, which seems rather strange since snd_cwnd is being
limited by the clamp value here and there?!?! And tcp_highspeed.c is
clearly assuming even more than this (but the problem is
control
* This is special case used for fallback as well.
Add RFC3742 Limited Slow-Start, controlled by variable sysctl_tcp_max_ssthresh.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
commit 97033fa201705e6cfc68ce66f34ede3277c3d645
tree 5df4607728abce93aa05b31015a90f2ce369abff
parent
Angelo P. Castellani wrote:
John Heffner ha scritto:
Note the patch is compile-tested only! I can do some real testing if
you'd like to apply this Dave.
The date you read on the patch is due to the fact I've splitted this
patchset into 2 diff files. This isn't compile-tested only, I've used
This isn't really a reply to anyone in particular, but I wanted to touch
on a few points.
Reno. As Windows decided to go with Compound TCP, why we want to
back to 80's algorithm?
It's worth noting that Microsoft is not using Compound TCP by default,
except in Beta versions so they can get
to FIN packets that contain data.
---
commit af319609eee705e0791a1a58c33b216e8d0254bf
tree 5a1afcc506e09f5adfd74efb7e0cbbc82ec4d5b0
parent c0d4d573feed199b16094c072e7cb07afb01c598
author John Heffner [EMAIL PROTECTED] Mon, 05 Feb 2007 16:25:46 -0500
committer John Heffner [EMAIL PROTECTED] Mon, 05
David Miller wrote:
From: John Heffner [EMAIL PROTECTED]
Date: Mon, 05 Feb 2007 16:58:18 -0500
This is especially important with TSO enabled. Currently, it will send
a burst of up to 64k at the end of a connection, even when cwnd is much
smaller than 64k. This patch still lets out empty FIN
-off-by: John Heffner [EMAIL PROTECTED]
---
commit 89de0d8cb75958b0315c076b31a597143e30f7a4
tree 7e9c321e62729c6ef76e3886fe9edf2ac78a680c
parent c0d4d573feed199b16094c072e7cb07afb01c598
author John Heffner [EMAIL PROTECTED] Mon, 05 Feb 2007 18:42:31 -0500
committer John Heffner [EMAIL PROTECTED] Mon
Rick Jones wrote:
John Heffner wrote:
David Miller wrote:
However, I can't think of any reason why the cwnd test should not
apply.
Care to elaborate here? You can view the FIN special case as an off
by one error in the CWND test, it's not going to melt the internet.
:-)
True, it's
David Miller wrote:
However, I wonder if we want to set this differently than the way this
patch does it. Depending on how far off the memory size is from a power
of two (exactly equal to a power of two is the worst case), and if total
memory 128M, it can be substantially less than 3/4.
systems).
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
commit d4ef8c8245c0a033622ce9ba9e25d379475254f6
tree 5377b8af0bac3b92161188e7369a84e472b5acb2
parent ea55b7c31b47edf90132baea9a088da3bbe2bb5c
author John Heffner [EMAIL PROTECTED] Tue, 14 Nov 2006 14:53:27 -0500
committer John Heffner [EMAIL
David Miller wrote:
If we don't ACK every two segments, stacks which grow the congestion
window based upon packet counting will not grow the congestion window
properly when they are sending smaller than MSS sized segments.
The only stack I know of that does this currently is linux, and in
David Miller wrote:
From: John Heffner [EMAIL PROTECTED]
Date: Tue, 07 Nov 2006 16:50:33 -0500
The only stack I know of that does this currently is linux, and in doing
so does not conform to the spec. ;) Sending to a BSD receiver will
result in the same behavior, so the right place to fix
This patch removes consideration of high memory when determining TCP hash
table sizes. Taking into account high memory results in tcp_mem values that
are too large.
Signed-off-by: John Heffner [EMAIL PROTECTED]
---
commit ea55b7c31b47edf90132baea9a088da3bbe2bb5c
tree
I think unfair is a difficult word. Unfair to what? It's true that
Scalable TCP is unfair to itself in that flows with unequal shares do
not converge, but it's not clear what its interactions are with other
congestion control algorithms. It's not clear to me that it's
significantly more
My reservation in doing this would be that as an administrator, I may
want to choose exactly what congestion control is available any any
given time. The different congestion control algorithms are not
necessarily fair to each other.
If the modules are autoloaded, I could still enforce this
Hagen Paul Pfeifer wrote:
* John Heffner | 2006-10-26 13:29:26 [-0400]:
My reservation in doing this would be that as an administrator, I may
want to choose exactly what congestion control is available any any
given time. The different congestion control algorithms are not
necessarily fair
David Miller wrote:
From: John Heffner [EMAIL PROTECTED]
Date: Tue, 17 Oct 2006 00:18:33 -0400
Stephen Hemminger wrote:
On Mon, 16 Oct 2006 20:53:20 -0400 (EDT)
John Heffner [EMAIL PROTECTED] wrote:
This patch limits the amount of time you will defer sending a TSO segment
to less than two
-- Forwarded message --
Date: Mon, 16 Oct 2006 15:55:53 -0400 (EDT)
From: John Heffner [EMAIL PROTECTED]
To: David Miller [EMAIL PROTECTED]
Cc: netdev netdev@vger.kernel.org
Subject: [PATCH] Bound TSO defer time
This patch limits the amount of time you will defer sending a TSO segment
Here is a corrected patch.
Signed-off-by: John Heffner [EMAIL PROTECTED]
-static u32 tcp_usrtt(const struct sk_buff *skb)
+static u32 tcp_usrtt(struct timeval *tv)
{
- struct timeval tv, now;
+ struct timeval now;
do_gettimeofday(now);
- skb_get_timestamp(skb, tv
-off-by: John Heffner [EMAIL PROTECTED]
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index b5521a9..d0f6bd6 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2228,13 +2228,12 @@ static int tcp_tso_acked(struct sock *sk
return acked;
}
-static u32 tcp_usrtt(const
Okay, this patch is junk (never trust compile-tested code). Will send
something better soon.
-John
John Heffner wrote:
About commit 2d2abbab63f6726a147ae61ada39bf2c9ee0db9a:
It looks like this patch bypassed the enforcement of Karn's algorithm in
tcp_ack_no_tstamp() for the purposes
David Miller wrote:
John, have a look at this code in tcp_write_timeout():
mss = min(sysctl_tcp_base_mss,
tcp_mtu_to_mss(sk, icsk-icsk_mtup.search_low)/2);
mss = max(mss, 68 - tp-tcp_header_len);
That first line looks like it should be
1 - 100 of 142 matches
Mail list logo