subject:"TCP Reassembly Issues"

Re: TCP Reassembly Issues

2011-12-06 Thread Johannes Totz

On 26/11/2011 05:23, Lawrence Stewart wrote:

On 11/25/11 13:01, Lawrence Stewart wrote:

On 11/24/11 18:02, Kris Bauer wrote:

Hello,

I am currently experiencing an issue with FreeBSD 9.0-RC2 r227852
where the
net.inet.tcp.reass.curesegments value is constantly increasing (and not
descreasing when there is nominal traffic with the box). It is causing
tcp
slowdowns as described with kern/155407:

Exhausted net.inet.tcp.reass.maxsegments block recovering tcp session
(for
this socket and any other socket waiting for retransmited packets).
After
exhausted net.inet.tcp.reass.maxsegments allocation new entry in
tcp_reass
failed (for this socket and any other socket waiting for retransmited
packets).

I have increased the reass.maxsegments value to 16384 to temporarily
avoid
the problem, but the cursegments number keeps rising and it seems it
will
occur again.

Is this an issue that anyone else has seen? I can provide more
information
if need be.

Thanks Kris, Raul and Stefan for the reports, I'll look into this.

I think I've got it - a stupid 1 line logic bug. My apologies for
missing it when I reviewed the patch which introduced the bug (patch was
committed to head as r226113, MFCed to stable/9 as r226228).

Due to some miscommunication, the initial patch was committed to and
MFCed from head much later than it should have been in the 9.0 release
cycle and instead of being included in the BETAs, didn't make it in
until 9.0-RC1 I believe i.e. only RC1 and RC2 should be experiencing the
issue.

Could those who have reported the bug and are able to recompile their
kernel to test a patch please try the following and report back to the
list:

http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch

The patch is against head r227986 but will apply and work correctly for
9.0 as well.

Just a me-too. Patch applied cleanly and is working fine.

Hehe... and I was blaming the Linux box at the other end of the
connection :)

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TCP Reassembly Issues

2011-11-26 Thread Jeremy Chadwick

On Fri, Nov 25, 2011 at 11:56:47PM -0800, Jeremy Chadwick wrote:
On Sat, Nov 26, 2011 at 12:49:24AM -0600, Kris Bauer wrote:
On Fri, Nov 25, 2011 at 11:23 PM, Lawrence Stewart
lstew...@freebsd.orgwrote:

On 11/25/11 13:01, Lawrence Stewart wrote:

On 11/24/11 18:02, Kris Bauer wrote:

Hello,

I have increased the reass.maxsegments value to 16384 to temporarily
avoid
the problem, but the cursegments number keeps rising and it seems it
will
occur again.

Is this an issue that anyone else has seen? I can provide more
information
if need be.

Thanks Kris, Raul and Stefan for the reports, I'll look into this.

I think I've got it - a stupid 1 line logic bug. My apologies for missing
it when I reviewed the patch which introduced the bug (patch was committed
to head as r226113, MFCed to stable/9 as r226228).

Due to some miscommunication, the initial patch was committed to and MFCed
from head much later than it should have been in the 9.0 release cycle and
instead of being included in the BETAs, didn't make it in until 9.0-RC1 I
believe i.e. only RC1 and RC2 should be experiencing the issue.

Could those who have reported the bug and are able to recompile their
kernel to test a patch please try the following and report back to the
list:

http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch

The patch is against head r227986 but will apply and work correctly for
9.0 as well.

Cheers,
Lawrence

I have patched, recompiled, and rebooted. net.inet.tcp.reass.cursegments
is no longer incrementing, and connectivity is holding steady. If anything
changes over the next couple of hours, I'll be sure to report it; but all
preliminary signs of the problem are gone.

Thanks for all the help!

Let's not be hasty in concluding everything is fixed. Why I'm a bit on
edge about this: I took the time to find the CVS commits that induced
this issue in the first place, and it seems there is some history.

The commit that caused this problem to begin with was supposedly a fix
for a different problem:

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_reass.c#rev1.375

A week later, that commit went from HEAD/MAIN into RELENG_9:

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_reass.c#rev1.374.2.2

Be sure to read the description of the problem that was being fixed in
the first place. I've also CC'd the original problem reporter, Steven
Hartland, because we're going to need him to try the above patch from
Lawrence to make sure there aren't other problems. Meaning: for all we
know, the above fix might work great for Kris but cause problems for
Steve.

This entire situation leads me to believe very few people are doing
quality testing of RELENG_9, yet we're already into 9.0-RC2. Please
don't tell me that's exactly why you should be running RELENG_9!; that
is completely backwards and I refuse to get into a flame war about it,
because it's this simple: 90%+ of those running FreeBSD on servers need
something that's stable, we can't risk wonkiness (especially of this
degree!) on systems taking production traffic. Did no one actually test
the change *thoroughly*? Imagine had this lay dormant until 9.0-RELEASE.

Lawrence: please don't take my comments personally or to mean you broke
it and caused this mess! It's meant to read more along the lines of
you committed a fix for something that broke other bits badly, but
nobody noticed this, including the original reporter of a different
problem? How/why? You get the idea.

Re-sending, because the Tested by commit line had someone who replaced
the @ character with -at-, so my mail client assumed the Email
address was on my local machine. Sorry about that folks.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to

Re: TCP Reassembly Issues [SOLVED?]

2011-11-26 Thread Raul


El 26/11/2011 6:23, Lawrence Stewart escribió:

...

kernel to test a patch please try the following and report back to the
list:

http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch


The patch is against head r227986 but will apply and work correctly for
9.0 as well.


Cleanly applied against RELENG_9_0.

As my case was not exactly the same as Kris or Stefan I'd wait their 
feedback but as far I concern, *it works perfect!*.


[]
%sysctl kern.version | head -n1
kern.version: FreeBSD 9.0-RC2 #1: Sat Nov 26 10:24:38 CET 2011

%uptime
12:06PM  up  1:30, 3 users, load averages: 0,07 0,08 0,10

%vmstat -z | head -n1 ; vmstat -z | grep reass
ITEM   SIZE  LIMIT USED FREE  REQ FAIL SLEEP
tcpreass:40,   1680,  58,1370,  276624,   0,   0

%sysctl net.inet.tcp.reass
net.inet.tcp.reass.overflows: 5
net.inet.tcp.reass.cursegments: 17
net.inet.tcp.reass.maxsegments: 1680

%netstat -s -p tcp | grep mem
5 discarded due to memory problems
[]

I'll leave the box stressing the tcp stack a couple of days, just in case.

Thanks a lot.

Regards,
Raúl.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TCP Reassembly Issues [SOLVED?]

2011-11-26 Thread Kris Bauer

On Sat, Nov 26, 2011 at 5:30 AM, Raul r...@b2n.org wrote:

 El 26/11/2011 6:23, Lawrence Stewart escribió:

 ...

 kernel to test a patch please try the following and report back to the
 list:


 http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch


 The patch is against head r227986 but will apply and work correctly for
 9.0 as well.


 Cleanly applied against RELENG_9_0.

 As my case was not exactly the same as Kris or Stefan I'd wait their
 feedback but as far I concern, *it works perfect!*.

 []
 %sysctl kern.version | head -n1
 kern.version: FreeBSD 9.0-RC2 #1: Sat Nov 26 10:24:38 CET 2011

 %uptime
 12:06PM  up  1:30, 3 users, load averages: 0,07 0,08 0,10

 %vmstat -z | head -n1 ; vmstat -z | grep reass
 ITEM   SIZE  LIMIT USED FREE  REQ FAIL SLEEP
 tcpreass:40,   1680,  58,1370,  276624,   0,   0

 %sysctl net.inet.tcp.reass
 net.inet.tcp.reass.overflows: 5
 net.inet.tcp.reass.cursegments: 17
 net.inet.tcp.reass.maxsegments: 1680

 %netstat -s -p tcp | grep mem
5 discarded due to memory problems
 []

 I'll leave the box stressing the tcp stack a couple of days, just in case.

 Thanks a lot.

 Regards,
 Raúl.


After 5 hours and a few gigs of traffuc, things have been fine:
# sysctl net.inet.tcp.reass
net.inet.tcp.reass.overflows: 155
net.inet.tcp.reass.cursegments: 0
net.inet.tcp.reass.maxsegments: 4116

Kris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TCP Reassembly Issues

2011-11-26 Thread Stefan Bethke

 I think I've got it - a stupid 1 line logic bug. My apologies for missing it 
 when I reviewed the patch which introduced the bug (patch was committed to 
 head as r226113, MFCed to stable/9 as r226228).
 
 Due to some miscommunication, the initial patch was committed to and MFCed 
 from head much later than it should have been in the 9.0 release cycle and 
 instead of being included in the BETAs, didn't make it in until 9.0-RC1 I 
 believe i.e. only RC1 and RC2 should be experiencing the issue.
 
 Could those who have reported the bug and are able to recompile their kernel 
 to test a patch please try the following and report back to the list:
 
 http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch
 
 The patch is against head r227986 but will apply and work correctly for 9.0 
 as well.

I'm a happy camper!


Thanks,
Stefan

-- 
Stefan Bethke s...@lassitu.de   Fon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TCP Reassembly Issues

2011-11-26 Thread George Mitchell


On 11/26/11 00:23, Lawrence Stewart wrote:

[...]
Could those who have reported the bug and are able to recompile their
kernel to test a patch please try the following and report back to the
list:

http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch
[...]

Works for me!  I'm now getting a sustained throughput of 7.4MB/s,
compared to 4.3MB/s on 8.2-STABLE and 3.2MB/s on 7.4-RELEASE, all on
the same hardware (HP notebook with re 100Mb/s interface, reading from
an 8.2-STABLE server with an alc 1000Mb/s interface, via two gigabit
switches).

But I'm still bemused that there should have been any TCP reassembly
going on.  Doesn't that imply that there was packet fragmentation?  My
network is uniformly 1500 byte MTU. -- George
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TCP Reassembly Issues

2011-11-26 Thread George Mitchell


On 11/26/11 02:56, Jeremy Chadwick wrote:

[...]
This entire situation leads me to believe very few people are doing
quality testing of RELENG_9, yet we're already into 9.0-RC2.  Please
don't tell me that's exactly why you should be running RELENG_9!; that
is completely backwards and I refuse to get into a flame war about it,
because it's this simple: 90%+ of those running FreeBSD on servers need
something that's stable, we can't risk wonkiness (especially of this
degree!) on systems taking production traffic.  Did no one actually test
the change *thoroughly*?  Imagine had this lay dormant until 9.0-RELEASE.
[...]


  L U   U    OOO  N   N E
P   P L U   U S O   O NN  N E
  L U   U  SSS  O   O N N N EEE
P L U   U S O   O N  NN E
P L  UUU     OOO  N   N E

I didn't get a warm, fuzzy feeling about FreeBSD 7 until 7.1, and
FreeBSD 8 was worse -- no warm, fuzzy feeling until 8.2.  And I am still
not sold on SCHED_ULE:  Start as many compute-bound programs as there
are CPUs, and prepare for poor (to put it kindly) interactive response.
That's not everybody's usage pattern, but it seems plausible enough to
me to preclude SCHED_ULE as the default scheduler until it is fixed.

On the good side, I'm pleased with the new 9.0 boot menu, and I'm very
happy that the ahci driver automatically creates symbolic links to the
old device names for its disks.  I like that I don't have to tab to the
Okay button in configuration dialogs any more (though I was surprised
the first time it happened).

But I hope this gets fixed for my flash card reader/writer:

ugen0.5: vendor 0x05e3 at usbus0
umass0: vendor 0x05e3 USB TO IDE, class 0/0, rev 2.00/0.32, addr 5 on 
usbus0

umass0:  SCSI over Bulk-Only; quirks = 0x4101
umass0:2:0:-1: Attached to scbus2
(probe0:umass-sim0:0:0:0): AutoSense failed
(da0:umass-sim0:0:0:0): got CAM status 0x4
(da0:umass-sim0:0:0:0): fatal error, failed to attach to device
(da0:umass-sim0:0:0:0): lost device - 0 outstanding
(da0:umass-sim0:0:0:0): removing device entry

-- George Mitchell
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TCP Reassembly Issues

2011-11-26 Thread Lawrence Stewart

Hi Jeremy,

On 11/26/11 18:56, Jeremy Chadwick wrote:

On Sat, Nov 26, 2011 at 12:49:24AM -0600, Kris Bauer wrote:

On Fri, Nov 25, 2011 at 11:23 PM, Lawrence Stewartlstew...@freebsd.orgwrote:

On 11/25/11 13:01, Lawrence Stewart wrote:

[snip]

Thanks Kris, Raul and Stefan for the reports, I'll look into this.

I think I've got it - a stupid 1 line logic bug. My apologies for missing
it when I reviewed the patch which introduced the bug (patch was committed
to head as r226113, MFCed to stable/9 as r226228).

Due to some miscommunication, the initial patch was committed to and MFCed
from head much later than it should have been in the 9.0 release cycle and
instead of being included in the BETAs, didn't make it in until 9.0-RC1 I
believe i.e. only RC1 and RC2 should be experiencing the issue.

Could those who have reported the bug and are able to recompile their
kernel to test a patch please try the following and report back to the list:

http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch

The patch is against head r227986 but will apply and work correctly for
9.0 as well.

Cheers,
Lawrence

Thanks for all the help!

The commit that caused this problem to begin with was supposedly a fix
for a different problem:

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_reass.c#rev1.375

The original patch you reference (equivalent to svn r226113 as noted in
my previous email) was indeed for a separate problem. Unfortunately the
fix introduced a new problem.

A week later, that commit went from HEAD/MAIN into RELENG_9:

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_reass.c#rev1.374.2.2

Even though my patch is a multi-line diff, it only effectively changes
one thing - that the te == NULL condition must be true for both the
case where th-th_seq != tp-rcv_nxt (current segment does not plug
the hole) and where they are equal (current segment does plug the hole;
a new case introduced in r226113). I can say with confidence based on
the change in the logic that my patch is not a regression as far as
Steven's original bug report is concerned.

The latter half is fair criticism, more comments below. The fact we're
having this discussion now prior to 9.0 being released somewhat negates
the assertion in the former part of your paragraph.

Lawrence: please don't take my comments personally or to mean you broke
it and caused this mess! It's meant to read more along the lines of

All good, not taken personally.

you committed a fix for something that broke other bits badly, but
nobody noticed this, including the original reporter of a different
problem? How/why? You get the idea.

Your concerns are valid.

To clarify, I did not propose or commit the patch which introduced this
bug (r226113). Generally speaking, it is a committer's responsibility to
ensure that a patch which they commit has been sufficiently tested prior
to commit.

Normally the committer will solicit testing from the original problem
reporter and do some testing themselves. I believe Steven tested Andre's
patch and reported to the mailing list that it resolved his immediate
problem. I was not privy to any other testing conducted by Andre, so
can't comment further on that.

As to how this could have been missed: TCP is impressively robust,
capable of working even when it has both arms tied behind its back and
is missing a leg. It may not work well, but will limp along all the
same. People tend to notice and report scenarios where something is
definitively broken far more

Re: TCP Reassembly Issues

2011-11-26 Thread Lawrence Stewart

Hi George,

On 11/27/11 03:16, George Mitchell wrote:

On 11/26/11 00:23, Lawrence Stewart wrote:

[...]
Could those who have reported the bug and are able to recompile their
kernel to test a patch please try the following and report back to the
list:

http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch

[...]

Works for me! I'm now getting a sustained throughput of 7.4MB/s,
compared to 4.3MB/s on 8.2-STABLE and 3.2MB/s on 7.4-RELEASE, all on
the same hardware (HP notebook with re 100Mb/s interface, reading from
an 8.2-STABLE server with an alc 1000Mb/s interface, via two gigabit
switches).

Good stuff.

But I'm still bemused that there should have been any TCP reassembly
going on. Doesn't that imply that there was packet fragmentation? My
network is uniformly 1500 byte MTU. -- George

TCP reassembly refers to queuing packets received out of order until the
missing segment is received i.e. not IP layer fragmentation related, but
packet loss or packet reordering related.

I guess something in your setup is dropping the odd packet which is why
your NFS performance isn't closer to the 10+MB/s (I'm not sure how much
overhead NFS adds, but ~12MB/s is max application-layer throughput of
100Mbps Ethernet so achievable NFS throughput should be a bit less than
that) it could be if everything was peachy.

siftr(4) and some tcpdumping on both client/server could probably help
you figure out where you're dropping packets if you want to improve your
current performance even further.

Cheers,
Lawrence
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TCP Reassembly Issues

2011-11-26 Thread kerbzo

Hi,

this patch works for me, also.
Reass counter now does not increase ( tcpreass:40,
1680, 21, 399, 572562, 0, 0 ) and a severe network
performance issue of netatalk and afpd, used as a Time Capsule
server for mac os x, seems now disappeared.
Really thank you,

best regards,

On Sat, Nov 26, 2011 at 6:23 AM, Lawrence Stewart lstew...@freebsd.org wrote:
On 11/25/11 13:01, Lawrence Stewart wrote:

On 11/24/11 18:02, Kris Bauer wrote:

Hello,

Exhausted net.inet.tcp.reass.maxsegments block recovering tcp session
(for
this socket and any other socket waiting for retransmited packets). After
exhausted net.inet.tcp.reass.maxsegments allocation new entry in
tcp_reass
failed (for this socket and any other socket waiting for retransmited
packets).

I have increased the reass.maxsegments value to 16384 to temporarily
avoid
the problem, but the cursegments number keeps rising and it seems it will
occur again.

Is this an issue that anyone else has seen? I can provide more
information
if need be.

Thanks Kris, Raul and Stefan for the reports, I'll look into this.

I think I've got it - a stupid 1 line logic bug. My apologies for missing it
when I reviewed the patch which introduced the bug (patch was committed to
head as r226113, MFCed to stable/9 as r226228).

Due to some miscommunication, the initial patch was committed to and MFCed
from head much later than it should have been in the 9.0 release cycle and
instead of being included in the BETAs, didn't make it in until 9.0-RC1 I
believe i.e. only RC1 and RC2 should be experiencing the issue.

Could those who have reported the bug and are able to recompile their kernel
to test a patch please try the following and report back to the list:

http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch

The patch is against head r227986 but will apply and work correctly for 9.0
as well.

Cheers,
Lawrence

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TCP Reassembly Issues

2011-11-26 Thread Lawrence Stewart

On 11/26/11 16:23, Lawrence Stewart wrote:

On 11/25/11 13:01, Lawrence Stewart wrote:

On 11/24/11 18:02, Kris Bauer wrote:

Hello,

I have increased the reass.maxsegments value to 16384 to temporarily
avoid
the problem, but the cursegments number keeps rising and it seems it
will
occur again.

Is this an issue that anyone else has seen? I can provide more
information
if need be.

Thanks Kris, Raul and Stefan for the reports, I'll look into this.

I think I've got it - a stupid 1 line logic bug. My apologies for
missing it when I reviewed the patch which introduced the bug (patch was
committed to head as r226113, MFCed to stable/9 as r226228).

Could those who have reported the bug and are able to recompile their
kernel to test a patch please try the following and report back to the
list:

http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch

The patch is against head r227986 but will apply and work correctly for
9.0 as well.

Thanks to all for the reports and testing. I committed the patch to head
(http://svn.freebsd.org/changeset/base/228016) and it will be MFCed to 9
soon pending feedback from the release engineering team.

Cheers,
Lawrence
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TCP Reassembly Issues

2011-11-25 Thread Stefan Bethke

Am 25.11.2011 um 00:35 schrieb Adrian Chadd:

 Have you tried disabling the tcp offload features of your NIC?


I'm using my em0 as a VLAN trunk, and I'm under the impression that that 
disables all the hardware assists in the controller. Also, the LAN vlan is 
bridged via OpenVPN and tap, making the whole bunch promiscous, which I believe 
also forces off the acceleration.

em0: flags=8943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST metric 0 mtu 
1500

options=219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC
ether 00:1c:c0:7d:8c:50
inet6 fe80::21c:c0ff:fe7d:8c50%em0 prefixlen 64 scopeid 0x1 
nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL
media: Ethernet autoselect (1000baseT full-duplex)
status: active
bridge0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
ether 02:00:00:00:00:01
inet6 2001:470:1f0b:1064::1 prefixlen 64 
inet 44.128.65.1 netmask 0xffc0 broadcast 44.128.65.63
inet6 fe80::21c:c0ff:fe7d:8c50%bridge0 prefixlen 64 scopeid 0xd 
nd6 options=21PERFORMNUD,AUTO_LINKLOCAL
id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200
root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
member: vlan1 flags=143LEARNING,DISCOVER,AUTOEDGE,AUTOPTP
ifmaxaddr 0 port 15 priority 128 path cost 55
member: tap0 flags=143LEARNING,DISCOVER,AUTOEDGE,AUTOPTP
ifmaxaddr 0 port 14 priority 128 path cost 200
vlan1: flags=8943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST metric 0 mtu 
1500
options=3RXCSUM,TXCSUM
ether 00:1c:c0:7d:8c:50
nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL
media: Ethernet autoselect (1000baseT full-duplex)
status: active
vlan: 1 parent interface: em0

em0@pci0:0:25:0:class=0x02 card=0x50038086 chip=0x10cd8086 rev=0x00 
hdr=0x00
vendor = 'Intel Corporation'
device = '82567LF-2 Gigabit Network Connection'
class  = network
subclass   = ethernet
cap 01[c8] = powerspec 2  supports D0 D3  current D0
cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
cap 09[e0] = vendor (length 6) Intel cap 2 version 0


-- 
Stefan Bethke s...@lassitu.de   Fon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TCP Reassembly Issues

2011-11-25 Thread Raul


El 24/11/2011 23:06, Stefan Bethke escribió:

[]
 I regularly copy large files off my Tivo trans-atlantic (125ms RTT),
 and TCP connections currently stall after about 500 megs, never
 recovering.  I suspect this is connected, as it started immediately
 after upgrading the machine to 9-stable.

I've not seen not recovering nor completely stalled (mpd tcpmssfix 
related?).


What I see is a normal start, normal bandwidth increase, peak 
performance using all available bandwidth and after that bandwidth drops 
to a 'unreasonable' level and stay there most of time during transfer.


Numbers always depend on too much factors, but to illustrate how 
dramatic it is:


[]
%ping -c100 XX.au
PING XX.au (136.186.XX.XX): 56 data bytes
...
100 packets transmitted, 100 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 352.036/354.258/374.731/3.593 ms
[]

downloading by ftp an iso image from that host (wget), transfer peaks at 
about 1.4MBytes/sec before falling up to 2,9?KBytes/sec where most 
transfer happens.


Please note, this numbers come from a pppoe link (DSL) established by 
mpd55 with *'tcpmsswilink'*:


[]
%cat /usr/local/etc/mpd5/mpd.conf | grep fix
set iface enable tcpmssfix
[]

I hope that shed some light.

Regards,
Raúl.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TCP Reassembly Issues

2011-11-25 Thread George Mitchell


On 11/24/11 21:00, Jeremy Chadwick wrote:

[...]
If none of this solves the problem, then I consider this a priority 0
blocker (read: all hands on deck) issue with the IP stack in FreeBSD
9.x and will need immediate attention.

I would strongly recommend a developer or clueful end-user begin
tracking down who committed all of these bits and CC them into the
thread.  I would start by looking who implemented the
net.inet.tcp.reass.cursegments sysctl, because that isn't in RELENG_8 at
all.



I've tried out the 9.0 release candidates, and what I notice is that for
a few minutes after the system starts, I get wonderful NFS read
throughput (7+ MB/s over a 100 megabit interface) -- more than twice as
fast as 7.n or 8.n on the same hardware -- quickly degrading to abysmal
(less than 0.5 MB/s).  Is this possibly related to the problem under
discussion?  -- George Mitchell

P.S. A lot of other 9.0 features look very nice indeed!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TCP Reassembly Issues

2011-11-25 Thread Rick Macklem

George Mitchell wrote:
 On 11/24/11 21:00, Jeremy Chadwick wrote:
 [...]
  If none of this solves the problem, then I consider this a priority
  0
  blocker (read: all hands on deck) issue with the IP stack in
  FreeBSD
  9.x and will need immediate attention.
 
  I would strongly recommend a developer or clueful end-user begin
  tracking down who committed all of these bits and CC them into the
  thread. I would start by looking who implemented the
  net.inet.tcp.reass.cursegments sysctl, because that isn't in
  RELENG_8 at
  all.
 
 
 I've tried out the 9.0 release candidates, and what I notice is that
 for
 a few minutes after the system starts, I get wonderful NFS read
 throughput (7+ MB/s over a 100 megabit interface) -- more than twice
 as
 fast as 7.n or 8.n on the same hardware -- quickly degrading to
 abysmal
 (less than 0.5 MB/s). Is this possibly related to the problem under
 discussion? -- George Mitchell
 
Well, when I've seen NFS perf. degrade like this, it has usually been
related to RPC transport (and TCP is the default for 9.0).

Just from reading some of the thread, it sounds like this problem will
result in the FAIL count (the last #) for vmstat -z for tcpreass will
increase and/or net.inet.tcp.reass.cursegments increases to
net.inet.tcp.reass.maxsegments.

I'd suggest that, after the NFS perf has degrades, you:
# vmstat -z | fgrep tcpreass
- and see how big the last # is
# sysctl -a | fgrep reass
- and see how cursegments compares with maxsegments

If these don't indicate that is the TCP Reassembly Issue, then...

There are many other possibilities w.r.t. the NFS perf. degradation.
Most often I've seen it when the net interface hardware/device driver
starts dropping packets (like happens on this laptop with an el-cheapo
re net interface in it).

You can capture a packet trace after the performance has degraded with
tcpdump and look to see if TCP segments are being lost/retransmitted.
(Although wireshark knows NFS and is nice for this, because it shows
 relative sequence numbers, the TCP dump will show you the TCP level
 retries, etc.)

Good luck with it, rick

 P.S. A lot of other 9.0 features look very nice indeed!
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to
 freebsd-stable-unsubscr...@freebsd.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TCP Reassembly Issues

2011-11-25 Thread Jeremy Chadwick

On Fri, Nov 25, 2011 at 01:05:06PM -0500, George Mitchell wrote:
 On 11/24/11 21:00, Jeremy Chadwick wrote:
 [...]
 If none of this solves the problem, then I consider this a priority 0
 blocker (read: all hands on deck) issue with the IP stack in FreeBSD
 9.x and will need immediate attention.
 
 I would strongly recommend a developer or clueful end-user begin
 tracking down who committed all of these bits and CC them into the
 thread.  I would start by looking who implemented the
 net.inet.tcp.reass.cursegments sysctl, because that isn't in RELENG_8 at
 all.
 
 
 I've tried out the 9.0 release candidates, and what I notice is that for
 a few minutes after the system starts, I get wonderful NFS read
 throughput (7+ MB/s over a 100 megabit interface) -- more than twice as
 fast as 7.n or 8.n on the same hardware -- quickly degrading to abysmal
 (less than 0.5 MB/s).  Is this possibly related to the problem under
 discussion?  -- George Mitchell
 
 P.S. A lot of other 9.0 features look very nice indeed!

You could try forcing UDP NFS (assuming this is possible; I would assume
on the server side nfsd -u is needed and on the client side use of the
mntudp option would be needed in /etc/fstab; see mount_nfs(8))
description that others have given indicate the problem being discussed
affects purely TCP.

Regarding NFS performance in general -- and this is in no way shape or
form a slam against Rick -- it would be good to get some actual Linux
vs. FreeBSD numbers when it comes to NFS performance, including what
protocols are used (TCP vs. UDP) and NFS versions are used (3 vs. 4).
I have a gut feeling NFS on Linux is significantly faster, and it would
be really helpful to find out how/why.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: TCP Reassembly Issues

2011-11-25 Thread Lawrence Stewart