Re: TCP Reassembly Issues
On 26/11/2011 05:23, Lawrence Stewart wrote: On 11/25/11 13:01, Lawrence Stewart wrote: On 11/24/11 18:02, Kris Bauer wrote: Hello, I am currently experiencing an issue with FreeBSD 9.0-RC2 r227852 where the net.inet.tcp.reass.curesegments value is constantly increasing (and not descreasing when there is nominal traffic with the box). It is causing tcp slowdowns as described with kern/155407: Exhausted net.inet.tcp.reass.maxsegments block recovering tcp session (for this socket and any other socket waiting for retransmited packets). After exhausted net.inet.tcp.reass.maxsegments allocation new entry in tcp_reass failed (for this socket and any other socket waiting for retransmited packets). I have increased the reass.maxsegments value to 16384 to temporarily avoid the problem, but the cursegments number keeps rising and it seems it will occur again. Is this an issue that anyone else has seen? I can provide more information if need be. Thanks Kris, Raul and Stefan for the reports, I'll look into this. I think I've got it - a stupid 1 line logic bug. My apologies for missing it when I reviewed the patch which introduced the bug (patch was committed to head as r226113, MFCed to stable/9 as r226228). Due to some miscommunication, the initial patch was committed to and MFCed from head much later than it should have been in the 9.0 release cycle and instead of being included in the BETAs, didn't make it in until 9.0-RC1 I believe i.e. only RC1 and RC2 should be experiencing the issue. Could those who have reported the bug and are able to recompile their kernel to test a patch please try the following and report back to the list: http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch The patch is against head r227986 but will apply and work correctly for 9.0 as well. Just a me-too. Patch applied cleanly and is working fine. Hehe... and I was blaming the Linux box at the other end of the connection :) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On Fri, Nov 25, 2011 at 11:56:47PM -0800, Jeremy Chadwick wrote: On Sat, Nov 26, 2011 at 12:49:24AM -0600, Kris Bauer wrote: On Fri, Nov 25, 2011 at 11:23 PM, Lawrence Stewart lstew...@freebsd.orgwrote: On 11/25/11 13:01, Lawrence Stewart wrote: On 11/24/11 18:02, Kris Bauer wrote: Hello, I am currently experiencing an issue with FreeBSD 9.0-RC2 r227852 where the net.inet.tcp.reass.curesegments value is constantly increasing (and not descreasing when there is nominal traffic with the box). It is causing tcp slowdowns as described with kern/155407: Exhausted net.inet.tcp.reass.maxsegments block recovering tcp session (for this socket and any other socket waiting for retransmited packets). After exhausted net.inet.tcp.reass.maxsegments allocation new entry in tcp_reass failed (for this socket and any other socket waiting for retransmited packets). I have increased the reass.maxsegments value to 16384 to temporarily avoid the problem, but the cursegments number keeps rising and it seems it will occur again. Is this an issue that anyone else has seen? I can provide more information if need be. Thanks Kris, Raul and Stefan for the reports, I'll look into this. I think I've got it - a stupid 1 line logic bug. My apologies for missing it when I reviewed the patch which introduced the bug (patch was committed to head as r226113, MFCed to stable/9 as r226228). Due to some miscommunication, the initial patch was committed to and MFCed from head much later than it should have been in the 9.0 release cycle and instead of being included in the BETAs, didn't make it in until 9.0-RC1 I believe i.e. only RC1 and RC2 should be experiencing the issue. Could those who have reported the bug and are able to recompile their kernel to test a patch please try the following and report back to the list: http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch The patch is against head r227986 but will apply and work correctly for 9.0 as well. Cheers, Lawrence I have patched, recompiled, and rebooted. net.inet.tcp.reass.cursegments is no longer incrementing, and connectivity is holding steady. If anything changes over the next couple of hours, I'll be sure to report it; but all preliminary signs of the problem are gone. Thanks for all the help! Let's not be hasty in concluding everything is fixed. Why I'm a bit on edge about this: I took the time to find the CVS commits that induced this issue in the first place, and it seems there is some history. The commit that caused this problem to begin with was supposedly a fix for a different problem: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_reass.c#rev1.375 A week later, that commit went from HEAD/MAIN into RELENG_9: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_reass.c#rev1.374.2.2 Be sure to read the description of the problem that was being fixed in the first place. I've also CC'd the original problem reporter, Steven Hartland, because we're going to need him to try the above patch from Lawrence to make sure there aren't other problems. Meaning: for all we know, the above fix might work great for Kris but cause problems for Steve. This entire situation leads me to believe very few people are doing quality testing of RELENG_9, yet we're already into 9.0-RC2. Please don't tell me that's exactly why you should be running RELENG_9!; that is completely backwards and I refuse to get into a flame war about it, because it's this simple: 90%+ of those running FreeBSD on servers need something that's stable, we can't risk wonkiness (especially of this degree!) on systems taking production traffic. Did no one actually test the change *thoroughly*? Imagine had this lay dormant until 9.0-RELEASE. Lawrence: please don't take my comments personally or to mean you broke it and caused this mess! It's meant to read more along the lines of you committed a fix for something that broke other bits badly, but nobody noticed this, including the original reporter of a different problem? How/why? You get the idea. Re-sending, because the Tested by commit line had someone who replaced the @ character with -at-, so my mail client assumed the Email address was on my local machine. Sorry about that folks. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to
Re: TCP Reassembly Issues [SOLVED?]
El 26/11/2011 6:23, Lawrence Stewart escribió: ... kernel to test a patch please try the following and report back to the list: http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch The patch is against head r227986 but will apply and work correctly for 9.0 as well. Cleanly applied against RELENG_9_0. As my case was not exactly the same as Kris or Stefan I'd wait their feedback but as far I concern, *it works perfect!*. [] %sysctl kern.version | head -n1 kern.version: FreeBSD 9.0-RC2 #1: Sat Nov 26 10:24:38 CET 2011 %uptime 12:06PM up 1:30, 3 users, load averages: 0,07 0,08 0,10 %vmstat -z | head -n1 ; vmstat -z | grep reass ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP tcpreass:40, 1680, 58,1370, 276624, 0, 0 %sysctl net.inet.tcp.reass net.inet.tcp.reass.overflows: 5 net.inet.tcp.reass.cursegments: 17 net.inet.tcp.reass.maxsegments: 1680 %netstat -s -p tcp | grep mem 5 discarded due to memory problems [] I'll leave the box stressing the tcp stack a couple of days, just in case. Thanks a lot. Regards, Raúl. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues [SOLVED?]
On Sat, Nov 26, 2011 at 5:30 AM, Raul r...@b2n.org wrote: El 26/11/2011 6:23, Lawrence Stewart escribió: ... kernel to test a patch please try the following and report back to the list: http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch The patch is against head r227986 but will apply and work correctly for 9.0 as well. Cleanly applied against RELENG_9_0. As my case was not exactly the same as Kris or Stefan I'd wait their feedback but as far I concern, *it works perfect!*. [] %sysctl kern.version | head -n1 kern.version: FreeBSD 9.0-RC2 #1: Sat Nov 26 10:24:38 CET 2011 %uptime 12:06PM up 1:30, 3 users, load averages: 0,07 0,08 0,10 %vmstat -z | head -n1 ; vmstat -z | grep reass ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP tcpreass:40, 1680, 58,1370, 276624, 0, 0 %sysctl net.inet.tcp.reass net.inet.tcp.reass.overflows: 5 net.inet.tcp.reass.cursegments: 17 net.inet.tcp.reass.maxsegments: 1680 %netstat -s -p tcp | grep mem 5 discarded due to memory problems [] I'll leave the box stressing the tcp stack a couple of days, just in case. Thanks a lot. Regards, Raúl. After 5 hours and a few gigs of traffuc, things have been fine: # sysctl net.inet.tcp.reass net.inet.tcp.reass.overflows: 155 net.inet.tcp.reass.cursegments: 0 net.inet.tcp.reass.maxsegments: 4116 Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
I think I've got it - a stupid 1 line logic bug. My apologies for missing it when I reviewed the patch which introduced the bug (patch was committed to head as r226113, MFCed to stable/9 as r226228). Due to some miscommunication, the initial patch was committed to and MFCed from head much later than it should have been in the 9.0 release cycle and instead of being included in the BETAs, didn't make it in until 9.0-RC1 I believe i.e. only RC1 and RC2 should be experiencing the issue. Could those who have reported the bug and are able to recompile their kernel to test a patch please try the following and report back to the list: http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch The patch is against head r227986 but will apply and work correctly for 9.0 as well. I'm a happy camper! Thanks, Stefan -- Stefan Bethke s...@lassitu.de Fon +49 151 14070811 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On 11/26/11 00:23, Lawrence Stewart wrote: [...] Could those who have reported the bug and are able to recompile their kernel to test a patch please try the following and report back to the list: http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch [...] Works for me! I'm now getting a sustained throughput of 7.4MB/s, compared to 4.3MB/s on 8.2-STABLE and 3.2MB/s on 7.4-RELEASE, all on the same hardware (HP notebook with re 100Mb/s interface, reading from an 8.2-STABLE server with an alc 1000Mb/s interface, via two gigabit switches). But I'm still bemused that there should have been any TCP reassembly going on. Doesn't that imply that there was packet fragmentation? My network is uniformly 1500 byte MTU. -- George ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On 11/26/11 02:56, Jeremy Chadwick wrote: [...] This entire situation leads me to believe very few people are doing quality testing of RELENG_9, yet we're already into 9.0-RC2. Please don't tell me that's exactly why you should be running RELENG_9!; that is completely backwards and I refuse to get into a flame war about it, because it's this simple: 90%+ of those running FreeBSD on servers need something that's stable, we can't risk wonkiness (especially of this degree!) on systems taking production traffic. Did no one actually test the change *thoroughly*? Imagine had this lay dormant until 9.0-RELEASE. [...] L U U OOO N N E P P L U U S O O NN N E L U U SSS O O N N N EEE P L U U S O O N NN E P L UUU OOO N N E I didn't get a warm, fuzzy feeling about FreeBSD 7 until 7.1, and FreeBSD 8 was worse -- no warm, fuzzy feeling until 8.2. And I am still not sold on SCHED_ULE: Start as many compute-bound programs as there are CPUs, and prepare for poor (to put it kindly) interactive response. That's not everybody's usage pattern, but it seems plausible enough to me to preclude SCHED_ULE as the default scheduler until it is fixed. On the good side, I'm pleased with the new 9.0 boot menu, and I'm very happy that the ahci driver automatically creates symbolic links to the old device names for its disks. I like that I don't have to tab to the Okay button in configuration dialogs any more (though I was surprised the first time it happened). But I hope this gets fixed for my flash card reader/writer: ugen0.5: vendor 0x05e3 at usbus0 umass0: vendor 0x05e3 USB TO IDE, class 0/0, rev 2.00/0.32, addr 5 on usbus0 umass0: SCSI over Bulk-Only; quirks = 0x4101 umass0:2:0:-1: Attached to scbus2 (probe0:umass-sim0:0:0:0): AutoSense failed (da0:umass-sim0:0:0:0): got CAM status 0x4 (da0:umass-sim0:0:0:0): fatal error, failed to attach to device (da0:umass-sim0:0:0:0): lost device - 0 outstanding (da0:umass-sim0:0:0:0): removing device entry -- George Mitchell ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
Hi Jeremy, On 11/26/11 18:56, Jeremy Chadwick wrote: On Sat, Nov 26, 2011 at 12:49:24AM -0600, Kris Bauer wrote: On Fri, Nov 25, 2011 at 11:23 PM, Lawrence Stewartlstew...@freebsd.orgwrote: On 11/25/11 13:01, Lawrence Stewart wrote: [snip] Thanks Kris, Raul and Stefan for the reports, I'll look into this. I think I've got it - a stupid 1 line logic bug. My apologies for missing it when I reviewed the patch which introduced the bug (patch was committed to head as r226113, MFCed to stable/9 as r226228). Due to some miscommunication, the initial patch was committed to and MFCed from head much later than it should have been in the 9.0 release cycle and instead of being included in the BETAs, didn't make it in until 9.0-RC1 I believe i.e. only RC1 and RC2 should be experiencing the issue. Could those who have reported the bug and are able to recompile their kernel to test a patch please try the following and report back to the list: http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch The patch is against head r227986 but will apply and work correctly for 9.0 as well. Cheers, Lawrence I have patched, recompiled, and rebooted. net.inet.tcp.reass.cursegments is no longer incrementing, and connectivity is holding steady. If anything changes over the next couple of hours, I'll be sure to report it; but all preliminary signs of the problem are gone. Thanks for all the help! Let's not be hasty in concluding everything is fixed. Why I'm a bit on edge about this: I took the time to find the CVS commits that induced this issue in the first place, and it seems there is some history. The commit that caused this problem to begin with was supposedly a fix for a different problem: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_reass.c#rev1.375 The original patch you reference (equivalent to svn r226113 as noted in my previous email) was indeed for a separate problem. Unfortunately the fix introduced a new problem. A week later, that commit went from HEAD/MAIN into RELENG_9: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_reass.c#rev1.374.2.2 Be sure to read the description of the problem that was being fixed in the first place. I've also CC'd the original problem reporter, Steven Hartland, because we're going to need him to try the above patch from Lawrence to make sure there aren't other problems. Meaning: for all we know, the above fix might work great for Kris but cause problems for Steve. Even though my patch is a multi-line diff, it only effectively changes one thing - that the te == NULL condition must be true for both the case where th-th_seq != tp-rcv_nxt (current segment does not plug the hole) and where they are equal (current segment does plug the hole; a new case introduced in r226113). I can say with confidence based on the change in the logic that my patch is not a regression as far as Steven's original bug report is concerned. This entire situation leads me to believe very few people are doing quality testing of RELENG_9, yet we're already into 9.0-RC2. Please don't tell me that's exactly why you should be running RELENG_9!; that is completely backwards and I refuse to get into a flame war about it, because it's this simple: 90%+ of those running FreeBSD on servers need something that's stable, we can't risk wonkiness (especially of this degree!) on systems taking production traffic. Did no one actually test the change *thoroughly*? Imagine had this lay dormant until 9.0-RELEASE. The latter half is fair criticism, more comments below. The fact we're having this discussion now prior to 9.0 being released somewhat negates the assertion in the former part of your paragraph. Lawrence: please don't take my comments personally or to mean you broke it and caused this mess! It's meant to read more along the lines of All good, not taken personally. you committed a fix for something that broke other bits badly, but nobody noticed this, including the original reporter of a different problem? How/why? You get the idea. Your concerns are valid. To clarify, I did not propose or commit the patch which introduced this bug (r226113). Generally speaking, it is a committer's responsibility to ensure that a patch which they commit has been sufficiently tested prior to commit. Normally the committer will solicit testing from the original problem reporter and do some testing themselves. I believe Steven tested Andre's patch and reported to the mailing list that it resolved his immediate problem. I was not privy to any other testing conducted by Andre, so can't comment further on that. As to how this could have been missed: TCP is impressively robust, capable of working even when it has both arms tied behind its back and is missing a leg. It may not work well, but will limp along all the same. People tend to notice and report scenarios where something is definitively broken far more
Re: TCP Reassembly Issues
Hi George, On 11/27/11 03:16, George Mitchell wrote: On 11/26/11 00:23, Lawrence Stewart wrote: [...] Could those who have reported the bug and are able to recompile their kernel to test a patch please try the following and report back to the list: http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch [...] Works for me! I'm now getting a sustained throughput of 7.4MB/s, compared to 4.3MB/s on 8.2-STABLE and 3.2MB/s on 7.4-RELEASE, all on the same hardware (HP notebook with re 100Mb/s interface, reading from an 8.2-STABLE server with an alc 1000Mb/s interface, via two gigabit switches). Good stuff. But I'm still bemused that there should have been any TCP reassembly going on. Doesn't that imply that there was packet fragmentation? My network is uniformly 1500 byte MTU. -- George TCP reassembly refers to queuing packets received out of order until the missing segment is received i.e. not IP layer fragmentation related, but packet loss or packet reordering related. I guess something in your setup is dropping the odd packet which is why your NFS performance isn't closer to the 10+MB/s (I'm not sure how much overhead NFS adds, but ~12MB/s is max application-layer throughput of 100Mbps Ethernet so achievable NFS throughput should be a bit less than that) it could be if everything was peachy. siftr(4) and some tcpdumping on both client/server could probably help you figure out where you're dropping packets if you want to improve your current performance even further. Cheers, Lawrence ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
Hi, this patch works for me, also. Reass counter now does not increase ( tcpreass:40, 1680, 21, 399, 572562, 0, 0 ) and a severe network performance issue of netatalk and afpd, used as a Time Capsule server for mac os x, seems now disappeared. Really thank you, best regards, On Sat, Nov 26, 2011 at 6:23 AM, Lawrence Stewart lstew...@freebsd.org wrote: On 11/25/11 13:01, Lawrence Stewart wrote: On 11/24/11 18:02, Kris Bauer wrote: Hello, I am currently experiencing an issue with FreeBSD 9.0-RC2 r227852 where the net.inet.tcp.reass.curesegments value is constantly increasing (and not descreasing when there is nominal traffic with the box). It is causing tcp slowdowns as described with kern/155407: Exhausted net.inet.tcp.reass.maxsegments block recovering tcp session (for this socket and any other socket waiting for retransmited packets). After exhausted net.inet.tcp.reass.maxsegments allocation new entry in tcp_reass failed (for this socket and any other socket waiting for retransmited packets). I have increased the reass.maxsegments value to 16384 to temporarily avoid the problem, but the cursegments number keeps rising and it seems it will occur again. Is this an issue that anyone else has seen? I can provide more information if need be. Thanks Kris, Raul and Stefan for the reports, I'll look into this. I think I've got it - a stupid 1 line logic bug. My apologies for missing it when I reviewed the patch which introduced the bug (patch was committed to head as r226113, MFCed to stable/9 as r226228). Due to some miscommunication, the initial patch was committed to and MFCed from head much later than it should have been in the 9.0 release cycle and instead of being included in the BETAs, didn't make it in until 9.0-RC1 I believe i.e. only RC1 and RC2 should be experiencing the issue. Could those who have reported the bug and are able to recompile their kernel to test a patch please try the following and report back to the list: http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch The patch is against head r227986 but will apply and work correctly for 9.0 as well. Cheers, Lawrence ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On 11/26/11 16:23, Lawrence Stewart wrote: On 11/25/11 13:01, Lawrence Stewart wrote: On 11/24/11 18:02, Kris Bauer wrote: Hello, I am currently experiencing an issue with FreeBSD 9.0-RC2 r227852 where the net.inet.tcp.reass.curesegments value is constantly increasing (and not descreasing when there is nominal traffic with the box). It is causing tcp slowdowns as described with kern/155407: Exhausted net.inet.tcp.reass.maxsegments block recovering tcp session (for this socket and any other socket waiting for retransmited packets). After exhausted net.inet.tcp.reass.maxsegments allocation new entry in tcp_reass failed (for this socket and any other socket waiting for retransmited packets). I have increased the reass.maxsegments value to 16384 to temporarily avoid the problem, but the cursegments number keeps rising and it seems it will occur again. Is this an issue that anyone else has seen? I can provide more information if need be. Thanks Kris, Raul and Stefan for the reports, I'll look into this. I think I've got it - a stupid 1 line logic bug. My apologies for missing it when I reviewed the patch which introduced the bug (patch was committed to head as r226113, MFCed to stable/9 as r226228). Due to some miscommunication, the initial patch was committed to and MFCed from head much later than it should have been in the 9.0 release cycle and instead of being included in the BETAs, didn't make it in until 9.0-RC1 I believe i.e. only RC1 and RC2 should be experiencing the issue. Could those who have reported the bug and are able to recompile their kernel to test a patch please try the following and report back to the list: http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch The patch is against head r227986 but will apply and work correctly for 9.0 as well. Thanks to all for the reports and testing. I committed the patch to head (http://svn.freebsd.org/changeset/base/228016) and it will be MFCed to 9 soon pending feedback from the release engineering team. Cheers, Lawrence ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
Am 25.11.2011 um 00:35 schrieb Adrian Chadd: Have you tried disabling the tcp offload features of your NIC? I'm using my em0 as a VLAN trunk, and I'm under the impression that that disables all the hardware assists in the controller. Also, the LAN vlan is bridged via OpenVPN and tap, making the whole bunch promiscous, which I believe also forces off the acceleration. em0: flags=8943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST metric 0 mtu 1500 options=219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC ether 00:1c:c0:7d:8c:50 inet6 fe80::21c:c0ff:fe7d:8c50%em0 prefixlen 64 scopeid 0x1 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (1000baseT full-duplex) status: active bridge0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 ether 02:00:00:00:00:01 inet6 2001:470:1f0b:1064::1 prefixlen 64 inet 44.128.65.1 netmask 0xffc0 broadcast 44.128.65.63 inet6 fe80::21c:c0ff:fe7d:8c50%bridge0 prefixlen 64 scopeid 0xd nd6 options=21PERFORMNUD,AUTO_LINKLOCAL id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: vlan1 flags=143LEARNING,DISCOVER,AUTOEDGE,AUTOPTP ifmaxaddr 0 port 15 priority 128 path cost 55 member: tap0 flags=143LEARNING,DISCOVER,AUTOEDGE,AUTOPTP ifmaxaddr 0 port 14 priority 128 path cost 200 vlan1: flags=8943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST metric 0 mtu 1500 options=3RXCSUM,TXCSUM ether 00:1c:c0:7d:8c:50 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (1000baseT full-duplex) status: active vlan: 1 parent interface: em0 em0@pci0:0:25:0:class=0x02 card=0x50038086 chip=0x10cd8086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = '82567LF-2 Gigabit Network Connection' class = network subclass = ethernet cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message cap 09[e0] = vendor (length 6) Intel cap 2 version 0 -- Stefan Bethke s...@lassitu.de Fon +49 151 14070811 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
El 24/11/2011 23:06, Stefan Bethke escribió: [] I regularly copy large files off my Tivo trans-atlantic (125ms RTT), and TCP connections currently stall after about 500 megs, never recovering. I suspect this is connected, as it started immediately after upgrading the machine to 9-stable. I've not seen not recovering nor completely stalled (mpd tcpmssfix related?). What I see is a normal start, normal bandwidth increase, peak performance using all available bandwidth and after that bandwidth drops to a 'unreasonable' level and stay there most of time during transfer. Numbers always depend on too much factors, but to illustrate how dramatic it is: [] %ping -c100 XX.au PING XX.au (136.186.XX.XX): 56 data bytes ... 100 packets transmitted, 100 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 352.036/354.258/374.731/3.593 ms [] downloading by ftp an iso image from that host (wget), transfer peaks at about 1.4MBytes/sec before falling up to 2,9?KBytes/sec where most transfer happens. Please note, this numbers come from a pppoe link (DSL) established by mpd55 with *'tcpmsswilink'*: [] %cat /usr/local/etc/mpd5/mpd.conf | grep fix set iface enable tcpmssfix [] I hope that shed some light. Regards, Raúl. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On 11/24/11 21:00, Jeremy Chadwick wrote: [...] If none of this solves the problem, then I consider this a priority 0 blocker (read: all hands on deck) issue with the IP stack in FreeBSD 9.x and will need immediate attention. I would strongly recommend a developer or clueful end-user begin tracking down who committed all of these bits and CC them into the thread. I would start by looking who implemented the net.inet.tcp.reass.cursegments sysctl, because that isn't in RELENG_8 at all. I've tried out the 9.0 release candidates, and what I notice is that for a few minutes after the system starts, I get wonderful NFS read throughput (7+ MB/s over a 100 megabit interface) -- more than twice as fast as 7.n or 8.n on the same hardware -- quickly degrading to abysmal (less than 0.5 MB/s). Is this possibly related to the problem under discussion? -- George Mitchell P.S. A lot of other 9.0 features look very nice indeed! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
George Mitchell wrote: On 11/24/11 21:00, Jeremy Chadwick wrote: [...] If none of this solves the problem, then I consider this a priority 0 blocker (read: all hands on deck) issue with the IP stack in FreeBSD 9.x and will need immediate attention. I would strongly recommend a developer or clueful end-user begin tracking down who committed all of these bits and CC them into the thread. I would start by looking who implemented the net.inet.tcp.reass.cursegments sysctl, because that isn't in RELENG_8 at all. I've tried out the 9.0 release candidates, and what I notice is that for a few minutes after the system starts, I get wonderful NFS read throughput (7+ MB/s over a 100 megabit interface) -- more than twice as fast as 7.n or 8.n on the same hardware -- quickly degrading to abysmal (less than 0.5 MB/s). Is this possibly related to the problem under discussion? -- George Mitchell Well, when I've seen NFS perf. degrade like this, it has usually been related to RPC transport (and TCP is the default for 9.0). Just from reading some of the thread, it sounds like this problem will result in the FAIL count (the last #) for vmstat -z for tcpreass will increase and/or net.inet.tcp.reass.cursegments increases to net.inet.tcp.reass.maxsegments. I'd suggest that, after the NFS perf has degrades, you: # vmstat -z | fgrep tcpreass - and see how big the last # is # sysctl -a | fgrep reass - and see how cursegments compares with maxsegments If these don't indicate that is the TCP Reassembly Issue, then... There are many other possibilities w.r.t. the NFS perf. degradation. Most often I've seen it when the net interface hardware/device driver starts dropping packets (like happens on this laptop with an el-cheapo re net interface in it). You can capture a packet trace after the performance has degraded with tcpdump and look to see if TCP segments are being lost/retransmitted. (Although wireshark knows NFS and is nice for this, because it shows relative sequence numbers, the TCP dump will show you the TCP level retries, etc.) Good luck with it, rick P.S. A lot of other 9.0 features look very nice indeed! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On Fri, Nov 25, 2011 at 01:05:06PM -0500, George Mitchell wrote: On 11/24/11 21:00, Jeremy Chadwick wrote: [...] If none of this solves the problem, then I consider this a priority 0 blocker (read: all hands on deck) issue with the IP stack in FreeBSD 9.x and will need immediate attention. I would strongly recommend a developer or clueful end-user begin tracking down who committed all of these bits and CC them into the thread. I would start by looking who implemented the net.inet.tcp.reass.cursegments sysctl, because that isn't in RELENG_8 at all. I've tried out the 9.0 release candidates, and what I notice is that for a few minutes after the system starts, I get wonderful NFS read throughput (7+ MB/s over a 100 megabit interface) -- more than twice as fast as 7.n or 8.n on the same hardware -- quickly degrading to abysmal (less than 0.5 MB/s). Is this possibly related to the problem under discussion? -- George Mitchell P.S. A lot of other 9.0 features look very nice indeed! You could try forcing UDP NFS (assuming this is possible; I would assume on the server side nfsd -u is needed and on the client side use of the mntudp option would be needed in /etc/fstab; see mount_nfs(8)) description that others have given indicate the problem being discussed affects purely TCP. Regarding NFS performance in general -- and this is in no way shape or form a slam against Rick -- it would be good to get some actual Linux vs. FreeBSD numbers when it comes to NFS performance, including what protocols are used (TCP vs. UDP) and NFS versions are used (3 vs. 4). I have a gut feeling NFS on Linux is significantly faster, and it would be really helpful to find out how/why. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On 11/25/11 13:01, Lawrence Stewart wrote: On 11/24/11 18:02, Kris Bauer wrote: Hello, I am currently experiencing an issue with FreeBSD 9.0-RC2 r227852 where the net.inet.tcp.reass.curesegments value is constantly increasing (and not descreasing when there is nominal traffic with the box). It is causing tcp slowdowns as described with kern/155407: Exhausted net.inet.tcp.reass.maxsegments block recovering tcp session (for this socket and any other socket waiting for retransmited packets). After exhausted net.inet.tcp.reass.maxsegments allocation new entry in tcp_reass failed (for this socket and any other socket waiting for retransmited packets). I have increased the reass.maxsegments value to 16384 to temporarily avoid the problem, but the cursegments number keeps rising and it seems it will occur again. Is this an issue that anyone else has seen? I can provide more information if need be. Thanks Kris, Raul and Stefan for the reports, I'll look into this. I think I've got it - a stupid 1 line logic bug. My apologies for missing it when I reviewed the patch which introduced the bug (patch was committed to head as r226113, MFCed to stable/9 as r226228). Due to some miscommunication, the initial patch was committed to and MFCed from head much later than it should have been in the 9.0 release cycle and instead of being included in the BETAs, didn't make it in until 9.0-RC1 I believe i.e. only RC1 and RC2 should be experiencing the issue. Could those who have reported the bug and are able to recompile their kernel to test a patch please try the following and report back to the list: http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch The patch is against head r227986 but will apply and work correctly for 9.0 as well. Cheers, Lawrence ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On Fri, Nov 25, 2011 at 11:23 PM, Lawrence Stewart lstew...@freebsd.orgwrote: On 11/25/11 13:01, Lawrence Stewart wrote: On 11/24/11 18:02, Kris Bauer wrote: Hello, I am currently experiencing an issue with FreeBSD 9.0-RC2 r227852 where the net.inet.tcp.reass.curesegments value is constantly increasing (and not descreasing when there is nominal traffic with the box). It is causing tcp slowdowns as described with kern/155407: Exhausted net.inet.tcp.reass.maxsegments block recovering tcp session (for this socket and any other socket waiting for retransmited packets). After exhausted net.inet.tcp.reass.maxsegments allocation new entry in tcp_reass failed (for this socket and any other socket waiting for retransmited packets). I have increased the reass.maxsegments value to 16384 to temporarily avoid the problem, but the cursegments number keeps rising and it seems it will occur again. Is this an issue that anyone else has seen? I can provide more information if need be. Thanks Kris, Raul and Stefan for the reports, I'll look into this. I think I've got it - a stupid 1 line logic bug. My apologies for missing it when I reviewed the patch which introduced the bug (patch was committed to head as r226113, MFCed to stable/9 as r226228). Due to some miscommunication, the initial patch was committed to and MFCed from head much later than it should have been in the 9.0 release cycle and instead of being included in the BETAs, didn't make it in until 9.0-RC1 I believe i.e. only RC1 and RC2 should be experiencing the issue. Could those who have reported the bug and are able to recompile their kernel to test a patch please try the following and report back to the list: http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch The patch is against head r227986 but will apply and work correctly for 9.0 as well. Cheers, Lawrence I have patched, recompiled, and rebooted. net.inet.tcp.reass.cursegments is no longer incrementing, and connectivity is holding steady. If anything changes over the next couple of hours, I'll be sure to report it; but all preliminary signs of the problem are gone. Thanks for all the help! Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On Sat, Nov 26, 2011 at 12:49:24AM -0600, Kris Bauer wrote: On Fri, Nov 25, 2011 at 11:23 PM, Lawrence Stewart lstew...@freebsd.orgwrote: On 11/25/11 13:01, Lawrence Stewart wrote: On 11/24/11 18:02, Kris Bauer wrote: Hello, I am currently experiencing an issue with FreeBSD 9.0-RC2 r227852 where the net.inet.tcp.reass.curesegments value is constantly increasing (and not descreasing when there is nominal traffic with the box). It is causing tcp slowdowns as described with kern/155407: Exhausted net.inet.tcp.reass.maxsegments block recovering tcp session (for this socket and any other socket waiting for retransmited packets). After exhausted net.inet.tcp.reass.maxsegments allocation new entry in tcp_reass failed (for this socket and any other socket waiting for retransmited packets). I have increased the reass.maxsegments value to 16384 to temporarily avoid the problem, but the cursegments number keeps rising and it seems it will occur again. Is this an issue that anyone else has seen? I can provide more information if need be. Thanks Kris, Raul and Stefan for the reports, I'll look into this. I think I've got it - a stupid 1 line logic bug. My apologies for missing it when I reviewed the patch which introduced the bug (patch was committed to head as r226113, MFCed to stable/9 as r226228). Due to some miscommunication, the initial patch was committed to and MFCed from head much later than it should have been in the 9.0 release cycle and instead of being included in the BETAs, didn't make it in until 9.0-RC1 I believe i.e. only RC1 and RC2 should be experiencing the issue. Could those who have reported the bug and are able to recompile their kernel to test a patch please try the following and report back to the list: http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch The patch is against head r227986 but will apply and work correctly for 9.0 as well. Cheers, Lawrence I have patched, recompiled, and rebooted. net.inet.tcp.reass.cursegments is no longer incrementing, and connectivity is holding steady. If anything changes over the next couple of hours, I'll be sure to report it; but all preliminary signs of the problem are gone. Thanks for all the help! Let's not be hasty in concluding everything is fixed. Why I'm a bit on edge about this: I took the time to find the CVS commits that induced this issue in the first place, and it seems there is some history. The commit that caused this problem to begin with was supposedly a fix for a different problem: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_reass.c#rev1.375 A week later, that commit went from HEAD/MAIN into RELENG_9: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_reass.c#rev1.374.2.2 Be sure to read the description of the problem that was being fixed in the first place. I've also CC'd the original problem reporter, Steven Hartland, because we're going to need him to try the above patch from Lawrence to make sure there aren't other problems. Meaning: for all we know, the above fix might work great for Kris but cause problems for Steve. This entire situation leads me to believe very few people are doing quality testing of RELENG_9, yet we're already into 9.0-RC2. Please don't tell me that's exactly why you should be running RELENG_9!; that is completely backwards and I refuse to get into a flame war about it, because it's this simple: 90%+ of those running FreeBSD on servers need something that's stable, we can't risk wonkiness (especially of this degree!) on systems taking production traffic. Did no one actually test the change *thoroughly*? Imagine had this lay dormant until 9.0-RELEASE. Lawrence: please don't take my comments personally or to mean you broke it and caused this mess! It's meant to read more along the lines of you committed a fix for something that broke other bits badly, but nobody noticed this, including the original reporter of a different problem? How/why? You get the idea. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
Hi, I 'm experiencing a similar issue but I don't know if mine could be considered normal behaviour: even if net.inet.tcp.reass.curesegments is set to 1680 and does not icrease, the output of vmstat -z shows a high tcpreass fail value that I don't remember in previous (8-STABLE) builds: ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP [...] socket: 680, 25602, 740, 754, 4345747, 0, 0 unpcb: 240, 25600, 85, 459, 148595, 0, 0 ipq: 56,819, 0, 378, 16126, 0, 0 udp_inpcb: 392, 25600, 24, 376, 222036, 0, 0 udpcb: 16, 25704, 24, 480, 222036, 0, 0 tcp_inpcb: 392, 25600, 719,2391, 3958901, 0, 0 tcpcb: 976, 25600, 625, 707, 3958901, 0, 0 tcptw: 72, 5150, 93,2357, 1486035, 0, 0 syncache: 152, 15375, 2, 398, 1587985, 0, 0 hostcache: 136, 15372,1490,4922, 119374, 0, 0 tcpreass:40, 1680,1680, 0, 108934,4800302, 0 sackhole:32, 0, 0, 404, 134750, 0, 0 [...] System is FreeBSD 9.0-PRERELEASE #8 r227705 Bye, On Thu, Nov 24, 2011 at 8:02 AM, Kris Bauer kristoph.ba...@gmail.com wrote: Hello, I am currently experiencing an issue with FreeBSD 9.0-RC2 r227852 where the net.inet.tcp.reass.curesegments value is constantly increasing (and not descreasing when there is nominal traffic with the box). It is causing tcp slowdowns as described with kern/155407: Exhausted net.inet.tcp.reass.maxsegments block recovering tcp session (for this socket and any other socket waiting for retransmited packets). After exhausted net.inet.tcp.reass.maxsegments allocation new entry in tcp_reass failed (for this socket and any other socket waiting for retransmited packets). I have increased the reass.maxsegments value to 16384 to temporarily avoid the problem, but the cursegments number keeps rising and it seems it will occur again. Is this an issue that anyone else has seen? I can provide more information if need be. Thanks, Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On 24.11.2011. 8:02, Kris Bauer wrote: Hello, I am currently experiencing an issue with FreeBSD 9.0-RC2 r227852 where the net.inet.tcp.reass.curesegments value is constantly increasing (and not descreasing when there is nominal traffic with the box). It is causing tcp slowdowns as described with kern/155407: Exhausted net.inet.tcp.reass.maxsegments block recovering tcp session (for this socket and any other socket waiting for retransmited packets). After exhausted net.inet.tcp.reass.maxsegments allocation new entry in tcp_reass failed (for this socket and any other socket waiting for retransmited packets). I have increased the reass.maxsegments value to 16384 to temporarily avoid the problem, but the cursegments number keeps rising and it seems it will occur again. Is this an issue that anyone else has seen? I can provide more information if need be. Is your configuration different than the default in some way? Do you use a firewall? Multithreaded netisr? One of the new TCP congestion control modules? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On Thu, Nov 24, 2011 at 10:33 AM, Ivan Voras ivo...@freebsd.org wrote: On 24.11.2011. 8:02, Kris Bauer wrote: Hello, I am currently experiencing an issue with FreeBSD 9.0-RC2 r227852 where the net.inet.tcp.reass.curesegments value is constantly increasing (and not descreasing when there is nominal traffic with the box). It is causing tcp slowdowns as described with kern/155407: Exhausted net.inet.tcp.reass.maxsegments block recovering tcp session (for this socket and any other socket waiting for retransmited packets). After exhausted net.inet.tcp.reass.maxsegments allocation new entry in tcp_reass failed (for this socket and any other socket waiting for retransmited packets). I have increased the reass.maxsegments value to 16384 to temporarily avoid the problem, but the cursegments number keeps rising and it seems it will occur again. Is this an issue that anyone else has seen? I can provide more information if need be. Is your configuration different than the default in some way? Do you use a firewall? Multithreaded netisr? One of the new TCP congestion control modules? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org I don't believe that my configuration is anything out of the usual. Just some standard LFN tuning. sysctl.conf net.inet.tcp.blackhole=2 net.inet.udp.blackhole=1 kern.geom.eli.threads=2 kern.ipc.maxsockbuf=16777216 net.inet.tcp.cc.algorithm=htcp net.inet.tcp.sendbuf_max=16777216 net.inet.tcp.recvbuf_max=16777216 net.inet.tcp.sendbuf_auto=1 net.inet.tcp.recvbuf_auto=1 net.inet.tcp.sendbuf_inc=262144 net.inet.tcp.recvbuf_inc=524288 net.inet.tcp.sendspace=1048576 net.inet.tcp.recvspace=1048576 net.inet.tcp.hostcache.expire=1 net.inet.tcp.delayed_ack=0 boot/loader.conf vm.kmem_size_max=5120M vm.kmem_size=5120M geom_mirror_load=YES vfs.zfs.arc_max=4096M vfs.zfs.prefetch_disable=1 vfs.zfs.txg.timeout=15 vfs.zfs.write_limit_override=268435456 kern.ipc.nmbclusters=65536 cc_htcp_load=YES net.inet.tcp.reass.maxsegments=16384 With the exception of the CC H-TCP and Reass maxsegments tunables, this is exactly what I was use with 8.2 with no issues. I have also seen the issue crop up (although I hadn't yet identified the source) while booting the box entirely with defaults (including using NewReno). The box is a 2 x Xeon E5405 Supermicro X7DCX with 8gb of RAM. Thanks, Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
El 24/11/2011 17:07, kerbzo escribió: I 'm experiencing a similar issue but I don't know if mine could be considered normal behaviour: even if net.inet.tcp.reass.curesegments is set to 1680 and does not icrease, the output of vmstat -z shows a high tcpreass fail value that I don't remember in previous (8-STABLE) builds: I see both, 'net.inet.tcp.reass.cursegments' reaching default 'net.inet.tcp.reass.maxsegments' after 38 minutes of uptime, apparently for never going down despite the amount of traffic and vmstat -z also show tcpreass failures. I also see sudden packet 'bursts' discarded by memory problems maybe related: [] %date netstat -s -p tcp | grep mem jueves, 24 de noviembre de 2011, 19:39:23 CET 5115 discarded due to memory problems %date netstat -s -p tcp | grep mem jueves, 24 de noviembre de 2011, 19:39:30 CET 5268 discarded due to memory problems [] My settings: [] %cat /etc/sysctl.conf | grep -v ^\# debug.cpufreq.lowest=1000 %cat /boot/loader.conf | grep -v ^\# vfs.zfs.prefetch_disable=0 aio_load=YES cc_cubic_load=YES %sysctl net.isr net.isr.numthreads: 1 net.isr.maxprot: 16 net.isr.defaultqlimit: 256 net.isr.maxqlimit: 10240 net.isr.bindthreads: 0 net.isr.maxthreads: 1 net.isr.direct: 0 net.isr.direct_force: 0 net.isr.dispatch: direct [] cc cubic although loaded, not used in this 'pristine' reboot. About firewalling, pf using altq. Pretty recent compile: [] %sysctl -a | grep RC2 kern.osrelease: 9.0-RC2 kern.version: FreeBSD 9.0-RC2 #0: Thu Nov 24 00:39:07 CET 2011 [] Regards, Raúl. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On Thu, Nov 24, 2011 at 1:20 PM, Raul r...@turing.b2n.org wrote: El 24/11/2011 17:07, kerbzo escribió: I 'm experiencing a similar issue but I don't know if mine could be considered normal behaviour: even if net.inet.tcp.reass.curesegments is set to 1680 and does not icrease, the output of vmstat -z shows a high tcpreass fail value that I don't remember in previous (8-STABLE) builds: I see both, 'net.inet.tcp.reass.cursegments' reaching default 'net.inet.tcp.reass.maxsegments' after 38 minutes of uptime, apparently for never going down despite the amount of traffic and vmstat -z also show tcpreass failures. I also see sudden packet 'bursts' discarded by memory problems maybe related: [] %date netstat -s -p tcp | grep mem jueves, 24 de noviembre de 2011, 19:39:23 CET 5115 discarded due to memory problems %date netstat -s -p tcp | grep mem jueves, 24 de noviembre de 2011, 19:39:30 CET 5268 discarded due to memory problems [] My settings: [] %cat /etc/sysctl.conf | grep -v ^\# debug.cpufreq.lowest=1000 %cat /boot/loader.conf | grep -v ^\# vfs.zfs.prefetch_disable=0 aio_load=YES cc_cubic_load=YES %sysctl net.isr net.isr.numthreads: 1 net.isr.maxprot: 16 net.isr.defaultqlimit: 256 net.isr.maxqlimit: 10240 net.isr.bindthreads: 0 net.isr.maxthreads: 1 net.isr.direct: 0 net.isr.direct_force: 0 net.isr.dispatch: direct [] cc cubic although loaded, not used in this 'pristine' reboot. About firewalling, pf using altq. Pretty recent compile: [] %sysctl -a | grep RC2 kern.osrelease: 9.0-RC2 kern.version: FreeBSD 9.0-RC2 #0: Thu Nov 24 00:39:07 CET 2011 [] Regards, Raúl. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org I am seeing the same sorts of things in netstat vmstat: # netstat -s -p tcp |grep mem 742935 discarded due to memory problems # vmstat -z |grep tcpreass tcpreass: 40, 16464, 16340, 124, 131485,955443, 0 I also this filling up of reass.cursegments occur within an hour of reboot. Thanks, Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
Am 24.11.2011 um 21:30 schrieb Kris Bauer: On Thu, Nov 24, 2011 at 1:20 PM, Raul r...@turing.b2n.org wrote: I am seeing the same sorts of things in netstat vmstat: # netstat -s -p tcp |grep mem 742935 discarded due to memory problems # vmstat -z |grep tcpreass tcpreass: 40, 16464, 16340, 124, 131485,955443, 0 Same here: root@diesel:~# netstat -s -p tcp |grep mem 529211 discarded due to memory problems root@diesel:~# vmstat -z |grep tcpreass tcpreass:40, 1680,1679, 1, 118846,831450, 0 root@diesel:~# uname -a FreeBSD diesel.lassitu.de 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #20: Fri Nov 18 21:57:59 CET 2011 r...@diesel.lassitu.de:/usr/obj/usr/src/sys/DIESEL amd64 root@diesel:~# uptime 11:01PM up 5 days, 23:15, 1 user, load averages: 0.14, 0.04, 0.01 root@diesel:~# svn info /usr/src Path: /usr/src Working Copy Root Path: /usr/src URL: http://mirror.hanse.de/svn/freebsd/base/stable/9 Repository Root: http://mirror.hanse.de/svn/freebsd/base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 227665 Node Kind: directory Schedule: normal Last Changed Author: fabient Last Changed Rev: 227664 Last Changed Date: 2011-11-18 15:41:48 +0100 (Fri, 18 Nov 2011) I regularly copy large files off my Tivo trans-atlantic (125ms RTT), and TCP connections currently stall after about 500 megs, never recovering. I suspect this is connected, as it started immediately after upgrading the machine to 9-stable. As far as I can tell, the problem does not exist with 8-stable. Stefan -- Stefan Bethke s...@lassitu.de Fon +49 151 14070811 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On Thu, Nov 24, 2011 at 4:06 PM, Stefan Bethke s...@lassitu.de wrote: Am 24.11.2011 um 21:30 schrieb Kris Bauer: On Thu, Nov 24, 2011 at 1:20 PM, Raul r...@turing.b2n.org wrote: I am seeing the same sorts of things in netstat vmstat: # netstat -s -p tcp |grep mem 742935 discarded due to memory problems # vmstat -z |grep tcpreass tcpreass: 40, 16464, 16340, 124, 131485,955443, 0 Same here: root@diesel:~# netstat -s -p tcp |grep mem 529211 discarded due to memory problems root@diesel:~# vmstat -z |grep tcpreass tcpreass:40, 1680,1679, 1, 118846,831450, 0 root@diesel:~# uname -a FreeBSD diesel.lassitu.de 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #20: Fri Nov 18 21:57:59 CET 2011 r...@diesel.lassitu.de:/usr/obj/usr/src/sys/DIESEL amd64 root@diesel:~# uptime 11:01PM up 5 days, 23:15, 1 user, load averages: 0.14, 0.04, 0.01 root@diesel:~# svn info /usr/src Path: /usr/src Working Copy Root Path: /usr/src URL: http://mirror.hanse.de/svn/freebsd/base/stable/9 Repository Root: http://mirror.hanse.de/svn/freebsd/base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 227665 Node Kind: directory Schedule: normal Last Changed Author: fabient Last Changed Rev: 227664 Last Changed Date: 2011-11-18 15:41:48 +0100 (Fri, 18 Nov 2011) I regularly copy large files off my Tivo trans-atlantic (125ms RTT), and TCP connections currently stall after about 500 megs, never recovering. I suspect this is connected, as it started immediately after upgrading the machine to 9-stable. As far as I can tell, the problem does not exist with 8-stable. Stefan -- Stefan Bethke s...@lassitu.de Fon +49 151 14070811 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org 100-150ms RTT trans-atlantic transfers is what I saw (largely) driving up the reass.cursegments value. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
Have you tried disabling the tcp offload features of your NIC? Adrian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On Thu, Nov 24, 2011 at 5:35 PM, Adrian Chadd adr...@freebsd.org wrote: Have you tried disabling the tcp offload features of your NIC? Adrian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org To test this, I added net.inet.tcp.tso=0 to sysctl.conf and restarted the box; it didn't work. net.inet.tcp.reass.cursegments immediately started climbing up and were exhausted within an hour. Kris. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On Thu, Nov 24, 2011 at 07:13:39PM -0600, Kris Bauer wrote: On Thu, Nov 24, 2011 at 5:35 PM, Adrian Chadd adr...@freebsd.org wrote: Have you tried disabling the tcp offload features of your NIC? Adrian To test this, I added net.inet.tcp.tso=0 to sysctl.conf and restarted the box; it didn't work. net.inet.tcp.reass.cursegments immediately started climbing up and were exhausted within an hour. I think Adrian was referring to RXCSUM and TXCSUM on your NIC; TSO is another offloading feature. See ifconfig(8) for how to disable those. Be aware that disabling them in real-time (e.g. ifconfig xxx -rxcsum -txcsum) may cause problems; there are some NIC drivers on FreeBSD which do not like you doing this once the NIC has established link (meaning reloading the driver (for lack of better term) results in wonky behaviour). So you may instead want to add those hyphen-options to your ifconfig_XXX lines in /etc/rc.conf and reboot the box. If none of this solves the problem, then I consider this a priority 0 blocker (read: all hands on deck) issue with the IP stack in FreeBSD 9.x and will need immediate attention. I would strongly recommend a developer or clueful end-user begin tracking down who committed all of these bits and CC them into the thread. I would start by looking who implemented the net.inet.tcp.reass.cursegments sysctl, because that isn't in RELENG_8 at all. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On 11/24/11 18:02, Kris Bauer wrote: Hello, I am currently experiencing an issue with FreeBSD 9.0-RC2 r227852 where the net.inet.tcp.reass.curesegments value is constantly increasing (and not descreasing when there is nominal traffic with the box). It is causing tcp slowdowns as described with kern/155407: Exhausted net.inet.tcp.reass.maxsegments block recovering tcp session (for this socket and any other socket waiting for retransmited packets). After exhausted net.inet.tcp.reass.maxsegments allocation new entry in tcp_reass failed (for this socket and any other socket waiting for retransmited packets). I have increased the reass.maxsegments value to 16384 to temporarily avoid the problem, but the cursegments number keeps rising and it seems it will occur again. Is this an issue that anyone else has seen? I can provide more information if need be. Thanks Kris, Raul and Stefan for the reports, I'll look into this. Cheers, Lawrence ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On Thu, Nov 24, 2011 at 8:00 PM, Jeremy Chadwick free...@jdc.parodius.comwrote: On Thu, Nov 24, 2011 at 07:13:39PM -0600, Kris Bauer wrote: On Thu, Nov 24, 2011 at 5:35 PM, Adrian Chadd adr...@freebsd.org wrote: Have you tried disabling the tcp offload features of your NIC? Adrian To test this, I added net.inet.tcp.tso=0 to sysctl.conf and restarted the box; it didn't work. net.inet.tcp.reass.cursegments immediately started climbing up and were exhausted within an hour. I think Adrian was referring to RXCSUM and TXCSUM on your NIC; TSO is another offloading feature. See ifconfig(8) for how to disable those. Be aware that disabling them in real-time (e.g. ifconfig xxx -rxcsum -txcsum) may cause problems; there are some NIC drivers on FreeBSD which do not like you doing this once the NIC has established link (meaning reloading the driver (for lack of better term) results in wonky behaviour). So you may instead want to add those hyphen-options to your ifconfig_XXX lines in /etc/rc.conf and reboot the box. If none of this solves the problem, then I consider this a priority 0 blocker (read: all hands on deck) issue with the IP stack in FreeBSD 9.x and will need immediate attention. I would strongly recommend a developer or clueful end-user begin tracking down who committed all of these bits and CC them into the thread. I would start by looking who implemented the net.inet.tcp.reass.cursegments sysctl, because that isn't in RELENG_8 at all. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | I have added -rxcsum -txcsum -tso to rc.conf and rebooted the box. This has not solved the problem. After a half-hour usage, I'm already up to reass.cursegments=2182 and it keeps climbing. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On 11/25/11 14:19, Kris Bauer wrote: On Thu, Nov 24, 2011 at 8:00 PM, Jeremy Chadwick free...@jdc.parodius.comwrote: On Thu, Nov 24, 2011 at 07:13:39PM -0600, Kris Bauer wrote: On Thu, Nov 24, 2011 at 5:35 PM, Adrian Chaddadr...@freebsd.org wrote: Have you tried disabling the tcp offload features of your NIC? Adrian To test this, I added net.inet.tcp.tso=0 to sysctl.conf and restarted the box; it didn't work. net.inet.tcp.reass.cursegments immediately started climbing up and were exhausted within an hour. I think Adrian was referring to RXCSUM and TXCSUM on your NIC; TSO is another offloading feature. See ifconfig(8) for how to disable those. Be aware that disabling them in real-time (e.g. ifconfig xxx -rxcsum -txcsum) may cause problems; there are some NIC drivers on FreeBSD which do not like you doing this once the NIC has established link (meaning reloading the driver (for lack of better term) results in wonky behaviour). So you may instead want to add those hyphen-options to your ifconfig_XXX lines in /etc/rc.conf and reboot the box. If none of this solves the problem, then I consider this a priority 0 blocker (read: all hands on deck) issue with the IP stack in FreeBSD 9.x and will need immediate attention. I would strongly recommend a developer or clueful end-user begin tracking down who committed all of these bits and CC them into the thread. I would start by looking who implemented the net.inet.tcp.reass.cursegments sysctl, because that isn't in RELENG_8 at all. I have added -rxcsum -txcsum -tso to rc.conf and rebooted the box. This has not solved the problem. After a half-hour usage, I'm already up to reass.cursegments=2182 and it keeps climbing. This is pretty much guaranteed to be an accounting problem in the TCP reassembly code (netinet/tcp_reass.c), not a driver related issue. I would not expect any amount of tweaking, tuning or driver option twiddling to change the outcome (but if you do find something which alleviates it, do let us know). Kris, are you in a position to test kernel patches on the machine which is experiencing this problem? Cheers, Lawrence ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
On Thu, Nov 24, 2011 at 10:20 PM, Lawrence Stewart lstew...@freebsd.orgwrote: On 11/25/11 14:19, Kris Bauer wrote: On Thu, Nov 24, 2011 at 8:00 PM, Jeremy Chadwick free...@jdc.parodius.comwrote: On Thu, Nov 24, 2011 at 07:13:39PM -0600, Kris Bauer wrote: On Thu, Nov 24, 2011 at 5:35 PM, Adrian Chaddadr...@freebsd.org wrote: Have you tried disabling the tcp offload features of your NIC? Adrian To test this, I added net.inet.tcp.tso=0 to sysctl.conf and restarted the box; it didn't work. net.inet.tcp.reass.cursegments immediately started climbing up and were exhausted within an hour. I think Adrian was referring to RXCSUM and TXCSUM on your NIC; TSO is another offloading feature. See ifconfig(8) for how to disable those. Be aware that disabling them in real-time (e.g. ifconfig xxx -rxcsum -txcsum) may cause problems; there are some NIC drivers on FreeBSD which do not like you doing this once the NIC has established link (meaning reloading the driver (for lack of better term) results in wonky behaviour). So you may instead want to add those hyphen-options to your ifconfig_XXX lines in /etc/rc.conf and reboot the box. If none of this solves the problem, then I consider this a priority 0 blocker (read: all hands on deck) issue with the IP stack in FreeBSD 9.x and will need immediate attention. I would strongly recommend a developer or clueful end-user begin tracking down who committed all of these bits and CC them into the thread. I would start by looking who implemented the net.inet.tcp.reass.cursegments sysctl, because that isn't in RELENG_8 at all. I have added -rxcsum -txcsum -tso to rc.conf and rebooted the box. This has not solved the problem. After a half-hour usage, I'm already up to reass.cursegments=2182 and it keeps climbing. This is pretty much guaranteed to be an accounting problem in the TCP reassembly code (netinet/tcp_reass.c), not a driver related issue. I would not expect any amount of tweaking, tuning or driver option twiddling to change the outcome (but if you do find something which alleviates it, do let us know). Kris, are you in a position to test kernel patches on the machine which is experiencing this problem? Cheers, Lawrence I'd be happy to test kernel patches with this machine. Thanks, Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: TCP Reassembly Issues
El 25/11/2011 0:35, Adrian Chadd escribió: Have you tried disabling the tcp offload features of your NIC? In my case, there is no tcp on the ethernet interface. It is pppoe (mpd / netgraph) so no fancy hardware acceleration there. [...] %ifconfig ng0 | head -n1 ng0: flags=88d1UP,POINTOPOINT,RUNNING,NOARP,SIMPLEX,MULTICAST metric 0 mtu 1492 [...] Regards, Raúl. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
TCP Reassembly Issues
Hello, I am currently experiencing an issue with FreeBSD 9.0-RC2 r227852 where the net.inet.tcp.reass.curesegments value is constantly increasing (and not descreasing when there is nominal traffic with the box). It is causing tcp slowdowns as described with kern/155407: Exhausted net.inet.tcp.reass.maxsegments block recovering tcp session (for this socket and any other socket waiting for retransmited packets). After exhausted net.inet.tcp.reass.maxsegments allocation new entry in tcp_reass failed (for this socket and any other socket waiting for retransmited packets). I have increased the reass.maxsegments value to 16384 to temporarily avoid the problem, but the cursegments number keeps rising and it seems it will occur again. Is this an issue that anyone else has seen? I can provide more information if need be. Thanks, Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org