Re: Route messages
At 10:34 PM 6/27/2008, [EMAIL PROTECTED] wrote: On Sun, 15 Jun 2008 11:16:17 +0100, in sentex.lists.freebsd.net you wrote: Paul wrote: Get these with GRE tunnel on FreeBSD 7.0-STABLE FreeBSD 7.0-STABLE #5: Sun May 11 19:00:57 EDT 2008 :/usr/obj/usr/src/sys/ROUTER amd64 But do not get them with 7.0-RELEASE Any ideas what changed? :) Wish there was some sort of changelog.. # of messages per second seems consistent with packets per second on GRE interface.. No impact in routing, but definitely impact in cpu usage for all processes monitoring the route messages. RTM_MISS is actually fairly common when you don't have a default route. Hi, I am seeing this issue as well on a pair of recently deployed boxes, one running MPD and one acting as an area router in front of it. The MPD box has a default route and only has 400 routes or so. A steady stream of those messages, upwards of 500 per second. got message of size 96 on Fri Jun 27 22:25:42 2008 RTM_MISS: Lookup failed on this address: len 96, pid: 0, seq 0, errno 0, flags:DONE locks: inits: sockaddrs: DST default got message of size 96 on Fri Jun 27 22:25:42 2008 RTM_MISS: Lookup failed on this address: len 96, pid: 0, seq 0, errno 0, flags:DONE locks: inits: sockaddrs: DST default Is there a way to try and track down what is generating those messages ? Its eating up a fair bit of cpu with quagga (the zebra process specifically) I narrowed down where the change to RELENG_7 happened. It looks like a commit around April 22nd caused the behaviour to change. When a box acting as a router has a packet transit it, an RTM_MISS is generated for *each packet*... Given a setup of H1 R1 - H2 where H1 is 10.10.1.2/24 H2 is 10.20.1.2/24 and R1 has 2 interfaces, 10.10.1.1/24 and 10.20.1.1/24 Pinging H2 from H1 makes R1 generate a RTM_MISS for each packet! For routing daemons such as zebra, this eats up a *lot* of CPU. Turning on ip_fast_forwarding stops this behaviour on R1. However, if the interface routing the packet is an netgraph interface (e.g. mpd) fast_forwarding doesnt seem to have an effect and the RTM_MISS messages are generated again for each packet. The ping packet below is a valid icmp echo request and reply. e.g 0[releng7]# ping -c 2 -S 10.20.1.2 10.10.1.2 PING 10.10.1.2 (10.10.1.2) from 10.20.1.2: 56 data bytes 64 bytes from 10.10.1.2: icmp_seq=0 ttl=63 time=0.302 ms 64 bytes from 10.10.1.2: icmp_seq=1 ttl=63 time=0.337 ms --- 10.10.1.2 ping statistics --- 2 packets transmitted, 2 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.302/0.320/0.337/0.018 ms 0[releng7]# generates 4 messages on the router [r7-router]# route -n monitor got message of size 96 on Tue Jul 1 00:42:35 2008 RTM_MISS: Lookup failed on this address: len 96, pid: 0, seq 0, errno 0, flags:DONE locks: inits: sockaddrs: DST default got message of size 96 on Tue Jul 1 00:42:35 2008 RTM_MISS: Lookup failed on this address: len 96, pid: 0, seq 0, errno 0, flags:DONE locks: inits: sockaddrs: DST default got message of size 96 on Tue Jul 1 00:42:36 2008 RTM_MISS: Lookup failed on this address: len 96, pid: 0, seq 0, errno 0, flags:DONE locks: inits: sockaddrs: DST default got message of size 96 on Tue Jul 1 00:42:36 2008 RTM_MISS: Lookup failed on this address: len 96, pid: 0, seq 0, errno 0, flags:DONE locks: inits: sockaddrs: DST default I am thinking http://lists.freebsd.org/pipermail/cvs-src/2008-April/090303.html is the commit ? If I revert to the prev version, the issue goes away. kernel is just 0[r7-router]% diff router GENERIC 24,27c24 ident router makeoptions MODULES_OVERRIDE=ipfw acpi --- ident GENERIC 37,38c34,35 #options INET6 # IPv6 communications protocols #options SCTP# Stream Control Transmission Protocol --- options INET6 # IPv6 communications protocols options SCTP# Stream Control Transmission Protocol 47c44 #options NFSLOCKD# Network Lock Manager --- options NFSLOCKD# Network Lock Manager 61c58 #options STACK # stack(9) support --- options STACK # stack(9) support 303c300 #device uslcom # SI Labs CP2101/CP2102 serial adapters --- deviceuslcom # SI Labs CP2101/CP2102 serial adapters ---Mike ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: if_bridge turns off checksum offload of members?
Andrew Thompson wrote: On Mon, Jun 30, 2008 at 07:16:29PM +0900, Pyun YongHyeon wrote: On Mon, Jun 30, 2008 at 12:11:40PM +0300, Stefan Lambrev wrote: Greetings, I just noticed, that when I add em network card to bridge the checksum offload is turned off. I even put in my rc.conf: ifconfig_em0=rxcsum up ifconfig_em1=rxcsum up but after reboot both em0 and em1 have this feature disabled. Is this expected behavior? Should I care about csum in bridge mode? I noticed that enabling checksum offload manually improve things little btw. AFAIK this is intended one, bridge(4) turns off Tx side checksum offload by default. I think disabling Tx checksum offload is required as not all members of a bridge may be able to do checksum offload. The same is true for TSO but it seems that bridge(4) doesn't disable it. If all members of bridge have the same hardware capability I think bridge(4) may not need to disable Tx side hardware assistance. I guess bridge(4) can scan every interface capabilities in a member and can decide what hardware assistance can be activated instead of blindly turning off Tx side hardware assistance. This patch should do that, are you able to test it Stefan? === if_bridge (all) cc -O2 -fno-strict-aliasing -pipe -march=nocona -D_KERNEL -DKLD_MODULE -std=c99 -nostdinc -DHAVE_KERNEL_OPTION_HEADERS -include /usr/obj/usr/src/sys/CORE/opt_global.h -I. -I@ -I@/contrib/altq -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-common -g -fno-omit-frame-pointer -I/usr/obj/usr/src/sys/CORE -mcmodel=kernel -mno-red-zone -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -c /usr/src/sys/modules/if_bridge/../../net/if_bridge.c /usr/src/sys/modules/if_bridge/../../net/if_bridge.c: In function 'bridge_capabilities': /usr/src/sys/modules/if_bridge/../../net/if_bridge.c:787: error: 'IFCAP_TOE' undeclared (first use in this function) /usr/src/sys/modules/if_bridge/../../net/if_bridge.c:787: error: (Each undeclared identifier is reported only once /usr/src/sys/modules/if_bridge/../../net/if_bridge.c:787: error: for each function it appears in.) *** Error code 1 1 error *** Error code 2 1 error *** Error code 2 1 error *** Error code 2 1 error *** Error code 2 1 error I'm building without -j5 to see if the error message will change :) I'm using 7-STABLE from Jun 27 cheers, Andrew ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED] -- Best Wishes, Stefan Lambrev ICQ# 24134177 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]
Hi, Ingo Flaschberger wrote: Dear Rudy, I used polling in FreeBSD 5.x and it helped a bunch. I set up a new router with 7.0 and MSI was recommended to me. (I noticed no difference when moving from polling - MSI, however, on 5.4 polling seemed to help a lot. What are people using in 7.0? polling or MSI? if you have a inet-router with gige-uplinks, it is possible that there will be (d)dos attacks. only polling helps you then to keep the router manageable (but dropping packets). Let me disagree :) I'm experimenting with bridge and Intel 82571EB Gigabit Ethernet Controller. On quad core system I have no problems with the stability of the bridge without polling. taskq em0 takes 100% CPU, but I have another three (cpus/cores) that are free and the router is very very stable, no lag on other interfaces and the average load is not very high too. Kind regards, Ingo Flaschberger ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED] -- Best Wishes, Stefan Lambrev ICQ# 24134177 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Route messages
On Tue, 1 Jul 2008, Bjoern A. Zeeb wrote: Hi, On Tue, 1 Jul 2008, Andre Oppermann wrote: Hi, Mike Tancsa wrote: I am thinking http://lists.freebsd.org/pipermail/cvs-src/2008-April/090303.html is the commit ? If I revert to the prev version, the issue goes away. Ha, I finally know why I ended up on Cc: of a thread I had no idea about. Someone could have told me instead of blindly adding me;-) Yes, this change doesn't look right. It should only do the route lookup in ip_input.c when there was an EMSGSIZE error returned by ip_output(). The rtalloc_ign() call causes the message to be sent because it always sets report to one. The default message is RTM_MISS. I'll try to prep an updated patch which doesn't have these issues later today. Yeah my bad. Sorry. If you do that, do not do an extra route lookup if possible, correct the rtalloc call. Thanks. So I had a very quick look at the code between doing something else. I think the only change needed is this if I am not mistaken but my head is far away nowhere close enough in this code. Andre, could you review this? Index: sys/netinet/ip_input.c === RCS file: /shared/mirror/FreeBSD/r/ncvs/src/sys/netinet/ip_input.c,v retrieving revision 1.332.2.2 diff -u -p -r1.332.2.2 ip_input.c --- sys/netinet/ip_input.c 22 Apr 2008 12:02:55 - 1.332.2.2 +++ sys/netinet/ip_input.c 1 Jul 2008 09:23:08 - @@ -1363,7 +1363,6 @@ ip_forward(struct mbuf *m, int srcrt) * the ICMP_UNREACH_NEEDFRAG Next-Hop MTU field described in RFC1191. */ bzero(ro, sizeof(ro)); - rtalloc_ign(ro, RTF_CLONING); error = ip_output(m, NULL, ro, IP_FORWARDING, NULL, NULL); -- Bjoern A. Zeeb Stop bit received. Insert coin for new game. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: if_bridge turns off checksum offload of members?
Hi, May be a stupid questions, but: 1) There are zero matches of IFCAP_TOE in kernel sources .. there is not support for TOE in 7.0, but may be this is work in progress for 8-current? 2) In #define BRIDGE_IFCAPS_MASK (IFCAP_TOE|IFCAP_TSO|IFCAP_TXCSUM) - TOE should be repleaced with RXCSUM or just removed? 3) Why RX is never checked? In my case this doesn't matter because em turn off both TX and RX if only one is disabled, but probably there is a hardware, that can separate them e.g. RX disabled while TX enabled? 4) I'm not sure why bridge should not work with two interfaces one of which support TX and the other does not? At least if I turn on checksum offload only on one of the interfaces the bridge is still working ... Andrew Thompson wrote: - cut - This patch should do that, are you able to test it Stefan? cheers, Andrew P.S. I saw very good results with netisr2 on a kernel from p4 before few months .. are there any patches flying around so I can test them with 7-STABLE? :) -- Best Wishes, Stefan Lambrev ICQ# 24134177 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: if_bridge turns off checksum offload of members?
Hi, Sorry to reply to myself. Stefan Lambrev wrote: Hi, May be a stupid questions, but: 1) There are zero matches of IFCAP_TOE in kernel sources .. there is not support for TOE in 7.0, but may be this is work in progress for 8-current? 2) In #define BRIDGE_IFCAPS_MASK (IFCAP_TOE|IFCAP_TSO|IFCAP_TXCSUM) - TOE should be repleaced with RXCSUM or just removed? Your patch plus this small change (replacing TOE with RXCSUM) seems to work fine for me - kernel compiles without a problem and checksum offload is enabled after reboot. 3) Why RX is never checked? In my case this doesn't matter because em turn off both TX and RX if only one is disabled, but probably there is a hardware, that can separate them e.g. RX disabled while TX enabled? 4) I'm not sure why bridge should not work with two interfaces one of which support TX and the other does not? At least if I turn on checksum offload only on one of the interfaces the bridge is still working ... Andrew Thompson wrote: - cut - This patch should do that, are you able to test it Stefan? cheers, Andrew P.S. I saw very good results with netisr2 on a kernel from p4 before few months .. are there any patches flying around so I can test them with 7-STABLE? :) -- Best Wishes, Stefan Lambrev ICQ# 24134177 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]
Dear Paul, I have been unable to even come close to livelocking the machine with the em driver interrupt moderation. So that to me throws polling out the window. I tried 8000hz with polling modified to allow 1 burst and it makes no difference higher hz-values gives you better latenca but less overall speed. 2000hz should be enough. Kind regards, Ingo Flaschberger ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]
Dear Paul, Dual Opteron 2212, Recompiled kernel with 7-STABLE and removed a lot of junk in the config, added options NO_ADAPTIVE_MUTEXES not sure if that makes any difference or not, will test without. Used ULE scheduler, used preemption, CPUTYPE=opteron in /etc/make.conf 7.0-STABLE FreeBSD 7.0-STABLE #4: Tue Jul 1 01:22:18 CDT 2008 amd64 Max input rate .. 587kpps? Take into consideration that these packets are being forwarded out em1 interface which causes a great impact on cpu usage. If I set up a firewall rule to block the packets it can do over 1mpps on em0 input. would be great if you can also test with 32bit. what value do you have at net.inet.ip.intr_queue_maxlen? kind regards, Ingo Flaschberger ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RELENG_7 ath WPA stuck when bgscan is active on interface
Hello, I'm running the above configuration, RELENG_7 kernel and WPA, on an Asus laptop eeePC 900 for which one must patch the HAL with: http://snapshots.madwifi.org/special/madwifi-ng-r2756+ar5007.tar.gz ) all is fine, mostly, but when 'bgscan' is activated on the interface ath0 it get stuck reproduce-able after some time without any traffic through the interface; setting 'ifconfig ath0 -bgscan' makes the problem going away; could it be related to the bug I'm facing on another laptop with bgscan/WPA/iwi0, see: http://www.freebsd.org/cgi/query-pr.cgi?pr=122331 thx matthias -- Matthias Apitz Manager Technical Support - OCLC GmbH Gruenwalder Weg 28g - 82041 Oberhaching - Germany t +49-89-61308 351 - f +49-89-61308 399 - m +49-170-4527211 e [EMAIL PROTECTED] - w http://www.oclc.org/ http://www.UnixArea.de/ b http://gurucubano.blogspot.com/ «...una sola vez, que es cuanto basta si se trata de verdades definitivas.» «...only once, which is enough if it has todo with definite truth.» José Saramago, Historia del Cerca de Lisboa ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: altq on vlan
Max Laier wrote: Would you mind adding some words to that effect to your patch? I think I'll hide it from public access instead. Looks like some people prefer to patch kernel instead of learning how to make a queue on parent interface. -- Dixi. Sem. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: if_bridge turns off checksum offload of members?
On Tue, Jul 01, 2008 at 12:51:42PM +0300, Stefan Lambrev wrote: Hi, May be a stupid questions, but: 1) There are zero matches of IFCAP_TOE in kernel sources .. there is not support for TOE in 7.0, but may be this is work in progress for 8-current? Yes, its in current only. Just remove IFCAP_TOE. 2) In #define BRIDGE_IFCAPS_MASK (IFCAP_TOE|IFCAP_TSO|IFCAP_TXCSUM) - TOE should be repleaced with RXCSUM or just removed? 3) Why RX is never checked? In my case this doesn't matter because em turn off both TX and RX if only one is disabled, but probably there is a hardware, that can separate them e.g. RX disabled while TX enabled? Rx does not matter, whatever isnt offloaded in hardware is just computed locally such as checking the cksum. Its Tx that messes up the bridge, if a outgoing packet is generated locally on an interface that has Tx offloading, it may actaully be sent out a different bridge member that does not have that capability. This would cause it to be sent with an invalid checksum for instance. The bridge used to just disable Tx offloading but this patch you are testing makes sure each feature is supported by all members. 4) I'm not sure why bridge should not work with two interfaces one of which support TX and the other does not? At least if I turn on checksum offload only on one of the interfaces the bridge is still working ... Andrew Thompson wrote: - cut - This patch should do that, are you able to test it Stefan? cheers, Andrew P.S. I saw very good results with netisr2 on a kernel from p4 before few months .. are there any patches flying around so I can test them with 7-STABLE? :) -- Best Wishes, Stefan Lambrev ICQ# 24134177 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: if_bridge turns off checksum offload of members?
Greetings Andrew, The patch compiles and works as expected. I noticed something strange btw - swi1: net was consuming 100% WCPU (shown on top -S) but I'm not sure this have something to do with your patch, as I can't reproduce it right now .. Andrew Thompson wrote: On Tue, Jul 01, 2008 at 12:51:42PM +0300, Stefan Lambrev wrote: Hi, May be a stupid questions, but: 1) There are zero matches of IFCAP_TOE in kernel sources .. there is not support for TOE in 7.0, but may be this is work in progress for 8-current? Yes, its in current only. Just remove IFCAP_TOE. 2) In #define BRIDGE_IFCAPS_MASK (IFCAP_TOE|IFCAP_TSO|IFCAP_TXCSUM) - TOE should be repleaced with RXCSUM or just removed? 3) Why RX is never checked? In my case this doesn't matter because em turn off both TX and RX if only one is disabled, but probably there is a hardware, that can separate them e.g. RX disabled while TX enabled? Rx does not matter, whatever isnt offloaded in hardware is just computed locally such as checking the cksum. Its Tx that messes up the bridge, if a outgoing packet is generated locally on an interface that has Tx offloading, it may actaully be sent out a different bridge member that does not have that capability. This would cause it to be sent with an invalid checksum for instance. The bridge used to just disable Tx offloading but this patch you are testing makes sure each feature is supported by all members. 4) I'm not sure why bridge should not work with two interfaces one of which support TX and the other does not? At least if I turn on checksum offload only on one of the interfaces the bridge is still working ... Andrew Thompson wrote: - cut - This patch should do that, are you able to test it Stefan? cheers, Andrew P.S. I saw very good results with netisr2 on a kernel from p4 before few months .. are there any patches flying around so I can test them with 7-STABLE? :) -- Best Wishes, Stefan Lambrev ICQ# 24134177 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED] -- Best Wishes, Stefan Lambrev ICQ# 24134177 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD NAT-T patch integration
Larry Baird wrote: And how do I know that it works ? Well, when it doesn't work, I do know it, quite quickly most of the time ! I have to chime in here. I did most of the initial porting of the NAT-T patches from Kame IPSec to FAST_IPSEC. I did look at every line of code during this process. I found no security problems during the port. Like Yvan, my company uses the NAT-T patches commercially. Like he says, if it had problems, we would hear about it. If the patches don't get commited, I highly suspect Yvan or myself would try to keep the patches up todate. So far I have done FAST_IPSEC pacthes for FreeBSD 4,5,6. Yvan did 7 and 8 by himself. Keeping up gets to be a pain after a while. I do plan to look at the FreeBSD 7 patches soon, but it sure would be nice to see it commited. This whole issue seems ridiculous. I've been trying to get the NAT-T patches committed for a while but since I'm not setup to do any IPSEC testing have deferred to others. If we need to break a logjam I'll pitch in. Sam ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: if_bridge turns off checksum offload of members?
Andrew Thompson wrote: On Tue, Jul 01, 2008 at 12:51:42PM +0300, Stefan Lambrev wrote: Hi, May be a stupid questions, but: 1) There are zero matches of IFCAP_TOE in kernel sources .. there is not support for TOE in 7.0, but may be this is work in progress for 8-current? Yes, its in current only. Just remove IFCAP_TOE. 2) In #define BRIDGE_IFCAPS_MASK (IFCAP_TOE|IFCAP_TSO|IFCAP_TXCSUM) - TOE should be repleaced with RXCSUM or just removed? 3) Why RX is never checked? In my case this doesn't matter because em turn off both TX and RX if only one is disabled, but probably there is a hardware, that can separate them e.g. RX disabled while TX enabled? Rx does not matter, whatever isnt offloaded in hardware is just computed locally such as checking the cksum. Its Tx that messes up the bridge, if a outgoing packet is generated locally on an interface that has Tx offloading, it may actaully be sent out a different bridge member that does not have that capability. This would cause it to be sent with an invalid checksum for instance. The bridge used to just disable Tx offloading but this patch you are testing makes sure each feature is supported by all members. 4) I'm not sure why bridge should not work with two interfaces one of which support TX and the other does not? At least if I turn on checksum offload only on one of the interfaces the bridge is still working ... Andrew Thompson wrote: - cut - This patch should do that, are you able to test it Stefan? cheers, Andrew P.S. I saw very good results with netisr2 on a kernel from p4 before few months .. are there any patches flying around so I can test them with 7-STABLE? :) This issue has come up before. Handling checksum offload in the bridge for devices that are not capable is not a big deal and is important for performance. TSO likewise should be done but we're missing a generic TSO support routine to do that (I believe, netbsd has one and linux has a GSO mechanism). Sam ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]
Thanks.. I was hoping I wasn't seeing things : I do not like inconsistencies.. :/ Stefan Lambrev wrote: Greetings Paul, --OK I'm stumped now.. Rebuilt with preemption and ULE and preemption again and it's not doing what it did before.. I saw this in my configuration too :) Just leave your test running for longer time and you will see this strange inconsistency in action. In my configuration I almost always have better throughput after reboot, which drops latter (5-10min under flood) with 50-60kpps and after another 10-15min the number of correctly passed packet increase again. Looks like auto tuning of which I'm not aware :) How could that be? Now about 500kpps.. That kind of inconsistency almost invalidates all my testing.. why would it be so much different after trying a bunch of kernel options and rebooting a bunch of times and then going back to the original config doesn't get you what it did in the beginning.. I'll have to dig into this further.. never seen anything like it :) Hopefully the ip_input fix will help free up a few cpu cycles. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]
I am going to.. I have an opteron 270 dual set up on 32 bit and the 2212 is set up on 64 bit :) Today should bring some 32 bit results as well as etherchannel results. Ingo Flaschberger wrote: Dear Paul, Dual Opteron 2212, Recompiled kernel with 7-STABLE and removed a lot of junk in the config, added options NO_ADAPTIVE_MUTEXES not sure if that makes any difference or not, will test without. Used ULE scheduler, used preemption, CPUTYPE=opteron in /etc/make.conf 7.0-STABLE FreeBSD 7.0-STABLE #4: Tue Jul 1 01:22:18 CDT 2008 amd64 Max input rate .. 587kpps? Take into consideration that these packets are being forwarded out em1 interface which causes a great impact on cpu usage. If I set up a firewall rule to block the packets it can do over 1mpps on em0 input. would be great if you can also test with 32bit. what value do you have at net.inet.ip.intr_queue_maxlen? kind regards, Ingo Flaschberger ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]
I can't reproduce the 580kpps maximum that I saw when I first compiled for some reason, I don't understand, the max I get even with ULE and preemption is now about 530 and it dips to 480 a lot.. The first time I tried it it was at 580 and dipped to 520...what the?.. (kernel config attached at end) * noticed that SOMETIMES the em0 taskq jumps around cpus and doesn't use 100% of any one cpu * noticed that the netstat packets per second rate varies explicitly with the CPU usage of em0 taskq (top output with ULE/PREEMPTION compiled in): PID USERNAME PRI NICE SIZERES STATE C TIME WCPU COMMAND 10 root 171 ki31 0K16K RUN3 64:12 94.09% idle: cpu3 36 root -68- 0K16K CPU1 1 5:43 89.75% em0 taskq 13 root 171 ki31 0K16K CPU0 0 63:21 87.30% idle: cpu0 12 root 171 ki31 0K16K RUN1 62:44 66.75% idle: cpu1 11 root 171 ki31 0K16K CPU2 2 62:17 56.49% idle: cpu2 39 root -68- 0K16K - 0 0:54 10.64% em3 taskq this is about 480-500kpps rate. now I wait a minute and PID USERNAME PRI NICE SIZERES STATE C TIME WCPU COMMAND 10 root 171 ki31 0K16K CPU3 3 64:56 100.00% idle: cpu3 36 root -68- 0K16K CPU2 2 6:21 94.14% em0 taskq 13 root 171 ki31 0K16K RUN0 63:55 80.18% idle: cpu0 11 root 171 ki31 0K16K RUN2 62:48 67.38% idle: cpu2 12 root 171 ki31 0K16K CPU1 1 63:04 58.40% idle: cpu1 39 root -68- 0K16K - 1 1:00 10.21% em3 taskq 530kpps rate... drops to 85%.. 480kpps rate goes back up to 95% 530kpps it keeps flopping like this... none of the CPUs are 100% use and none of the cpus add up , like the cpu time of em0 taskq is 94% so one of the cpus should be 6% idle but it's not. This is with ULE/PREEMPTION.. I see different behavior without preemption and with 4bsd.. and I also see different behavior depending on the time of day lol :) Figure that one out I'll post back without preemption and with 4bsd in a min then i'll move on to the 32 bit platform tests ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]
ULE without PREEMPTION is now yeilding better results. input (em0) output packets errs bytespackets errs bytes colls 571595 40639 34564108 1 0226 0 577892 48865 34941908 1 0178 0 545240 84744 32966404 1 0178 0 587661 44691 35534512 1 0178 0 587839 38073 35544904 1 0178 0 587787 43556 35540360 1 0178 0 540786 39492 32712746 1 0178 0 572071 55797 34595650 1 0178 0 *OUCH, IPFW HURTS.. loading ipfw, and adding one ipfw rule allow ip from any to any drops 100Kpps off :/ what's up with THAT? unloaded ipfw module and back 100kpps more again, that's not right with ONE rule.. :/ em0 taskq is still jumping cpus.. is there any way to lock it to one cpu or is this just a function of ULE running a tar czpvf all.tgz * and seeing if pps changes.. negligible.. guess scheduler is doing it's job at least.. Hmm. even when it's getting 50-60k errors per second on the interface I can still SCP a file through that interface although it's not fast.. 3-4MB/s.. You know, I wouldn't care if it added 5ms latency to the packets when it was doing 1mpps as long as it didn't drop any.. Why can't it do that? Queue them up and do them in bi chunks so none are droppedhmm? 32 bit system is compiling now.. won't do 400kpps with GENERIC kernel, as with 64 bit did 450k with GENERIC, although that could be the difference between opteron 270 and opteron 2212.. Paul ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Poor network performance for clients in 100MB to Gigabit environment
What options do you have enabled on the linux server? sysctl -a | grep net.ipv4.tcp and on the bsd sysctl -a net.inet.tcp It sounds like a problem with BSD not handing the dropped data or ack packets so what happens is it pushes a burst of data out 100mbit and the switch drops the packets and then BSD waits too long to recover and doesn't scale the transmission back. TCP is supposed to scale down the transmission speed until packets are not dropped to a point even without ECN. Options such as 'reno' and 'sack' etc. are congestion control algorithms that use congestion windows. David Kwan wrote: I have a couple of questions regarding the TCP Stack: I have a situation with clients on a 100MB network connecting to servers on a Gigabit network where the client read speeds are very slow from the FreeBSD server and fast from the Linux server; Write speeds from the clients to both servers are fast. (Clients on the gigabit network work fine with blazing read and write speeds). The network traces shows congestion packets for both servers when doing reads from the clients (dup acks and retransmissions), but the Linux server seem to handle the congestion better. ECN is not enabled on the network and I don't see any congestion windowing or clients window changing. The 100MB/1G switch is dropping packets. I double checked the network configuration and also swapped swithports for the servers to use the others to make sure the switch configuration are the same, and the Linux always does better than FreeBSD. Assuming that the network configuration is a constant for all clients and servers (speed, duplex, and etc...), the only variable is the servers themselves (Linux and FreeBSD). I have tried a couple of FreeBSD machines with 6.1 and 7.0 and they exhibit the same problem, with no luck matching the speed and network utilization of Linux (2 years old). The read speed test I'm referring is doing transferring of a 100MB file (cifs, nfs, and ftp), and the Linux server does it consistently in around 10 sec (line speed) with a constant network utilization chart, while the FreeBSD servers are magnitudes slower with erratic network utilization chart. I've attempted to tweak some network sysctl options on the FreeBSD, and the only ones that helped were disabling TSO and inflight; which leads me to think that the inter-packet gap was slightly increased to partially relieve congestion on the switch; not a long term solution. My questions are: 1. Have you heard of this problem before with 100MB clients to Gigabit servers? 2. Are you aware of any Linux fix/patch in the TCP stack to better handling congestion than FreeBSD? I'm looking to address this issue in the FreeBSD, but wondering if the Linux stack did something special that can help with the FreeBSD performance. David K. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Poor network performance for clients in 100MB to Gigabit environment
Take it from someone who has spent a couple weeks beating his head against a wall over this... system tuning is essential. If your driver is going to the kernel looking for a resource and having to wait, its gonna hurt... Look into kern.ipc, and as Paul said net.inet. Off the shelf config is more than likely going to be inadequate. Good luck, Jack On Tue, Jul 1, 2008 at 12:50 PM, David Kwan [EMAIL PROTECTED] wrote: I have a couple of questions regarding the TCP Stack: I have a situation with clients on a 100MB network connecting to servers on a Gigabit network where the client read speeds are very slow from the FreeBSD server and fast from the Linux server; Write speeds from the clients to both servers are fast. (Clients on the gigabit network work fine with blazing read and write speeds). The network traces shows congestion packets for both servers when doing reads from the clients (dup acks and retransmissions), but the Linux server seem to handle the congestion better. ECN is not enabled on the network and I don't see any congestion windowing or clients window changing. The 100MB/1G switch is dropping packets. I double checked the network configuration and also swapped swithports for the servers to use the others to make sure the switch configuration are the same, and the Linux always does better than FreeBSD. Assuming that the network configuration is a constant for all clients and servers (speed, duplex, and etc...), the only variable is the servers themselves (Linux and FreeBSD). I have tried a couple of FreeBSD machines with 6.1 and 7.0 and they exhibit the same problem, with no luck matching the speed and network utilization of Linux (2 years old). The read speed test I'm referring is doing transferring of a 100MB file (cifs, nfs, and ftp), and the Linux server does it consistently in around 10 sec (line speed) with a constant network utilization chart, while the FreeBSD servers are magnitudes slower with erratic network utilization chart. I've attempted to tweak some network sysctl options on the FreeBSD, and the only ones that helped were disabling TSO and inflight; which leads me to think that the inter-packet gap was slightly increased to partially relieve congestion on the switch; not a long term solution. My questions are: 1. Have you heard of this problem before with 100MB clients to Gigabit servers? 2. Are you aware of any Linux fix/patch in the TCP stack to better handling congestion than FreeBSD? I'm looking to address this issue in the FreeBSD, but wondering if the Linux stack did something special that can help with the FreeBSD performance. David K. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Maximum ARP Entries
Does anyone know if there is a maximum number of ARP entries/ adjacencies that FBSD can handle before recycling? I want to route several thousand ips direct to some interfaces so it will have 3-4k ARP entries.. I'm curious because in Linux I have to set the sysctl net.ipv4.neigh threshholds a lot higher or it bombs with 'too many neighbors'... I don't see a setting like this in BSD sysctl . Thanks! Paul ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]
Ok, now THIS is absoultely a whole bunch of ridiculousness.. I set up etherchannel, and I'm evenly distributing packets over em0 em1 and em2 to lagg0 and i get WORSE performance than with a single interface.. Can anyone explain this one? This is horrible. I got em0-em2 taskq's using 80% cpu EACH and they are only doing 100kpps EACH looks: packets errs bytespackets errs bytes colls 105050 110666303000 0 0 0 0 104952 139696297120 0 0 0 0 104331 121216259860 0 0 0 0 input (em1) output packets errs bytespackets errs bytes colls 103734 706586223998 0 0 0 0 103483 757036209046 0 0 0 0 103848 761956230886 0 0 0 0 input (em2) output packets errs bytespackets errs bytes colls 103299 629576197940 1 0226 0 106388 730716383280 1 0178 0 104503 705736270180 4 0712 0 last pid: 1378; load averages: 2.31, 1.28, 0.57 up 0+00:06:27 17:42:32 68 processes: 8 running, 42 sleeping, 18 waiting CPU: 0.0% user, 0.0% nice, 58.9% system, 0.0% interrupt, 41.1% idle Mem: 7980K Active, 5932K Inact, 47M Wired, 16K Cache, 8512K Buf, 1920M Free Swap: 8192M Total, 8192M Free PID USERNAME PRI NICE SIZERES STATE C TIME WCPU COMMAND 11 root 171 ki31 0K16K RUN2 5:18 80.47% idle: cpu2 38 root -68- 0K16K CPU3 3 2:30 80.18% em2 taskq 37 root -68- 0K16K CPU1 1 2:28 76.90% em1 taskq 36 root -68- 0K16K CPU2 2 2:28 72.56% em0 taskq 13 root 171 ki31 0K16K RUN0 3:32 29.20% idle: cpu0 12 root 171 ki31 0K16K RUN1 3:29 27.88% idle: cpu1 10 root 171 ki31 0K16K RUN3 3:21 25.63% idle: cpu3 39 root -68- 0K16K - 3 0:32 17.68% em3 taskq See that's total wrongness.. something is very wrong here. Does anyone have any ideas? I really need to get this working. I figured if I evenly distributed the packets over 3 interfaces it simulates having 3 rx queues because it has a separate process for each interface and the result is WAY more CPU usage and a little over half the pps throughput with a single port .. If anyone is interested in tackling some these issues please e-mail me. It would be greatly appreciated. Paul Julian Elischer wrote: Paul wrote: ULE without PREEMPTION is now yeilding better results. input (em0) output packets errs bytespackets errs bytes colls 571595 40639 34564108 1 0226 0 577892 48865 34941908 1 0178 0 545240 84744 32966404 1 0178 0 587661 44691 35534512 1 0178 0 587839 38073 35544904 1 0178 0 587787 43556 35540360 1 0178 0 540786 39492 32712746 1 0178 0 572071 55797 34595650 1 0178 0 *OUCH, IPFW HURTS.. loading ipfw, and adding one ipfw rule allow ip from any to any drops 100Kpps off :/ what's up with THAT? unloaded ipfw module and back 100kpps more again, that's not right with ONE rule.. :/ ipfw need sto gain a lock on hte firewall before running, and is quite complex.. I can believe it.. in FreeBSD 4.8 I was able to use ipfw and filter 1Gb between two interfaces (bridged) but I think it has slowed down since then due to the SMP locking. em0 taskq is still jumping cpus.. is there any way to lock it to one cpu or is this just a function of ULE running a tar czpvf all.tgz * and seeing if pps changes.. negligible.. guess scheduler is doing it's job at least.. Hmm. even when it's getting 50-60k errors per second on the interface I can still SCP a file through that interface although it's not fast.. 3-4MB/s.. You know, I wouldn't care if it added 5ms latency to the packets when it was doing 1mpps as long as it didn't drop any.. Why can't it do that? Queue them up and do them in bi chunks so none are droppedhmm? 32 bit system is compiling now.. won't do 400kpps with GENERIC kernel, as with 64 bit did 450k with GENERIC, although that could be the difference between opteron 270 and opteron 2212.. Paul ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe,
Re: kern/124753: net80211 discards power-save queue packets early
Sepherosa Ziehau wrote: On Thu, Jun 19, 2008 at 6:30 PM, [EMAIL PROTECTED] wrote: Synopsis: net80211 discards power-save queue packets early Responsible-Changed-From-To: freebsd-i386-freebsd-net Responsible-Changed-By: remko Responsible-Changed-When: Thu Jun 19 10:29:47 UTC 2008 Responsible-Changed-Why: reassign to networking team. http://www.freebsd.org/cgi/query-pr.cgi?pr=124753 In How-To-Repeat, you said: Then associate a recent Windows Mobile 6.1 device to the FreeBSD box running hostapd ... In Description, you said: The WM6.1 device recv ps-poll's for packets every 20 seconds ... AFAIK, STA sends ps-poll to AP; AP does not send ps-poll to STA. Why did your windows STA receive ps-poll from freebsd AP? Did you capture it by using 802.11 tap? And which freebsd driver were you using? Your problem looks like: - Either freebsd AP did not properly configure TIM in beacons, which could be easily found out by using 802.11 tap. But I highly suspect if you were using ath(4), TIM would be misconfigured. - Or your windows STA didn't process TIM according to 802.11 standard. The PR states the listen interval sent by the station is 3 (beacons) and the beacon interval is 100TU. This means the AP is required to buffer unicast frames for only 300TU which is ~300 ms. But according to the report the Windows device is polling every 20 seconds so there's no guarantee any packets will be present (even with the net80211 code arbitrarily using 4x the list interval specified by the sta). I find it really hard to believe a device would poll every 20 secs so something seems wrong in what's reported/observed. Given that defeating the aging logic just pushed the problem elsewhere it sounds like there's something else wrong which (as you note) probably requires a packet capture to understand. I'm pretty sure TIM is handled correctly in RELENG_7 but a packet capture would help us verify that. Sam ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Maximum ARP Entries
On Tue, Jul 01, 2008 at 04:37:01PM -0400, Paul wrote: Does anyone know if there is a maximum number of ARP entries/ adjacencies that FBSD can handle before recycling? In FreeBSD, ARP still uses routing table as its storage, and as such limits on the routing table memory applies, and the latter currently has no limit. Cheers, -- Ruslan Ermilov [EMAIL PROTECTED] FreeBSD committer ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]
Apparently lagg hasn't been giant fixed :/ Can we do something about this quickly? with adaptive giant i get more performance on lagg but the cpu usage is smashed 100% I get about 50k more pps per interface (so 150kpps total which STILL is less than a single gigabit port) Check it out 68 processes: 9 running, 41 sleeping, 18 waiting CPU: 0.0% user, 0.0% nice, 89.5% system, 0.0% interrupt, 10.5% idle Mem: 8016K Active, 6192K Inact, 47M Wired, 108K Cache, 9056K Buf, 1919M Free Swap: 8192M Total, 8192M Free PID USERNAME PRI NICE SIZERES STATE C TIME WCPU COMMAND 38 root -68- 0K16K CPU1 1 3:29 100.00% em2 taskq 37 root -68- 0K16K CPU0 0 3:31 98.78% em1 taskq 36 root -68- 0K16K CPU3 3 2:53 82.42% em0 taskq 11 root 171 ki31 0K16K RUN2 22:48 79.00% idle: cpu2 10 root 171 ki31 0K16K RUN3 20:51 22.90% idle: cpu3 39 root -68- 0K16K RUN2 0:32 16.60% em3 taskq 12 root 171 ki31 0K16K RUN1 20:16 2.05% idle: cpu1 13 root 171 ki31 0K16K RUN0 20:25 1.90% idle: cpu0 input (em0) output packets errs bytespackets errs bytes colls 122588 07355280 0 0 0 0 123057 07383420 0 0 0 0 input (em1) output packets errs bytespackets errs bytes colls 174917 11899 10495032 2 0178 0 173967 11697 10438038 2 0356 0 174630 10603 10477806 2 0268 0 input (em2) output packets errs bytespackets errs bytes colls 175843 3928 10550580 0 0 0 0 175952 5750 10557120 0 0 0 0 Still less performance than single gig-e.. that giant lock really sucks , and why on earth would LAGG require that.. It seems so simple to fix :/ Anyone up for it:) I wish I was a programmer sometimes, but network engineering will have to do. :D Julian Elischer wrote: Paul wrote: Is PF better than ipfw? iptables almost has no impact on routing performance unless I add a swath of rules to it and then it bombs I need maybe 10 rules max and I don't want 20% performance drop for that.. :P well lots of people have wanted to fix it, and I've investigated quite a lot but it takes someone with 2 weeks of free time and all the right clue. It's not inherrent in ipfw but it needs some TLC from someone who cares :-). Ouch! :) Is this going to be fixed any time soon? We have some money that can be used for development costs to fix things like this because we use linux and freebsd machines as firewalls for a lot of customers and with the increasing bandwidth and pps the customers are demanding more and I can't give them better performance with a brand new dual xeon or opteron machine vs the old p4 machines I have them running on now :/ The only difference in the new machine vs old machine is that the new one can take in more pps and drop it but it can't route a whole lot more. Routing/firewalling must still not be lock free, ugh.. :P Thanks Julian Elischer wrote: Paul wrote: ULE without PREEMPTION is now yeilding better results. input (em0) output packets errs bytespackets errs bytes colls 571595 40639 34564108 1 0226 0 577892 48865 34941908 1 0178 0 545240 84744 32966404 1 0178 0 587661 44691 35534512 1 0178 0 587839 38073 35544904 1 0178 0 587787 43556 35540360 1 0178 0 540786 39492 32712746 1 0178 0 572071 55797 34595650 1 0178 0 *OUCH, IPFW HURTS.. loading ipfw, and adding one ipfw rule allow ip from any to any drops 100Kpps off :/ what's up with THAT? unloaded ipfw module and back 100kpps more again, that's not right with ONE rule.. :/ ipfw need sto gain a lock on hte firewall before running, and is quite complex.. I can believe it.. in FreeBSD 4.8 I was able to use ipfw and filter 1Gb between two interfaces (bridged) but I think it has slowed down since then due to the SMP locking. em0 taskq is still jumping cpus.. is there any way to lock it to one cpu or is this just a function of ULE running a tar czpvf all.tgz * and seeing if pps changes.. negligible.. guess scheduler is doing it's job at least.. Hmm. even when it's getting 50-60k errors per second on the interface I can still SCP a file through that interface although it's not fast.. 3-4MB/s.. You know, I wouldn't care if it added 5ms latency to the packets when it was doing 1mpps as long as it didn't drop any.. Why can't it do that?
RE: Poor network performance for clients in 100MB toGigabit environment
I've attempt many standard and non-standard permutations of the tcp tuning parameters without much successful via sysctl. It feels like FreeBSD is not handling the congestion very well and is beyond tuning sysctl. It's just clients on the 100MB networks has slow/erratic reads; Clients on the Gigabit network are fine and screams, so the original tcp parameters are just fine for them. For the record, these are the sysctl options for the Linux and FreeBSD. Linux: net.ipv4.conf.eth0.force_igmp_version = 0 net.ipv4.conf.eth0.disable_policy = 0 net.ipv4.conf.eth0.disable_xfrm = 0 net.ipv4.conf.eth0.arp_ignore = 0 net.ipv4.conf.eth0.arp_announce = 0 net.ipv4.conf.eth0.arp_filter = 0 net.ipv4.conf.eth0.tag = 0 net.ipv4.conf.eth0.log_martians = 0 net.ipv4.conf.eth0.bootp_relay = 0 net.ipv4.conf.eth0.medium_id = 0 net.ipv4.conf.eth0.proxy_arp = 0 net.ipv4.conf.eth0.accept_source_route = 0 net.ipv4.conf.eth0.send_redirects = 1 net.ipv4.conf.eth0.rp_filter = 1 net.ipv4.conf.eth0.shared_media = 1 net.ipv4.conf.eth0.secure_redirects = 1 net.ipv4.conf.eth0.accept_redirects = 1 net.ipv4.conf.eth0.mc_forwarding = 0 net.ipv4.conf.eth0.forwarding = 0 net.ipv4.conf.lo.force_igmp_version = 0 net.ipv4.conf.lo.disable_policy = 1 net.ipv4.conf.lo.disable_xfrm = 1 net.ipv4.conf.lo.arp_ignore = 0 net.ipv4.conf.lo.arp_announce = 0 net.ipv4.conf.lo.arp_filter = 0 net.ipv4.conf.lo.tag = 0 net.ipv4.conf.lo.log_martians = 0 net.ipv4.conf.lo.bootp_relay = 0 net.ipv4.conf.lo.medium_id = 0 net.ipv4.conf.lo.proxy_arp = 0 net.ipv4.conf.lo.accept_source_route = 1 net.ipv4.conf.lo.send_redirects = 1 net.ipv4.conf.lo.rp_filter = 0 net.ipv4.conf.lo.shared_media = 1 net.ipv4.conf.lo.secure_redirects = 1 net.ipv4.conf.lo.accept_redirects = 1 net.ipv4.conf.lo.mc_forwarding = 0 net.ipv4.conf.lo.forwarding = 0 net.ipv4.conf.default.force_igmp_version = 0 net.ipv4.conf.default.disable_policy = 0 net.ipv4.conf.default.disable_xfrm = 0 net.ipv4.conf.default.arp_ignore = 0 net.ipv4.conf.default.arp_announce = 0 net.ipv4.conf.default.arp_filter = 0 net.ipv4.conf.default.tag = 0 net.ipv4.conf.default.log_martians = 0 net.ipv4.conf.default.bootp_relay = 0 net.ipv4.conf.default.medium_id = 0 net.ipv4.conf.default.proxy_arp = 0 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.conf.default.send_redirects = 1 net.ipv4.conf.default.rp_filter = 1 net.ipv4.conf.default.shared_media = 1 net.ipv4.conf.default.secure_redirects = 1 net.ipv4.conf.default.accept_redirects = 1 net.ipv4.conf.default.mc_forwarding = 0 net.ipv4.conf.default.forwarding = 0 net.ipv4.conf.all.force_igmp_version = 0 net.ipv4.conf.all.disable_policy = 0 net.ipv4.conf.all.disable_xfrm = 0 net.ipv4.conf.all.arp_ignore = 0 net.ipv4.conf.all.arp_announce = 0 net.ipv4.conf.all.arp_filter = 0 net.ipv4.conf.all.tag = 0 net.ipv4.conf.all.log_martians = 0 net.ipv4.conf.all.bootp_relay = 0 net.ipv4.conf.all.medium_id = 0 net.ipv4.conf.all.proxy_arp = 0 net.ipv4.conf.all.accept_source_route = 0 net.ipv4.conf.all.send_redirects = 1 net.ipv4.conf.all.rp_filter = 0 net.ipv4.conf.all.shared_media = 1 net.ipv4.conf.all.secure_redirects = 1 net.ipv4.conf.all.accept_redirects = 1 net.ipv4.conf.all.mc_forwarding = 0 net.ipv4.conf.all.forwarding = 0 net.ipv4.neigh.eth0.locktime = 99 net.ipv4.neigh.eth0.proxy_delay = 79 net.ipv4.neigh.eth0.anycast_delay = 99 net.ipv4.neigh.eth0.proxy_qlen = 64 net.ipv4.neigh.eth0.unres_qlen = 3 net.ipv4.neigh.eth0.gc_stale_time = 60 net.ipv4.neigh.eth0.delay_first_probe_time = 5 net.ipv4.neigh.eth0.base_reachable_time = 30 net.ipv4.neigh.eth0.retrans_time = 99 net.ipv4.neigh.eth0.app_solicit = 0 net.ipv4.neigh.eth0.ucast_solicit = 3 net.ipv4.neigh.eth0.mcast_solicit = 3 net.ipv4.neigh.lo.locktime = 99 net.ipv4.neigh.lo.proxy_delay = 79 net.ipv4.neigh.lo.anycast_delay = 99 net.ipv4.neigh.lo.proxy_qlen = 64 net.ipv4.neigh.lo.unres_qlen = 3 net.ipv4.neigh.lo.gc_stale_time = 60 net.ipv4.neigh.lo.delay_first_probe_time = 5 net.ipv4.neigh.lo.base_reachable_time = 30 net.ipv4.neigh.lo.retrans_time = 99 net.ipv4.neigh.lo.app_solicit = 0 net.ipv4.neigh.lo.ucast_solicit = 3 net.ipv4.neigh.lo.mcast_solicit = 3 net.ipv4.neigh.default.gc_thresh3 = 1024 net.ipv4.neigh.default.gc_thresh2 = 512 net.ipv4.neigh.default.gc_thresh1 = 128 net.ipv4.neigh.default.gc_interval = 30 net.ipv4.neigh.default.locktime = 99 net.ipv4.neigh.default.proxy_delay = 79 net.ipv4.neigh.default.anycast_delay = 99 net.ipv4.neigh.default.proxy_qlen = 64 net.ipv4.neigh.default.unres_qlen = 3 net.ipv4.neigh.default.gc_stale_time = 60 net.ipv4.neigh.default.delay_first_probe_time = 5 net.ipv4.neigh.default.base_reachable_time = 30 net.ipv4.neigh.default.retrans_time = 99 net.ipv4.neigh.default.app_solicit = 0 net.ipv4.neigh.default.ucast_solicit = 3 net.ipv4.neigh.default.mcast_solicit = 3 net.ipv4.tcp_slow_start_after_idle = 1 net.ipv4.tcp_workaround_signed_windows = 1 net.ipv4.tcp_bic_beta = 819 net.ipv4.tcp_tso_win_divisor = 8 net.ipv4.tcp_moderate_rcvbuf = 1 net.ipv4.tcp_bic_low_window = 14
Re: Route messages
At 05:24 AM 7/1/2008, Bjoern A. Zeeb wrote: So I had a very quick look at the code between doing something else. I think the only change needed is this if I am not mistaken but my head is far away nowhere close enough in this code. Hi, The patch seems to work in that there is not an RTM_MISS message generated per packet forwarded on my test box. Is it the final / correct version ? ---Mike ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Poor network performance for clients in 100MB toGigabit environment
Are the NFS mounts UDP or TCP on Linux and FreeBSD? I believe FreeBSD still defaults to UDP which can act differently especially for NFS. On Tue, Jul 01, 2008 at 05:30:35PM -0700, David Kwan wrote: I've attempt many standard and non-standard permutations of the tcp tuning parameters without much successful via sysctl. It feels like FreeBSD is not handling the congestion very well and is beyond tuning sysctl. It's just clients on the 100MB networks has slow/erratic reads; Clients on the Gigabit network are fine and screams, so the original tcp parameters are just fine for them. David K. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Sent: Tuesday, July 01, 2008 1:21 PM To: David Kwan Cc: freebsd-net@freebsd.org Subject: Re: Poor network performance for clients in 100MB toGigabit environment What options do you have enabled on the linux server? sysctl -a | grep net.ipv4.tcp and on the bsd sysctl -a net.inet.tcp It sounds like a problem with BSD not handing the dropped data or ack packets so what happens is it pushes a burst of data out 100mbit and the switch drops the packets and then BSD waits too long to recover and doesn't scale the transmission back. TCP is supposed to scale down the transmission speed until packets are not dropped to a point even without ECN. Options such as 'reno' and 'sack' etc. are congestion control algorithms that use congestion windows. David Kwan wrote: I have a couple of questions regarding the TCP Stack: I have a situation with clients on a 100MB network connecting to servers on a Gigabit network where the client read speeds are very slow from the FreeBSD server and fast from the Linux server; Write speeds from the clients to both servers are fast. (Clients on the gigabit network work fine with blazing read and write speeds). The network traces shows congestion packets for both servers when doing reads from the clients (dup acks and retransmissions), but the Linux server seem to handle the congestion better. ECN is not enabled on the network and I don't see any congestion windowing or clients window changing. The 100MB/1G switch is dropping packets. I double checked the network configuration and also swapped swithports for the servers to use the others to make sure the switch configuration are the same, and the Linux always does better than FreeBSD. Assuming that the network configuration is a constant for all clients and servers (speed, duplex, and etc...), the only variable is the servers themselves (Linux and FreeBSD). I have tried a couple of FreeBSD machines with 6.1 and 7.0 and they exhibit the same problem, with no luck matching the speed and network utilization of Linux (2 years old). The read speed test I'm referring is doing transferring of a 100MB file (cifs, nfs, and ftp), and the Linux server does it consistently in around 10 sec (line speed) with a constant network utilization chart, while the FreeBSD servers are magnitudes slower with erratic network utilization chart. I've attempted to tweak some network sysctl options on the FreeBSD, and the only ones that helped were disabling TSO and inflight; which leads me to think that the inter-packet gap was slightly increased to partially relieve congestion on the switch; not a long term solution. My questions are: 1. Have you heard of this problem before with 100MB clients to Gigabit servers? 2. Are you aware of any Linux fix/patch in the TCP stack to better handling congestion than FreeBSD? I'm looking to address this issue in the FreeBSD, but wondering if the Linux stack did something special that can help with the FreeBSD performance. David K. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]