Re: Route messages

2008-07-01 Thread Mike Tancsa

At 10:34 PM 6/27/2008, [EMAIL PROTECTED] wrote:

On Sun, 15 Jun 2008 11:16:17 +0100, in sentex.lists.freebsd.net you
wrote:

Paul wrote:
 Get these with GRE tunnel on
 FreeBSD 7.0-STABLE FreeBSD 7.0-STABLE #5: Sun May 11 19:00:57 EDT
 2008 :/usr/obj/usr/src/sys/ROUTER  amd64
 But do not get them with 7.0-RELEASE

 Any ideas what changed? :)  Wish there was some sort of changelog..
 # of messages per second seems consistent with packets per second on
 GRE interface..
 No impact in routing, but definitely impact in cpu usage for all
 processes monitoring the route messages.

RTM_MISS is actually fairly common when you don't have a default route.


Hi,
I am seeing this issue as well on a pair of  recently deployed
boxes, one  running MPD and one acting as an area router in front of
it. The MPD box has a default route and only has 400 routes or so.

A steady stream of those messages, upwards of 500 per second.

got message of size 96 on Fri Jun 27 22:25:42 2008
RTM_MISS: Lookup failed on this address: len 96, pid: 0, seq 0, errno
0, flags:DONE
locks:  inits:
sockaddrs: DST
 default

got message of size 96 on Fri Jun 27 22:25:42 2008
RTM_MISS: Lookup failed on this address: len 96, pid: 0, seq 0, errno
0, flags:DONE
locks:  inits:
sockaddrs: DST
 default

Is there a way to try and track down what is generating those messages
? Its eating up a fair bit of cpu with quagga (the zebra process
specifically)


I narrowed down where the change to RELENG_7 happened.  It looks like 
a commit around April 22nd caused the behaviour to change.


When a box acting as a router has a packet transit it, an RTM_MISS is 
generated for *each packet*...



Given a setup of

H1  R1 - H2

where
H1 is 10.10.1.2/24
H2 is 10.20.1.2/24
and
R1 has 2 interfaces, 10.10.1.1/24 and 10.20.1.1/24

Pinging H2 from H1 makes R1 generate a RTM_MISS for each packet!  For 
routing daemons such as zebra, this eats up a *lot* of CPU.  Turning 
on ip_fast_forwarding stops this behaviour on R1.  However, if the 
interface routing the packet is an netgraph interface (e.g. mpd) 
fast_forwarding doesnt seem to have an effect and the RTM_MISS 
messages are generated again for each packet.



The ping packet below is a valid icmp echo request and reply.

e.g
0[releng7]# ping -c 2 -S 10.20.1.2 10.10.1.2
PING 10.10.1.2 (10.10.1.2) from 10.20.1.2: 56 data bytes
64 bytes from 10.10.1.2: icmp_seq=0 ttl=63 time=0.302 ms
64 bytes from 10.10.1.2: icmp_seq=1 ttl=63 time=0.337 ms

--- 10.10.1.2 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.302/0.320/0.337/0.018 ms
0[releng7]#

generates 4 messages on the router

[r7-router]# route -n monitor

got message of size 96 on Tue Jul  1 00:42:35 2008
RTM_MISS: Lookup failed on this address: len 96, pid: 0, seq 0, errno 
0, flags:DONE

locks:  inits:
sockaddrs: DST
 default

got message of size 96 on Tue Jul  1 00:42:35 2008
RTM_MISS: Lookup failed on this address: len 96, pid: 0, seq 0, errno 
0, flags:DONE

locks:  inits:
sockaddrs: DST
 default

got message of size 96 on Tue Jul  1 00:42:36 2008
RTM_MISS: Lookup failed on this address: len 96, pid: 0, seq 0, errno 
0, flags:DONE

locks:  inits:
sockaddrs: DST
 default

got message of size 96 on Tue Jul  1 00:42:36 2008
RTM_MISS: Lookup failed on this address: len 96, pid: 0, seq 0, errno 
0, flags:DONE

locks:  inits:
sockaddrs: DST
 default



I am thinking

http://lists.freebsd.org/pipermail/cvs-src/2008-April/090303.html
is the commit ? If I revert to the prev version, the issue goes away.


kernel is just

0[r7-router]% diff router GENERIC
24,27c24
 ident router

 makeoptions MODULES_OVERRIDE=ipfw acpi

---
 ident GENERIC
37,38c34,35
 #options  INET6   # IPv6 communications protocols
 #options  SCTP# Stream Control Transmission Protocol
---
 options   INET6   # IPv6 communications protocols
 options   SCTP# Stream Control Transmission Protocol
47c44
 #options  NFSLOCKD# Network Lock Manager
---
 options   NFSLOCKD# Network Lock Manager
61c58
 #options  STACK   # stack(9) support
---
 options   STACK   # stack(9) support
303c300
 #device   uslcom  # SI Labs CP2101/CP2102 serial adapters
---
 deviceuslcom  # SI Labs CP2101/CP2102 
serial adapters



---Mike 


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: if_bridge turns off checksum offload of members?

2008-07-01 Thread Stefan Lambrev



Andrew Thompson wrote:

On Mon, Jun 30, 2008 at 07:16:29PM +0900, Pyun YongHyeon wrote:
  

On Mon, Jun 30, 2008 at 12:11:40PM +0300, Stefan Lambrev wrote:
  Greetings,
  
  I just noticed, that when I add em network card to bridge the checksum 
  offload is turned off.

  I even put in my rc.conf:
  ifconfig_em0=rxcsum up
  ifconfig_em1=rxcsum up
  but after reboot both em0 and em1 have this feature disabled.
  
  Is this expected behavior? Should I care about csum in bridge mode?

  I noticed that enabling checksum offload manually improve things little btw.
  


AFAIK this is intended one, bridge(4) turns off Tx side checksum
offload by default. I think disabling Tx checksum offload is
required as not all members of a bridge may be able to do checksum
offload. The same is true for TSO but it seems that bridge(4)
doesn't disable it.
If all members of bridge have the same hardware capability I think
bridge(4) may not need to disable Tx side hardware assistance. I
guess bridge(4) can scan every interface capabilities in a member
and can decide what hardware assistance can be activated instead of
blindly turning off Tx side hardware assistance.



This patch should do that, are you able to test it Stefan?
  

=== if_bridge (all)
cc -O2 -fno-strict-aliasing -pipe -march=nocona  -D_KERNEL -DKLD_MODULE 
-std=c99 -nostdinc   -DHAVE_KERNEL_OPTION_HEADERS -include 
/usr/obj/usr/src/sys/CORE/opt_global.h -I. -I@ -I@/contrib/altq 
-finline-limit=8000 --param inline-unit-growth=100 --param 
large-function-growth=1000 -fno-common -g -fno-omit-frame-pointer 
-I/usr/obj/usr/src/sys/CORE -mcmodel=kernel -mno-red-zone  -mfpmath=387 
-mno-sse -mno-sse2 -mno-mmx -mno-3dnow  -msoft-float 
-fno-asynchronous-unwind-tables -ffreestanding -Wall -Wredundant-decls 
-Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes 
-Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign 
-fformat-extensions -c /usr/src/sys/modules/if_bridge/../../net/if_bridge.c
/usr/src/sys/modules/if_bridge/../../net/if_bridge.c: In function 
'bridge_capabilities':
/usr/src/sys/modules/if_bridge/../../net/if_bridge.c:787: error: 
'IFCAP_TOE' undeclared (first use in this function)
/usr/src/sys/modules/if_bridge/../../net/if_bridge.c:787: error: (Each 
undeclared identifier is reported only once
/usr/src/sys/modules/if_bridge/../../net/if_bridge.c:787: error: for 
each function it appears in.)

*** Error code 1
1 error
*** Error code 2
1 error
*** Error code 2
1 error
*** Error code 2
1 error
*** Error code 2
1 error

I'm building without -j5 to see if the error message will change :)

I'm using 7-STABLE from Jun 27


cheers,
Andrew
  



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]

2008-07-01 Thread Stefan Lambrev

Hi,

Ingo Flaschberger wrote:

Dear Rudy,

I used polling in FreeBSD 5.x and it helped a bunch.  I set up a new 
router with 7.0 and MSI was recommended to me.  (I noticed no 
difference when moving from polling - MSI, however, on 5.4 polling 
seemed to help a lot.  What are people using in 7.0?

polling or MSI?


if you have a inet-router with gige-uplinks, it is possible that there 
will be (d)dos attacks.
only polling helps you then to keep the router manageable (but 
dropping packets).

Let me disagree :)
I'm experimenting with bridge and Intel 82571EB Gigabit Ethernet Controller.
On quad core system I have no problems with the stability of the bridge 
without polling.
taskq em0 takes 100% CPU, but I have another three (cpus/cores) that are 
free and the router is very very stable, no lag on other interfaces

and the average load is not very high too.


Kind regards,
Ingo Flaschberger

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Route messages

2008-07-01 Thread Bjoern A. Zeeb

On Tue, 1 Jul 2008, Bjoern A. Zeeb wrote:

Hi,


On Tue, 1 Jul 2008, Andre Oppermann wrote:

Hi,


Mike Tancsa wrote:

I am thinking

http://lists.freebsd.org/pipermail/cvs-src/2008-April/090303.html
is the commit ? If I revert to the prev version, the issue goes away.


Ha, I finally know why I ended up on Cc: of a thread I had no idea
about. Someone could have told me instead of blindly adding me;-)



Yes, this change doesn't look right.  It should only do the route
lookup in ip_input.c when there was an EMSGSIZE error returned by
ip_output().  The rtalloc_ign() call causes the message to be sent
because it always sets report to one.  The default message is RTM_MISS.

I'll try to prep an updated patch which doesn't have these issues later
today.


Yeah my bad. Sorry.

If you do that, do not do an extra route lookup if possible, correct
the rtalloc call. Thanks.


So I had a very quick look at the code between doing something else.
I think the only change needed is this if I am not mistaken but my
head is far away nowhere close enough in this code.

Andre, could you review this?

Index: sys/netinet/ip_input.c
===
RCS file: /shared/mirror/FreeBSD/r/ncvs/src/sys/netinet/ip_input.c,v
retrieving revision 1.332.2.2
diff -u -p -r1.332.2.2 ip_input.c
--- sys/netinet/ip_input.c  22 Apr 2008 12:02:55 - 1.332.2.2
+++ sys/netinet/ip_input.c  1 Jul 2008 09:23:08 -
@@ -1363,7 +1363,6 @@ ip_forward(struct mbuf *m, int srcrt)
 * the ICMP_UNREACH_NEEDFRAG Next-Hop MTU field described in RFC1191.
 */
bzero(ro, sizeof(ro));
-   rtalloc_ign(ro, RTF_CLONING);

error = ip_output(m, NULL, ro, IP_FORWARDING, NULL, NULL);


--
Bjoern A. Zeeb  Stop bit received. Insert coin for new game.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: if_bridge turns off checksum offload of members?

2008-07-01 Thread Stefan Lambrev

Hi,

May be a stupid questions, but:

1) There are zero matches of IFCAP_TOE in kernel sources .. there is not 
support for TOE in 7.0, but may be this is work in progress for 8-current?
2) In #define BRIDGE_IFCAPS_MASK (IFCAP_TOE|IFCAP_TSO|IFCAP_TXCSUM) - 
TOE should be repleaced with RXCSUM or just removed?
3) Why RX is never checked? In my case this doesn't matter because em 
turn off both TX and RX if only one is disabled, but probably there is a 
hardware,

that can separate them e.g. RX disabled while TX enabled?
4) I'm not sure why bridge should not work with two interfaces one of 
which support TX and the other does not? At least if I turn on checksum 
offload

only on one of the interfaces the bridge is still working ...

Andrew Thompson wrote:

- cut -



This patch should do that, are you able to test it Stefan?


cheers,
Andrew
  
P.S. I saw very good results with netisr2 on a kernel from p4 before few 
months .. are there any patches flying around so I can test them with 
7-STABLE? :)


--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: if_bridge turns off checksum offload of members?

2008-07-01 Thread Stefan Lambrev

Hi,

Sorry to reply to myself.

Stefan Lambrev wrote:

Hi,

May be a stupid questions, but:

1) There are zero matches of IFCAP_TOE in kernel sources .. there is 
not support for TOE in 7.0, but may be this is work in progress for 
8-current?
2) In #define BRIDGE_IFCAPS_MASK (IFCAP_TOE|IFCAP_TSO|IFCAP_TXCSUM) - 
TOE should be repleaced with RXCSUM or just removed?
Your patch plus this small change (replacing TOE with RXCSUM) seems to 
work fine for me - kernel compiles without a problem and checksum 
offload is enabled after reboot.
3) Why RX is never checked? In my case this doesn't matter because em 
turn off both TX and RX if only one is disabled, but probably there is 
a hardware,

that can separate them e.g. RX disabled while TX enabled?
4) I'm not sure why bridge should not work with two interfaces one of 
which support TX and the other does not? At least if I turn on 
checksum offload

only on one of the interfaces the bridge is still working ...

Andrew Thompson wrote:

- cut -



This patch should do that, are you able to test it Stefan?


cheers,
Andrew
  
P.S. I saw very good results with netisr2 on a kernel from p4 before 
few months .. are there any patches flying around so I can test them 
with 7-STABLE? :)




--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]

2008-07-01 Thread Ingo Flaschberger

Dear Paul,

I have been unable to even come close to livelocking the machine with the em 
driver interrupt moderation.
So that to me throws polling out the window.  I tried 8000hz with polling 
modified to allow 1 burst and it makes no difference


higher hz-values gives you better latenca but less overall speed.
2000hz should be enough.

Kind regards,
Ingo Flaschberger

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]

2008-07-01 Thread Ingo Flaschberger

Dear Paul,



Dual Opteron 2212, Recompiled kernel with 7-STABLE and removed a lot of junk 
in the config, added
options NO_ADAPTIVE_MUTEXES   not sure if that makes any difference 
or not, will test without.

Used ULE scheduler, used preemption, CPUTYPE=opteron in /etc/make.conf
7.0-STABLE FreeBSD 7.0-STABLE #4: Tue Jul  1 01:22:18 CDT 2008 amd64
Max input rate .. 587kpps?   Take into consideration that these packets are 
being forwarded out em1 interface which
causes a great impact on cpu usage.  If I set up a firewall rule to block the 
packets it can do over 1mpps on em0 input.


would be great if you can also test with 32bit.

what value do you have at net.inet.ip.intr_queue_maxlen?

kind regards,
Ingo Flaschberger

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RELENG_7 ath WPA stuck when bgscan is active on interface

2008-07-01 Thread Matthias Apitz

Hello,

I'm running the above configuration, RELENG_7 kernel and WPA, on an Asus
laptop eeePC 900 for which one must patch the HAL with:
http://snapshots.madwifi.org/special/madwifi-ng-r2756+ar5007.tar.gz )
all is fine, mostly, but when 'bgscan' is activated on the interface
ath0 it get stuck reproduce-able after some time without any traffic through
the interface; setting 'ifconfig ath0 -bgscan' makes the problem going
away;

could it be related to the bug I'm facing on another laptop with
bgscan/WPA/iwi0, see:

http://www.freebsd.org/cgi/query-pr.cgi?pr=122331

thx

matthias
-- 
Matthias Apitz
Manager Technical Support - OCLC GmbH
Gruenwalder Weg 28g - 82041 Oberhaching - Germany
t +49-89-61308 351 - f +49-89-61308 399 - m +49-170-4527211
e [EMAIL PROTECTED] - w http://www.oclc.org/ http://www.UnixArea.de/
b http://gurucubano.blogspot.com/
«...una sola vez, que es cuanto basta si se trata de verdades definitivas.»
«...only once, which is enough if it has todo with definite truth.»
José Saramago, Historia del Cerca de Lisboa
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: altq on vlan

2008-07-01 Thread Sergey Matveychuk

Max Laier wrote:


Would you mind adding some words to that effect to your patch?



I think I'll hide it from public access instead. Looks like some people 
prefer to patch kernel instead of learning how to make a queue on parent 
interface.


--
Dixi.
Sem.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: if_bridge turns off checksum offload of members?

2008-07-01 Thread Andrew Thompson
On Tue, Jul 01, 2008 at 12:51:42PM +0300, Stefan Lambrev wrote:
 Hi,
 
 May be a stupid questions, but:
 
 1) There are zero matches of IFCAP_TOE in kernel sources .. there is not 
 support for TOE in 7.0, but may be this is work in progress for 8-current?

Yes, its in current only. Just remove IFCAP_TOE.

 2) In #define BRIDGE_IFCAPS_MASK (IFCAP_TOE|IFCAP_TSO|IFCAP_TXCSUM) - TOE 
 should be repleaced with RXCSUM or just removed?
 3) Why RX is never checked? In my case this doesn't matter because em turn 
 off both TX and RX if only one is disabled, but probably there is a 
 hardware,
 that can separate them e.g. RX disabled while TX enabled?

Rx does not matter, whatever isnt offloaded in hardware is just computed
locally such as checking the cksum. Its Tx that messes up the bridge, if
a outgoing packet is generated locally on an interface that has Tx
offloading, it may actaully be sent out a different bridge member that
does not have that capability. This would cause it to be sent with an
invalid checksum for instance.

The bridge used to just disable Tx offloading but this patch you are
testing makes sure each feature is supported by all members.

 4) I'm not sure why bridge should not work with two interfaces one of which 
 support TX and the other does not? At least if I turn on checksum offload
 only on one of the interfaces the bridge is still working ...
 
 Andrew Thompson wrote:
 
 - cut -
 
 
 This patch should do that, are you able to test it Stefan?
 
 
 cheers,
 Andrew
   
 P.S. I saw very good results with netisr2 on a kernel from p4 before few 
 months .. are there any patches flying around so I can test them with 
 7-STABLE? :)
 
 -- 
 
 Best Wishes,
 Stefan Lambrev
 ICQ# 24134177
 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: if_bridge turns off checksum offload of members?

2008-07-01 Thread Stefan Lambrev

Greetings Andrew,

The patch compiles and works as expected.
I noticed something strange btw - swi1: net was consuming 100% WCPU 
(shown on top -S)
but I'm not sure this have something to do with your patch, as I can't 
reproduce it right now ..


Andrew Thompson wrote:

On Tue, Jul 01, 2008 at 12:51:42PM +0300, Stefan Lambrev wrote:
  

Hi,

May be a stupid questions, but:

1) There are zero matches of IFCAP_TOE in kernel sources .. there is not 
support for TOE in 7.0, but may be this is work in progress for 8-current?



Yes, its in current only. Just remove IFCAP_TOE.

  
2) In #define BRIDGE_IFCAPS_MASK (IFCAP_TOE|IFCAP_TSO|IFCAP_TXCSUM) - TOE 
should be repleaced with RXCSUM or just removed?
3) Why RX is never checked? In my case this doesn't matter because em turn 
off both TX and RX if only one is disabled, but probably there is a 
hardware,

that can separate them e.g. RX disabled while TX enabled?



Rx does not matter, whatever isnt offloaded in hardware is just computed
locally such as checking the cksum. Its Tx that messes up the bridge, if
a outgoing packet is generated locally on an interface that has Tx
offloading, it may actaully be sent out a different bridge member that
does not have that capability. This would cause it to be sent with an
invalid checksum for instance.

The bridge used to just disable Tx offloading but this patch you are
testing makes sure each feature is supported by all members.

  
4) I'm not sure why bridge should not work with two interfaces one of which 
support TX and the other does not? At least if I turn on checksum offload

only on one of the interfaces the bridge is still working ...

Andrew Thompson wrote:

- cut -


This patch should do that, are you able to test it Stefan?


cheers,
Andrew
  
  
P.S. I saw very good results with netisr2 on a kernel from p4 before few 
months .. are there any patches flying around so I can test them with 
7-STABLE? :)


--

Best Wishes,
Stefan Lambrev
ICQ# 24134177



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]
  


--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD NAT-T patch integration

2008-07-01 Thread Sam Leffler

Larry Baird wrote:

And how do I know that it works ?
Well, when it doesn't work, I do know it, quite quickly most of the
time !


I have to chime in here.  I did most of the initial porting of the
NAT-T patches from Kame IPSec to FAST_IPSEC.  I did look at every
line of code during this process.  I found no security problems during
the port.  Like Yvan, my company uses the NAT-T patches commercially.
Like he says, if it had problems, we would hear about it.  If the patches
don't get commited, I highly suspect Yvan or myself would try to keep the
patches up todate.  So far I have done FAST_IPSEC pacthes for FreeBSD 4,5,6.  
Yvan did 7 and 8 by himself.  Keeping up gets to be a pain after a while.  
I do plan to look at the FreeBSD 7 patches soon, but it sure would be nice

to see it commited.

  
This whole issue seems ridiculous.  I've been trying to get the NAT-T 
patches committed for a while but since I'm not setup to do any IPSEC 
testing have deferred to others.  If we need to break a logjam I'll 
pitch in.


   Sam

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: if_bridge turns off checksum offload of members?

2008-07-01 Thread Sam Leffler

Andrew Thompson wrote:

On Tue, Jul 01, 2008 at 12:51:42PM +0300, Stefan Lambrev wrote:
  

Hi,

May be a stupid questions, but:

1) There are zero matches of IFCAP_TOE in kernel sources .. there is not 
support for TOE in 7.0, but may be this is work in progress for 8-current?



Yes, its in current only. Just remove IFCAP_TOE.

  
2) In #define BRIDGE_IFCAPS_MASK (IFCAP_TOE|IFCAP_TSO|IFCAP_TXCSUM) - TOE 
should be repleaced with RXCSUM or just removed?
3) Why RX is never checked? In my case this doesn't matter because em turn 
off both TX and RX if only one is disabled, but probably there is a 
hardware,

that can separate them e.g. RX disabled while TX enabled?



Rx does not matter, whatever isnt offloaded in hardware is just computed
locally such as checking the cksum. Its Tx that messes up the bridge, if
a outgoing packet is generated locally on an interface that has Tx
offloading, it may actaully be sent out a different bridge member that
does not have that capability. This would cause it to be sent with an
invalid checksum for instance.

The bridge used to just disable Tx offloading but this patch you are
testing makes sure each feature is supported by all members.

  
4) I'm not sure why bridge should not work with two interfaces one of which 
support TX and the other does not? At least if I turn on checksum offload

only on one of the interfaces the bridge is still working ...

Andrew Thompson wrote:

- cut -


This patch should do that, are you able to test it Stefan?


cheers,
Andrew
  
  
P.S. I saw very good results with netisr2 on a kernel from p4 before few 
months .. are there any patches flying around so I can test them with 
7-STABLE? :)





This issue has come up before.  Handling checksum offload in the bridge 
for devices that are not capable is not a big deal and is important for 
performance.  TSO likewise should be done but we're missing a generic 
TSO support routine to do that (I believe, netbsd has one and linux has 
a GSO mechanism).


   Sam

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]

2008-07-01 Thread Paul

Thanks.. I was hoping I wasn't seeing things :
I do not like inconsistencies.. :/

Stefan Lambrev wrote:



Greetings Paul,



--OK I'm stumped now.. Rebuilt with preemption and ULE and 
preemption again and it's not doing what it did before..
I saw this in my configuration too :) Just leave your test running for 
longer time and you will see this strange inconsistency in action.
In my configuration I almost always have better throughput after 
reboot, which drops latter (5-10min under flood) with 50-60kpps and 
after another 10-15min the number of correctly passed packet increase 
again. Looks like auto tuning of which I'm not aware :)



How could that be? Now about 500kpps..

That kind of inconsistency almost invalidates all my testing.. why 
would it be so much different after trying a bunch of kernel options 
and rebooting a bunch of times and then going back to the original 
config doesn't get you what it did in the beginning..


I'll have to dig into this further.. never seen anything like it :)

Hopefully the ip_input fix will help free up a few cpu cycles.


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]

2008-07-01 Thread Paul
I am going to.. I have an opteron 270 dual set up on 32 bit and the 2212 
is set up on 64 bit :)

Today should bring some 32 bit results as well as etherchannel results.


Ingo Flaschberger wrote:

Dear Paul,



Dual Opteron 2212, Recompiled kernel with 7-STABLE and removed a lot 
of junk in the config, added
options NO_ADAPTIVE_MUTEXES   not sure if that makes any 
difference or not, will test without.

Used ULE scheduler, used preemption, CPUTYPE=opteron in /etc/make.conf
7.0-STABLE FreeBSD 7.0-STABLE #4: Tue Jul  1 01:22:18 CDT 2008 amd64
Max input rate .. 587kpps?   Take into consideration that these 
packets are being forwarded out em1 interface which
causes a great impact on cpu usage.  If I set up a firewall rule to 
block the packets it can do over 1mpps on em0 input.


would be great if you can also test with 32bit.

what value do you have at net.inet.ip.intr_queue_maxlen?

kind regards,
Ingo Flaschberger




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]

2008-07-01 Thread Paul
I can't reproduce the 580kpps maximum that I saw when I first compiled 
for some reason, I don't understand, the max I get even with ULE and 
preemption
is now about 530 and it dips to 480 a lot.. The first time I tried it it 
was at 580 and dipped to 520...what the?.. (kernel config attached at end)
* noticed that SOMETIMES the em0 taskq jumps around cpus and doesn't use 
100% of any one cpu
* noticed that the netstat packets per second rate varies explicitly 
with the CPU usage of em0 taskq

(top output with ULE/PREEMPTION compiled in):
PID USERNAME PRI NICE   SIZERES STATE  C   TIME   WCPU COMMAND
  10 root 171 ki31 0K16K RUN3  64:12 94.09% idle: cpu3
  36 root -68- 0K16K CPU1   1   5:43 89.75% em0 taskq
  13 root 171 ki31 0K16K CPU0   0  63:21 87.30% idle: cpu0
  12 root 171 ki31 0K16K RUN1  62:44 66.75% idle: cpu1
  11 root 171 ki31 0K16K CPU2   2  62:17 56.49% idle: cpu2
  39 root -68- 0K16K -  0   0:54 10.64% em3 taskq

this is about 480-500kpps rate.
now I wait a minute and

PID USERNAME PRI NICE   SIZERES STATE  C   TIME   WCPU COMMAND
  10 root 171 ki31 0K16K CPU3   3  64:56 100.00% idle: cpu3
  36 root -68- 0K16K CPU2   2   6:21 94.14% em0 taskq
  13 root 171 ki31 0K16K RUN0  63:55 80.18% idle: cpu0
  11 root 171 ki31 0K16K RUN2  62:48 67.38% idle: cpu2
  12 root 171 ki31 0K16K CPU1   1  63:04 58.40% idle: cpu1
  39 root -68- 0K16K -  1   1:00 10.21% em3 taskq


530kpps rate...


drops to 85%.. 480kpps rate
goes back up to 95% 530kpps

it keeps flopping like this...

none of the CPUs are 100% use and none of the cpus add up , like the cpu 
time of em0 taskq is 94% so one of the cpus should be 6% idle but it's not.
This is with ULE/PREEMPTION.. I see different behavior without 
preemption and with 4bsd..

and I also see different behavior depending on the time of day lol :)
Figure that one out

I'll post back without preemption and with 4bsd in a min
then i'll move on to the 32 bit platform tests


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]

2008-07-01 Thread Paul

ULE without PREEMPTION is now yeilding better results.
input  (em0)   output
  packets  errs  bytespackets  errs  bytes colls
   571595 40639   34564108  1 0226 0
   577892 48865   34941908  1 0178 0
   545240 84744   32966404  1 0178 0
   587661 44691   35534512  1 0178 0
   587839 38073   35544904  1 0178 0
   587787 43556   35540360  1 0178 0
   540786 39492   32712746  1 0178 0
   572071 55797   34595650  1 0178 0
 
*OUCH, IPFW HURTS..
loading ipfw, and adding one ipfw rule allow ip from any to any drops 
100Kpps off :/ what's up with THAT?
unloaded ipfw module and back 100kpps more again, that's not right with 
ONE rule.. :/


em0 taskq is still jumping cpus.. is there any way to lock it to one cpu 
or is this just a function of ULE


running a tar czpvf all.tgz *  and seeing if pps changes..
negligible.. guess scheduler is doing it's job at least..

Hmm. even when it's getting 50-60k errors per second on the interface I 
can still SCP a file through that interface although it's not fast.. 
3-4MB/s..


You know, I wouldn't care if it added 5ms latency to the packets when it 
was doing 1mpps as long as it didn't drop any.. Why can't it do that? 
Queue them up and do them in bi chunks so none are droppedhmm?


32 bit system is compiling now..  won't do  400kpps with GENERIC 
kernel, as with 64 bit did 450k with GENERIC, although that could be

the difference between opteron 270 and opteron 2212..

Paul

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Poor network performance for clients in 100MB to Gigabit environment

2008-07-01 Thread Paul

What options do you have enabled on the linux server?
sysctl -a | grep net.ipv4.tcp
and on the bsd
sysctl -a net.inet.tcp

It sounds like a problem with BSD not handing the dropped data or ack 
packets so what happens is it pushes a burst of
data out  100mbit and the switch drops the packets and then BSD waits 
too long to recover and doesn't scale the transmission
back.  TCP is supposed to scale down the transmission speed until 
packets are not dropped to a point even without ECN.


Options such as 'reno' and 'sack' etc. are congestion control algorithms 
that use congestion windows.



David Kwan wrote:

I have a couple of questions regarding the TCP Stack:

 


I have a situation with clients on a 100MB network connecting to servers
on a Gigabit network where the client read speeds are very slow from the
FreeBSD server and fast from the Linux server; Write speeds from the
clients to both servers are fast.  (Clients on the gigabit network work
fine with blazing read and write speeds).  The network traces shows
congestion packets for both servers when doing reads from the clients
(dup acks and retransmissions), but the Linux server seem to handle the
congestion better. ECN is not enabled on the network and I don't see any
congestion windowing or clients window changing.  The 100MB/1G switch

is dropping packets.   I double checked the network configuration and
also swapped swithports for the servers to use the others to make sure
the switch configuration are the same, and the Linux always does better
than FreeBSD.  Assuming that the network configuration is a constant for
all clients and servers (speed, duplex, and etc...), the only variable
is the servers themselves (Linux and FreeBSD).  I have tried a couple of
FreeBSD machines with 6.1 and 7.0 and they exhibit the same problem,
with no luck matching the speed and network utilization of Linux (2
years old).  The read speed test I'm referring is doing transferring of
a 100MB file (cifs, nfs, and ftp), and the Linux server does it
consistently in around 10 sec (line speed) with a constant network
utilization chart, while the FreeBSD servers are magnitudes slower with
erratic network utilization chart.  I've attempted to tweak some network
sysctl  options on the FreeBSD, and the only ones that helped were
disabling TSO and inflight; which leads me to think that the
inter-packet gap was slightly increased to partially relieve congestion
on the switch; not a long term solution.

 

My questions are: 


  1. Have you heard of this problem before with 100MB clients to Gigabit
servers?

  2. Are you aware of any Linux fix/patch in the TCP stack to better
handling congestion than FreeBSD?  I'm looking to address this issue in
the FreeBSD, but wondering if the Linux stack did something special that
can help with the FreeBSD performance.

 


David K.

 


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]

  


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Poor network performance for clients in 100MB to Gigabit environment

2008-07-01 Thread Jack Vogel
Take it from someone who has spent a couple weeks beating his
head against a wall over this... system tuning is essential.

If your driver is going to the kernel looking for a resource and
having to wait, its gonna hurt...

Look into kern.ipc, and as Paul said net.inet.

Off the shelf config is more than likely going to be inadequate.

Good luck,

Jack


On Tue, Jul 1, 2008 at 12:50 PM, David Kwan [EMAIL PROTECTED] wrote:
 I have a couple of questions regarding the TCP Stack:



 I have a situation with clients on a 100MB network connecting to servers
 on a Gigabit network where the client read speeds are very slow from the
 FreeBSD server and fast from the Linux server; Write speeds from the
 clients to both servers are fast.  (Clients on the gigabit network work
 fine with blazing read and write speeds).  The network traces shows
 congestion packets for both servers when doing reads from the clients
 (dup acks and retransmissions), but the Linux server seem to handle the
 congestion better. ECN is not enabled on the network and I don't see any
 congestion windowing or clients window changing.  The 100MB/1G switch

 is dropping packets.   I double checked the network configuration and
 also swapped swithports for the servers to use the others to make sure
 the switch configuration are the same, and the Linux always does better
 than FreeBSD.  Assuming that the network configuration is a constant for
 all clients and servers (speed, duplex, and etc...), the only variable
 is the servers themselves (Linux and FreeBSD).  I have tried a couple of
 FreeBSD machines with 6.1 and 7.0 and they exhibit the same problem,
 with no luck matching the speed and network utilization of Linux (2
 years old).  The read speed test I'm referring is doing transferring of
 a 100MB file (cifs, nfs, and ftp), and the Linux server does it
 consistently in around 10 sec (line speed) with a constant network
 utilization chart, while the FreeBSD servers are magnitudes slower with
 erratic network utilization chart.  I've attempted to tweak some network
 sysctl  options on the FreeBSD, and the only ones that helped were
 disabling TSO and inflight; which leads me to think that the
 inter-packet gap was slightly increased to partially relieve congestion
 on the switch; not a long term solution.



 My questions are:

  1. Have you heard of this problem before with 100MB clients to Gigabit
 servers?

  2. Are you aware of any Linux fix/patch in the TCP stack to better
 handling congestion than FreeBSD?  I'm looking to address this issue in
 the FreeBSD, but wondering if the Linux stack did something special that
 can help with the FreeBSD performance.



 David K.



 ___
 freebsd-net@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to [EMAIL PROTECTED]

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Maximum ARP Entries

2008-07-01 Thread Paul
Does anyone know if there is a maximum number of ARP entries/ 
adjacencies that FBSD can handle before recycling? 
I want to route several thousand ips direct to some interfaces so it 
will have 3-4k ARP entries..  I'm curious because in Linux I have to set
the sysctl net.ipv4.neigh threshholds a lot higher or it bombs with 'too 
many neighbors'...  I don't see a setting like this in BSD sysctl .


Thanks!

Paul

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]

2008-07-01 Thread Paul

Ok, now THIS is absoultely a whole bunch of ridiculousness..
I set up etherchannel, and I'm evenly distributing packets over em0 em1 
and em2 to lagg0
and i get WORSE performance than with a single interface..  Can anyone 
explain this one? This is horrible.
I got em0-em2 taskq's using 80% cpu EACH and they are only doing 100kpps 
EACH


looks:

packets  errs  bytespackets  errs  bytes colls
   105050 110666303000  0 0  0 0
   104952 139696297120  0 0  0 0
   104331 121216259860  0 0  0 0

  input  (em1)   output
  packets  errs  bytespackets  errs  bytes colls
   103734 706586223998  0 0  0 0
   103483 757036209046  0 0  0 0
   103848 761956230886  0 0  0 0


  input  (em2)   output
  packets  errs  bytespackets  errs  bytes colls
   103299 629576197940  1 0226 0
   106388 730716383280  1 0178 0
   104503 705736270180  4 0712 0

last pid:  1378;  load averages:  2.31,  1.28,  
0.57  up 
0+00:06:27  17:42:32

68 processes:  8 running, 42 sleeping, 18 waiting
CPU:  0.0% user,  0.0% nice, 58.9% system,  0.0% interrupt, 41.1% idle
Mem: 7980K Active, 5932K Inact, 47M Wired, 16K Cache, 8512K Buf, 1920M Free
Swap: 8192M Total, 8192M Free

 PID USERNAME PRI NICE   SIZERES STATE  C   TIME   WCPU COMMAND
  11 root 171 ki31 0K16K RUN2   5:18 80.47% idle: cpu2
  38 root -68- 0K16K CPU3   3   2:30 80.18% em2 taskq
  37 root -68- 0K16K CPU1   1   2:28 76.90% em1 taskq
  36 root -68- 0K16K CPU2   2   2:28 72.56% em0 taskq
  13 root 171 ki31 0K16K RUN0   3:32 29.20% idle: cpu0
  12 root 171 ki31 0K16K RUN1   3:29 27.88% idle: cpu1
  10 root 171 ki31 0K16K RUN3   3:21 25.63% idle: cpu3
  39 root -68- 0K16K -  3   0:32 17.68% em3 taskq


See that's total wrongness.. something is very wrong here.  Does anyone 
have any ideas? I really need to get this working.
I figured if I evenly distributed the packets over 3 interfaces it 
simulates having 3 rx queues because it has a separate process for each 
interface
and the result is WAY more CPU usage and a little over half the pps 
throughput with a single port ..


If anyone is interested in tackling some these issues please e-mail me.  
It would be greatly appreciated.



Paul



Julian Elischer wrote:

Paul wrote:

ULE without PREEMPTION is now yeilding better results.
input  (em0)   output
  packets  errs  bytespackets  errs  bytes colls
   571595 40639   34564108  1 0226 0
   577892 48865   34941908  1 0178 0
   545240 84744   32966404  1 0178 0
   587661 44691   35534512  1 0178 0
   587839 38073   35544904  1 0178 0
   587787 43556   35540360  1 0178 0
   540786 39492   32712746  1 0178 0
   572071 55797   34595650  1 0178 0
 
*OUCH, IPFW HURTS..
loading ipfw, and adding one ipfw rule allow ip from any to any drops 
100Kpps off :/ what's up with THAT?
unloaded ipfw module and back 100kpps more again, that's not right 
with ONE rule.. :/


ipfw need sto gain a lock on hte firewall before running,
and is quite complex..  I can believe it..

in FreeBSD 4.8 I was able to use ipfw and filter 1Gb between two 
interfaces (bridged) but I think it has slowed down since then due to 
the SMP locking.





em0 taskq is still jumping cpus.. is there any way to lock it to one 
cpu or is this just a function of ULE


running a tar czpvf all.tgz *  and seeing if pps changes..
negligible.. guess scheduler is doing it's job at least..

Hmm. even when it's getting 50-60k errors per second on the interface 
I can still SCP a file through that interface although it's not 
fast.. 3-4MB/s..


You know, I wouldn't care if it added 5ms latency to the packets when 
it was doing 1mpps as long as it didn't drop any.. Why can't it do 
that? Queue them up and do them in bi chunks so none are 
droppedhmm?


32 bit system is compiling now..  won't do  400kpps with GENERIC 
kernel, as with 64 bit did 450k with GENERIC, although that could be

the difference between opteron 270 and opteron 2212..

Paul

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, 

Re: kern/124753: net80211 discards power-save queue packets early

2008-07-01 Thread Sam Leffler

Sepherosa Ziehau wrote:

On Thu, Jun 19, 2008 at 6:30 PM,  [EMAIL PROTECTED] wrote:
  

Synopsis: net80211 discards power-save queue packets early

Responsible-Changed-From-To: freebsd-i386-freebsd-net
Responsible-Changed-By: remko
Responsible-Changed-When: Thu Jun 19 10:29:47 UTC 2008
Responsible-Changed-Why:
reassign to networking team.

http://www.freebsd.org/cgi/query-pr.cgi?pr=124753



In How-To-Repeat, you said:
Then associate a recent Windows Mobile 6.1 device to the FreeBSD box
running hostapd ...

In Description, you said:
The WM6.1 device recv ps-poll's for packets every 20 seconds ...

AFAIK, STA sends ps-poll to AP; AP does not send ps-poll to STA.  Why
did your windows STA receive ps-poll from freebsd AP?  Did you capture
it by using 802.11 tap?

And which freebsd driver were you using?

Your problem looks like:
- Either freebsd AP did not properly configure TIM in beacons, which
could be easily found out by using 802.11 tap.  But I highly suspect
if you were using ath(4), TIM would be misconfigured.
- Or your windows STA didn't process TIM according to 802.11 standard.

  
The PR states the listen interval sent by the station is 3 (beacons) and 
the beacon interval is 100TU.  This means the AP is required to buffer 
unicast frames for only 300TU which is ~300 ms.  But according to the 
report the Windows device is polling every 20 seconds so there's no 
guarantee any packets will be present (even with the net80211 code 
arbitrarily using 4x the list interval specified by the sta).  I find it 
really hard to believe a device would poll every 20 secs so something 
seems wrong in what's reported/observed.


Given that defeating the aging logic just pushed the problem elsewhere 
it sounds like there's something else wrong which (as you note) probably 
requires a packet capture to understand.  I'm pretty sure TIM is handled 
correctly in RELENG_7 but a packet capture would help us verify that.


   Sam

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Maximum ARP Entries

2008-07-01 Thread Ruslan Ermilov
On Tue, Jul 01, 2008 at 04:37:01PM -0400, Paul wrote:
 Does anyone know if there is a maximum number of ARP entries/ 
 adjacencies that FBSD can handle before recycling? 
 
In FreeBSD, ARP still uses routing table as its storage, and
as such limits on the routing table memory applies, and the
latter currently has no limit.


Cheers,
-- 
Ruslan Ermilov
[EMAIL PROTECTED]
FreeBSD committer
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]

2008-07-01 Thread Paul
Apparently lagg hasn't been giant fixed :/   Can we do something about 
this quickly? 
with adaptive giant i get more performance on lagg but the cpu usage is 
smashed 100%
I get about 50k more pps per interface  (so 150kpps total which STILL is 
less than a single gigabit port)

Check it out

68 processes:  9 running, 41 sleeping, 18 waiting
CPU:  0.0% user,  0.0% nice, 89.5% system,  0.0% interrupt, 10.5% idle
Mem: 8016K Active, 6192K Inact, 47M Wired, 108K Cache, 9056K Buf, 1919M Free
Swap: 8192M Total, 8192M Free

 PID USERNAME PRI NICE   SIZERES STATE  C   TIME   WCPU COMMAND
  38 root -68- 0K16K CPU1   1   3:29 100.00% em2 taskq
  37 root -68- 0K16K CPU0   0   3:31 98.78% em1 taskq
  36 root -68- 0K16K CPU3   3   2:53 82.42% em0 taskq
  11 root 171 ki31 0K16K RUN2  22:48 79.00% idle: cpu2
  10 root 171 ki31 0K16K RUN3  20:51 22.90% idle: cpu3
  39 root -68- 0K16K RUN2   0:32 16.60% em3 taskq
  12 root 171 ki31 0K16K RUN1  20:16  2.05% idle: cpu1
  13 root 171 ki31 0K16K RUN0  20:25  1.90% idle: cpu0

   input  (em0)   output
  packets  errs  bytespackets  errs  bytes colls
   122588 07355280  0 0  0 0
   123057 07383420  0 0  0 0

   input  (em1)   output
  packets  errs  bytespackets  errs  bytes colls
   174917 11899   10495032  2 0178 0
   173967 11697   10438038  2 0356 0
   174630 10603   10477806  2 0268 0

   input  (em2)   output
  packets  errs  bytespackets  errs  bytes colls
   175843  3928   10550580  0 0  0 0
   175952  5750   10557120  0 0  0 0


Still less performance than single gig-e.. that giant lock really sucks 
, and why on earth would LAGG require that.. It  seems so simple to fix :/
Anyone up for it:) I wish I was a programmer sometimes, but network 
engineering will have to do. :D




Julian Elischer wrote:

Paul wrote:
Is PF better than ipfw?  iptables almost has no impact on routing 
performance unless I add a swath of rules to it and then it bombs
I need maybe 10 rules max  and I don't want 20% performance drop for 
that.. :P


well lots of people have wanted to fix it, and I've investigated
quite a lot but it takes someone with 2 weeks of free time and
all the right clue. It's not inherrent in ipfw but it needs some
TLC from someone who cares :-).



Ouch! :)  Is this going to be fixed any time soon?  We have some 
money that can be used for development costs to fix things like this 
because
we use linux and freebsd machines as firewalls for a lot of customers 
and with the increasing bandwidth and pps the customers are demanding 
more and I
can't give them better performance with a brand new dual xeon or 
opteron machine vs the old p4 machines I have them running on now :/  
The only difference
in the new machine vs old machine is that the new one can take in 
more pps and drop it but it can't route a whole lot more.  
Routing/firewalling must still not be lock free, ugh.. :P


Thanks



Julian Elischer wrote:

Paul wrote:

ULE without PREEMPTION is now yeilding better results.
input  (em0)   output
  packets  errs  bytespackets  errs  bytes colls
   571595 40639   34564108  1 0226 0
   577892 48865   34941908  1 0178 0
   545240 84744   32966404  1 0178 0
   587661 44691   35534512  1 0178 0
   587839 38073   35544904  1 0178 0
   587787 43556   35540360  1 0178 0
   540786 39492   32712746  1 0178 0
   572071 55797   34595650  1 0178 0
 
*OUCH, IPFW HURTS..
loading ipfw, and adding one ipfw rule allow ip from any to any 
drops 100Kpps off :/ what's up with THAT?
unloaded ipfw module and back 100kpps more again, that's not right 
with ONE rule.. :/


ipfw need sto gain a lock on hte firewall before running,
and is quite complex..  I can believe it..

in FreeBSD 4.8 I was able to use ipfw and filter 1Gb between two 
interfaces (bridged) but I think it has slowed down since then due 
to the SMP locking.





em0 taskq is still jumping cpus.. is there any way to lock it to 
one cpu or is this just a function of ULE


running a tar czpvf all.tgz *  and seeing if pps changes..
negligible.. guess scheduler is doing it's job at least..

Hmm. even when it's getting 50-60k errors per second on the 
interface I can still SCP a file through that interface although 
it's not fast.. 3-4MB/s..


You know, I wouldn't care if it added 5ms latency to the packets 
when it was doing 1mpps as long as it didn't drop any.. Why can't 
it do that? 

RE: Poor network performance for clients in 100MB toGigabit environment

2008-07-01 Thread David Kwan
I've attempt many standard and non-standard permutations of the tcp
tuning parameters without much successful via sysctl.  It feels like
FreeBSD is not handling the congestion very well and is beyond tuning
sysctl.  It's just clients on the 100MB networks has slow/erratic reads;
Clients on the Gigabit network are fine and screams, so the original tcp
parameters are just fine for them.

For the record, these are the sysctl options for the Linux and FreeBSD. 

Linux:
net.ipv4.conf.eth0.force_igmp_version = 0
net.ipv4.conf.eth0.disable_policy = 0
net.ipv4.conf.eth0.disable_xfrm = 0
net.ipv4.conf.eth0.arp_ignore = 0
net.ipv4.conf.eth0.arp_announce = 0
net.ipv4.conf.eth0.arp_filter = 0
net.ipv4.conf.eth0.tag = 0
net.ipv4.conf.eth0.log_martians = 0
net.ipv4.conf.eth0.bootp_relay = 0
net.ipv4.conf.eth0.medium_id = 0
net.ipv4.conf.eth0.proxy_arp = 0
net.ipv4.conf.eth0.accept_source_route = 0
net.ipv4.conf.eth0.send_redirects = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.eth0.shared_media = 1
net.ipv4.conf.eth0.secure_redirects = 1
net.ipv4.conf.eth0.accept_redirects = 1
net.ipv4.conf.eth0.mc_forwarding = 0
net.ipv4.conf.eth0.forwarding = 0
net.ipv4.conf.lo.force_igmp_version = 0
net.ipv4.conf.lo.disable_policy = 1
net.ipv4.conf.lo.disable_xfrm = 1
net.ipv4.conf.lo.arp_ignore = 0
net.ipv4.conf.lo.arp_announce = 0
net.ipv4.conf.lo.arp_filter = 0
net.ipv4.conf.lo.tag = 0
net.ipv4.conf.lo.log_martians = 0
net.ipv4.conf.lo.bootp_relay = 0
net.ipv4.conf.lo.medium_id = 0
net.ipv4.conf.lo.proxy_arp = 0
net.ipv4.conf.lo.accept_source_route = 1
net.ipv4.conf.lo.send_redirects = 1
net.ipv4.conf.lo.rp_filter = 0
net.ipv4.conf.lo.shared_media = 1
net.ipv4.conf.lo.secure_redirects = 1
net.ipv4.conf.lo.accept_redirects = 1
net.ipv4.conf.lo.mc_forwarding = 0
net.ipv4.conf.lo.forwarding = 0
net.ipv4.conf.default.force_igmp_version = 0
net.ipv4.conf.default.disable_policy = 0
net.ipv4.conf.default.disable_xfrm = 0
net.ipv4.conf.default.arp_ignore = 0
net.ipv4.conf.default.arp_announce = 0
net.ipv4.conf.default.arp_filter = 0
net.ipv4.conf.default.tag = 0
net.ipv4.conf.default.log_martians = 0
net.ipv4.conf.default.bootp_relay = 0
net.ipv4.conf.default.medium_id = 0
net.ipv4.conf.default.proxy_arp = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.default.send_redirects = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.shared_media = 1
net.ipv4.conf.default.secure_redirects = 1
net.ipv4.conf.default.accept_redirects = 1
net.ipv4.conf.default.mc_forwarding = 0
net.ipv4.conf.default.forwarding = 0
net.ipv4.conf.all.force_igmp_version = 0
net.ipv4.conf.all.disable_policy = 0
net.ipv4.conf.all.disable_xfrm = 0
net.ipv4.conf.all.arp_ignore = 0
net.ipv4.conf.all.arp_announce = 0
net.ipv4.conf.all.arp_filter = 0
net.ipv4.conf.all.tag = 0
net.ipv4.conf.all.log_martians = 0
net.ipv4.conf.all.bootp_relay = 0
net.ipv4.conf.all.medium_id = 0
net.ipv4.conf.all.proxy_arp = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.all.send_redirects = 1
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.all.shared_media = 1
net.ipv4.conf.all.secure_redirects = 1
net.ipv4.conf.all.accept_redirects = 1
net.ipv4.conf.all.mc_forwarding = 0
net.ipv4.conf.all.forwarding = 0
net.ipv4.neigh.eth0.locktime = 99
net.ipv4.neigh.eth0.proxy_delay = 79
net.ipv4.neigh.eth0.anycast_delay = 99
net.ipv4.neigh.eth0.proxy_qlen = 64
net.ipv4.neigh.eth0.unres_qlen = 3
net.ipv4.neigh.eth0.gc_stale_time = 60
net.ipv4.neigh.eth0.delay_first_probe_time = 5
net.ipv4.neigh.eth0.base_reachable_time = 30
net.ipv4.neigh.eth0.retrans_time = 99
net.ipv4.neigh.eth0.app_solicit = 0
net.ipv4.neigh.eth0.ucast_solicit = 3
net.ipv4.neigh.eth0.mcast_solicit = 3
net.ipv4.neigh.lo.locktime = 99
net.ipv4.neigh.lo.proxy_delay = 79
net.ipv4.neigh.lo.anycast_delay = 99
net.ipv4.neigh.lo.proxy_qlen = 64
net.ipv4.neigh.lo.unres_qlen = 3
net.ipv4.neigh.lo.gc_stale_time = 60
net.ipv4.neigh.lo.delay_first_probe_time = 5
net.ipv4.neigh.lo.base_reachable_time = 30
net.ipv4.neigh.lo.retrans_time = 99
net.ipv4.neigh.lo.app_solicit = 0
net.ipv4.neigh.lo.ucast_solicit = 3
net.ipv4.neigh.lo.mcast_solicit = 3
net.ipv4.neigh.default.gc_thresh3 = 1024
net.ipv4.neigh.default.gc_thresh2 = 512
net.ipv4.neigh.default.gc_thresh1 = 128
net.ipv4.neigh.default.gc_interval = 30
net.ipv4.neigh.default.locktime = 99
net.ipv4.neigh.default.proxy_delay = 79
net.ipv4.neigh.default.anycast_delay = 99
net.ipv4.neigh.default.proxy_qlen = 64
net.ipv4.neigh.default.unres_qlen = 3
net.ipv4.neigh.default.gc_stale_time = 60
net.ipv4.neigh.default.delay_first_probe_time = 5
net.ipv4.neigh.default.base_reachable_time = 30
net.ipv4.neigh.default.retrans_time = 99
net.ipv4.neigh.default.app_solicit = 0
net.ipv4.neigh.default.ucast_solicit = 3
net.ipv4.neigh.default.mcast_solicit = 3
net.ipv4.tcp_slow_start_after_idle = 1
net.ipv4.tcp_workaround_signed_windows = 1
net.ipv4.tcp_bic_beta = 819
net.ipv4.tcp_tso_win_divisor = 8
net.ipv4.tcp_moderate_rcvbuf = 1
net.ipv4.tcp_bic_low_window = 14

Re: Route messages

2008-07-01 Thread Mike Tancsa

At 05:24 AM 7/1/2008, Bjoern A. Zeeb wrote:


So I had a very quick look at the code between doing something else.
I think the only change needed is this if I am not mistaken but my
head is far away nowhere close enough in this code.


Hi,
The patch seems to work in that there is not an RTM_MISS 
message generated per packet forwarded on my test box.  Is it the 
final / correct version ?


---Mike 


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Poor network performance for clients in 100MB toGigabit environment

2008-07-01 Thread Adam McDougall
Are the NFS mounts UDP or TCP on Linux and FreeBSD?  I believe FreeBSD
still defaults to UDP which can act differently especially for NFS.

On Tue, Jul 01, 2008 at 05:30:35PM -0700, David Kwan wrote:

  I've attempt many standard and non-standard permutations of the tcp
  tuning parameters without much successful via sysctl.  It feels like
  FreeBSD is not handling the congestion very well and is beyond tuning
  sysctl.  It's just clients on the 100MB networks has slow/erratic reads;
  Clients on the Gigabit network are fine and screams, so the original tcp
  parameters are just fine for them.
  
  
  David K.
  
  
  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of Paul
  Sent: Tuesday, July 01, 2008 1:21 PM
  To: David Kwan
  Cc: freebsd-net@freebsd.org
  Subject: Re: Poor network performance for clients in 100MB toGigabit
  environment
  
  What options do you have enabled on the linux server?
  sysctl -a | grep net.ipv4.tcp
  and on the bsd
  sysctl -a net.inet.tcp
  
  It sounds like a problem with BSD not handing the dropped data or ack 
  packets so what happens is it pushes a burst of
  data out  100mbit and the switch drops the packets and then BSD waits 
  too long to recover and doesn't scale the transmission
  back.  TCP is supposed to scale down the transmission speed until 
  packets are not dropped to a point even without ECN.
  
  Options such as 'reno' and 'sack' etc. are congestion control algorithms
  
  that use congestion windows.
  
  
  David Kwan wrote:
   I have a couple of questions regarding the TCP Stack:
  

  
   I have a situation with clients on a 100MB network connecting to
  servers
   on a Gigabit network where the client read speeds are very slow from
  the
   FreeBSD server and fast from the Linux server; Write speeds from the
   clients to both servers are fast.  (Clients on the gigabit network
  work
   fine with blazing read and write speeds).  The network traces shows
   congestion packets for both servers when doing reads from the clients
   (dup acks and retransmissions), but the Linux server seem to handle
  the
   congestion better. ECN is not enabled on the network and I don't see
  any
   congestion windowing or clients window changing.  The 100MB/1G switch
  
   is dropping packets.   I double checked the network configuration and
   also swapped swithports for the servers to use the others to make sure
   the switch configuration are the same, and the Linux always does
  better
   than FreeBSD.  Assuming that the network configuration is a constant
  for
   all clients and servers (speed, duplex, and etc...), the only variable
   is the servers themselves (Linux and FreeBSD).  I have tried a couple
  of
   FreeBSD machines with 6.1 and 7.0 and they exhibit the same problem,
   with no luck matching the speed and network utilization of Linux (2
   years old).  The read speed test I'm referring is doing transferring
  of
   a 100MB file (cifs, nfs, and ftp), and the Linux server does it
   consistently in around 10 sec (line speed) with a constant network
   utilization chart, while the FreeBSD servers are magnitudes slower
  with
   erratic network utilization chart.  I've attempted to tweak some
  network
   sysctl  options on the FreeBSD, and the only ones that helped were
   disabling TSO and inflight; which leads me to think that the
   inter-packet gap was slightly increased to partially relieve
  congestion
   on the switch; not a long term solution.
  

  
   My questions are: 
  
 1. Have you heard of this problem before with 100MB clients to
  Gigabit
   servers?
  
 2. Are you aware of any Linux fix/patch in the TCP stack to better
   handling congestion than FreeBSD?  I'm looking to address this issue
  in
   the FreeBSD, but wondering if the Linux stack did something special
  that
   can help with the FreeBSD performance.
  

  
   David K.
  

  
   ___
   freebsd-net@freebsd.org mailing list
   http://lists.freebsd.org/mailman/listinfo/freebsd-net
   To unsubscribe, send any mail to [EMAIL PROTECTED]
  
 
  
  ___
  freebsd-net@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-net
  To unsubscribe, send any mail to [EMAIL PROTECTED]
  ___
  freebsd-net@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-net
  To unsubscribe, send any mail to [EMAIL PROTECTED]
  
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]