Re: Packet forwarding performance

2009-11-02 Thread Bartosz Kuźma
On Mon, Nov 2, 2009 at 20:47, Adriaan  wrote:
> Changing send and recvspace on a router has no effect, except
> unnecessary taking away
> memory space.
>
> When my ADSL line was upgraded to 896 up /7296 down the only thing to
> speed up ftp download speed on
> my workstation was to adjust B net.inet.tcp.recvspace to 65536.
>
> On my old Pentium II router, I did not have to change anything, those
> settings are still the default:
> B net.inet.tcp.recvspace=16384
> B net.inet.tcp.sendspace=16384
>
> [snip]
>
> Adriaan
>
>

There are a lot of free memory on this machine ;-). As I said previous
changing (recv|send)space has no effect.

--
Kind regards, Bartosz Kuzma.



Re: Packet forwarding performance

2009-11-02 Thread Bartosz Kuźma
On Mon, Nov 2, 2009 at 21:35, Stuart Henderson  wrote:
> On 2009-11-02, Bartosz Ku??ma  wrote:
>> I have pair of routers configured with CARP, pfsync, trunk interface
>> and a really simple pf. The main purpose of this system is
>> load-balancing WWW traffic among two web servers. In production
>> environment rdr will be replaced by relayd(8). All of them running on
>> Dell R200 (see attached dmesg below). There is one external interface,
>> one pfsync (both bge(4)) and internal interface is trunk(4) configured
>> with two em(4) in round-robin mode. My pf.conf:
>
> most likely, you'll see some drops in sysctl net.inet.ip.ifq.drops.
> if that's the case, increasing net.inet.ip.ifq.maxlen should help.
>
>
I did several experiments with net.inet.ip.ifq.maxlen but there are no
packet drops (at least when I checked it). I'll retest it tomorrow and
pay more attention to net.inet.ip.ifq.drops.

-- 
Kind regards, Bartosz Kuzma.



Re: Packet forwarding performance

2009-11-02 Thread Stuart Henderson
On 2009-11-02, Adriaan  wrote:
> When my ADSL line was upgraded to 896 up /7296 down the only thing to
> speed up ftp download speed on
> my workstation was to adjust  net.inet.tcp.recvspace to 65536.

there is less risk of hitting router bugs and misconfigured stateful
firewalls if your buffer is one byte shorter, 65535 (which avoids
increasing wscale).



Re: Packet forwarding performance

2009-11-02 Thread Stuart Henderson
On 2009-11-02, Bartosz Ku??ma  wrote:
> I have pair of routers configured with CARP, pfsync, trunk interface
> and a really simple pf. The main purpose of this system is
> load-balancing WWW traffic among two web servers. In production
> environment rdr will be replaced by relayd(8). All of them running on
> Dell R200 (see attached dmesg below). There is one external interface,
> one pfsync (both bge(4)) and internal interface is trunk(4) configured
> with two em(4) in round-robin mode. My pf.conf:

most likely, you'll see some drops in sysctl net.inet.ip.ifq.drops.
if that's the case, increasing net.inet.ip.ifq.maxlen should help.



Re: Packet forwarding performance

2009-11-02 Thread Adriaan
On Mon, Nov 2, 2009 at 4:45 PM, Bartosz KuE:ma 
wrote:
[snip]
 I did system tuning according to
> https://calomel.org/network_performance.html (changed send and
> recevspace to 256144 and several more minor improvements) but without
> effect.
>
> How can I improve packet forwarding speed? Or I just reached upper
> limit of packet forwarding for this machine?

Changing send and recvspace on a router has no effect, except
unnecessary taking away
memory space.

When my ADSL line was upgraded to 896 up /7296 down the only thing to
speed up ftp download speed on
my workstation was to adjust  net.inet.tcp.recvspace to 65536.

On my old Pentium II router, I did not have to change anything, those
settings are still the default:
  net.inet.tcp.recvspace=16384
  net.inet.tcp.sendspace=16384

[snip]

Adriaan



Packet forwarding performance

2009-11-02 Thread Bartosz Kuźma
Hi all!

I have pair of routers configured with CARP, pfsync, trunk interface
and a really simple pf. The main purpose of this system is
load-balancing WWW traffic among two web servers. In production
environment rdr will be replaced by relayd(8). All of them running on
Dell R200 (see attached dmesg below). There is one external interface,
one pfsync (both bge(4)) and internal interface is trunk(4) configured
with two em(4) in round-robin mode. My pf.conf:

ext_if=3Dbge0
ext_carp_if=3Dcarp0
pfsync_if=3Dbge1
int_if=3Dtrunk0
trunk_part_1=3Dem0
trunk_part_2=3Dem1

set limit states 262144

set skip on lo0
set skip on $pfsync_if
set skip on $trunk_part_1
set skip on $trunk_part_2

rdr on $ext_if inet proto tcp \
from any to ($ext_carp_if) port http ->  port http
round-robin sticky-address

block in on $ext_if

pass in quick on $ext_if inet proto tcp \
from any to  port http \
keep state

The highest packet rate which this system is able to forwarding is
about 30k pps. It is rather small and I expected at least 80k pps and
even two times more because of trunk interface at internal side. I
tested it with siege running on several machines connected to switch
at external side. During test top shows very high interrupt CPU
utilization - even 90 %.

I tried bsd and bsd.mp kernel (the second one has a little lower
performance but 3-5 % is acceptable for me). I also replaced
integrated bge(4) with em(4) (Intel PRO 1000/PT Dual port) but without
noticeable improvement. I tried to run all interfaces on the same IRQ
but without effect. I did system tuning according to
https://calomel.org/network_performance.html (changed send and
recevspace to 256144 and several more minor improvements) but without
effect.

How can I improve packet forwarding speed? Or I just reached upper
limit of packet forwarding for this machine?

If anyone need more information about this system (sysctls,
configurations, etc) please ask and I'll post it.

dmesg:

OpenBSD 4.6 (GENERIC) #58: Thu Jul  9 21:24:42 MDT 2009
dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel(R) Xeon(R) CPU X3220 @ 2.40GHz ("GenuineIntel" 686-class) 2.41 =
GHz
cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE=
36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST=
,TM2,CX16,xTPR
real mem  =3D 2145689600 (2046MB)
avail mem =3D 2065989632 (1970MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 05/15/09, BIOS32 rev. 0 @
0xfac90, SMBIOS rev. 2.5 @ 0x7ff9c000 (46 entries)
bios0: vendor Dell Inc. version "1.4.3" date 05/15/2009
bios0: Dell Inc. PowerEdge R200
acpi0 at bios0: rev 2
acpi0: tables DSDT FACP APIC SPCR HPET MCFG WD__ SLIC ERST HEST BERT
EINJ SSDT SSDT SSDT SSDT SSDT
acpi0: wakeup devices PCI0(S5)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: apic clock running at 266MHz
cpu at mainbus0: not configured
cpu at mainbus0: not configured
cpu at mainbus0: not configured
ioapic0 at mainbus0: apid 4 pa 0xfec0, version 20, 24 pins
ioapic0: misconfigured as apic 0, remapped to apid 4
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (PEX1)
acpiprt2 at acpi0: bus 2 (SBE0)
acpiprt3 at acpi0: bus 3 (SBE4)
acpiprt4 at acpi0: bus 4 (SBE5)
acpiprt5 at acpi0: bus 5 (COMP)
acpicpu0 at acpi0: PSS
bios0: ROM list: 0xc/0x9000 0xec000/0x4000!
ipmi at mainbus0 not configured
cpu0: Enhanced SpeedStep 2401 MHz: speeds: 2400, 2133, 1867, 1600 MHz
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 "Intel 3200/3210 Host" rev 0x01
ppb0 at pci0 dev 1 function 0 "Intel 3200/3210 PCIE" rev 0x01: apic 4
int 16 (irq 15)
pci1 at ppb0 bus 1
ppb1 at pci0 dev 28 function 0 "Intel 82801I PCIE" rev 0x02: apic 4
int 16 (irq 15)
pci2 at ppb1 bus 2
em0 at pci2 dev 0 function 0 "Intel PRO/1000 PT (82571EB)" rev 0x06:
apic 4 int 16 (irq 15), address 00:15:17:c8:90:36
em1 at pci2 dev 0 function 1 "Intel PRO/1000 PT (82571EB)" rev 0x06:
apic 4 int 17 (irq 15), address 00:15:17:c8:90:37
ppb2 at pci0 dev 28 function 4 "Intel 82801I PCIE" rev 0x02
pci3 at ppb2 bus 3
bge0 at pci3 dev 0 function 0 "Broadcom BCM5721" rev 0x21, BCM5750 C1
(0x4201): apic 4 int 16 (irq 15), address 00:25:64:3c:10:17
brgphy0 at bge0 phy 1: BCM5750 10/100/1000baseT PHY, rev. 0
ppb3 at pci0 dev 28 function 5 "Intel 82801I PCIE" rev 0x02
pci4 at ppb3 bus 4
bge1 at pci4 dev 0 function 0 "Broadcom BCM5721" rev 0x21, BCM5750 C1
(0x4201): apic 4 int 17 (irq 15), address 00:25:64:3c:10:18
brgphy1 at bge1 phy 1: BCM5750 10/100/1000baseT PHY, rev. 0
uhci0 at pci0 dev 29 function 0 "Intel 82801I USB" rev 0x02: apic 4
int 21 (irq 11)
uhci1 at pci0 dev 29 function 1 "Intel 82801I USB" rev 0x02: apic 4
int 20 (irq 11)
uhci2 at pci0 dev 29 function 2 "Intel 82801I USB" rev 0x02: apic 4
int 21 (irq 11)
ehci0 at pci0 dev 29 function 7 "Intel 82801I USB" rev 0x02: apic 4
int 21 (irq 11)
usb0 at e