openbsd 5.2 amd64: high interrupts/low throughout problem: Intel 10GBE cards, dual xeon x5570 (dell r710)

2013-06-12 Thread John Jasen
(We've seen this problem through a few OpenBSD releases. 5.2 is much
improved over 4.8 in this regard. Using BIOS to disable multiple cores
on each physical CPU also yielded greater throughput. Hyperthreading and
virtualization BIOS extensions are off.)

We have two Dell poweredge 710 systems in an active/passive carp
failover configuration, acting as firewall systems, with another one
currently being used for testing.

Various tests show the system can handle close to 10GbE speeds in and
out, but I've not been able to push much beyond that, hitting a wall at
about 11-12Gb/s.

As I try to push the test system, interrupts climb above 12-15k/second
(seen via systat), consuming more and more of the first CPU until after
about 50% utilisation (according to top), it just hits a wall and
refuses to spit out any more bandwidth.

A coworker was able to drive it up to close to 60k interrupts/second,
but was not able to get much more through it.

Comparison tests, booting a Debian Squeeze live cd, and booting
FreeBSD 9.x, indicate that out of the box, they can push 15-20Gb/s --
which, while lower than what I would expect, is an improvement.

Are there tuning options that we've not seen yet? Googling, reading the
ix(4) manpage have not exposed a clear go faster option, and I'm
concerned about the load and interrupts concentrating on one CPU so heavily.

ifconfig (from test box, minux carp and vlan interfaces) and dmesg enclosed.


OpenBSD 5.2 (GENERIC.MP) #368: Wed Aug  1 10:04:49 MDT 2012
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 12870651904 (12274MB)
avail mem = 12505665536 (11926MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.6 @ 0xbf49c000 (84 entries)
bios0: vendor Dell Inc. version 6.3.0 date 07/24/2012
bios0: Dell Inc. PowerEdge R710
acpi0 at bios0: rev 2
acpi0: sleep states S0 S4 S5
acpi0: tables DSDT FACP APIC SPCR HPET DM__ MCFG WDAT SLIC ERST HEST
BERT EINJ TCPA
acpi0: wakeup devices PCI0(S5)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 16 (boot processor)
cpu0: Intel(R) Xeon(R) CPU X5570 @ 2.93GHz, 2926.44 MHz
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF
,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,DCA,SSE4.1,SSE4.2,POPCNT,NXE,LONG,LAHF
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: apic clock running at 132MHz
cpu1 at mainbus0: apid 0 (application processor)
cpu1: Intel(R) Xeon(R) CPU X5570 @ 2.93GHz, 2926.00 MHz
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF
,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,DCA,SSE4.1,SSE4.2,POPCNT,NXE,LONG,LAHF
cpu1: 256KB 64b/line 8-way L2 cache
ioapic0 at mainbus0: apid 0 pa 0xfec0, version 20, 24 pins
ioapic1 at mainbus0: apid 1 pa 0xfec8, version 20, 24 pins
ioapic1: misconfigured as apic 0, remapped to apid 1
acpihpet0 at acpi0: 14318179 Hz
acpimcfg0 at acpi0 addr 0xe000, bus 0-255
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (PEX1)
acpiprt2 at acpi0: bus 2 (PEX3)
acpiprt3 at acpi0: bus 3 (PEX4)
acpiprt4 at acpi0: bus 4 (PEX5)
acpiprt5 at acpi0: bus 5 (PEX6)
acpiprt6 at acpi0: bus 6 (PEX7)
acpiprt7 at acpi0: bus 7 (PEX9)
acpiprt8 at acpi0: bus -1 (PEXA)
acpiprt9 at acpi0: bus -1 (SBEX)
acpiprt10 at acpi0: bus 8 (COMP)
acpicpu0 at acpi0
acpicpu1 at acpi0
ipmi at mainbus0 not configured
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 Intel 5520 Host rev 0x13
ppb0 at pci0 dev 1 function 0 Intel X58 PCIE rev 0x13
pci1 at ppb0 bus 1
bnx0 at pci1 dev 0 function 0 Broadcom BCM5709 rev 0x20: apic 1 int 4
bnx1 at pci1 dev 0 function 1 Broadcom BCM5709 rev 0x20: apic 1 int 16
ppb1 at pci0 dev 3 function 0 Intel X58 PCIE rev 0x13
pci2 at ppb1 bus 2
bnx2 at pci2 dev 0 function 0 Broadcom BCM5709 rev 0x20: apic 1 int 0
bnx3 at pci2 dev 0 function 1 Broadcom BCM5709 rev 0x20: apic 1 int 10
ppb2 at pci0 dev 4 function 0 Intel X58 PCIE rev 0x13
pci3 at ppb2 bus 3
mpi0 at pci3 dev 0 function 0 Symbios Logic SAS1068E rev 0x08: msi
scsibus0 at mpi0: 112 targets
sd0 at scsibus0 targ 0 lun 0: Dell, VIRTUAL DISK, 1028 SCSI3 0/direct
fixed naa.600508e0a54b752495c4c504
sd0: 237952MB, 512 bytes/sector, 487325696 sectors
ses0 at scsibus0 targ 8 lun 0: DP, BACKPLANE, 1.07 SCSI3 13/enclosure
services fixed t10.DP_BACKPLANE00
ppb3 at pci0 dev 5 function 0 Intel X58 PCIE rev 0x13: msi
pci4 at ppb3 bus 4
ppb4 at pci0 dev 6 function 0 Intel X58 PCIE rev 0x13: msi
pci5 at ppb4 bus 5
ix0 at pci5 dev 0 function 0 Intel 10GbE SR Dual (82598AF) rev 0x01:
msi, address 00:1b:21:3f:f0:e3
ix1 at pci5 dev 0 function 1 Intel 10GbE SR Dual (82598AF) rev 0x01:
msi, address 00:1b:21:3f:f0:e2
ppb5 at pci0 dev 7 function 0 Intel X58 PCIE rev 0x13: msi
pci6 at ppb5 bus 6
ix2 at pci6 dev 0 function 0 Intel 10GbE SR Dual (82598AF) rev 0x01:
msi, address 00:1b:21:41:b3:43
ix3 at pci6 dev 0 function 1 

Re: openbsd 5.2 amd64: high interrupts/low throughout problem: Intel 10GBE cards, dual xeon x5570 (dell r710)

2013-06-12 Thread Christiano F. Haesbaert
On 13 June 2013 00:15, John Jasen jja...@realityfailure.org wrote:
 (We've seen this problem through a few OpenBSD releases. 5.2 is much
 improved over 4.8 in this regard. Using BIOS to disable multiple cores
 on each physical CPU also yielded greater throughput. Hyperthreading and
 virtualization BIOS extensions are off.)

 We have two Dell poweredge 710 systems in an active/passive carp
 failover configuration, acting as firewall systems, with another one
 currently being used for testing.

 Various tests show the system can handle close to 10GbE speeds in and
 out, but I've not been able to push much beyond that, hitting a wall at
 about 11-12Gb/s.


That is more or less what you will get on OpenBSD.

 As I try to push the test system, interrupts climb above 12-15k/second
 (seen via systat), consuming more and more of the first CPU until after
 about 50% utilisation (according to top), it just hits a wall and
 refuses to spit out any more bandwidth.

 A coworker was able to drive it up to close to 60k interrupts/second,
 but was not able to get much more through it.

 Comparison tests, booting a Debian Squeeze live cd, and booting
 FreeBSD 9.x, indicate that out of the box, they can push 15-20Gb/s --
 which, while lower than what I would expect, is an improvement.

Here your test is probably botched, you're probably just stressing one
queue from the card, which gives you more or less what you're seeing,
you need to send multiple tcp/udp streams, then you can have an idea
how much linux/freebsd can do.

obs: we usually count forwarding rate, so when you say 15gbit/s, most
people say 7.5gbit/s =)


 Are there tuning options that we've not seen yet? Googling, reading the
 ix(4) manpage have not exposed a clear go faster option, and I'm
 concerned about the load and interrupts concentrating on one CPU so heavily.

OpenBSD kernel is single threaded so this is the expected behaviour.



 ifconfig (from test box, minux carp and vlan interfaces) and dmesg enclosed.


 OpenBSD 5.2 (GENERIC.MP) #368: Wed Aug  1 10:04:49 MDT 2012
 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
 real mem = 12870651904 (12274MB)
 avail mem = 12505665536 (11926MB)
 mainbus0 at root
 bios0 at mainbus0: SMBIOS rev. 2.6 @ 0xbf49c000 (84 entries)
 bios0: vendor Dell Inc. version 6.3.0 date 07/24/2012
 bios0: Dell Inc. PowerEdge R710
 acpi0 at bios0: rev 2
 acpi0: sleep states S0 S4 S5
 acpi0: tables DSDT FACP APIC SPCR HPET DM__ MCFG WDAT SLIC ERST HEST
 BERT EINJ TCPA
 acpi0: wakeup devices PCI0(S5)
 acpitimer0 at acpi0: 3579545 Hz, 24 bits
 acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
 cpu0 at mainbus0: apid 16 (boot processor)
 cpu0: Intel(R) Xeon(R) CPU X5570 @ 2.93GHz, 2926.44 MHz
 cpu0:
 FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF
 ,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,DCA,SSE4.1,SSE4.2,POPCNT,NXE,LONG,LAHF
 cpu0: 256KB 64b/line 8-way L2 cache
 cpu0: apic clock running at 132MHz
 cpu1 at mainbus0: apid 0 (application processor)
 cpu1: Intel(R) Xeon(R) CPU X5570 @ 2.93GHz, 2926.00 MHz
 cpu1:
 FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF
 ,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,DCA,SSE4.1,SSE4.2,POPCNT,NXE,LONG,LAHF
 cpu1: 256KB 64b/line 8-way L2 cache
 ioapic0 at mainbus0: apid 0 pa 0xfec0, version 20, 24 pins
 ioapic1 at mainbus0: apid 1 pa 0xfec8, version 20, 24 pins
 ioapic1: misconfigured as apic 0, remapped to apid 1
 acpihpet0 at acpi0: 14318179 Hz
 acpimcfg0 at acpi0 addr 0xe000, bus 0-255
 acpiprt0 at acpi0: bus 0 (PCI0)
 acpiprt1 at acpi0: bus 1 (PEX1)
 acpiprt2 at acpi0: bus 2 (PEX3)
 acpiprt3 at acpi0: bus 3 (PEX4)
 acpiprt4 at acpi0: bus 4 (PEX5)
 acpiprt5 at acpi0: bus 5 (PEX6)
 acpiprt6 at acpi0: bus 6 (PEX7)
 acpiprt7 at acpi0: bus 7 (PEX9)
 acpiprt8 at acpi0: bus -1 (PEXA)
 acpiprt9 at acpi0: bus -1 (SBEX)
 acpiprt10 at acpi0: bus 8 (COMP)
 acpicpu0 at acpi0
 acpicpu1 at acpi0
 ipmi at mainbus0 not configured
 pci0 at mainbus0 bus 0
 pchb0 at pci0 dev 0 function 0 Intel 5520 Host rev 0x13
 ppb0 at pci0 dev 1 function 0 Intel X58 PCIE rev 0x13
 pci1 at ppb0 bus 1
 bnx0 at pci1 dev 0 function 0 Broadcom BCM5709 rev 0x20: apic 1 int 4
 bnx1 at pci1 dev 0 function 1 Broadcom BCM5709 rev 0x20: apic 1 int 16
 ppb1 at pci0 dev 3 function 0 Intel X58 PCIE rev 0x13
 pci2 at ppb1 bus 2
 bnx2 at pci2 dev 0 function 0 Broadcom BCM5709 rev 0x20: apic 1 int 0
 bnx3 at pci2 dev 0 function 1 Broadcom BCM5709 rev 0x20: apic 1 int 10
 ppb2 at pci0 dev 4 function 0 Intel X58 PCIE rev 0x13
 pci3 at ppb2 bus 3
 mpi0 at pci3 dev 0 function 0 Symbios Logic SAS1068E rev 0x08: msi
 scsibus0 at mpi0: 112 targets
 sd0 at scsibus0 targ 0 lun 0: Dell, VIRTUAL DISK, 1028 SCSI3 0/direct
 fixed naa.600508e0a54b752495c4c504
 sd0: 237952MB, 512 bytes/sector, 487325696 sectors
 ses0 at scsibus0 targ 8 lun 0: DP, BACKPLANE, 1.07 

Re: openbsd 5.2 amd64: high interrupts/low throughout problem: Intel 10GBE cards, dual xeon x5570 (dell r710)

2013-06-12 Thread John Jasen
On 06/12/2013 06:22 PM, Christiano F. Haesbaert wrote:
 On 13 June 2013 00:15, John Jasen jja...@realityfailure.org wrote:

 As I try to push the test system, interrupts climb above 12-15k/second
 (seen via systat), consuming more and more of the first CPU until after
 about 50% utilisation (according to top), it just hits a wall and
 refuses to spit out any more bandwidth.

 A coworker was able to drive it up to close to 60k interrupts/second,
 but was not able to get much more through it.

 Comparison tests, booting a Debian Squeeze live cd, and booting
 FreeBSD 9.x, indicate that out of the box, they can push 15-20Gb/s --
 which, while lower than what I would expect, is an improvement.
 
 Here your test is probably botched, you're probably just stressing one
 queue from the card, which gives you more or less what you're seeing,
 you need to send multiple tcp/udp streams, then you can have an idea
 how much linux/freebsd can do.
 
 obs: we usually count forwarding rate, so when you say 15gbit/s, most
 people say 7.5gbit/s =)

Thanks for the feedback. I was worried that we were hitting the upper
bounds of the openbsd kernel. My quick litmus tests with other OSes were
to try and rule in/out the hardware.

Our production network is divided into a few high speed internal zones,
and we have two external connections -- one high speed, the other 1GbE,
but due to be upgraded to 10GbE.

The test environment was an approximation of that, involving servers and
load generators, exercising several to all of the configured firewall
interfaces simultaneously. IE: client1 sends to server1 via ix0 and ix2,
clien2 to server2 via ix3 abnd ix5, etc.

-- 
-- John Jasen (jja...@realityfailure.org)
-- No one will sorrow for me when I die, because those who would
-- are dead already. -- Lan Mandragoran, The Wheel of Time, New Spring