Re: network bandwith with em(4)

2011-12-07 Thread Patrick Lamaiziere
Le Tue, 22 Feb 2011 18:09:32 +0100,
Patrick Lamaiziere patf...@davenulle.org a icrit :

 (4.8/amd64)
 I'm using two ethernet cards Intel 1000/PRO quad ports (gigabit) on a
 firewall (one fiber and one copper).
 
 The problem is that we don't get more than ~320 Mbits/s of bandwith
 beetween the internal networks and internet (gigabit).
 
 As far I can see, on load there is a number of Ierr on the interface
 connected to Internet (between 1% to 5%).
 
 --
 dmesg (on 4.8):
 em0 at pci5 dev 0 function 0 Intel PRO/1000 QP (82571EB) rev
 0x06: apic 1 int 13 (irq 14), address 00:15:17:ed:98:9d
 
 em4 at pci9 dev 0 function 0 Intel PRO/1000 QP (82575GB) rev 0x02:
 apic 1 int 23 (irq 11), address 00:1b:21:38:e0:80

Hello,

This issue (IERR on em) looks to be fixed on 5.0. With 4.8 and 4.9
there were IERR errors with traffic  150 Mbs. With 5.0 there are
only few IERR from time to time, even on high load ( 400 Mbits/s,
40K packets/s in, 30K packets/s out)

I guess that the fixes on em(4) helps. May be the use of MSI interrupts
too because I see a significant improvement on CPU interrupt load
(around 60% in load to 50% with 5.0).
(the measures are averaged on 5 minutes)

That's cool!

There are still some PF congestions from time to time but I have to
investigate. It happens even when the box is idle but may be there are
some burst of traffic. The box has 6 interfaces and I don't believe it
can handle 6 Gbits at once.

Too finish this too long thread, since february we (an university) are
very happy with the reliability of our two PF firewalls, that just
works.

Thanks a lot, regards.



Re: network bandwith with em(4)

2011-04-06 Thread Stuart Henderson
On 2011-02-28, Manuel Guesdon ml+openbsd.m...@oxymium.net wrote:

 OK. Anyway NIC buffers restrict buffered packets number. But the problem
 remain: why a (for exemple) dual Xeon E5520@2.27GHz with Intel PRO/1000
 (82576) can't route 150kpps without Ierr :-)
 http://www.oxymium.net/tmp/core3-dmesg

So looking at this dmesg you have ppb and em sharing ints;
it wouldn't be a total surprise if the Performance degradation
after upgrade thread was relevant:

http://comments.gmane.org/gmane.os.openbsd.misc/184121



Re: network bandwith with em(4)

2011-03-24 Thread Martin Pelikan
2011/3/23 Kapetanakis Giannis bil...@edu.physics.uoc.gr:
 I'm testing my self a 2 port 82571EB on a new fw.
 How are you doing the pps test?

I'm actually reporting the values found in the first systat page. I
have a suspicion these counters act weird on cloning interfaces (I saw
the IPKTS being twice as much as OPKTS on a router without much
local-originating/consuming traffic, with fifty carps and vlans on one
side and bgp on the other), but in all of these tests the values were
more or less the same - around 200k each.
The bandwidth was distributed 113MB/s inbound and 70MB/s outbound
(depending on the way of course), and I watched it in systat ifs.

2011/3/23 Theo de Raadt dera...@cvs.openbsd.org:
 -current kernels contain an option called POOL_DEBUG which has a pretty
 high impact on network traffic. B Unfortunately POOL_DEBUG is useful..

Thank you! I've only played with DEBUG once, but after failing to
explain some of the behaviour I consider myself not educated enough to
play with kernel options...
Unfortunately I probably won't be able to repeat the tests for some
time now, as the machine is already in production.

--
Martin Pelikan



Re: network bandwith with em(4)

2011-03-23 Thread Martin Pelikan
Hi,
we just bought a new firewall, so I did some tests. It has 2
integrated i82574L's and we use 2port i82571EB. I tested routing
through this box with a simple match out on em1 nat-to (em1) rule,
using 4.8-stable, tcpbench on all five end computers and here's what I
got:
- maximum throughput 183 MB/s according to systat ifs - total. almost
exactly 200kpps in each direction.
- the difference between amd64-SP and amd64-MP is insignificant (few
percent of cpu load, SP better)
- the difference between amd64-SP and i386-SP is noticeable (the
throughput stays, load decreases a bit more, i386 better)
- I couldn't boot i386-MP -stable version, the system kept rebooting
after fs checks...
- the difference between 82574L and 82571EB is quite big (574L at 183
MB/s and i386-SP had cpu load about 70-80% (intr), whereas 571EB
performed the same with about 45-55% interrupt cpu load!)
- tuning of ITR or the amount of Tx/Rx descriptors used per card is
useless (at least here, different kind of traffic might behave
different way) - even if you gain a few megabits, you are still
risking latency problems (probably system usability?)
- at the end of the day I tried 4.9 -current amd64 from 18th March and
it actually performed worse - around 175 MB/s max and 70% CPU with
571EBs.
- it's a brilliant motherboard, compared to our other 6 Intels

Is there anything I should test or mention and I didn't? Still, hope
this helps someone...

dmesg below:

OpenBSD 4.8-stable (GENERIC.MP) #0: Tue Mar 22 17:42:14 CET 2011
peli...@koza.steadynet.cz:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 2137653248 (2038MB)
avail mem = 2066927616 (1971MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.6 @ 0x9f000 (68 entries)
bios0: vendor American Megatrends Inc. version 1.1 date 05/27/2010
bios0: Supermicro X8SIL
acpi0 at bios0: rev 2
acpi0: sleep states S0 S1 S4 S5
acpi0: tables DSDT FACP APIC MCFG OEMB HPET GSCI SSDT
acpi0: wakeup devices P0P1(S4) P0P3(S4) P0P4(S4) P0P5(S4) P0P6(S4)
BR1E(S4) PS2K(S4) PS2M(S4) USB0(S4) USB1(S4) USB2(S4) USB3(S4)
USB4(S4) USB5(S4) USB6(S4) GBE_(S4) BR20(S4
) BR21(S4) BR22(S4) BR23(S4) BR24(S4) BR25(S4) BR26(S4) BR27(S4)
EUSB(S4) USBE(S4) SLPB(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i3 CPU 540 @ 3.07GHz, 3067.11 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PD
CM,SSE4.1,SSE4.2,POPCNT,NXE,LONG
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: apic clock running at 133MHz
cpu1 at mainbus0: apid 4 (application processor)
cpu1: Intel(R) Core(TM) i3 CPU 540 @ 3.07GHz, 3066.67 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PD
CM,SSE4.1,SSE4.2,POPCNT,NXE,LONG
cpu1: 256KB 64b/line 8-way L2 cache
ioapic0 at mainbus0: apid 5 pa 0xfec0, version 20, 24 pins
ioapic0: misconfigured as apic 1, remapped to apid 5
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (P0P1)
acpiprt2 at acpi0: bus -1 (P0P3)
acpiprt3 at acpi0: bus -1 (P0P5)
acpiprt4 at acpi0: bus -1 (P0P6)
acpiprt5 at acpi0: bus 4 (BR1E)
acpiprt6 at acpi0: bus 1 (BR20)
acpiprt7 at acpi0: bus 2 (BR24)
acpiprt8 at acpi0: bus 3 (BR25)
acpicpu0 at acpi0: C3, C2, C1, PSS
acpicpu1 at acpi0: C3, C2, C1, PSS
acpibtn0 at acpi0: SLPB
acpibtn1 at acpi0: PWRB
ipmi at mainbus0 not configured
cpu0: Enhanced SpeedStep 3066 MHz: speeds: 3067, 2933, 2800, 2667,
2533, 2400, 2267, 2133, 2000, 1867, 1733, 1600, 1467, 1333, 1200 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 vendor Intel, unknown product 0x0048 rev 0x18
ppb0 at pci0 dev 28 function 0 Intel 3400 PCIE rev 0x05: apic 5 int
17 (irq 10)
pci1 at ppb0 bus 1
em0 at pci1 dev 0 function 0 Intel PRO/1000 PT (82571EB) rev 0x06:
apic 5 int 16 (irq 11), address 00:1b:21:82:67:0a
em1 at pci1 dev 0 function 1 Intel PRO/1000 PT (82571EB) rev 0x06:
apic 5 int 17 (irq 10), address 00:1b:21:82:67:0b
ppb1 at pci0 dev 28 function 4 Intel 3400 PCIE rev 0x05: apic 5 int
17 (irq 10)
pci2 at ppb1 bus 2
em2 at pci2 dev 0 function 0 Intel PRO/1000 MT (82574L) rev 0x00:
apic 5 int 16 (irq 11), address 00:25:90:0e:77:7a
ppb2 at pci0 dev 28 function 5 Intel 3400 PCIE rev 0x05: apic 5 int
16 (irq 11)
pci3 at ppb2 bus 3
em3 at pci3 dev 0 function 0 Intel PRO/1000 MT (82574L) rev 0x00:
apic 5 int 17 (irq 10), address 00:25:90:0e:77:7b
ehci0 at pci0 dev 29 function 0 Intel 3400 USB rev 0x05: apic 5 int
23 (irq 15)
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 Intel EHCI root hub rev 2.00/1.00 addr 1
ppb3 at pci0 dev 30 function 0 Intel 82801BA Hub-to-PCI rev 0xa5
pci4 at ppb3 bus 4
vga1 at pci4 dev 3 function 0 Matrox MGA G200eW rev 0x0a
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added 

Re: network bandwith with em(4)

2011-03-23 Thread Theo de Raadt
 - at the end of the day I tried 4.9 -current amd64 from 18th March and
 it actually performed worse - around 175 MB/s max and 70% CPU with
 571EBs.

-current kernels contain an option called POOL_DEBUG which has a pretty
high impact on network traffic.  Unfortunately POOL_DEBUG is useful..



Re: network bandwith with em(4)

2011-03-23 Thread Kapetanakis Giannis
On 23/03/11 16:59, Martin Pelikan wrote:
 Hi,
 we just bought a new firewall, so I did some tests. It has 2
 integrated i82574L's and we use 2port i82571EB. I tested routing
 through this box with a simple match out on em1 nat-to (em1) rule,
 using 4.8-stable, tcpbench on all five end computers and here's what I
 got:
 - maximum throughput 183 MB/s according to systat ifs - total. almost
 exactly 200kpps in each direction.


I'm testing my self a 2 port 82571EB on a new fw.
How are you doing the pps test?

Giannis

[demime 1.01d removed an attachment of type application/pkcs7-signature which 
had a name of smime.p7s]



Re: network bandwith with em(4)

2011-03-13 Thread Peter Hunčár
Hello

I have couple of old ProLiants with bxp/em interfaces with 4.8 stable.
If you provide me more info what to test extactly and what output to send,
I'd gladly help.

BR

Peter
On 13 Mar 2011 03:56, Ryan McBride mcbr...@openbsd.org wrote:
 On Sat, Mar 12, 2011 at 06:29:42PM -0800, Chris Cappuccio wrote:
  Are you suggesting that because you have a quad-port gig nic, your box
  should be able to do 6 *million* packets per second? By that logic my
  5-port Soekris net4801 should be able to handle 740kpps. (for
reference,
  the net4801 does about 3kpps with 4.9)

 are you sure? that seems low, the 4501 used to do 4kpps with openbsd 3.3
!

 Quite sure, though I certainly welcome someone else doing independent
 testing to prove me wrong. (FWIW: I tested 3.3 last month and got a
 maximum of 2400pps before packet loss exceeded 1%)

 The numbers above are for IP forwarding (not bridging), no PF, TCP syn
 packets with random ports, ISN, and source address, but fixed
 destination address. Measurements are on either side of the device
 using SNMP on the switch, and they match very closely what I'm seeing
 from the endpoints on either side of the firewall. The results are also
 stable across the more than 30,000 individual tests I've run to date
 against a variety of hardware and versions (automated, of course!)

 Note that If you measure on the box itself (i.e. the IPKTS/OPKTS) you
 will get lies when the system is livelocking. If you push harder you can
 get more packets through the soekris but it's meaningless as most of the
 packets are being dropped and the box is completely livelocked.



Re: network bandwith with em(4)

2011-03-12 Thread RLW

W dniu 2011-03-12 01:26, Stuart Henderson pisze:

On 2011-03-11, RLWseran...@o2.pl  wrote:

Because lately some people wrote to the group about network bandwidth
problems with em(4) i have run some test myself.


Most of the recent posts about this have been about packet
forwarding perfornance; sourcing/sinking packets on the box
itself is also interesting of course, but it's a totally
separate measurement.


bandwidth test by IPERF and NETPERF:
iperf -c 10.0.0.X -t 60 -i 5
netperf -H 10.0.0.X -p 9192 -n1 -l 10


Not sure about netperf but from what I remember iperf
isn't a great performer on OpenBSD.




I will happily run some network tests. Could someone suggest best 
programs, methods, etc for the job?

I have bnx(4) dual port pcie x4, em(4) inegrated and pcie x1 cards.

best regards,
RLW



Re: network bandwith with em(4)

2011-03-12 Thread Chris Cappuccio
Ryan McBride [mcbr...@openbsd.org] wrote:
 
 Are you suggesting that because you have a quad-port gig nic, your box
 should be able to do 6 *million* packets per second? By that logic my
 5-port Soekris net4801 should be able to handle 740kpps. (for reference,
 the net4801 does about 3kpps with 4.9)
 

are you sure? that seems low, the 4501 used to do 4kpps with openbsd 3.3 !



Re: network bandwith with em(4)

2011-03-12 Thread Ryan McBride
On Sat, Mar 12, 2011 at 06:29:42PM -0800, Chris Cappuccio wrote:
  Are you suggesting that because you have a quad-port gig nic, your box
  should be able to do 6 *million* packets per second? By that logic my
  5-port Soekris net4801 should be able to handle 740kpps. (for reference,
  the net4801 does about 3kpps with 4.9)
 
 are you sure? that seems low, the 4501 used to do 4kpps with openbsd 3.3 !

Quite sure, though I certainly welcome someone else doing independent
testing to prove me wrong. (FWIW: I tested 3.3 last month and got a
maximum of 2400pps before packet loss exceeded 1%)

The numbers above are for IP forwarding (not bridging), no PF, TCP syn
packets with random ports, ISN, and source address, but fixed
destination address.  Measurements are on either side of the device
using SNMP on the switch, and they match very closely what I'm seeing
from the endpoints on either side of the firewall. The results are also
stable across the more than 30,000 individual tests I've run to date
against a variety of hardware and versions (automated, of course!)

Note that If you measure on the box itself (i.e. the IPKTS/OPKTS) you
will get lies when the system is livelocking. If you push harder you can
get more packets through the soekris but it's meaningless as most of the
packets are being dropped and the box is completely livelocked.



Re: network bandwith with em(4)

2011-03-11 Thread Tom Murphy
I fixed my issue. I demoted the OpenBSD 4.4 machine so the 4.8 one took 
over as CARP master, downed pfsync0 on both machines and now the 4.8
box is happily passing tons of packets. It was pfsync0 that was messing
up 4.8 even with defer: off it was struggling.

Going to test it for about a week, then upgrade the remaining 4.4 box
to 4.8. Thank goodness it wasn't a hardware issue.

Tom



Re: network bandwith with em(4)

2011-03-11 Thread RLW

W dniu 2011-03-05 21:24, Manuel Guesdon pisze:

On Sat, 5 Mar 2011 22:09:51 +0900
Ryan McBridemcbr...@openbsd.org  wrote:


| On Fri, Feb 25, 2011 at 08:40:10PM +0100, Manuel Guesdon wrote:
|  systat -s 2 vmstat:
|
| 3.2%Int   0.1%Sys   0.0%Usr   0.0%Nic  96.8%Idle
|  |||||||||||
|
| The numbers presented here are calculated against the sum of your CPUs.
| Since you are running bsd.mp with hyperthreading turned on, your machine
| has 16 CPUs; each CPU accounts for about 6% of the total available so
| the 3.2%Int value in your systat vmstat means that you have one cpu
| (the only one that is actually working in the kernel) about 50% in
| interrupt context.
|
| The exact behaviour varies from hardware to hardware, but it's not
| surprising that you start losing packets at this level of load.


OK. Understood. Thank you. I'll try SP kernel with mulithread disabled as soon
as I can and make some tests.

Manuel

--
__
Manuel Guesdon - OXYMIUM





Hello,

Because lately some people wrote to the group about network bandwidth 
problems with em(4) i have run some test myself.


On the same hardware i have run tests on Debian and OpenBSD.
It seems, there might be something in OpenBSD that slows bandwidth on 
gbit NICs.


Below detailed info.

I can run some more tests if someone push me in the wright direction ;)

--

TEST BOX:

Mainbord: Intel D955XBK
CPU: Pentium 4 3GHz, HT disabled
LAN 1 (integrated): Gigabit (10/100/1000 Mbits/sec) LAN subsystem using 
the Intel. 82573E/82573V/82574V Gigabit Ethernet Controller
LAN 2 (pcie x4): HP NC380T PCI-E x4 Dual Port Multifunction Gigabit 
Server NIC

HDD1: OpenBSD 4.8 i386, pf disabled
HDD2: Debian  6.0 i386

bandwidth test by IPERF and NETPERF:
iperf -c 10.0.0.X -t 60 -i 5
netperf -H 10.0.0.X -p 9192 -n1 -l 10

--

DMESG OpenBSD 4.8 CD INSTALL:

cpu0: Intel(R) Pentium(R) 4 CPU 3.00GHz (GenuineIntel 686-class) 3.01 GHz
cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,
PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,
DS-CPL,EST,CNXT-ID,CX16,xTPR
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: apic clock running at 200MHz
cpu at mainbus0: not configured
acpicpu0 at acpi0: FVS, 3000, 2800 MHz

bnx0 at pci3 dev 4 function 0 Broadcom BCM5706 rev 0x02: apic 2 int 16 
(irq 11)

brgphy0 at bnx0 phy 1: BCM5706 10/100/1000baseT/SX PHY, rev. 2

em0 at pci6 dev 0 function 0 Intel PRO/1000MT (82573E) rev 0x03: apic 
2 int 17 (irq 10)


--

DMESG Debian 6.0:

Linux version 2.6.32-5-686 (Debian 2.6.32-30)
CPU0: Intel(R) Pentium(R) 4 CPU 3.00GHz stepping 03

e1000e :04:00.0: irq 28 for MSI/MSI-X
e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.0.2 (Aug 21, 2009)
eth0: Broadcom NetXtreme II BCM5706 1000Base-T (A2) PCI-X 64-bit 100MHz 
found at mem 2400, IRQ 16
eth1: Broadcom NetXtreme II BCM5706 1000Base-T (A2) PCI-X 64-bit 100MHz 
found at mem 2200, IRQ 17


lspci:
Ethernet controller: Broadcom Corporation NetXtreme II BCM5706 Gigabit 
Ethernet (rev 02)
Ethernet controller: Broadcom Corporation NetXtreme II BCM5706 Gigabit 
Ethernet (rev 02)
Ethernet controller: Intel Corporation 82573V Gigabit Ethernet 
Controller (Copper) (rev 03)


--

test 1a @ bnx0 (pcie x4): iperf from OpenBSD 4.8 - Debian 6.0

[  3] 45.0-50.0 sec236 MBytes395 Mbits/sec
[ ID] Interval   Transfer Bandwidth
[  3] 50.0-55.0 sec236 MBytes396 Mbits/sec
[ ID] Interval   Transfer Bandwidth
[  3] 55.0-60.0 sec236 MBytes396 Mbits/sec
[ ID] Interval   Transfer Bandwidth
[  3]  0.0-60.0 sec  2.76 GBytes396 Mbits/sec

load averages:  0.42,  0.27,  0.18   rlw.local.kig 
12:07:09

22 processes:  1 running, 20 idle, 1 on processor
CPU states:  0.0% user,  0.0% nice, 33.1% system, 22.2% interrupt, 44.7% 
idle

Memory: Real: 9060K/46M act/tot  Free: 445M  Swap: 0K/764M used/tot

--

test 1b @ bnx0 (pcie x4): netperf from OpenBSD 4.8 - OpenBSD 4.8

Recv   SendSend
Socket Socket  Message  Elapsed
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec

 16384  16384  1638410.25 453.33

load averages:  0.51,  0.19,  0.07   rlw.local.kig 
14:55:05

24 processes:  23 idle, 1 on processor
CPU states:  0.2% user,  0.0% nice, 45.1% system, 33.7% interrupt, 21.0% 
idle

Memory: Real: 8672K/37M act/tot  Free: 454M  Swap: 0K/764M used/tot

  PID USERNAME PRI NICE  SIZE   RES STATE WAIT  TIMECPU COMMAND
18327 root   20  396K  812K sleep netio 0:04 15.48% netperf

--

test 2a @ em0 (integrated): iperf from OpenBSD 4.8 - Debian 6.0

[ ID] Interval   Transfer 

Re: network bandwith with em(4)

2011-03-11 Thread Alexey Suslikov
RLW wrote:
 W dniu 2011-03-05 21:24, Manuel Guesdon pisze:
  On Sat, 5 Mar 2011 22:09:51 +0900
  Ryan McBridemcbr...@openbsd.org  wrote:
 
  | On Fri, Feb 25, 2011 at 08:40:10PM +0100, Manuel Guesdon wrote:
  |  systat -s 2 vmstat:
  |
  | 3.2%Int   0.1%Sys   0.0%Usr   0.0%Nic  96.8%Idle
  |  |||||||||||
  |
  | The numbers presented here are calculated against the sum of your
CPUs.
  | Since you are running bsd.mp with hyperthreading turned on, your
machine
  | has 16 CPUs; each CPU accounts for about 6% of the total available
so
  | the 3.2%Int value in your systat vmstat means that you have one cpu
  | (the only one that is actually working in the kernel) about 50% in
  | interrupt context.
  |
  | The exact behaviour varies from hardware to hardware, but it's not
  | surprising that you start losing packets at this level of load.
 
  OK. Understood. Thank you. I'll try SP kernel with mulithread disabled as
soon
  as I can and make some tests.
 
  Manuel
 
  --
  __
  Manuel Guesdon - OXYMIUM
 
 


 Hello,

 Because lately some people wrote to the group about network bandwidth
 problems with em(4) i have run some test myself.

 On the same hardware i have run tests on Debian and OpenBSD.
 It seems, there might be something in OpenBSD that slows bandwidth on
 gbit NICs.

 Below detailed info.

snip

How about MTU? Did you have jumbo frames enabled on Debian?

Alexey



Re: network bandwith with em(4)

2011-03-11 Thread Stuart Henderson
On 2011-03-11, RLW seran...@o2.pl wrote:
 Because lately some people wrote to the group about network bandwidth 
 problems with em(4) i have run some test myself.

Most of the recent posts about this have been about packet
forwarding perfornance; sourcing/sinking packets on the box
itself is also interesting of course, but it's a totally
separate measurement.

 bandwidth test by IPERF and NETPERF:
 iperf -c 10.0.0.X -t 60 -i 5
 netperf -H 10.0.0.X -p 9192 -n1 -l 10

Not sure about netperf but from what I remember iperf
isn't a great performer on OpenBSD.



Re: network bandwith with em(4)

2011-03-10 Thread Tom Murphy
Hi,

  I had a pair of Dell PowerEdge R200s that have both em(4) and bge(4)s 
in them, however, it's the em(4) doing the heavy lifting. Roughly 30-40 
megabits/s sustained and doing anywhere between 3000-4000 packets/s.

  On OpenBSD 4.4, it happily forwards packets along. I upgraded one of 
the firewalls to 4.8 and switched CARP over to it (yes, I know the 
redundancy is broken anyway now.) and it couldn't seem to handle the 
traffic. Any inbound connections would stall and I have no idea why.

  There were no net.inet.ip.ifq.drops, but I noticed 10 livelocks when
running systat mbufs (on em0). Could MCLGETI be hindering performance?
Is there anything I can try?

   Tom

OpenBSD 4.8 (GENERIC) #136: Mon Aug 16 09:06:23 MDT 2010
dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel(R) Core(TM)2 Duo CPU E4500 @ 2.20GHz (GenuineIntel 686-class) 
2.21 GHz
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,EST
,TM2,SSSE3,CX16,xTPR,PDCM
real mem  = 1071947776 (1022MB)
avail mem = 1044451328 (996MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 10/24/07, BIOS32 rev. 0 @ 0xfadd0, SMBIOS 
rev. 2.5 @ 0x3ff9c000 (46 entries)
bios0: vendor Dell Inc. version 1.0.0 date 10/24/2007
bios0: Dell Inc. PowerEdge R200
acpi0 at bios0: rev 2
acpi0: sleep states S0 S4 S5
acpi0: tables DSDT FACP APIC SPCR HPET MCFG WD__ SLIC ERST HEST BERT EINJ SSDT 
SSDT SSDT
acpi0: wakeup devices PCI0(S5)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: apic clock running at 200MHz
cpu at mainbus0: not configured
ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 24 pins
ioapic0: misconfigured as apic 0, remapped to apid 2
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (PEX1)
acpiprt2 at acpi0: bus 2 (SBE0)
acpiprt3 at acpi0: bus 3 (SBE4)
acpiprt4 at acpi0: bus 4 (SBE5)
acpiprt5 at acpi0: bus 5 (COMP)
acpicpu0 at acpi0: PSS
bios0: ROM list: 0xc/0x9000 0xec000/0x4000!
ipmi at mainbus0 not configured
cpu0: Enhanced SpeedStep 2201 MHz: speeds: 2200, 2000, 1800, 1600, 1400, 1200 
MHz
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 Intel 3200/3210 Host rev 0x01
ppb0 at pci0 dev 1 function 0 Intel 3200/3210 PCIE rev 0x01: apic 2 int 16 
(irq 15)
pci1 at ppb0 bus 1
ppb1 at pci0 dev 28 function 0 Intel 82801I PCIE rev 0x02: apic 2 int 16 (irq 
15)
pci2 at ppb1 bus 2
em0 at pci2 dev 0 function 0 Intel PRO/1000 PT (82571EB) rev 0x06: apic 2 int 
16 (irq 15), address 00:15:17:6c:c7:a2
em1 at pci2 dev 0 function 1 Intel PRO/1000 PT (82571EB) rev 0x06: apic 2 int 
17 (irq 14), address 00:15:17:6c:c7:a3
ppb2 at pci0 dev 28 function 4 Intel 82801I PCIE rev 0x02
pci3 at ppb2 bus 3
bge0 at pci3 dev 0 function 0 Broadcom BCM5721 rev 0x21, BCM5750 C1 (0x4201): 
apic 2 int 16 (irq 15), address 00:19:b9:fa:59:20
brgphy0 at bge0 phy 1: BCM5750 10/100/1000baseT PHY, rev. 0
ppb3 at pci0 dev 28 function 5 Intel 82801I PCIE rev 0x02
pci4 at ppb3 bus 4
bge1 at pci4 dev 0 function 0 Broadcom BCM5721 rev 0x21, BCM5750 C1 (0x4201): 
apic 2 int 17 (irq 14), address 00:19:b9:fa:59:21
brgphy1 at bge1 phy 1: BCM5750 10/100/1000baseT PHY, rev. 0
uhci0 at pci0 dev 29 function 0 Intel 82801I USB rev 0x02: apic 2 int 21 (irq 
11)
uhci1 at pci0 dev 29 function 1 Intel 82801I USB rev 0x02: apic 2 int 20 (irq 
10)
uhci2 at pci0 dev 29 function 2 Intel 82801I USB rev 0x02: apic 2 int 21 (irq 
11)
ehci0 at pci0 dev 29 function 7 Intel 82801I USB rev 0x02: apic 2 int 21 (irq 
11)
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 Intel EHCI root hub rev 2.00/1.00 addr 1
ppb4 at pci0 dev 30 function 0 Intel 82801BA Hub-to-PCI rev 0x92
pci5 at ppb4 bus 5
vga1 at pci5 dev 5 function 0 ATI ES1000 rev 0x02
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
radeondrm0 at vga1: apic 2 int 19 (irq 5)
drm0 at radeondrm0
ichpcib0 at pci0 dev 31 function 0 Intel 82801IR LPC rev 0x02: PM disabled
pciide0 at pci0 dev 31 function 2 Intel 82801I SATA rev 0x02: DMA, channel 0 
configured to native-PCI, channel 1 configured to native-PCI
pciide0: using apic 2 int 23 (irq 6) for native-PCI interrupt
wd0 at pciide0 channel 0 drive 0: WDC WD1601ABYS-18C0A0
wd0: 16-sector PIO, LBA48, 152587MB, 31250 sectors
atapiscsi0 at pciide0 channel 0 drive 1
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: HL-DT-ST, CDRW/DVD GCCT10N, A102 ATAPI 5/cdrom 
removable
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 6
cd0(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 2
usb1 at uhci0: USB revision 1.0
uhub1 at usb1 Intel UHCI root hub rev 1.00/1.00 addr 1
usb2 at uhci1: USB revision 1.0
uhub2 at usb2 Intel UHCI root hub rev 1.00/1.00 addr 1
usb3 at uhci2: USB revision 1.0
uhub3 at usb3 Intel UHCI root hub rev 1.00/1.00 addr 1
isa0 at ichpcib0

Re: network bandwith with em(4)

2011-03-10 Thread Ryan McBride
On Thu, Mar 10, 2011 at 12:18:32PM +, Tom Murphy wrote:
   I had a pair of Dell PowerEdge R200s that have both em(4) and bge(4)s 
 in them, however, it's the em(4) doing the heavy lifting. Roughly 30-40 
 megabits/s sustained and doing anywhere between 3000-4000 packets/s.
 
   On OpenBSD 4.4, it happily forwards packets along. I upgraded one of 
 the firewalls to 4.8 and switched CARP over to it (yes, I know the 
 redundancy is broken anyway now.) and it couldn't seem to handle the 
 traffic. Any inbound connections would stall and I have no idea why.

I assume that you don't have the 'defer' option set on your pfsync
interface (it would be broken until you upgrade both firewalls)


   There were no net.inet.ip.ifq.drops, but I noticed 10 livelocks when
 running systat mbufs (on em0). 


I think in 4.8 systat mbufs is showing the total number of livelocks
ever, and 10 is a tiny number. On a system nearing it's limit you could
expect the livelocks counter to get hit a few times a second, but if
it's getting hit 50 times per second you're way over capacity.

Note you can also look at 'sysctl kern.netlivelocks' which is a little 
less ambiguous, and shows the total number of livelocks since boot.


 Could MCLGETI be hindering performance?  

I'm doing a lot of testing in this area these days on a broad range of 
hardware, and I have yet to find a case where MCLGETI does not improve 
a system's ability to handle load. If anything MCLGETI needs to be more 
aggressive, and we're looking at ways to do that.

-Ryan



Re: network bandwith with em(4)

2011-03-10 Thread Tom Murphy
Ryan McBride wrote:
 On Thu, Mar 10, 2011 at 12:18:32PM +, Tom Murphy wrote:
I had a pair of Dell PowerEdge R200s that have both em(4) and bge(4)s 
  in them, however, it's the em(4) doing the heavy lifting. Roughly 30-40 
  megabits/s sustained and doing anywhere between 3000-4000 packets/s.
 
   On OpenBSD 4.4, it happily forwards packets along. I upgraded one of 
 the firewalls to 4.8 and switched CARP over to it (yes, I know the 
 redundancy is broken anyway now.) and it couldn't seem to handle the 
 traffic. Any inbound connections would stall and I have no idea why.

I assume that you don't have the 'defer' option set on your pfsync
interface (it would be broken until you upgrade both firewalls)

  Correct. The defer option is off by default and when I looked at
pfsync0 on the 4.8 box it said:

  pfsync: syncdev: bge1 maxupd: 128 defer: off

   There were no net.inet.ip.ifq.drops, but I noticed 10 livelocks when
 running systat mbufs (on em0). 


I think in 4.8 systat mbufs is showing the total number of livelocks
ever, and 10 is a tiny number. On a system nearing it's limit you could
expect the livelocks counter to get hit a few times a second, but if
it's getting hit 50 times per second you're way over capacity.

 Yeah I only had 10 after about 3-4 hours and the number did not increase.

Note you can also look at 'sysctl kern.netlivelocks' which is a little 
less ambiguous, and shows the total number of livelocks since boot.
 
 Thanks! I will bear that in mind.

 Could MCLGETI be hindering performance?  

I'm doing a lot of testing in this area these days on a broad range of 
hardware, and I have yet to find a case where MCLGETI does not improve 
a system's ability to handle load. If anything MCLGETI needs to be more 
aggressive, and we're looking at ways to do that.

  I notice the machines are mostly idle.. between 90-95%. They also use
very little memory (top reports 15-18M of memory used). The 4.8 box
only has 1 gig of RAM, whereas the 4.4 box has 2 gig. It doesn't seem
to make much of a difference in this case. Whichever firewall is active
can handle upwards to about 62000 states during peak times.

  Would it be worth just shutting down pfsync(4) on both machines to
test performance? I wouldn't want pfsync getting in the way since
pfsync is broken anyway. It would be one more variable to remove from
the equation.

  Tom



kernel leaks (was: Re: network bandwith with em(4))

2011-03-10 Thread Leen Besselink
On 03/10/2011 03:45 PM, Tom Murphy wrote:
 Ryan McBride wrote:
 On Thu, Mar 10, 2011 at 12:18:32PM +, Tom Murphy wrote:
I had a pair of Dell PowerEdge R200s that have both em(4) and bge(4)s 
  in them, however, it's the em(4) doing the heavy lifting. Roughly 30-40 
  megabits/s sustained and doing anywhere between 3000-4000 packets/s.

   On OpenBSD 4.4, it happily forwards packets along. I upgraded one of 
 the firewalls to 4.8 and switched CARP over to it (yes, I know the 
 redundancy is broken anyway now.) and it couldn't seem to handle the 
 traffic. Any inbound connections would stall and I have no idea why.

Hi folks,

Sorry for hijacking this thread.

I also have a Dell machine with em(4)'s.

When I upgraded a machine from 4.3 or 4.4 to 4.7 the kernel is leaking
memory I've been looking at it ever since. This was just before 4.8 came
out so it didn't get 4.8.

I disabled everything I could find to figure out if I did something
wrong, ranging from openvpn with briding setup and to the new setup I
made with relayd. Anything I could think of in userspace.

And I also reverted back to the stock kernel instead of one with errata
patches applied. I've set the interfaces to use a full duplex instead of
automatic. Disabled the use of IPv6 (which wasn't used before the upgrade).

Nothing seems to have worked so far.

It isn't a big machine and it doesn't need to handle a lot of traffic
but at the current rate it is leaking memory all day long and I have to
reboot the machine every 1 or 2 weeks or it will stop working.

Which obvously is very sad.

When it gets to about 8000+ mbufs the machine starts to exhibit really
weird behaviour but does not lockup. It can setup client connections on
TCP but TCP or Unix server sockets can not receive any new connections.

I keep a log of the output of netstat -m.

There is part of the output of the log and the dmesg at the end of this
email.

The one part I haven't tried disabling is the dynamic routing, it does
get frequent route updates.

I have an other machine which runs exactly the same binaries but the
hardware is a bit different.

I looked at a lot of changes in CVS and I didn't see anything special in
the related drivers that I could find which warranted an upgrade to 4.8.
Doing an upgrade would take quite a
bit of time I don't have right now and I also didn't want to make the
problem worse. ;-)

If you have any tips I could to further investigate or fix problem I
would really appreciate it.

If you need any extra information let me know.

I keep wondering what changed between 4.3/4.4 and 4.7/4.8 in
correspondance with Dell and em(4).

At this point I'm thinking wasn't there a big update in how ACPI works
on OpenBSD or something like that which might effect how interrupts and
drivers work ?

Anyway have a nice day,
Leen.

___

OpenBSD 4.7 (GENERIC) #558: Wed Mar 17 20:46:15 MDT 2010
dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel(R) Pentium(R) 4 CPU 3.06GHz (GenuineIntel 686-class) 3.07 GHz
cpu0:
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,CNXT-ID,xTPR
real mem  = 1073184768 (1023MB)
avail mem = 1031110656 (983MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 10/08/03, BIOS32 rev. 0 @ 0xffe90,
SMBIOS rev. 2.3 @ 0xfae10 (77 entries)
bios0: vendor Dell Computer Corporation version A04 date 10/08/2003
bios0: Dell Computer Corporation PowerEdge 650
acpi0 at bios0: rev 0
acpi0: tables DSDT FACP APIC SPCR
acpi0: wakeup devices PCI0(S5) PCI1(S5) PCI2(S5)
acpitimer0 at acpi0: 3579545 Hz, 32 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: apic clock running at 133MHz
cpu at mainbus0: not configured
ioapic0 at mainbus0: apid 2 pa 0xfec0, version 11, 16 pins
ioapic0: misconfigured as apic 0, remapped to apid 2
ioapic1 at mainbus0: apid 3 pa 0xfec01000, version 11, 16 pins
ioapic1: misconfigured as apic 0, remapped to apid 3
ioapic2 at mainbus0: apid 4 pa 0xfec02000, version 11, 16 pins
ioapic2: misconfigured as apic 0, remapped to apid 4
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (PCI1)
acpiprt2 at acpi0: bus 2 (PCI2)
acpicpu0 at acpi0
bios0: ROM list: 0xc/0x8000 0xc8000/0x4800 0xcc800/0x1800
0xce000/0x1800 0xec000/0x4000!
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 ServerWorks GCNB-LE Host rev 0x32
pchb1 at pci0 dev 0 function 1 ServerWorks GCNB-LE Host rev 0x00
pci1 at pchb1 bus 1
em0 at pci1 dev 3 function 0 Intel PRO/1000MT (82546EB) rev 0x01: apic
3 int 3 (irq 5), address 00:04:23:9f:24:56
em1 at pci1 dev 3 function 1 Intel PRO/1000MT (82546EB) rev 0x01: apic
3 int 4 (irq 3), address 00:04:23:9f:24:57
em2 at pci0 dev 3 function 0 Intel PRO/1000MT (82546EB) rev 0x01: apic
3 int 1 (irq 15), address 00:04:23:5f:1c:b2
em3 at pci0 dev 3 function 1 Intel PRO/1000MT (82546EB) rev 0x01: apic
3 int 2 (irq 11), 

Re: kernel leaks (was: Re: network bandwith with em(4))

2011-03-10 Thread Bret Lambert
On Fri, Mar 11, 2011 at 12:22 AM, Leen Besselink
open...@consolejunkie.net  Hi folks,

 Sorry for hijacking this thread.

 I also have a Dell machine with em(4)'s.

 When I upgraded a machine from 4.3 or 4.4 to 4.7 the kernel is leaking
 memory I've been looking at it ever since. This was just before 4.8 came
 out so it didn't get 4.8.


There have been a number of mbuf leak fixes between 4.8 and 4.9.

Reinstall with 4.9/current and repeat your tests.



Re: network bandwith with em(4)

2011-03-05 Thread Ryan McBride
On Fri, Feb 25, 2011 at 08:40:10PM +0100, Manuel Guesdon wrote:
 systat -s 2 vmstat:

3.2%Int   0.1%Sys   0.0%Usr   0.0%Nic  96.8%Idle   
 |||||||||||   

The numbers presented here are calculated against the sum of your CPUs.
Since you are running bsd.mp with hyperthreading turned on, your machine
has 16 CPUs; each CPU accounts for about 6% of the total available so
the 3.2%Int value in your systat vmstat means that you have one cpu
(the only one that is actually working in the kernel) about 50% in
interrupt context.  

The exact behaviour varies from hardware to hardware, but it's not
surprising that you start losing packets at this level of load.



Re: network bandwith with em(4)

2011-03-05 Thread Manuel Guesdon
On Sat, 5 Mar 2011 22:09:51 +0900
Ryan McBride mcbr...@openbsd.org wrote:

| On Fri, Feb 25, 2011 at 08:40:10PM +0100, Manuel Guesdon wrote:
|  systat -s 2 vmstat:
| 
| 3.2%Int   0.1%Sys   0.0%Usr   0.0%Nic  96.8%Idle   
|  |||||||||||   
| 
| The numbers presented here are calculated against the sum of your CPUs.
| Since you are running bsd.mp with hyperthreading turned on, your machine
| has 16 CPUs; each CPU accounts for about 6% of the total available so
| the 3.2%Int value in your systat vmstat means that you have one cpu
| (the only one that is actually working in the kernel) about 50% in
| interrupt context.  
| 
| The exact behaviour varies from hardware to hardware, but it's not
| surprising that you start losing packets at this level of load.

OK. Understood. Thank you. I'll try SP kernel with mulithread disabled as soon
as I can and make some tests.

Manuel 

--
__
Manuel Guesdon - OXYMIUM



Re: network bandwith with em(4)

2011-03-04 Thread Ryan McBride
On Thu, Mar 03, 2011 at 03:52:54PM +0100, Manuel Guesdon wrote:
 Of course and s/OpenBSD/FreeBSD/ may help too but none of these proposals
 seems very constructive.

If you think that you'd be better served by FreeBSD, please go ahead and
use that instead.

 | I think we already mentioned it that you will always see Ierr. The
 | question is if the box is able to forward more then 150kpps.
 
 Yes that's one a the questions. We can divide it into 3 questions:
 1) is the limitation comes from hardware ?
 2) is the limitation comes from OpenBSD ?
 3) is the limitation comes from the way OpenBSD exploit hardware.
 
 1) Except if someone explain by a+b why the hardware can't forward this
 rate, I'm keep thinking it can do it (otherwise I don't see reason to sell
 quad 1Gbps nic).

Are you suggesting that because you have a quad-port gig nic, your box
should be able to do 6 *million* packets per second? By that logic my
5-port Soekris net4801 should be able to handle 740kpps. (for reference,
the net4801 does about 3kpps with 4.9)


 I'm ok to hear that I've purchased crappy motherboard card
 or nic (but I'd like to understand why they are crappy).

It has nothing to do with hardware crappiness, it has to do with your
expectations. Your box should certainly be able to fill a few of your
gig ports with 1500byte packets, but there is no way it'll handle a full
4 gigabits / second of TCP syn packets.


 I've spent days and days making tests, searches, reading kernel source
 code and so on because I think it's interesting for the community to
 find where the problem come from and how to solve it (if possible). If
 finally the answer is that OpenBSD (or may be any other OS) can't
 forward more than 150kpps without losing 1 to 20 pps with this
 hardware, I'll live with it. 

Are you actually complaining about 1 to 20 errors per second? That's
0.01% packet loss, welcome to ethernet. You will not see this change by
switching to different hardware or OS.

It /is/ possible that something is wrong with your box and you could be
getting a slightly higher throughput. But don't expect that we'll make
it handle 2 million PPS any time soon.


 But as we've already seen that increasing int/s improve performances
 (for good or bad reason), I keep thinking there's something to improve
 or fix but I may be wrong.

There are MANY more performance considerations than just pps: latency,
interactive/userland performance under load, how the system responds
once it is overloaded, etc. We're not going to sacrifice all these just
to get a higher pps number.

However, don't bother just telling us there's something to improve.
We've working on this for years, we've already made huge improvements,
and we're always looking for more.  Perhaps the biggest limitation on
modern hardware is that we can't split the packet handling across
multiple CPUs, but your input provides exactly ZERO help with changing
that.



Re: network bandwith with em(4)

2011-03-04 Thread Manuel Guesdon
On Fri, 4 Mar 2011 22:53:30 +0900
Ryan McBride mcbr...@openbsd.org wrote:

| On Thu, Mar 03, 2011 at 03:52:54PM +0100, Manuel Guesdon wrote:
|  | I think we already mentioned it that you will always see Ierr. The
|  | question is if the box is able to forward more then 150kpps.
|  
|  Yes that's one a the questions. We can divide it into 3 questions:
|  1) is the limitation comes from hardware ?
|  2) is the limitation comes from OpenBSD ?
|  3) is the limitation comes from the way OpenBSD exploit hardware.
|  
|  1) Except if someone explain by a+b why the hardware can't forward this
|  rate, I'm keep thinking it can do it (otherwise I don't see reason to sell
|  quad 1Gbps nic).
| 
| Are you suggesting that because you have a quad-port gig nic, your box
| should be able to do 6 *million* packets per second? By that logic my
| 5-port Soekris net4801 should be able to handle 740kpps. (for reference,
| the net4801 does about 3kpps with 4.9)

No, I don't suggest that, I simply think it strange to have these
kind of hardware specification (bus length and speed and bgps nic) and
can't handle something like 160kpps in packets when the 'only' (i.e. no
userland application) job of the server is to forward packets and that server
seems to be 90% idle.


|  I'm ok to hear that I've purchased crappy motherboard card
|  or nic (but I'd like to understand why they are crappy).
| 
| It has nothing to do with hardware crappiness, it has to do with your
| expectations. Your box should certainly be able to fill a few of your
| gig ports with 1500byte packets, but there is no way it'll handle a full
| 4 gigabits / second of TCP syn packets.

I don't expect that numbers.


|  I've spent days and days making tests, searches, reading kernel source
|  code and so on because I think it's interesting for the community to
|  find where the problem come from and how to solve it (if possible). If
|  finally the answer is that OpenBSD (or may be any other OS) can't
|  forward more than 150kpps without losing 1 to 20 pps with this
|  hardware, I'll live with it. 
| 
| Are you actually complaining about 1 to 20 errors per second? That's
| 0.01% packet loss, welcome to ethernet. You will not see this change by
| switching to different hardware or OS.

I'm not complaining, I just try to see if it's 'normal' to have these loss
when server seems not very loaded or if it hide a problem.



| It /is/ possible that something is wrong with your box and you could be
| getting a slightly higher throughput. But don't expect that we'll make
| it handle 2 million PPS any time soon.

Once again, I don't expect forwarding 2Mpps nor 4Gbps.


| However, don't bother just telling us there's something to improve.
| We've working on this for years, we've already made huge improvements,
| and we're always looking for more.  Perhaps the biggest limitation on
| modern hardware is that we can't split the packet handling across
| multiple CPUs, but your input provides exactly ZERO help with changing
| that.

Please see my previous messages: I've never said I see Ierrs, please fix
it.
Claudio suggested a possible mbuf leak problem and I've asked how can I
try to confirm (or not) that.
You've also pointed out high livelocks value so I've understood it as there's
may be something wrong somewhere.

I've provided requested information to help us trying to see if there's a
problem or not.
I'm not hardware expert, not driver expert and even not OpenBSD expert, I just
try to understand and may be help improving things; All my apologies if my
previous messages didn't reflect that.


Manuel 

--
__
Manuel Guesdon - OXYMIUM



Re: network bandwith with em(4)

2011-03-04 Thread Theo de Raadt
 | On Thu, Mar 03, 2011 at 03:52:54PM +0100, Manuel Guesdon wrote:
 |  | I think we already mentioned it that you will always see Ierr. The
 |  | question is if the box is able to forward more then 150kpps.
 |  
 |  Yes that's one a the questions. We can divide it into 3 questions:
 |  1) is the limitation comes from hardware ?
 |  2) is the limitation comes from OpenBSD ?
 |  3) is the limitation comes from the way OpenBSD exploit hardware.
 |  
 |  1) Except if someone explain by a+b why the hardware can't forward this
 |  rate, I'm keep thinking it can do it (otherwise I don't see reason to 
 sell
 |  quad 1Gbps nic).
 | 
 | Are you suggesting that because you have a quad-port gig nic, your box
 | should be able to do 6 *million* packets per second? By that logic my
 | 5-port Soekris net4801 should be able to handle 740kpps. (for reference,
 | the net4801 does about 3kpps with 4.9)
 
 No, I don't suggest that, I simply think it strange to have these
 kind of hardware specification (bus length and speed and bgps nic)
 [...]

It is strange that the vendors of these hardware products lie with
statistics.

You are astoundingly naive.  We simply don't need the grief of
entertaining users like you.



Re: network bandwith with em(4)

2011-03-03 Thread Manuel Guesdon
On Thu, 3 Mar 2011 00:51:46 + (UTC)
Stuart Henderson s...@spacehopper.org wrote:

| On 2011-02-28, Manuel Guesdon ml+openbsd.m...@oxymium.net wrote:
|  http://www.oxymium.net/tmp/core3-dmesg
| 
| ipmi0 at mainbus0: version 2.0 interface KCS iobase 0xca2/2 spacing 1
| 
| ipmi is disabled in GENERIC. have you tried without it?

Not on this server (I can't reboot it often) but on another one with same
hardware: it doesn't seems to make difference (it still have Ierr). 


Manuel 

--
__
Manuel Guesdon - OXYMIUM



Re: network bandwith with em(4)

2011-03-03 Thread Claudio Jeker
On Thu, Mar 03, 2011 at 09:11:13AM +0100, Manuel Guesdon wrote:
 On Thu, 3 Mar 2011 00:51:46 + (UTC)
 Stuart Henderson s...@spacehopper.org wrote:
 
 | On 2011-02-28, Manuel Guesdon ml+openbsd.m...@oxymium.net wrote:
 |  http://www.oxymium.net/tmp/core3-dmesg
 | 
 | ipmi0 at mainbus0: version 2.0 interface KCS iobase 0xca2/2 spacing 1
 | 
 | ipmi is disabled in GENERIC. have you tried without it?
 
 Not on this server (I can't reboot it often) but on another one with same
 hardware: it doesn't seems to make difference (it still have Ierr). 
 

This diff will help./sarcasm
I think we already mentioned it that you will always see Ierr. The
question is if the box is able to forward more then 150kpps.

-- 
:wq Claudio

Index: if_em.c
===
RCS file: /cvs/src/sys/dev/pci/if_em.c,v
retrieving revision 1.249
diff -u -p -r1.249 if_em.c
--- if_em.c 13 Feb 2011 19:45:54 -  1.249
+++ if_em.c 3 Mar 2011 10:01:39 -
@@ -3194,14 +3194,7 @@ em_update_stats_counters(struct em_softc
ifp-if_collisions = sc-stats.colc;
 
/* Rx Errors */
-   ifp-if_ierrors =
-   sc-dropped_pkts +
-   sc-stats.rxerrc +
-   sc-stats.crcerrs +
-   sc-stats.algnerrc +
-   sc-stats.ruc + sc-stats.roc +
-   sc-stats.mpc + sc-stats.cexterr +
-   sc-rx_overruns;
+   ifp-if_ierrors = 0;
 
/* Tx Errors */
ifp-if_oerrors = sc-stats.ecol + sc-stats.latecol +



Re: network bandwith with em(4)

2011-03-03 Thread Manuel Guesdon
On Thu, 3 Mar 2011 11:12:09 +0100
Claudio Jeker cje...@diehard.n-r-g.com wrote:

| On Thu, Mar 03, 2011 at 09:11:13AM +0100, Manuel Guesdon wrote:
|  On Thu, 3 Mar 2011 00:51:46 + (UTC)
|  Stuart Henderson s...@spacehopper.org wrote:
|
|  | On 2011-02-28, Manuel Guesdon ml+openbsd.m...@oxymium.net wrote:
|  |  http://www.oxymium.net/tmp/core3-dmesg
|  | 
|  | ipmi0 at mainbus0: version 2.0 interface KCS iobase 0xca2/2 spacing 1
|  | 
|  | ipmi is disabled in GENERIC. have you tried without it?  
|  
|  Not on this server (I can't reboot it often) but on another one with same
|  hardware: it doesn't seems to make difference (it still have Ierr). 
|
| 
| This diff will help./sarcasm

Of course and s/OpenBSD/FreeBSD/ may help too but none of these proposals
seems very constructive.


| I think we already mentioned it that you will always see Ierr. The
| question is if the box is able to forward more then 150kpps.

Yes that's one a the questions. We can divide it into 3 questions:
1) is the limitation comes from hardware ?
2) is the limitation comes from OpenBSD ?
3) is the limitation comes from the way OpenBSD exploit hardware.

1) Except if someone explain by a+b why the hardware can't forward this
rate, I'm keep thinking it can do it (otherwise I don't see reason to sell
quad 1Gbps nic). I'm ok to hear that I've purchased crappy motherboard card
or nic (but I'd like to understand why they are crappy).

The last 2 questions are still open in my mind.

I've spent days and days making tests, searches, reading kernel source code
and so on because I think it's interesting for the community to find where the
problem come from and how to solve it (if possible). If finally the answer is
that OpenBSD (or may be any other OS) can't forward more than 150kpps without
losing 1 to 20 pps with this hardware, I'll live with it. But as we've
already seen that increasing int/s improve performances (for good or bad
reason), I keep thinking there's something to improve or fix but I may be
wrong.

Anyway, thank you for your work and help.

Manuel 

--
__
Manuel Guesdon - OXYMIUM



Re: network bandwith with em(4)

2011-03-03 Thread James A. Peltier
- Original Message -
| On Thu, Mar 03, 2011 at 09:11:13AM +0100, Manuel Guesdon wrote:
|  On Thu, 3 Mar 2011 00:51:46 + (UTC)
|  Stuart Henderson s...@spacehopper.org wrote:
| 
|  | On 2011-02-28, Manuel Guesdon ml+openbsd.m...@oxymium.net
|  | wrote:
|  |  http://www.oxymium.net/tmp/core3-dmesg
|  |
|  | ipmi0 at mainbus0: version 2.0 interface KCS iobase 0xca2/2
|  | spacing 1
|  |
|  | ipmi is disabled in GENERIC. have you tried without it?
| 
|  Not on this server (I can't reboot it often) but on another one with
|  same
|  hardware: it doesn't seems to make difference (it still have Ierr).
| 
| 
| This diff will help./sarcasm
| I think we already mentioned it that you will always see Ierr. The
| question is if the box is able to forward more then 150kpps.
| 
| --
| :wq Claudio
| 
| Index: if_em.c
| ===
| RCS file: /cvs/src/sys/dev/pci/if_em.c,v
| retrieving revision 1.249
| diff -u -p -r1.249 if_em.c
| --- if_em.c 13 Feb 2011 19:45:54 - 1.249
| +++ if_em.c 3 Mar 2011 10:01:39 -
| @@ -3194,14 +3194,7 @@ em_update_stats_counters(struct em_softc
| ifp-if_collisions = sc-stats.colc;
| 
| /* Rx Errors */
| - ifp-if_ierrors =
| - sc-dropped_pkts +
| - sc-stats.rxerrc +
| - sc-stats.crcerrs +
| - sc-stats.algnerrc +
| - sc-stats.ruc + sc-stats.roc +
| - sc-stats.mpc + sc-stats.cexterr +
| - sc-rx_overruns;
| + ifp-if_ierrors = 0;
| 
| /* Tx Errors */
| ifp-if_oerrors = sc-stats.ecol + sc-stats.latecol +


Hey Claudio,

Thanks!  This diff helped and now my errors have gone to zero!  LOL!  That was 
funny.

-- 
James A. Peltier
IT Services - Research Computing Group
Simon Fraser University - Burnaby Campus
Phone   : 778-782-6573
Fax : 778-782-3045
E-Mail  : jpelt...@sfu.ca
Website : http://www.sfu.ca/itservices
  http://blogs.sfu.ca/people/jpeltier



Re: network bandwith with em(4)

2011-03-03 Thread RLW

W dniu 2011-03-02 13:52, Ryan McBride pisze:

On Mon, Feb 28, 2011 at 12:49:01PM +0100, Manuel Guesdon wrote:

OK. Anyway NIC buffers restrict buffered packets number. But the problem
remain: why a (for exemple) dual Xeon E5520@2.27GHz with Intel PRO/1000
(82576) can't route 150kpps without Ierr :-)
http://www.oxymium.net/tmp/core3-dmesg


I've done some more comprehensive testing and talked to some other
developers, and it seems that 150kpps is in the range of what is
expected for such hardware with an unoptimized install.

One thing that seems to have a big performance impact is
net.inet.ip.ifq.maxlen. If and only if your network cards are all
supported by MCLGETI (ie, they show LWM/CWM/HWM values in 'systat
mbufs', you can try increasing ifq.maxlen until you don't see
net.inet.ip.ifq.drops incrementing anymore under constant load.

On my test box here - Intel(R) Xeon(R) CPU 5140 @ 2.33GHz with em(4), pf
disabled - increasing net.inet.ip.ifq.maxlen to 8192 gets more than
double the performance compared with the default of 256.

We're looking at making the ifq.maxlen tune itself so you don't have to
twiddle this knob anymore, not sure if and when that will happen though.




I also have problems with bandwidth on em(4).
On default clean 4.8 install i get 430Mbit/s. (with pf and altq enabled 
it's only 275Mbit/s).


systat shows:

  31.7%Int  62.1%Sys   0.0%Usr   0.0%Nic   6.2%Idle
|||||||||||
===

Interrupts
8025 total
 100 clock
7921 em0
   4 ichiic0

http://erydium.pl/upload/vmstat.gif
http://erydium.pl/upload/systat.gif
http://erydium.pl/upload/kern_profiling.txt


my hardware:
box: Lenovo ThinkCentre A51P
nic: Intel PRO/1000 PT Desktop Adapter (PCIe, model:
EXPI9300PTBLK)

DMESG:
OpenBSD 4.8 (KERN_PROF.PROF) #0: Thu Dec 30 13:25:40 CET 2010

r...@router-test.local.kig:/usr/src/sys/arch/i386/compile/KERN_PROF.PROF
cpu0: Intel(R) Celeron(R) CPU 2.80GHz (GenuineIntel 686-class) 2.80 GHz
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,TM2,CNXT-ID,xTPR

real mem  = 526938112 (502MB)
avail mem = 508166144 (484MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 05/10/07, BIOS32 rev. 0 @ 0xfd6dc, 
SMBIOS rev. 2.34 @ 0xefc60 (52 entries)

bios0: vendor IBM version 2BKT52AUS date 05/10/2007
bios0: IBM 8422W4P
acpi0 at bios0: rev 0
acpi0: sleep states S0 S1 S3 S4 S5
acpi0: tables DSDT FACP TCPA APIC BOOT MCFG
acpi0: wakeup devices EXP0(S5) EXP1(S5) EXP2(S5) EXP3(S5) USB1(S3) 
USB2(S3) USB3(S3) USB4(S3) USBE(S3) SLOT(S5) KBC_(S3) PSM_(S3) COMA(S5) 
COMB(S5)

acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: apic clock running at 133MHz
ioapic0 at mainbus0: apid 1 pa 0xfec0, version 20, 24 pins
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (PEG_)
acpiprt2 at acpi0: bus 2 (EXP0)
acpiprt3 at acpi0: bus -1 (EXP1)
acpiprt4 at acpi0: bus -1 (EXP2)
acpiprt5 at acpi0: bus -1 (EXP3)
acpiprt6 at acpi0: bus 10 (SLOT)
acpicpu0 at acpi0
acpitz0 at acpi0: critical temperature 105 degC
acpibtn0 at acpi0: PWRB
bios0: ROM list: 0xc/0xae00! 0xcb000/0x1000 0xcc000/0x2000 
0xce000/0x800 0xce800/0x800 0xe/0x1!

pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 Intel 82915G Host rev 0x04
vga1 at pci0 dev 2 function 0 Intel 82915G Video rev 0x04
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
intagp0 at vga1
agp0 at intagp0: aperture at 0xc000, size 0x1000
inteldrm0 at vga1: apic 1 int 16 (irq 5)
drm0 at inteldrm0
ppb0 at pci0 dev 28 function 0 Intel 82801FB PCIE rev 0x03: apic 1 int 
17 (irq 5)

pci1 at ppb0 bus 2
em0 at pci1 dev 0 function 0 Intel PRO/1000 PT (82572EI) rev 0x06: 
apic 1 int 16 (irq 5), address 00:1b:21:05:1f:39
uhci0 at pci0 dev 29 function 0 Intel 82801FB USB rev 0x03: apic 1 int 
23 (irq 11)
uhci1 at pci0 dev 29 function 1 Intel 82801FB USB rev 0x03: apic 1 int 
19 (irq 9)
uhci2 at pci0 dev 29 function 2 Intel 82801FB USB rev 0x03: apic 1 int 
18 (irq 10)
uhci3 at pci0 dev 29 function 3 Intel 82801FB USB rev 0x03: apic 1 int 
16 (irq 5)
ehci0 at pci0 dev 29 function 7 Intel 82801FB USB rev 0x03: apic 1 int 
23 (irq 11)

usb0 at ehci0: USB revision 2.0
uhub0 at usb0 Intel EHCI root hub rev 2.00/1.00 addr 1
ppb1 at pci0 dev 30 function 0 Intel 82801BA Hub-to-PCI rev 0xd3
pci2 at ppb1 bus 10
xl0 at pci2 dev 10 function 0 3Com 3c905C 100Base-TX rev 0x74: apic 1 
int 22 (irq 3), address 00:04:76:0b:90:9f

bmtphy0 at xl0 phy 24: 3C905C internal PHY, rev. 6
bge0 at pci2 dev 11 function 0 Broadcom BCM5705K rev 0x03, BCM5705 A3 
(0x3003): apic 1 int 16 (irq 5), address 00:11:25:4f:9a:f4

brgphy0 at bge0 phy 1: BCM5705 10/100/1000baseT PHY, rev. 2
xl1 at pci2 dev 12 function 0 3Com 3c905C 100Base-TX rev 0x74: apic 1 
int 

Re: network bandwith with em(4)

2011-03-02 Thread Ryan McBride
On Mon, Feb 28, 2011 at 12:49:01PM +0100, Manuel Guesdon wrote:
 OK. Anyway NIC buffers restrict buffered packets number. But the problem
 remain: why a (for exemple) dual Xeon E5520@2.27GHz with Intel PRO/1000
 (82576) can't route 150kpps without Ierr :-)
 http://www.oxymium.net/tmp/core3-dmesg

I've done some more comprehensive testing and talked to some other
developers, and it seems that 150kpps is in the range of what is
expected for such hardware with an unoptimized install.

One thing that seems to have a big performance impact is
net.inet.ip.ifq.maxlen. If and only if your network cards are all
supported by MCLGETI (ie, they show LWM/CWM/HWM values in 'systat
mbufs', you can try increasing ifq.maxlen until you don't see
net.inet.ip.ifq.drops incrementing anymore under constant load.

On my test box here - Intel(R) Xeon(R) CPU 5140 @ 2.33GHz with em(4), pf
disabled - increasing net.inet.ip.ifq.maxlen to 8192 gets more than
double the performance compared with the default of 256.

We're looking at making the ifq.maxlen tune itself so you don't have to
twiddle this knob anymore, not sure if and when that will happen though.



Re: network bandwith with em(4)

2011-03-02 Thread Manuel Guesdon
On Wed, 2 Mar 2011 21:52:03 +0900
Ryan McBride mcbr...@openbsd.org wrote:

| On Mon, Feb 28, 2011 at 12:49:01PM +0100, Manuel Guesdon wrote:
|  OK. Anyway NIC buffers restrict buffered packets number. But the problem
|  remain: why a (for exemple) dual Xeon E5520@2.27GHz with Intel PRO/1000
|  (82576) can't route 150kpps without Ierr :-)
|  http://www.oxymium.net/tmp/core3-dmesg
| 
| I've done some more comprehensive testing and talked to some other
| developers, and it seems that 150kpps is in the range of what is
| expected for such hardware with an unoptimized install.

Thank you for the help !


| One thing that seems to have a big performance impact is
| net.inet.ip.ifq.maxlen. If and only if your network cards are all
| supported by MCLGETI (ie, they show LWM/CWM/HWM values in 'systat
| mbufs', you can try increasing ifq.maxlen until you don't see
| net.inet.ip.ifq.drops incrementing anymore under constant load.

Yes all my nic interfaces have LWM/CWM/HWM values:
IFACE LIVELOCKS  SIZE ALIVE   LWM   HWM   CWM
System256 837715502
   2k   1601252
em0  372k 4 4   256 4
em1 2582k 4 4   256 4
em2  3727512k 7 4   256 7
em382582k 4 4   256 4
em4   250722k63 4   25663
em536582k 8 4   256 8
em6  5012882k24 4   25624
em7  222k 4 4   256 4
em8   365512k23 4   25623
em9   520532k 5 4   256 4


I've already increased to 2048 some time ago with good effect on ifq.drops 
but even when ifq.drops doesn't increase, I still have
Ierrs on interfaces (I've just verified this right now) :-)
I've made some change to em some time ago to dump card stats with -debug
option and it give me this stuff like this:
---
em4: Dropped PKTS = 0
em4: Excessive collisions = 0
em4: Symbol errors = 0
em4: Sequence errors = 0
em4: Defer count = 3938
em4: Missed Packets = 17728103
em4: Receive No Buffers = 21687370
em4: Receive Length Errors = 0
em4: Receive errors = 0
em4: Crc errors = 0
em4: Alignment errors = 0
em4: Carrier extension errors = 0
em4: RX overruns = 1456725
em4: watchdog timeouts = 0
em4: XON Rcvd = 31813
em4: XON Xmtd = 2304158
em4: XOFF Rcvd = 935928
em4: XOFF Xmtd = 20031226
em4: Good Packets Rcvd = 33772245185
em4: Good Packets Xmtd = 20662758161
---
em4: Dropped PKTS = 0
em4: Excessive collisions = 0
em4: Symbol errors = 0
em4: Sequence errors = 0
em4: Defer count = 3938
em4: Missed Packets = 17728457
em4: Receive No Buffers = 21687421
em4: Receive Length Errors = 0
em4: Receive errors = 0
em4: Crc errors = 0
em4: Alignment errors = 0
em4: Carrier extension errors = 0
em4: RX overruns = 1456730
em4: watchdog timeouts = 0
em4: XON Rcvd = 31813
em4: XON Xmtd = 2304166
em4: XOFF Rcvd = 935928
em4: XOFF Xmtd = 20031588
em4: Good Packets Rcvd = 33772265127
em4: Good Packets Xmtd = 20662759039

So If I well understand this, the card indicate that there are Missed Packets
because the nic have sometime not enough buffer space to store them which
seems stange with 8000 int/s and an 40K buffer (40K for Rx, 24K for Tx as
seen in if_em.c)



One of my interrogation is how to know that the system is heavy loaded.
systat -s 2 vmstat, give me these informations:

Proc:r  d  s  wCsw   Trp   Sys   Int   Sof  Flt
  14   149 2   509 2011898   31
   
   3.5%Int   0.5%Sys   0.0%Usr   0.0%Nic  96.0%Idle
|||||||||||

which make me think that the system is really not very loaded but I may miss
a point


Manuel 

--
__
Manuel Guesdon - OXYMIUM



Re: network bandwith with em(4)

2011-03-02 Thread Claudio Jeker
On Wed, Mar 02, 2011 at 08:34:02PM +0100, Manuel Guesdon wrote:
 On Wed, 2 Mar 2011 21:52:03 +0900
 Ryan McBride mcbr...@openbsd.org wrote:
 
 | On Mon, Feb 28, 2011 at 12:49:01PM +0100, Manuel Guesdon wrote:
 |  OK. Anyway NIC buffers restrict buffered packets number. But the problem
 |  remain: why a (for exemple) dual Xeon E5520@2.27GHz with Intel PRO/1000
 |  (82576) can't route 150kpps without Ierr :-)
 |  http://www.oxymium.net/tmp/core3-dmesg
 | 
 | I've done some more comprehensive testing and talked to some other
 | developers, and it seems that 150kpps is in the range of what is
 | expected for such hardware with an unoptimized install.
 
 Thank you for the help !

Hmpf. My last tests where done with ix(4) and it performed way better. Not
sure if something got back into em(4) that makes the driver slow or if it
is something different.

 
 
 | One thing that seems to have a big performance impact is
 | net.inet.ip.ifq.maxlen. If and only if your network cards are all
 | supported by MCLGETI (ie, they show LWM/CWM/HWM values in 'systat
 | mbufs', you can try increasing ifq.maxlen until you don't see
 | net.inet.ip.ifq.drops incrementing anymore under constant load.
 
 Yes all my nic interfaces have LWM/CWM/HWM values:
 IFACE LIVELOCKS  SIZE ALIVE   LWM   HWM   CWM
 System256 837715502
2k   1601252
 em0  372k 4 4   256 4
 em1 2582k 4 4   256 4
 em2  3727512k 7 4   256 7
 em382582k 4 4   256 4
 em4   250722k63 4   25663
 em536582k 8 4   256 8
 em6  5012882k24 4   25624
 em7  222k 4 4   256 4
 em8   365512k23 4   25623
 em9   520532k 5 4   256 4
 

Woohoo. That is a lot of livelocks you hit. In other words you are losing
ticks by something spinning to long in the kernel. Interfaces with a very
low CWM but a high pps rate are the ones you need to investigate about.

Additionally I would like to see your netstat -m and vmstat -m output.
If I see it right you have 83771 mbufs allocated in your system. This
sounds like a serious mbuf leak and could actually be the reason for your
bad performance. It is very well possible that most of your buffer
allocations fail causing the tiny rings and suboptimal performance.
 
 I've already increased to 2048 some time ago with good effect on ifq.drops 
 but even when ifq.drops doesn't increase, I still have
 Ierrs on interfaces (I've just verified this right now) :-)

Having some Ierrs is not a big issue always put them in perspective with
the number of packets received.
e.g.
em6 1500  Link  00:30:48:9c:3a:80 72007980648 143035 62166589667 0
 0

This interface had 143035 Ierrs but it also passed 72 billion packets so
this is far less then 1% and not a problem.

 I've made some change to em some time ago to dump card stats with -debug
 option and it give me this stuff like this:
 ---
 em4: Dropped PKTS = 0
 em4: Excessive collisions = 0
 em4: Symbol errors = 0
 em4: Sequence errors = 0
 em4: Defer count = 3938
 em4: Missed Packets = 17728103
 em4: Receive No Buffers = 21687370
 em4: Receive Length Errors = 0
 em4: Receive errors = 0
 em4: Crc errors = 0
 em4: Alignment errors = 0
 em4: Carrier extension errors = 0
 em4: RX overruns = 1456725
 em4: watchdog timeouts = 0
 em4: XON Rcvd = 31813
 em4: XON Xmtd = 2304158
 em4: XOFF Rcvd = 935928
 em4: XOFF Xmtd = 20031226
 em4: Good Packets Rcvd = 33772245185
 em4: Good Packets Xmtd = 20662758161
 ---
 em4: Dropped PKTS = 0
 em4: Excessive collisions = 0
 em4: Symbol errors = 0
 em4: Sequence errors = 0
 em4: Defer count = 3938
 em4: Missed Packets = 17728457
 em4: Receive No Buffers = 21687421
 em4: Receive Length Errors = 0
 em4: Receive errors = 0
 em4: Crc errors = 0
 em4: Alignment errors = 0
 em4: Carrier extension errors = 0
 em4: RX overruns = 1456730
 em4: watchdog timeouts = 0
 em4: XON Rcvd = 31813
 em4: XON Xmtd = 2304166
 em4: XOFF Rcvd = 935928
 em4: XOFF Xmtd = 20031588
 em4: Good Packets Rcvd = 33772265127
 em4: Good Packets Xmtd = 20662759039
 
 So If I well understand this, the card indicate that there are Missed Packets
 because the nic have sometime not enough buffer space to store them which
 seems stange with 8000 int/s and an 40K buffer (40K for Rx, 24K for Tx as
 seen in if_em.c)
 

The FIFO on the card don't matter that much. The problem is the DMA ring
and the amount of slots on the ring that are actually usable. This is the
CWM in the systat mbuf output. MCLGETI() reduces the buffers on the ring
to limit the work getting into the system over a specific network card. 

 
 One of my interrogation is how to know that the system is 

Re: network bandwith with em(4)

2011-03-02 Thread Manuel Guesdon
On Wed, 2 Mar 2011 21:12:24 +0100
Claudio Jeker cje...@diehard.n-r-g.com wrote:
|  | One thing that seems to have a big performance impact is
|  | net.inet.ip.ifq.maxlen. If and only if your network cards are all
|  | supported by MCLGETI (ie, they show LWM/CWM/HWM values in 'systat
|  | mbufs', you can try increasing ifq.maxlen until you don't see
|  | net.inet.ip.ifq.drops incrementing anymore under constant load.
|  
|  Yes all my nic interfaces have LWM/CWM/HWM values:
|  IFACE LIVELOCKS  SIZE ALIVE   LWM   HWM   CWM
|  System256 837715502
| 2k   1601252
|  em0  372k 4 4   256 4
|  em1 2582k 4 4   256 4
|  em2  3727512k 7 4   256 7
|  em382582k 4 4   256 4
|  em4   250722k63 4   25663
|  em536582k 8 4   256 8
|  em6  5012882k24 4   25624
|  em7  222k 4 4   256 4
|  em8   365512k23 4   25623
|  em9   520532k 5 4   256 4
|  
| 
| Woohoo. That is a lot of livelocks you hit. In other words you are losing
| ticks by something spinning to long in the kernel. Interfaces with a very
| low CWM but a high pps rate are the ones you need to investigate about.

Hum OK.
A strange thing on livelocks is the big difference beetwen for example em2 and 
em4:

NameMtu   Network Ipkts Ierrs   Opkts   Oerrs Colls
em2 1500  Link   886803460042899  6562765482  0 0
em2 1500  fe80::%em2/  886803460042899  6562765482  0 0
em4 1500  Link  33934108692 19371393 20672882997  0 0
em4 1500  fe80::%em4/ 33934108692 19371393 20672882997 0 0

There's more livelocks on em2 but less packets (or may be counters were reseted 
to 0 after reaching max value)



| Additionally I would like to see your netstat -m and vmstat -m output.

netstat -m:
18472 mbufs in use:
18449 mbufs allocated to data
16 mbufs allocated to packet headers
7 mbufs allocated to socket names and addresses
331/4188/6144 mbuf 2048 byte clusters in use (current/peak/max)
0/8/6144 mbuf 4096 byte clusters in use (current/peak/max)
0/8/6144 mbuf 8192 byte clusters in use (current/peak/max)
0/8/6144 mbuf 9216 byte clusters in use (current/peak/max)
0/8/6144 mbuf 12288 byte clusters in use (current/peak/max)
0/8/6144 mbuf 16384 byte clusters in use (current/peak/max)
0/8/6144 mbuf 65536 byte clusters in use (current/peak/max)
30704 Kbytes allocated to network (70% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

vmstat -m:
Memory statistics by bucket size
Size   In Use   Free   Requests  HighWater  Couldfree
  16   113578 195414   324140581280   6712
  32   378705687   74930489 640   6824
  64 7707869   11878746 320  27074
 12811411 45   36424677 160 78
 256 7875973  328666338  80   60487950
 512 1951 656017929  40 413368
1024  3311771947159  20 880831
2048   57  3 496398  10  0
4096 5164 15 260948   5 166561
8192   36  5 226431   5  18240
   16384   12  08279177   5  0
   327685  0 11   5  0
   655362  0  2   5  0

Memory usage type by bucket size
Size  Type(s)
  16  devbuf, pcb, routetbl, sysctl, UFS mount, dirhash, ACPI, exec,
  xform_data, VM swap, UVM amap, UVM aobj, USB, USB device, temp
  32  devbuf, pcb, routetbl, ifaddr, UFS mount, sem, dirhash, ACPI,
  ip_moptions, in_multi, exec, pfkey data, xform_data, UVM amap, USB,
  temp
  64  devbuf, pcb, routetbl, fragtbl, ifaddr, vnodes, UFS mount, dirhash,
  ACPI, proc, VFS cluster, in_multi, ether_multi, VM swap, UVM amap,
  USB, USB device, NDP, temp
 128  devbuf, pcb, routetbl, fragtbl, ifaddr, mount, sem, dirhash, ACPI,
  VFS cluster, MFS node, NFS srvsock, ip_moptions, ttys, pfkey data,
  UVM amap, USB, USB device, NDP, temp
 256  devbuf, routetbl, ifaddr, ioctlops, iov, vnodes, shm, VM map, dirhash,
  ACPI, ip_moptions, exec, UVM amap, USB, USB device, ip6_options, temp
 512  devbuf, ifaddr, sysctl, ioctlops, iov, vnodes, dirhash, file desc,
  NFS daemon, ttys, newblk, UVM amap, USB, USB device, temp
1024  devbuf, pcb, sysctl, ioctlops, iov, mount, UFS mount, shm, ACPI, proc,
  ttys, exec, UVM amap, USB HC, crypto data, 

Re: network bandwith with em(4)

2011-03-02 Thread Alexey Suslikov
Claudio Jeker wrote:

 On Wed, Mar 02, 2011 at 08:34:02PM +0100, Manuel Guesdon wrote:
  On Wed, 2 Mar 2011 21:52:03 +0900
  Ryan McBride mcbr...@openbsd.org wrote:
 
  | On Mon, Feb 28, 2011 at 12:49:01PM +0100, Manuel Guesdon wrote:
  |  OK. Anyway NIC buffers restrict buffered packets number. But the 
  problem
  |  remain: why a (for exemple) dual Xeon E5520@2.27GHz with Intel PRO/1000
  |  (82576) can't route 150kpps without Ierr :-)
  |  http://www.oxymium.net/tmp/core3-dmesg
  |
  | I've done some more comprehensive testing and talked to some other
  | developers, and it seems that 150kpps is in the range of what is
  | expected for such hardware with an unoptimized install.
 
  Thank you for the help !

 Hmpf. My last tests where done with ix(4) and it performed way better. Not
 sure if something got back into em(4) that makes the driver slow or if it
 is something different.



According to http://www.oxymium.net/tmp/core3-dmesg, interrupts are shared
heavily (see dmesg parts below).

Most problematic (wrt livelocks) em6 uses apic 9 int 15 which is shared by
other devices including PCIe bridges.

Is there any possibility for PCIe bridge to conflict with slave device
if interrupt
is shared and have excessive livelocks as a result? How bridge interrupts are
handled inside kernel?

Alexey

ppb6 at pci6 dev 1 function 0 PLX PEX 8533 rev 0xaa: apic 9 int 13 (irq 11)
pci7 at ppb6 bus 7

ppb0 at pci0 dev 1 function 0 Intel X58 PCIE rev 0x13
pci1 at ppb0 bus 1
em0 at pci1 dev 0 function 0 Intel PRO/1000 (82576) rev 0x01: apic 9
int 4 (irq 10), address 00:30:48:9f:17:52
em1 at pci1 dev 0 function 1 Intel PRO/1000 (82576) rev 0x01: apic 9
int 16 (irq 11), address 00:30:48:9f:17:53

ppb7 at pci6 dev 8 function 0 PLX PEX 8533 rev 0xaa: apic 9 int 6 (irq 10)
pci8 at ppb7 bus 8

ppb9 at pci9 dev 1 function 0 PLX PEX 8518 rev 0xac: apic 9 int 13 (irq 11)
pci10 at ppb9 bus 10
em2 at pci10 dev 0 function 0 Intel PRO/1000 (82576) rev 0x01: apic
9 int 13 (irq 11), address 00:25:90:05:53:3c
em3 at pci10 dev 0 function 1 Intel PRO/1000 (82576) rev 0x01: apic
9 int 15 (irq 15), address 00:25:90:05:53:3d

ppb10 at pci9 dev 2 function 0 PLX PEX 8518 rev 0xac: apic 9 int 15 (irq 15)
pci11 at ppb10 bus 11
em4 at pci11 dev 0 function 0 Intel PRO/1000 (82576) rev 0x01: apic
9 int 15 (irq 15), address 00:25:90:05:53:3e
em5 at pci11 dev 0 function 1 Intel PRO/1000 (82576) rev 0x01: apic
9 int 14 (irq 14), address 00:25:90:05:53:3f

ppb11 at pci6 dev 9 function 0 PLX PEX 8533 rev 0xaa: apic 9 int 13 (irq 11)
pci12 at ppb11 bus 12

ppb13 at pci13 dev 1 function 0 PLX PEX 8518 rev 0xac: apic 9 int 15 (irq 15)
pci14 at ppb13 bus 14
em6 at pci14 dev 0 function 0 Intel PRO/1000 (82576) rev 0x01: apic
9 int 15 (irq 15), address 00:25:90:05:51:d8
em7 at pci14 dev 0 function 1 Intel PRO/1000 (82576) rev 0x01: apic
9 int 14 (irq 14), address 00:25:90:05:51:d9

ppb14 at pci13 dev 2 function 0 PLX PEX 8518 rev 0xac: apic 9 int 14 (irq 14)
pci15 at ppb14 bus 15
em8 at pci15 dev 0 function 0 Intel PRO/1000 (82576) rev 0x01: apic
9 int 14 (irq 14), address 00:25:90:05:51:da
em9 at pci15 dev 0 function 1 Intel PRO/1000 (82576) rev 0x01: apic
9 int 6 (irq 10), address 00:25:90:05:51:db



Re: network bandwith with em(4)

2011-03-02 Thread Stuart Henderson
On 2011-02-28, Manuel Guesdon ml+openbsd.m...@oxymium.net wrote:
 http://www.oxymium.net/tmp/core3-dmesg

ipmi0 at mainbus0: version 2.0 interface KCS iobase 0xca2/2 spacing 1

ipmi is disabled in GENERIC. have you tried without it?



Re: network bandwith with em(4)

2011-02-28 Thread Manuel Guesdon
On Thu, 24 Feb 2011 22:03:22 -0700 (MST)
Theo de Raadt dera...@cvs.openbsd.org wrote:

|  We've got same problems (on a routeur, not a firewall). Increasing
|  MAX_INTS_PER_SEC to 24000  increased bandwith and lowered packet loss.
|  Our cards are Intel PRO/1000 (82576) and Intel PRO/1000 FP
|  (82576).
| 
| Did you try to increase the number of descriptor?
| #define EM_MAX_TXD 256
| #define EM_MAX_RXD 256
| 
| I've tried up to 2048 (and with MAX_INTS_PER_SEC = 16000) but it looks
| worth.
| 
| Say you increase this.
| 
| That means on a single interrupt, the handler could be forced to handle
| around 2000 packets.
| 
| Nothing else will happen on the machine during that period.
| 
| Can you say 'interrupt latency increase' boys and girls?

OK. Anyway NIC buffers restrict buffered packets number. But the problem
remain: why a (for exemple) dual Xeon E5520@2.27GHz with Intel PRO/1000
(82576) can't route 150kpps without Ierr :-)
http://www.oxymium.net/tmp/core3-dmesg

Manuel 



Re: network bandwith with em(4)

2011-02-28 Thread Ryan McBride
On Mon, Feb 28, 2011 at 12:49:01PM +0100, Manuel Guesdon wrote:
 OK. Anyway NIC buffers restrict buffered packets number. But the problem
 remain: why a (for exemple) dual Xeon E5520@2.27GHz with Intel PRO/1000
 (82576) can't route 150kpps without Ierr :-)
 http://www.oxymium.net/tmp/core3-dmesg

Turn off hyperthreading, run a uniprocessor kernel rather than bsd.mp.
I can't immediately tell if you're running i386 or amd64, but i386 will
probably be better.

There may be something else going on here, because 150kpps should be
trivial for a box like this, but the advice above will certainly improve
your situation.

(Yes, it will hurt to know that 7 of your cores are doing nothing. Too
bad, they're just slowing you down now)



Re: network bandwith with em(4)

2011-02-28 Thread Manuel Guesdon
On Mon, 28 Feb 2011 21:29:01 +0900
Ryan McBride mcbr...@openbsd.org wrote:

| On Mon, Feb 28, 2011 at 12:49:01PM +0100, Manuel Guesdon wrote:
|  OK. Anyway NIC buffers restrict buffered packets number. But the problem
|  remain: why a (for exemple) dual Xeon E5520@2.27GHz with Intel PRO/1000
|  (82576) can't route 150kpps without Ierr :-)
|  http://www.oxymium.net/tmp/core3-dmesg
| 
| Turn off hyperthreading, run a uniprocessor kernel rather than bsd.mp.
| I can't immediately tell if you're running i386 or amd64, but i386 will
| probably be better.

amd64 currently.


| There may be something else going on here, because 150kpps should be
| trivial for a box like this, but the advice above will certainly improve
| your situation.

Thank you ! I'll plan to test that !


| (Yes, it will hurt to know that 7 of your cores are doing nothing. Too
| bad, they're just slowing you down now)

Hum, I prefer to see it working well with only 1 core instead of working bad
using 8 cores :-)


Manuel 

--
__
Manuel Guesdon - OXYMIUM



Re: network bandwith with em(4)

2011-02-28 Thread Patrick Lamaiziere
Le Sat, 26 Feb 2011 00:23:36 +0900,
Ryan McBride mcbr...@openbsd.org a icrit :

How about a _full_ dmesg, so someone can take a wild guess at
what your machine is capable of?
  
  full dmesg : http://user.lamaiziere.net/patrick/dmesg-open48.txt
  
  The box is a Dell R610 server.
 
 This box should be able to fill a gigabit of regular TCP traffic (1500
 MTU) without any problem. Double-check your testing procedures.

I will test this.
 
 I have some additional comments/questions though:
 
 1) you probably don't want to run bsd.mp on a firewall, it'll hurt you
 more than it helps, unless you have significant CPU-bound userland
 stuff going on, for example antivirus scanning of email.

I've tried with a sp kernel (amd64), does not look to change something.
 
 2) You may get better performance running i386.

I will try, but I do not expect a lot of difference on the IErr rate.
 
 3) Besides the the em driver changes you've mentioned, is the source
 code you're building the kernel clean OPENBSD_4_8 -stable, or
 something else (4.8-current from after the 4.8 release, for example)

It's a clean release 4.8/amd64, with 4.8 erratas applied.

Thanks, regards.



Re: network bandwith with em(4)

2011-02-28 Thread Frédéric URBAN

Le 28/02/2011 16:51, Patrick Lamaiziere a icrit :

Le Sat, 26 Feb 2011 00:23:36 +0900,
Ryan McBridemcbr...@openbsd.org  a icrit :


How about a _full_ dmesg, so someone can take a wild guess at
what your machine is capable of?

full dmesg : http://user.lamaiziere.net/patrick/dmesg-open48.txt

The box is a Dell R610 server.

This box should be able to fill a gigabit of regular TCP traffic (1500
MTU) without any problem. Double-check your testing procedures.

I will test this.

As i said earlier, almost same setup with bnx(4) instead of em(4) (Dell 
R510 with a single Intel X5660) and we can send at Gigabit full duplex 
with only around 25% interrupt CPU.
It looks like R610 has 4x bnx(4) iface (Broadcom BCM 5709) maybe you can 
try to use them just for testing purpose.

I have some additional comments/questions though:

1) you probably don't want to run bsd.mp on a firewall, it'll hurt you
more than it helps, unless you have significant CPU-bound userland
stuff going on, for example antivirus scanning of email.

I've tried with a sp kernel (amd64), does not look to change something.


2) You may get better performance running i386.

I will try, but I do not expect a lot of difference on the IErr rate.


3) Besides the the em driver changes you've mentioned, is the source
code you're building the kernel clean OPENBSD_4_8 -stable, or
something else (4.8-current from after the 4.8 release, for example)

It's a clean release 4.8/amd64, with 4.8 erratas applied.

Thanks, regards.




Re: network bandwith with em(4)

2011-02-28 Thread Daniel Ouellet

OK. Anyway NIC buffers restrict buffered packets number. But the problem
remain: why a (for exemple) dual Xeon E5520@2.27GHz with Intel PRO/1000
(82576) can't route 150kpps without Ierr :-)
http://www.oxymium.net/tmp/core3-dmesg


Just an idea, but may be it very well could have something to do with this:

http://www.openbsd.org/want.html

Specifically this part:

# Intel 82576 SFP and 82580 based Gigabit Ethernet devices for improving 
hardware support in em(4). Needed in Hannover, Germany. Contact 
j...@openbsd.org.


I assume that if the wanted is still there form that chipset based 
network card and you use the same chipset on yours that the support for 
it is not as good as it could be?


Juts a thought, but I sure could well be way off as well.

Food for thought.

Best,

Daniel



Re: network bandwith with em(4)

2011-02-28 Thread fredrik danerklint
mendagen den 28 februari 2011 23.00.10 skrev  Daniel Ouellet:
  OK. Anyway NIC buffers restrict buffered packets number. But the problem
  remain: why a (for exemple) dual Xeon E5520@2.27GHz with Intel PRO/1000
  (82576) can't route 150kpps without Ierr :-)
  http://www.oxymium.net/tmp/core3-dmesg
 
 Just an idea, but may be it very well could have something to do with this:
 
 http://www.openbsd.org/want.html
 
 Specifically this part:
 
 # Intel 82576 SFP and 82580 based Gigabit Ethernet devices for improving
 hardware support in em(4). Needed in Hannover, Germany. Contact
 j...@openbsd.org.

I've sent an 4-port network card (Intel NIC I340-T4 (82580 ethernet chipset)) 
to jsg which he did received 2011-01-18. 

-- 
//fredan



Re: network bandwith with em(4)

2011-02-25 Thread Gabriel Linder

On 02/24/11 19:28, RLW wrote:

W dniu 2011-02-24 12:11, Patrick Lamaiziere pisze:

Le Wed, 23 Feb 2011 22:09:18 +0100,
Manuel Guesdonml+openbsd.m...@oxymium.net  a icrit :



| Did you try to increase the number of descriptor?
| #define EM_MAX_TXD 256
| #define EM_MAX_RXD 256
|
| I've tried up to 2048 (and with MAX_INTS_PER_SEC = 16000) but it
looks | worth.


Thank you ! I'll investigate this !


As I said it is worth here. The load is increaded and I lose around 50
Mbits of bandwith. I was curious if you've made some tests on this.




ok, so the conclusion might be, that if one want to have transfers 
bigger than 300mbit/s on em(4), one should tuning the em(4) driver 
source code?


I have firewalls with more than 300Mbit/s and standard GENERIC.MP.



Re: network bandwith with em(4)

2011-02-25 Thread Claer
On Thu, Feb 24 2011 at 28:19, RLW wrote:

[...]
 
 ok, so the conclusion might be, that if one want to have transfers
 bigger than 300mbit/s on em(4), one should tuning the em(4) driver
 source code?
False

Here are the tests I've done with a packet generator.
http://marc.info/?l=openbsd-miscm=129534605406967w=2

Claer



Re: network bandwith with em(4)

2011-02-25 Thread Patrick Lamaiziere
Le Fri, 25 Feb 2011 08:41:20 +0900,
Ryan McBride mcbr...@openbsd.org a icrit :

 On Wed, Feb 23, 2011 at 06:07:16PM +0100, Patrick Lamaiziere wrote:
  I log the congestion counter (each 10s) and there are at max 3 or 4
  congestions per day. I don't think the bottleneck is pf.
 
 The congestion counter doesn't directly mean you have a bottleneck in
 PF; it's triggered by the IP input queue being full, and could
 indicate a bottleneck in other places as well, which PF tries to help
 out with by dropping packets earlier.
 
 
   Interface errors?
  
  Quite a lot.
 
 The output of `systat mbufs` is worth looking at, in particular the
 figure for LIVELOCKS, and the LWM/CWM figures for the interface(s) in
 question. 
 
 If the livelocks value is very high, and the LWM/CWM numbers are very
 small, it is likely that the MCLGETI interface is protecting your
 system from being completly flattened by forcing the em card to drop
 packets (supported by your statement that the error rate is high). If
 it's bad enough MCLGETI will be so effective that the pf congestion
 counter will not get increment.

systat mbufs:
IFACELIVELOCKS SIZE ALIVE LWM HWM CWM
System 256  375   149
2k  240   1125

em0  17722k  80   4   256 80
em1112k   5   4   256  5
em2   2932k 110   4   256 110
em3
em4182k  11   4   256  11
em5102k  12   4   256  12
em6142k   5   4   256   5
bnx032k   4   2   510   4
bnx112k   4   2   510   4
bnx312k   2   2   510   2 
 
 
 You mentioned the following in your initial email:
 
  #define MAX_INTS_PER_SEC8000
 
  Do you think I can increase this value? The interrupt rate of the
  machine is at max ~60% (top).
 
 Increasing this value will likely hurt you. 60% interrupt rate sounds
 about right to me for a firewall system that is running at full tilt;
 100% interrupt is very bad, if your system spends all cycles servicing
 interrupts it will not do very much of anything useful.
 
 
 dmesg:
  em0 at pci5 dev 0 function 0 Intel PRO/1000 QP (82571EB) rev
  0x06: apic 1 int 13 (irq 14), address 00:15:17:ed:98:9d
 
  em4 at pci9 dev 0 function 0 Intel PRO/1000 QP (82575GB) rev 0x02:
  apic 1 int 23 (irq 11), address 00:1b:21:38:e0:80
 
 How about a _full_ dmesg, so someone can take a wild guess at what
 your machine is capable of?
 
 -Ryan
 



-- 
-- 
Patrick Lamaizihre
CRI Universiti de Rennes 1
Til: 02 23 23 71 45



Re: network bandwith with em(4)

2011-02-25 Thread Patrick Lamaiziere
Le Fri, 25 Feb 2011 13:51:32 +0100,
Patrick Lamaiziere patf...@davenulle.org a icrit :

(ooops, push the wrong button)

  How about a _full_ dmesg, so someone can take a wild guess at what
  your machine is capable of?

full dmesg : http://user.lamaiziere.net/patrick/dmesg-open48.txt

The box is a Dell R610 server.

Thanks, regards.



Re: network bandwith with em(4)

2011-02-25 Thread Patrick Lamaiziere
Le Fri, 25 Feb 2011 13:51:32 +0100,
Patrick Lamaiziere patf...@davenulle.org a icrit :

 systat mbufs:
 IFACELIVELOCKS SIZE ALIVE LWM HWM CWM

What does these counters mean?

Thanks.



Re: network bandwith with em(4)

2011-02-25 Thread Patrick Lamaiziere
Le Tue, 22 Feb 2011 18:09:32 +0100,
Patrick Lamaiziere patf...@davenulle.org a icrit :

 (4.8/amd64)
 
 Hello,
 
 I'm using two ethernet cards Intel 1000/PRO quad ports (gigabit) on a
 firewall (one fiber and one copper).
 
 The problem is that we don't get more than ~320 Mbits/s of bandwith
 beetween the internal networks and internet (gigabit).
 
 As far I can see, on load there is a number of Ierr on the interface
 connected to Internet (between 1% to 5%).
 
 Also the interrupt rate on this card is around ~7500 (using systat).
 In the em(4) driver, there is a limitation of the interrupt rate at
 8000/s.

...

Well, I've made some tests and increasing the number of interrupts or
the number of RX descriptors does not help to reduce the Ierr count or
to increase the bandwith.

So I don't know where is the problem... 

Do you think the hardware used is not powerful enough ? (dmesg :
http://user.lamaiziere.net/patrick/dmesg-openbsd4.8.txt).

The box is a router/firewall, there are 6 interfaces on the box, one is
connected to internet (the most busy interface). One is connected to
the lan (very busy too). The others are far less busy.

To give an idea, this box replaces an old Cisco 7204 which hangs at 200
Mbits, no more.

I would be happy to know which kind of hardware you are using to build
a gigabit router with good performance?

Thanks to all. regards.



Re: network bandwith with em(4)

2011-02-25 Thread Ryan McBride
On Fri, Feb 25, 2011 at 02:05:30PM +0100, Patrick Lamaiziere wrote:
 Le Fri, 25 Feb 2011 13:51:32 +0100,
 Patrick Lamaiziere patf...@davenulle.org a icrit :
 
 (ooops, push the wrong button)
 
   How about a _full_ dmesg, so someone can take a wild guess at what
   your machine is capable of?
 
 full dmesg : http://user.lamaiziere.net/patrick/dmesg-open48.txt
 
 The box is a Dell R610 server.

This box should be able to fill a gigabit of regular TCP traffic (1500
MTU) without any problem. Double-check your testing procedures.

I have some additional comments/questions though:

1) you probably don't want to run bsd.mp on a firewall, it'll hurt you
more than it helps, unless you have significant CPU-bound userland stuff
going on, for example antivirus scanning of email.

2) You may get better performance running i386.

3) Besides the the em driver changes you've mentioned, is the source
code you're building the kernel clean OPENBSD_4_8 -stable, or something
else (4.8-current from after the 4.8 release, for example)



Re: network bandwith with em(4)

2011-02-25 Thread Manuel Guesdon
Hi,

On Fri, 25 Feb 2011 08:41:20 +0900
Ryan McBride mcbr...@openbsd.org wrote:
..
| The output of `systat mbufs` is worth looking at, in particular the
| figure for LIVELOCKS, and the LWM/CWM figures for the interface(s) in
| question. 
| 
| If the livelocks value is very high, and the LWM/CWM numbers are very
| small, 

Thnak you for your help, Ryan.


It seems I'm in this situation:
   5 usersLoad 0.17 0.15 0.10  (1-48 of 58)Fri Feb 25 20:27:44
2011

IFACE LIVELOCKS  SIZE ALIVE   LWM   HWM   CWM
System256 820505446
   2k   2571252
lo0
em0  342k 4 4   256 4
em1 2572k 4 4   256 4
em2  3383822k 7 4   256 7
em382582k 4 4   256 4
em4   226352k48 4   25648
em534702k 6 4   256 6
em6  4582412k28 4   25628
em7   82k 4 4   256 4
em8   332322k50 4   25650
em9   468782k 4 4   256 4

systat -s 2 vmstat:
   5 usersLoad 0.22 0.17 0.10  Fri Feb 25 20:28:18
2011

memory totals (in KB)PAGING   SWAPPING Interrupts
   real   virtual free   in  out   in  out25589 total
Active   741204741204  1761104   ops   1600 clock
All 1278264   1278264  1761104   pages   11 ipi
  1 em0
Proc:r  d  s  wCsw   Trp   Sys   Int   Sof  Flt   forks em1
  1532 5   117 2286799   33   fkppw2691 em2
  fksvm em3
   3.2%Int   0.1%Sys   0.0%Usr   0.0%Nic  96.8%Idle   pwait6778 em4
|||||||||||   relck 382 em5
||rlkok7328 em6
  noram em7
Namei Sys-cacheProc-cacheNo-cache ndcpy6724 em8
Calls hits%hits %miss   % fltcp  74 em9
3 zfod  uhci1
  cow   ehci0
Disks   wd0   cd0   sd0 25328 fmin  ehci1
seeks   33770 ftarg
pciide0 xfers
itarg com0 speed 241
wired com1 sec
pdfre pckbc0 pdscn
  pzidle
   44 kmapent


  81542 IPKTS
  78860 OPKTS


(it's on a device with MAX_INTS_PER_SEC=8000)


|it is likely that the MCLGETI interface is protecting your system
| from being completly flattened by forcing the em card to drop packets
| (supported by your statement that the error rate is high). 
| 
| How about a _full_ dmesg, so someone can take a wild guess at what
| your machine is capable of?

http://www.oxymium.net/tmp/core3-dmesg

This device is not overloaded but it drop packets :-(

Manuel 



Re: network bandwith with em(4)

2011-02-24 Thread Patrick Lamaiziere
Le Wed, 23 Feb 2011 22:09:18 +0100,
Manuel Guesdon ml+openbsd.m...@oxymium.net a icrit :


 | Did you try to increase the number of descriptor?
 | #define EM_MAX_TXD 256
 | #define EM_MAX_RXD 256
 | 
 | I've tried up to 2048 (and with MAX_INTS_PER_SEC = 16000) but it
 looks | worth.
 
 Thank you ! I'll investigate this !

As I said it is worth here. The load is increaded and I lose around 50
Mbits of bandwith. I was curious if you've made some tests on this.



Re: network bandwith with em(4)

2011-02-24 Thread RLW

W dniu 2011-02-24 12:11, Patrick Lamaiziere pisze:

Le Wed, 23 Feb 2011 22:09:18 +0100,
Manuel Guesdonml+openbsd.m...@oxymium.net  a icrit :



| Did you try to increase the number of descriptor?
| #define EM_MAX_TXD 256
| #define EM_MAX_RXD 256
|
| I've tried up to 2048 (and with MAX_INTS_PER_SEC = 16000) but it
looks | worth.


Thank you ! I'll investigate this !


As I said it is worth here. The load is increaded and I lose around 50
Mbits of bandwith. I was curious if you've made some tests on this.




ok, so the conclusion might be, that if one want to have transfers 
bigger than 300mbit/s on em(4), one should tuning the em(4) driver 
source code?



best regards,
RLW



Re: network bandwith with em(4)

2011-02-24 Thread Ryan McBride
On Wed, Feb 23, 2011 at 06:07:16PM +0100, Patrick Lamaiziere wrote:
 I log the congestion counter (each 10s) and there are at max 3 or 4
 congestions per day. I don't think the bottleneck is pf.

The congestion counter doesn't directly mean you have a bottleneck in
PF; it's triggered by the IP input queue being full, and could indicate
a bottleneck in other places as well, which PF tries to help out with by
dropping packets earlier.


  Interface errors?
 
 Quite a lot.

The output of `systat mbufs` is worth looking at, in particular the
figure for LIVELOCKS, and the LWM/CWM figures for the interface(s) in
question. 

If the livelocks value is very high, and the LWM/CWM numbers are very
small, it is likely that the MCLGETI interface is protecting your system
from being completly flattened by forcing the em card to drop packets
(supported by your statement that the error rate is high). If it's bad
enough MCLGETI will be so effective that the pf congestion counter will
not get increment.


You mentioned the following in your initial email:

 #define MAX_INTS_PER_SEC8000

 Do you think I can increase this value? The interrupt rate of the
 machine is at max ~60% (top).

Increasing this value will likely hurt you. 60% interrupt rate sounds
about right to me for a firewall system that is running at full tilt;
100% interrupt is very bad, if your system spends all cycles servicing
interrupts it will not do very much of anything useful.


dmesg:
 em0 at pci5 dev 0 function 0 Intel PRO/1000 QP (82571EB) rev
 0x06: apic 1 int 13 (irq 14), address 00:15:17:ed:98:9d

 em4 at pci9 dev 0 function 0 Intel PRO/1000 QP (82575GB) rev 0x02:
 apic 1 int 23 (irq 11), address 00:1b:21:38:e0:80

How about a _full_ dmesg, so someone can take a wild guess at what
your machine is capable of?

-Ryan



Re: network bandwith with em(4)

2011-02-24 Thread David Gwynne
id like to reiterate ryans advice to have a look at the systat mbuf output.

as he said, mclgeti will try to protect the host by restricting the number of
packets placed on the rx rings. it turns out you dont need (or cant use) a lot
of packets on the ring, so bumping the ring size is a useless tweak. mclgeti
simply wont let you fill all those descriptors.

if you were allowed to fill all 2048 entries on your modified rings, that
would just mean you spend more time in the interrupt handler pulling packets
off these rings and freeing them immediately because you have no time to
process them. ie, increasing the ring size would actually slow down your
forwarding rate if mclgeti was disabled.

cheers,
dlg

On 25/02/2011, at 9:41 AM, Ryan McBride wrote:

 On Wed, Feb 23, 2011 at 06:07:16PM +0100, Patrick Lamaiziere wrote:
 I log the congestion counter (each 10s) and there are at max 3 or 4
 congestions per day. I don't think the bottleneck is pf.

 The congestion counter doesn't directly mean you have a bottleneck in
 PF; it's triggered by the IP input queue being full, and could indicate
 a bottleneck in other places as well, which PF tries to help out with by
 dropping packets earlier.


 Interface errors?

 Quite a lot.

 The output of `systat mbufs` is worth looking at, in particular the
 figure for LIVELOCKS, and the LWM/CWM figures for the interface(s) in
 question.

 If the livelocks value is very high, and the LWM/CWM numbers are very
 small, it is likely that the MCLGETI interface is protecting your system
 from being completly flattened by forcing the em card to drop packets
 (supported by your statement that the error rate is high). If it's bad
 enough MCLGETI will be so effective that the pf congestion counter will
 not get increment.


 You mentioned the following in your initial email:

 #define MAX_INTS_PER_SEC8000

 Do you think I can increase this value? The interrupt rate of the
 machine is at max ~60% (top).

 Increasing this value will likely hurt you. 60% interrupt rate sounds
 about right to me for a firewall system that is running at full tilt;
 100% interrupt is very bad, if your system spends all cycles servicing
 interrupts it will not do very much of anything useful.


 dmesg:
 em0 at pci5 dev 0 function 0 Intel PRO/1000 QP (82571EB) rev
 0x06: apic 1 int 13 (irq 14), address 00:15:17:ed:98:9d

 em4 at pci9 dev 0 function 0 Intel PRO/1000 QP (82575GB) rev 0x02:
 apic 1 int 23 (irq 11), address 00:1b:21:38:e0:80

 How about a _full_ dmesg, so someone can take a wild guess at what
 your machine is capable of?

 -Ryan



Re: network bandwith with em(4)

2011-02-24 Thread Theo de Raadt
 We've got same problems (on a routeur, not a firewall). Increasing
 MAX_INTS_PER_SEC to 24000  increased bandwith and lowered packet loss.
 Our cards are Intel PRO/1000 (82576) and Intel PRO/1000 FP
 (82576).

Did you try to increase the number of descriptor?
#define EM_MAX_TXD 256
#define EM_MAX_RXD 256

I've tried up to 2048 (and with MAX_INTS_PER_SEC = 16000) but it looks
worth.

Say you increase this.

That means on a single interrupt, the handler could be forced to handle
around 2000 packets.

Nothing else will happen on the machine during that period.

Can you say 'interrupt latency increase' boys and girls?



Re: network bandwith with em(4)

2011-02-23 Thread Patrick Lamaiziere
Le Tue, 22 Feb 2011 19:13:48 +0100,
Manuel Guesdon ml+openbsd.m...@oxymium.net a icrit :

Hello,

 We've got same problems (on a routeur, not a firewall). Increasing
 MAX_INTS_PER_SEC to 24000  increased bandwith and lowered packet loss.
 Our cards are Intel PRO/1000 (82576) and Intel PRO/1000 FP
 (82576).

Did you try to increase the number of descriptor?
#define EM_MAX_TXD 256
#define EM_MAX_RXD 256

I've tried up to 2048 (and with MAX_INTS_PER_SEC = 16000) but it looks
worth.

My configuration is two firewalls in master/backup mode. On the first
one the two most busy links are on the first card (Fiber). On the
second, these two links are not on the same card, one is on the fiber
card and the other on the cupper card. I've noticed today that the
input Ierr rate is far lower on the second firewall than on the first.

Is it possible to have a bottleneck on the ethernet card or on the bus?

I will make more tests tomorrow...
Thanks, regards.



Re: network bandwith with em(4)

2011-02-23 Thread Patrick Lamaiziere
Le Tue, 22 Feb 2011 10:22:16 -0800 (PST),
James A. Peltier jpelt...@sfu.ca a icrit :

 Those documents do not necessarily apply any more.  Don't go tweaking
 knobs until you know what they do.  We have machines here that
 transfer nearly a gigabit of traffic/s without tuning in bridge mode
 non-the-less.
 
 Are you seeing any packet congestion markers (counter congestion) in
 systat pf?  If so you might not have sufficient states available

I log the congestion counter (each 10s) and there are at max 3 or 4
congestions per day. I don't think the bottleneck is pf.
 
 What about framentation?

None.

 Interface errors?

Quite a lot.
 
 There are many other non-tweakable issues that could cause this.

Sure, it's hard to know.

Thanks, regards.



Re: network bandwith with em(4)

2011-02-23 Thread Manuel Guesdon
On Wed, 23 Feb 2011 17:52:21 +0100
Patrick Lamaiziere patf...@davenulle.org wrote:

| Le Tue, 22 Feb 2011 19:13:48 +0100,
| Manuel Guesdon ml+openbsd.m...@oxymium.net a icrit :
|
| Hello,
|
|  We've got same problems (on a routeur, not a firewall). Increasing
|  MAX_INTS_PER_SEC to 24000  increased bandwith and lowered packet loss.
|  Our cards are Intel PRO/1000 (82576) and Intel PRO/1000 FP
|  (82576).
|
| Did you try to increase the number of descriptor?
| #define EM_MAX_TXD 256
| #define EM_MAX_RXD 256
|
| I've tried up to 2048 (and with MAX_INTS_PER_SEC = 16000) but it looks
| worth.

Thank you ! I'll investigate this !


| My configuration is two firewalls in master/backup mode. On the first
| one the two most busy links are on the first card (Fiber). On the
| second, these two links are not on the same card, one is on the fiber
| card and the other on the cupper card. I've noticed today that the
| input Ierr rate is far lower on the second firewall than on the first.
|
| Is it possible to have a bottleneck on the ethernet card or on the bus?

May be (but I'm not an expert :-). In my case, the bus doesn't seems to be
the problem (cards are on the PCI #1 64-bit PCI Express on a X8DTU
http://www.supermicro.com/products/motherboard/QPI/5500/X8DTU.cfm).

Manuel

--
__
Manuel Guesdon - OXYMIUM



network bandwith with em(4)

2011-02-22 Thread Patrick Lamaiziere
(4.8/amd64)

Hello,

I'm using two ethernet cards Intel 1000/PRO quad ports (gigabit) on a
firewall (one fiber and one copper).

The problem is that we don't get more than ~320 Mbits/s of bandwith
beetween the internal networks and internet (gigabit).

As far I can see, on load there is a number of Ierr on the interface
connected to Internet (between 1% to 5%).

Also the interrupt rate on this card is around ~7500 (using systat). In
the em(4) driver, there is a limitation of the interrupt rate at 8000/s.

if_em.h
/*
 * MAX_INTS_PER_SEC (ITR - Interrupt Throttle Register)
 * The Interrupt Throttle Register (ITR) limits the delivery of
interrupts
 * to a reasonable rate by providing a guaranteed inter-interrupt delay
 * between interrupts asserted by the Ethernet controller.
 */
#define MAX_INTS_PER_SEC8000

Do you think I can increase this value? The interrupt rate of the
machine is at max ~60% (top).

Other ideas to increase the bandwith would be welcome too. I don't
think the limitation come from PF because I don't see any congestion.

thanks, regards.

--
dmesg:
em0 at pci5 dev 0 function 0 Intel PRO/1000 QP (82571EB) rev
0x06: apic 1 int 13 (irq 14), address 00:15:17:ed:98:9d

em4 at pci9 dev 0 function 0 Intel PRO/1000 QP (82575GB) rev 0x02:
apic 1 int 23 (irq 11), address 00:1b:21:38:e0:80



Re: network bandwith with em(4)

2011-02-22 Thread Mark Nipper
On 22 Feb 2011, Patrick Lamaiziere wrote:
 The problem is that we don't get more than ~320 Mbits/s of bandwith
 beetween the internal networks and internet (gigabit).

Have you already looked at:
---
https://calomel.org/network_performance.html

-- 
Mark Nipper
ni...@bitgnome.net (XMPP)
+1 979 575 3193



Re: network bandwith with em(4)

2011-02-22 Thread Frédéric URBAN

Hello,

We kinda have the same setup, but with bnx(4) devices. And there is no 
problem. I'm used to download big files on FTP all over the world and we 
have gigabit connectivity without any pf related tuning. We are planning 
to use em(4) 82876 on another path to another ISP so if you find 
anything else, i'm very interested.


Bonne soirie ;)

Le 22/02/2011 18:19, Mark Nipper a icrit :

On 22 Feb 2011, Patrick Lamaiziere wrote:

The problem is that we don't get more than ~320 Mbits/s of bandwith
beetween the internal networks and internet (gigabit).

Have you already looked at:
---
https://calomel.org/network_performance.html




Re: network bandwith with em(4)

2011-02-22 Thread Patrick Lamaiziere
Le Tue, 22 Feb 2011 11:19:26 -0600,
Mark Nipper ni...@bitgnome.net a icrit :

  The problem is that we don't get more than ~320 Mbits/s of bandwith
  beetween the internal networks and internet (gigabit).
 
   Have you already looked at:
 ---
 https://calomel.org/network_performance.html

Yes thanks. I've already increase the size of the
net.inet.ip.ifq.maxlen. 

But I don't see the point of these tunings for a firewall. IMHO, it
could help for a host handling tcp/udp connection. 

Anyway, I've tried, that does not change anything and I don't think it
should.

I'm not a network expert, I could be wrong. Let see:
## Calomel.org  OpenBSD  /etc/sysctl.conf
##
kern.maxclusters=128000 # Cluster allocation limit

= netstat -m reports a peak of *only* 2500 mbufs used.

net.inet.ip.mtudisc=0   # TCP MTU (Maximum Transmission Unit)

= still at 1. I don't use scrub in pf or mss clamping.

net.inet.tcp.ackonpush=1# acks for packets with the push bit

= only one TCP connection on the firewall (ssh).

net.inet.tcp.ecn=1  # Explicit Congestion Notification enabled

net.inet.tcp.mssdflt=1472   # maximum segment size (1472 from scrub
pf.conf)

= same here, I guess the default mss is for connections from the
machine. tcpdump shows that the mss is negociated around 1450. Looks
good.

net.inet.tcp.recvspace=262144 # Increase TCP recieve windows size
to increase performance

= same, no tcp nor udp...

I'm wrong?

Thanks, regards.



Re: network bandwith with em(4)

2011-02-22 Thread Manuel Guesdon
Hi,

On Tue, 22 Feb 2011 18:09:32 +0100
Patrick Lamaiziere patf...@davenulle.org wrote:
| I'm using two ethernet cards Intel 1000/PRO quad ports (gigabit) on a
| firewall (one fiber and one copper).
| 
| The problem is that we don't get more than ~320 Mbits/s of bandwith
| beetween the internal networks and internet (gigabit).
| 
| As far I can see, on load there is a number of Ierr on the interface
| connected to Internet (between 1% to 5%).
| 
| Also the interrupt rate on this card is around ~7500 (using systat). In
| the em(4) driver, there is a limitation of the interrupt rate at 8000/s.
| 
| if_em.h
| /*
|  * MAX_INTS_PER_SEC (ITR - Interrupt Throttle Register)
|  * The Interrupt Throttle Register (ITR) limits the delivery of
| interrupts
|  * to a reasonable rate by providing a guaranteed inter-interrupt delay
|  * between interrupts asserted by the Ethernet controller.
|  */
| #define MAX_INTS_PER_SEC 8000
| 
| Do you think I can increase this value? The interrupt rate of the
| machine is at max ~60% (top).

We've got same problems (on a routeur, not a firewall). Increasing
MAX_INTS_PER_SEC to 24000  increased bandwith and lowered packet loss.
Our cards are Intel PRO/1000 (82576) and Intel PRO/1000 FP (82576).

We still have Ierr (but lower count). I don't understand why we still get
errors with a 90+%Idle system.
I've made some calculations and for a 1Gbps link with 600 Bytes packets, we
have to process 208 334 pps. With a 40KB RX buffer on nic (4/600=66
packets max in buffer) we only need 208334/66=3157 interrupts/s so 24000 and
even 8000 interrupts/s should be enough :-(

If someone have an explanation...

Manuel 



Re: network bandwith with em(4)

2011-02-22 Thread RLW

W dniu 2011-02-22 18:31, Fridiric URBAN pisze:

Hello,

We kinda have the same setup, but with bnx(4) devices. And there is no
problem. I'm used to download big files on FTP all over the world and we
have gigabit connectivity without any pf related tuning. We are planning
to use em(4) 82876 on another path to another ISP so if you find
anything else, i'm very interested.

Bonne soirie ;)

Le 22/02/2011 18:19, Mark Nipper a icrit :

On 22 Feb 2011, Patrick Lamaiziere wrote:

The problem is that we don't get more than ~320 Mbits/s of bandwith
beetween the internal networks and internet (gigabit).

Have you already looked at:
---
https://calomel.org/network_performance.html





Hello,

i have been writing to this group about the same problem on November 
2010 - http://marc.info/?l=openbsd-miscm=128990880310013w=2


After some discussion, Claudio Joker suggested, that there might be 
problem with TBR (token bucket regulator).


When I tried to set tbrsize in pf.conf like man says a got an error.

altq on em0 cbq bandwidth 1Gb tbrsize 4K queue { q_lan }
queue q_lan bandwidth 950Mb cbq (default)

i got error:
root@router-test (/root)# pfctl -f /etc/pf.conf
/etc/pf.conf:9: syntax error
/etc/pf.conf:10: queue q_lan has no parent
/etc/pf.conf:10: errors in queue definition
pfctl: Syntax error in config file: pf rules not loaded

without tbrsize altq definition is ok.

Problem exist for Broadcom cards (bge) also but developers don't have 
enough time to look into it deeper unfortunately.



best regards,
RLW



Re: network bandwith with em(4)

2011-02-22 Thread BSD

On 02/22/11 11:19, Mark Nipper wrote:

On 22 Feb 2011, Patrick Lamaiziere wrote:

The problem is that we don't get more than ~320 Mbits/s of bandwith
beetween the internal networks and internet (gigabit).

Have you already looked at:
---
https://calomel.org/network_performance.html



Henning Brauer have some very interesting thoughts about the content of that 
particular page. Recent changes on the network stack make those sysctl settings 
useless.

-luis



Re: network bandwith with em(4)

2011-02-22 Thread James A. Peltier
Those documents do not necessarily apply any more.  Don't go tweaking knobs 
until you know what they do.  We have machines here that transfer nearly a 
gigabit of traffic/s without tuning in bridge mode non-the-less.

Are you seeing any packet congestion markers (counter congestion) in systat pf? 
 If so you might not have sufficient states available

What about framentation?

Interface errors?

There are many other non-tweakable issues that could cause this.

- Original Message -
| Le Tue, 22 Feb 2011 11:19:26 -0600,
| Mark Nipper ni...@bitgnome.net a icrit :
| 
|   The problem is that we don't get more than ~320 Mbits/s of
|   bandwith
|   beetween the internal networks and internet (gigabit).
| 
|  Have you already looked at:
|  ---
|  https://calomel.org/network_performance.html
| 
| Yes thanks. I've already increase the size of the
| net.inet.ip.ifq.maxlen.
| 
| But I don't see the point of these tunings for a firewall. IMHO, it
| could help for a host handling tcp/udp connection.
| 
| Anyway, I've tried, that does not change anything and I don't think it
| should.
| 
| I'm not a network expert, I could be wrong. Let see:
| ## Calomel.org OpenBSD /etc/sysctl.conf
| ##
| kern.maxclusters=128000 # Cluster allocation limit
| 
| = netstat -m reports a peak of *only* 2500 mbufs used.
| 
| net.inet.ip.mtudisc=0 # TCP MTU (Maximum Transmission Unit)
| 
| = still at 1. I don't use scrub in pf or mss clamping.
| 
| net.inet.tcp.ackonpush=1 # acks for packets with the push bit
| 
| = only one TCP connection on the firewall (ssh).
| 
| net.inet.tcp.ecn=1 # Explicit Congestion Notification enabled
| 
| net.inet.tcp.mssdflt=1472 # maximum segment size (1472 from scrub
| pf.conf)
| 
| = same here, I guess the default mss is for connections from the
| machine. tcpdump shows that the mss is negociated around 1450. Looks
| good.
| 
| net.inet.tcp.recvspace=262144 # Increase TCP recieve windows size
| to increase performance
| 
| = same, no tcp nor udp...
| 
| I'm wrong?
| 
| Thanks, regards.

-- 
James A. Peltier
IT Services - Research Computing Group
Simon Fraser University - Burnaby Campus
Phone   : 778-782-6573
Fax : 778-782-3045
E-Mail  : jpelt...@sfu.ca
Website : http://www.sfu.ca/itservices
  http://blogs.sfu.ca/people/jpeltier



Re: network bandwith with em(4)

2011-02-22 Thread Ted Unangst
On Tue, Feb 22, 2011 at 1:06 PM, Patrick Lamaiziere
patf...@davenulle.org wrote:
 https://calomel.org/network_performance.html

 Yes thanks. I've already increase the size of the
 net.inet.ip.ifq.maxlen.

 But I don't see the point of these tunings for a firewall. IMHO, it
 could help for a host handling tcp/udp connection.

Wow, you're like the first person ever to realize that.  I'm serious.
I wish more people would at least try to think about what they're
doing before they go twisting every dial they can find because the
internet said so.

Sorry I can't give you much useful help, but ignoring the calomel crap
is a great start.



Re: network bandwith with em(4)

2011-02-22 Thread Christiano F. Haesbaert
On 22 February 2011 14:09, Patrick Lamaiziere patf...@davenulle.org wrote:
 (4.8/amd64)

 Hello,

 I'm using two ethernet cards Intel 1000/PRO quad ports (gigabit) on a
 firewall (one fiber and one copper).

 The problem is that we don't get more than ~320 Mbits/s of bandwith
 beetween the internal networks and internet (gigabit).

 As far I can see, on load there is a number of Ierr on the interface
 connected to Internet (between 1% to 5%).

 Also the interrupt rate on this card is around ~7500 (using systat). In
 the em(4) driver, there is a limitation of the interrupt rate at 8000/s.

 if_em.h
 /*
  * MAX_INTS_PER_SEC (ITR - Interrupt Throttle Register)
  * The Interrupt Throttle Register (ITR) limits the delivery of
 interrupts
  * to a reasonable rate by providing a guaranteed inter-interrupt delay
  * between interrupts asserted by the Ethernet controller.
  */
 #define MAX_INTS_PER_SEC8000

 Do you think I can increase this value? The interrupt rate of the
 machine is at max ~60% (top).

 Other ideas to increase the bandwith would be welcome too. I don't
 think the limitation come from PF because I don't see any congestion.

 thanks, regards.

 --
 dmesg:
 em0 at pci5 dev 0 function 0 Intel PRO/1000 QP (82571EB) rev
 0x06: apic 1 int 13 (irq 14), address 00:15:17:ed:98:9d

 em4 at pci9 dev 0 function 0 Intel PRO/1000 QP (82575GB) rev 0x02:
 apic 1 int 23 (irq 11), address 00:1b:21:38:e0:80



How exactly are you measuring the bandwidth ?

What does tcpbench tells you ?