Re: poor routing/nat performance
Dear David, According to my experience, the IPv4/IPv6 packet forwarding performance of OpenBSD is about an order of magnitude lower than that of Linux, if I use a 16-core server. When I tried to identify the root causes, I found two things: 1. I used an RFC 2544 compliant test with a single IP address pair and RFC 4814 pseudorandom port numbers. However, the interrupts caused by the packet arrivals were processed by two CPU cores (one core per direction), the others did not take part in it. It is so because OpenBSD does not support the setting of the proper RSS (Receive-Side Scaling), please see the details in: https://marc.info/?l=openbsd-misc=166581934723445=2 If you forward Internet traffic, then you have different IP addresses, thus this one will not be an issue for you. 2. When I checked the CPU utilization using the top command, I found that only 3 CPU cores (out of the 32 CPU cores of my server) had non-zero load: two of them processed interrupts and had about 25-27% CPU utilization, and very likely the third one did the packet forwarding and it had about 90-95% CPU utilization in my particular experiment. That is, very likely the packet forwarding process can use only a single CPU core. I have saved the output of the top command, now I copy it here: 36 processes: 35 idle, 1 on processor up 0:12 CPU00 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU01 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle *CPU02 states: 0.0% user, 0.0% nice, 93.8% sys, 6.2% spin, 0.0% intr, 0.0% idle* CPU03 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU04 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU05 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU06 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU07 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU08 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle *CPU09 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 25.0% intr, 75.0% idle* CPU10 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU11 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU12 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU13 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU14 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU15 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU16 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU17 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU18 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU19 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU20 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU21 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU22 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU23 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU24 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle *CPU25 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 26.7% intr, 73.3% idle* CPU26 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU27 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU28 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU29 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU30 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU31 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle Memory: Real: 32M/1397M act/tot Free: 371G Cache: 712M Swap: 0K/256M As you can see, I made the lines with non-zero CPU utilization *bold*. I expect that this issue will be a problem for you, too: the packet forwarding performance of your OpenBSD system will not scale up with the number of CPU cores. Best regards, Gábor On 12/19/2022 5:35 PM, David Hajes wrote: hi guys, I have simple PcEngines APU2 router running latest OpenBSD stable. em0 is WAN (bridge to CaTV modem with 1Gbps/100Mbps connectivity with normal ether connectivity with DHCP...no special stuff like PPPoE) em1-3 is in vether/bridge mode with NAT routing to local network. I have complained to ISP about speeds because it supposes to run almost 1Gbps. results (speedtest.net used by ISP for some reason): 800+/85 Mbps measured by ISP technician directly from CaTV modem. 440MBps/85Mbps simple NAT firewall pf.conf based on OpenBSD suggestions 380/80Mbps with my
Re: poor routing/nat performance
With 7.2 on the APU 2 when I tested it was about 650 or so. I didn't send the info as it is not connected now. But either way, you can't get Gb speed on it no matter what. On 12/19/22 2:43 PM, Stuart Henderson wrote: On 2022-12-19, Daniel Ouellet wrote: OpenBSD 6.8 (GENERIC.MP) #4: Thu Aug 5 11:02:18 MDT 2021 This is too old for a good comparison, many improvements have been made since then.
Re: poor routing/nat performance
On 2022-12-19, Daniel Ouellet wrote: > OpenBSD 6.8 (GENERIC.MP) #4: Thu Aug 5 11:02:18 MDT 2021 This is too old for a good comparison, many improvements have been made since then.
Re: poor routing/nat performance
I have the APU 1 and here is what I get TEST_DATE TIME_ZONE DOWNLOAD_MEGABITS UPLOAD_MEGABITS 12/19/2022 11:52GMT 429.05 422.17 LATENCY_MS SERVER_NAME DISTANCE_MILES CONNECTION_MODE 3 Ashburn VA 0multi SERVER_COUNT multi 4 I haven't tested with the APU 2 that I have, but with NAT I don't think you can get the full 1Gb speed. I have 1Gb symmetric line and with NAT I can't come close to the full line speed. OpenBSD 6.8 (GENERIC.MP) #4: Thu Aug 5 11:02:18 MDT 2021 t...@syspatch-68-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 4246003712 (4049MB) avail mem = 4102266880 (3912MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xdf16d820 (7 entries) bios0: vendor coreboot version "4.0" date 09/08/2014 bios0: PC Engines APU acpi0 at bios0: ACPI 4.0 acpi0: sleep states S0 S1 S3 S4 S5 acpi0: tables DSDT FACP SPCR HPET APIC HEST SSDT SSDT SSDT acpi0: wakeup devices AGPB(S4) HDMI(S4) PBR4(S4) PBR5(S4) PBR6(S4) PBR7(S4) PE20(S4) PE21(S4) PE22(S4) PE23(S4) PIBR(S4) UOH1(S3) UOH2(S3) UOH3(S3) UOH4(S3) UOH5(S3) [...] acpitimer0 at acpi0: 3579545 Hz, 32 bits acpihpet0 at acpi0: 14318180 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: AMD G-T40E Processor, 1000.13 MHz, 14-02-00 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,MWAIT,SSSE3,CX16,POPCNT,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,IBS,SKINIT,ITSC cpu0: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 16-way L2 cache cpu0: 8 4MB entries fully associative cpu0: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 199MHz cpu0: mwait min=64, max=64, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: AMD G-T40E Processor, 1000.01 MHz, 14-02-00 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,MWAIT,SSSE3,CX16,POPCNT,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,IBS,SKINIT,ITSC cpu1: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 16-way L2 cache cpu1: 8 4MB entries fully associative cpu1: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative cpu1: smt 0, core 1, package 0 ioapic0 at mainbus0: apid 2 pa 0xfec0, version 21, 24 pins acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus -1 (AGPB) acpiprt2 at acpi0: bus -1 (HDMI) acpiprt3 at acpi0: bus 1 (PBR4) acpiprt4 at acpi0: bus 2 (PBR5) acpiprt5 at acpi0: bus 3 (PBR6) acpiprt6 at acpi0: bus -1 (PBR7) acpiprt7 at acpi0: bus 5 (PE20) acpiprt8 at acpi0: bus -1 (PE21) acpiprt9 at acpi0: bus -1 (PE22) acpiprt10 at acpi0: bus -1 (PE23) acpiprt11 at acpi0: bus 4 (PIBR) acpipci0 at acpi0 PCI0: 0x 0x0011 0x0001 acpicmos0 at acpi0 acpibtn0 at acpi0: PWRB acpicpu0 at acpi0: C2(0@100 io@0x841), C1(@1 halt!), PSS acpicpu1 at acpi0: C2(0@100 io@0x841), C1(@1 halt!), PSS cpu0: 1000 MHz: speeds: 1000 800 MHz pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 "AMD 14h Host" rev 0x00 ppb0 at pci0 dev 4 function 0 "AMD 14h PCIE" rev 0x00: msi pci1 at ppb0 bus 1 re0 at pci1 dev 0 function 0 "Realtek 8168" rev 0x06: RTL8168E/8111E (0x2c00), msi, address 00:0d:b9:3e:d5:5c rgephy0 at re0 phy 7: RTL8169S/8110S/8211 PHY, rev. 4 ppb1 at pci0 dev 5 function 0 "AMD 14h PCIE" rev 0x00: msi pci2 at ppb1 bus 2 re1 at pci2 dev 0 function 0 "Realtek 8168" rev 0x06: RTL8168E/8111E (0x2c00), msi, address 00:0d:b9:3e:d5:5d rgephy1 at re1 phy 7: RTL8169S/8110S/8211 PHY, rev. 4 ppb2 at pci0 dev 6 function 0 "AMD 14h PCIE" rev 0x00: msi pci3 at ppb2 bus 3 re2 at pci3 dev 0 function 0 "Realtek 8168" rev 0x06: RTL8168E/8111E (0x2c00), msi, address 00:0d:b9:3e:d5:5e rgephy2 at re2 phy 7: RTL8169S/8110S/8211 PHY, rev. 4 ahci0 at pci0 dev 17 function 0 "ATI SBx00 SATA" rev 0x40: apic 2 int 19, AHCI 1.2 ahci0: port 0: 6.0Gb/s scsibus1 at ahci0: 32 targets sd0 at scsibus1 targ 0 lun 0: naa.5000 sd0: 15272MB, 512 bytes/sector, 31277232 sectors, thin ohci0 at pci0 dev 18 function 0 "ATI SB700 USB" rev 0x00: apic 2 int 18, version 1.0, legacy support ehci0 at pci0 dev 18 function 2 "ATI SB700 USB2" rev 0x00: apic 2 int 17 usb0 at ehci0: USB revision 2.0 uhub0 at usb0 configuration 1 interface 0 "ATI EHCI root hub" rev 2.00/1.00 addr 1 ohci1 at pci0 dev 19 function 0 "ATI SB700 USB" rev 0x00: apic 2 int 18, version 1.0, legacy support ehci1 at pci0 dev 19 function 2 "ATI SB700 USB2" rev 0x00: apic 2 int 17 usb1 at ehci1: USB revision 2.0 uhub1 at usb1 configuration 1 interface 0 "ATI EHCI root hub" rev
Re: poor routing/nat performance
On 2022-12-19, David Hajes wrote: > hi guys, > > I have simple PcEngines APU2 router running latest OpenBSD stable. > > em0 is WAN (bridge to CaTV modem with 1Gbps/100Mbps connectivity with normal > ether connectivity with DHCP...no special stuff like PPPoE) > > em1-3 is in vether/bridge mode with NAT routing to local network. > > I have complained to ISP about speeds because it supposes to run almost 1Gbps. > > results (speedtest.net used by ISP for some reason): > > 800+/85 Mbps measured by ISP technician directly from CaTV modem. > 440MBps/85Mbps simple NAT firewall pf.conf based on OpenBSD suggestions > 380/80Mbps with my strict firewall rules APU2 is not particularly powerful. When running OpenBSD on it, it's pretty much OK for VDSL type speeds / 100M leased line / some slower FTTP. It's not really gigabit-capable. You aren't helping by making further requirements on the CPU by using it as a bridge as well; you might help things a bit by removing the bridge configuration on em1-3 and just have a standard single em1 interface, connect a real ethernet switch for additional machines. But I don't think that will get you really close to gigabit - maybe you can find another 100-150M or so, but that's probably about it. You will see better speeds from the APU hardware with other OS, though full gigabit with anything complex in terms of packet filtering is still a bit unlikely. If you want full gigabit from OpenBSD you'll need faster hw. > I have used following guide > http://dant.net.ru/calomel/network_performance.html No changes, same > performance. That guide is often quoted (though fortunately not quite so often these days). But it's fairly useless. -- Please keep replies on the mailing list.
Re: poor routing/nat performance
On Mon, Dec 19, 2022, at 10:35 AM, David Hajes wrote: > I am guessing HW is not issue. I would not be totally sure on that. The CPU in the APU2 is pretty slow. While you can no doubt find some tweaked Linux or FreeBSD configurations that push it close to wire speed, the best I've ever been able to accomplish with the simplest pf.conf and forwarding between em0-em1 is 500-550 Mbps sustained, with occasional bursts to 600 Mbps. Research indicates others have had similar experiences. If you check the misc@ list archive, you've find a bunch of threads with people looking for inexpensive alternatives to the APU2+ platform, and there are plenty in the $100-200 USD range for amd64. Most of my APU2s have been retired to terminal/console server duty. > CPU bored, max. load 25% It sounds like 1 of your 4 cores is maxed, which would not be surprising. Brian Conway
Re: poor routing/nat performance
On 19.12.2022. 17:35, David Hajes wrote: > hi guys, > > I have simple PcEngines APU2 router running latest OpenBSD stable. > > em0 is WAN (bridge to CaTV modem with 1Gbps/100Mbps connectivity with normal > ether connectivity with DHCP...no special stuff like PPPoE) > > em1-3 is in vether/bridge mode with NAT routing to local network. > > I have complained to ISP about speeds because it supposes to run almost 1Gbps. > > results (speedtest.net used by ISP for some reason): > > 800+/85 Mbps measured by ISP technician directly from CaTV modem. > 440MBps/85Mbps simple NAT firewall pf.conf based on OpenBSD suggestions > 380/80Mbps with my strict firewall rules > > I have used following guide > http://dant.net.ru/calomel/network_performance.html No changes, same > performance. > > Checking out router monitoring > > 3k packets/s firewall throughput > pf_states lookup max. 12k/s, ~2k/s > CPU bored, max. load 25% > RAM 2.6 GB from 4GB free, swap never used > > I am guessing HW is not issue. > > Is there any issues with bridging local interfaces, and routing/NAT > performance, please? > > I tried to Google answers, and there is lots of whining but no real info. It > supposes to run double speed, at least 800Mbps as shown by ISP technicians. > > Any suggestions for bottleneck, please? > Could you try veb(4) instead bridge(4) ? Bridge is quite slow https://undeadly.org/cgi?action=article;sid=20220319123157
poor routing/nat performance
hi guys, I have simple PcEngines APU2 router running latest OpenBSD stable. em0 is WAN (bridge to CaTV modem with 1Gbps/100Mbps connectivity with normal ether connectivity with DHCP...no special stuff like PPPoE) em1-3 is in vether/bridge mode with NAT routing to local network. I have complained to ISP about speeds because it supposes to run almost 1Gbps. results (speedtest.net used by ISP for some reason): 800+/85 Mbps measured by ISP technician directly from CaTV modem. 440MBps/85Mbps simple NAT firewall pf.conf based on OpenBSD suggestions 380/80Mbps with my strict firewall rules I have used following guide http://dant.net.ru/calomel/network_performance.html No changes, same performance. Checking out router monitoring 3k packets/s firewall throughput pf_states lookup max. 12k/s, ~2k/s CPU bored, max. load 25% RAM 2.6 GB from 4GB free, swap never used I am guessing HW is not issue. Is there any issues with bridging local interfaces, and routing/NAT performance, please? I tried to Google answers, and there is lots of whining but no real info. It supposes to run double speed, at least 800Mbps as shown by ISP technicians. Any suggestions for bottleneck, please? Regards DavidH Sent with [Proton Mail](https://proton.me/) secure email.