The Ifail and Ofail columns are a sum of queue drops and errors. Could you run that netstat command with -d and -e so we can see the drops and errors separately?
Cheers, dlg > On 11 Jun 2020, at 2:21 pm, Gabri Tofano <[email protected]> wrote: > > After extensive testing the latency spikes shown up again: > > To the inside interface of the firewall: > > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time=132ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 > > And to the firewall's next hop (ISP ONT) at the same time: > > Reply from 74.215.235.1: bytes=32 time=1ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=2ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=1ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=2ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=1ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=3ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=2ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=1ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=3ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=242ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=2ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=2ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=1ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=1ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=2ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=1ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=1ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=3ms TTL=62 > Reply from 74.215.235.1: bytes=32 time=3ms TTL=62 > > Interface errors are now showing up just on the output: > > #netstat -i > Name Mtu Network Address Ipkts Ifail Opkts Ofail > Colls > em0 1500 <Link> XX:XX:XX:XX:XX:XX 22655 0 41589 0 > 0 > em0 1500 XX.XX.XX.XX XX:XX:XX:XX:XX:XX 22655 0 41589 0 > 0 > em1 1500 <Link> XX:XX:XX:XX:XX:XX 39924 0 20476 1 > 0 > em1 1500 172.16.200. XX:XX:XX:XX:XX:XX 39924 0 20476 1 > 0 > em2 1500 <Link> XX:XX:XX:XX:XX:XX 427 0 330 2 > 0 > em2 1500 172.16.103/ XX:XX:XX:XX:XX:XX 427 0 330 2 > 0 > em3* 1500 <Link> XX:XX:XX:XX:XX:XX 0 0 0 0 > 0 > enc0* 0 <Link> 0 0 0 0 > 0 > pflog0 33136 <Link> 0 0 1294 0 > 0 > > UDP real time traffic is the most affected one as very sensitive and I keep \ > having spikes meanwhile playing online. > > Thank you! > Gabri > > On 2020-06-10 22:50, Gabri Tofano wrote: >> Another user pointed out to me that in the OpenBSD 6.7 release notes >> there is a statement in regards of the em(4) drivers: "Improvements in >> the em(4) driver." and so I have gave it a try and reinstalled with >> OpenBSD 6.6. It looks like that the system is now stable and latency >> spikes/interface errors are not present at all even under heavy >> traffic loads. I am not sure what introduced the issue but maybe one >> of the devs can give it a look? >> Thank you! >> Gabri >> On 2020-06-09 13:01, Gabri Tofano wrote: >>> Hi all, >>> I'm using a "Protectli FW1" with FreeBSD 12.1 amd64 as a firewall >>> which is serving me with great performances and no issues at all. The >>> appliance has 4 Intel Gigabit 82583V Ethernet NIC ports which are >>> working very well. I have used PFsense as well prior to FreeBSD and it >>> worked without issues too. >>> I took the decision to move to OpenBSD 6.7 amd64 in order to benefit >>> of the latest pf (and other) features but unfortunately the OS is >>> giving me an issue which I guess is related to the NIC drivers; When I >>> was connected via ssh I felt some glitches meanwhile I was >>> typing/moving around with the editor, so I started to ping the inside >>> interface from a wired connected pc and found out that time to time >>> the appliance is responding with a 100+/200+ ms response (I have cut >>> some 1ms reply to make it shorter): >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=163ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=2ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=3ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=43ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=4ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254 >>> Reply from 172.16.200.1: bytes=32 time=257ms TTL=254 >>> With FreeBSD 12.1 is steady at <1/1ms all the time and even under load. >>> Looking at the interface statistics on OpenBSD I found out that >>> inbound/outbound errors are present (this has been taken after few >>> minutes of a reinstall to test it again): >>> #netstat -i >>> Name Mtu Network Address Ipkts Ifail Opkts >>> Ofail Colls >>> em0 1500 <Link> xx:xx:xx:xx:xx:xx 1317600 2351 466114 >>> 0 0 >>> em0 1500 74.215.235/ xxx.xxx.xxx.xxx 1317600 2351 466114 >>> 0 0 >>> em1 1500 <Link> xx:xx:xx:xx:xx:xx 392782 18 1199871 >>> 1 0 >>> em1 1500 172.16.200. 172.16.200.1 392782 18 1199871 >>> 1 0 >>> em2 1500 <Link> xx:xx:xx:xx:xx:xx 156 0 55 >>> 1 0 >>> em2 1500 172.16.103/ 172.16.103.254 156 0 55 >>> 1 0 >>> em3* 1500 <Link> xx:xx:xx:xx:xx:xx 0 0 0 >>> 0 0 >>> enc0* 0 <Link> 0 0 0 >>> 0 0 >>> pflog0 33136 <Link> 0 0 0 >>> 0 0 >>> Looking at the Cisco 3560G where the ports are connected there are no >>> errors at all. I have also doublechecked the drivers and the firmware >>> installed by fw_update are the following: >>> vmm-firmware-1.11.0p2 >>> inteldrm-firmware-20181218 >>> intel-firmware-20200508v0 >>> I have done multiple reinstall with different OS to make sure that >>> this is related to OpenBSD 6.7 itself and found the following: >>> PFsense 2.4.5: no issues at all >>> FreeBSD 12.1: no issues at all >>> OPNsense: interface errors >>> OpenBSD: interface errors and interface latency spikes >>> I have also swapped the ethernet cables and contacted Protectli which >>> has confirmed that this appliance has been tested on OpenBSD (it looks >>> like 6.3). >>> Here the dmesg output: >>> OpenBSD 6.7 (GENERIC.MP) #2: Thu Jun 4 09:55:08 MDT 2020 >>> [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP >>> real mem = 4163854336 (3970MB) >>> avail mem = 4025044992 (3838MB) >>> mpath0 at root >>> scsibus0 at mpath0: 256 targets >>> mainbus0 at root >>> bios0 at mainbus0: SMBIOS rev. 3.0 @ 0xecea0 (51 entries) >>> bios0: vendor American Megatrends Inc. version "5.6.5" date 10/24/2018 >>> bios0: Protectli FW1 >>> acpi0 at bios0: ACPI 5.0 >>> acpi0: sleep states S0 S3 S4 S5 >>> acpi0: tables DSDT FACP APIC FPDT FIDT MCFG LPIT HPET SSDT SSDT SSDT UEFI >>> acpi0: wakeup devices PS2K(S3) PS2M(S3) XHC1(S4) RP01(S4) PXSX(S4) >>> RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) PXSX(S4) BRCM(S0) >>> acpitimer0 at acpi0: 3579545 Hz, 24 bits >>> acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat >>> cpu0 at mainbus0: apid 0 (boot processor) >>> cpu0: Intel(R) Celeron(R) CPU J1900 @ 1.99GHz, 2000.47 MHz, 06-37-09 >>> cpu0: >>> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,RDRAND,NXE,RDTSCP,LONG,LAHF,3DNOWP,PERF,ITSC,TSC_ADJUST,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,SENSOR,ARAT,MELTDOWN >>> cpu0: 1MB 64b/line 16-way L2 cache >>> cpu0: smt 0, core 0, package 0 >>> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges >>> cpu0: apic clock running at 83MHz >>> cpu0: mwait min=64, max=64, C-substates=0.2.0.0.0.0.3.3, IBE >>> cpu1 at mainbus0: apid 2 (application processor) >>> cpu1: Intel(R) Celeron(R) CPU J1900 @ 1.99GHz, 2000.01 MHz, 06-37-09 >>> cpu1: >>> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,RDRAND,NXE,RDTSCP,LONG,LAHF,3DNOWP,PERF,ITSC,TSC_ADJUST,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,SENSOR,ARAT,MELTDOWN >>> cpu1: 1MB 64b/line 16-way L2 cache >>> cpu1: smt 0, core 1, package 0 >>> cpu2 at mainbus0: apid 4 (application processor) >>> cpu2: Intel(R) Celeron(R) CPU J1900 @ 1.99GHz, 2000.03 MHz, 06-37-09 >>> cpu2: >>> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,RDRAND,NXE,RDTSCP,LONG,LAHF,3DNOWP,PERF,ITSC,TSC_ADJUST,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,SENSOR,ARAT,MELTDOWN >>> cpu2: 1MB 64b/line 16-way L2 cache >>> cpu2: smt 0, core 2, package 0 >>> cpu3 at mainbus0: apid 6 (application processor) >>> cpu3: Intel(R) Celeron(R) CPU J1900 @ 1.99GHz, 2000.01 MHz, 06-37-09 >>> cpu3: >>> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,RDRAND,NXE,RDTSCP,LONG,LAHF,3DNOWP,PERF,ITSC,TSC_ADJUST,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,SENSOR,ARAT,MELTDOWN >>> cpu3: 1MB 64b/line 16-way L2 cache >>> cpu3: smt 0, core 3, package 0 >>> ioapic0 at mainbus0: apid 1 pa 0xfec00000, version 20, 87 pins >>> acpimcfg0 at acpi0 >>> acpimcfg0: addr 0xe0000000, bus 0-255 >>> acpihpet0 at acpi0: 14318179 Hz >>> acpiprt0 at acpi0: bus 0 (PCI0) >>> acpiprt1 at acpi0: bus 1 (RP01) >>> acpiprt2 at acpi0: bus 2 (RP02) >>> acpiprt3 at acpi0: bus 3 (RP03) >>> acpiprt4 at acpi0: bus 4 (RP04) >>> acpiec0 at acpi0: not present >>> acpicpu0 at acpi0: C3(10@1500 mwait.1@0x52), C2(10@500 mwait.1@0x51), >>> C1(1000@1 mwait.1), PSS >>> acpicpu1 at acpi0: C3(10@1500 mwait.1@0x52), C2(10@500 mwait.1@0x51), >>> C1(1000@1 mwait.1), PSS >>> acpicpu2 at acpi0: C3(10@1500 mwait.1@0x52), C2(10@500 mwait.1@0x51), >>> C1(1000@1 mwait.1), PSS >>> acpicpu3 at acpi0: C3(10@1500 mwait.1@0x52), C2(10@500 mwait.1@0x51), >>> C1(1000@1 mwait.1), PSS >>> acpipwrres0 at acpi0: PLPE >>> acpipwrres1 at acpi0: PLPE >>> acpipwrres2 at acpi0: USBC, resource for EHC1, OTG1 >>> acpipwrres3 at acpi0: CLK0, resource for CAM1 >>> acpipwrres4 at acpi0: CLK1, resource for CAM0, CAM2 >>> acpicmos0 at acpi0 >>> acpipci0 at acpi0 PCI0: 0x00000010 0x00000011 0x00000000 >>> "DMA0F28" at acpi0 not configured >>> acpibtn0 at acpi0: SLPB >>> "BCM2E1A" at acpi0 not configured >>> "BCM4752" at acpi0 not configured >>> "INTCF0B" at acpi0 not configured >>> "INTCF1A" at acpi0 not configured >>> "INTCF1C" at acpi0 not configured >>> "SMO91D0" at acpi0 not configured >>> "ATML1000" at acpi0 not configured >>> "ATML2000" at acpi0 not configured >>> "INT33BD" at acpi0 not configured >>> acpivideo0 at acpi0: GFX0 >>> acpivout0 at acpivideo0: DD1F >>> cpu0: using VERW MDS workaround >>> cpu0: Enhanced SpeedStep 2000 MHz: speeds: 1993, 1992, 1909, 1826, >>> 1743, 1660, 1577, 1494, 1411, 1328 MHz >>> pci0 at mainbus0 bus 0 >>> pchb0 at pci0 dev 0 function 0 "Intel Bay Trail Host" rev 0x11 >>> inteldrm0 at pci0 dev 2 function 0 "Intel Bay Trail Video" rev 0x11 >>> drm0 at inteldrm0 >>> inteldrm0: msi, VALLEYVIEW, gen 7 >>> ahci0 at pci0 dev 19 function 0 "Intel Bay Trail AHCI" rev 0x11: msi, AHCI >>> 1.3 >>> ahci0: port 0: 3.0Gb/s >>> scsibus1 at ahci0: 32 targets >>> sd0 at scsibus1 targ 0 lun 0: <ATA, Hoodisk SSD, SBFM> naa.0000000000000000 >>> sd0: 15272MB, 512 bytes/sector, 31277232 sectors, thin >>> xhci0 at pci0 dev 20 function 0 "Intel Bay Trail xHCI" rev 0x11: msi, xHCI >>> 1.0 >>> usb0 at xhci0: USB revision 3.0 >>> uhub0 at usb0 configuration 1 interface 0 "Intel xHCI root hub" rev >>> 3.00/1.00 addr 1 >>> "Intel Bay Trail TXE" rev 0x11 at pci0 dev 26 function 0 not configured >>> azalia0 at pci0 dev 27 function 0 "Intel Bay Trail HD Audio" rev 0x11: msi >>> azalia0: no supported codecs >>> ppb0 at pci0 dev 28 function 0 "Intel Bay Trail PCIE" rev 0x11: msi >>> pci1 at ppb0 bus 1 >>> em0 at pci1 dev 0 function 0 "Intel 82583V" rev 0x00: msi, address >>> xx:xx:xx:xx:xx:xx >>> ppb1 at pci0 dev 28 function 1 "Intel Bay Trail PCIE" rev 0x11: msi >>> pci2 at ppb1 bus 2 >>> em1 at pci2 dev 0 function 0 "Intel 82583V" rev 0x00: msi, address >>> xx:xx:xx:xx:xx:xx >>> ppb2 at pci0 dev 28 function 2 "Intel Bay Trail PCIE" rev 0x11: msi >>> pci3 at ppb2 bus 3 >>> em2 at pci3 dev 0 function 0 "Intel 82583V" rev 0x00: msi, address >>> xx:xx:xx:xx:xx:xx >>> ppb3 at pci0 dev 28 function 3 "Intel Bay Trail PCIE" rev 0x11: msi >>> pci4 at ppb3 bus 4 >>> em3 at pci4 dev 0 function 0 "Intel 82583V" rev 0x00: msi, address >>> xx:xx:xx:xx:xx:xx >>> pcib0 at pci0 dev 31 function 0 "Intel Bay Trail LPC" rev 0x11 >>> ichiic0 at pci0 dev 31 function 3 "Intel Bay Trail SMBus" rev 0x11: >>> apic 1 int 18 >>> iic0 at ichiic0 >>> spdmem0 at iic0 addr 0x50: 4GB DDR3 SDRAM PC3-12800 SO-DIMM >>> isa0 at pcib0 >>> isadma0 at isa0 >>> com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo >>> pckbc0 at isa0 port 0x60/5 irq 1 irq 12 >>> pckbd0 at pckbc0 (kbd slot) >>> wskbd0 at pckbd0: console keyboard >>> pcppi0 at isa0 port 0x61 >>> spkr0 at pcppi0 >>> it0 at isa0 port 0x2e/2: IT8772F rev 1, EC port 0xa40 >>> vmm0 at mainbus0: VMX/EPT (using slow L1TF mitigation) >>> vscsi0 at root >>> scsibus2 at vscsi0: 256 targets >>> softraid0 at root >>> scsibus3 at softraid0: 256 targets >>> root on sd0a (78fa67e12e212447.a) swap on sd0b dump on sd0b >>> inteldrm0: 1024x768, 32bpp >>> wsdisplay0 at inteldrm0 mux 1: console (std, vt100 emulation), using wskbd0 >>> wsdisplay0: screen 1-5 added (std, vt100 emulation) >>> Any clue of what the issue could be? I had a tip from another user >>> that it might be related to msi-x. >>> Thanks! >>> Gabri >
