Re: 7.3 regression: high network latency every 12 seconds on all interfaces

2023-05-01 Thread Jonathan Matthew
On Sat, Apr 29, 2023 at 07:32:27AM +0200, Harald Dunkel wrote:
> >Synopsis:    7.3 regression: high network latency every 12 seconds on all 
> >interfaces
> >Category:network
> >Environment:
>   System  : OpenBSD 7.3
>   Details : OpenBSD 7.3 (GENERIC.MP) #1125: Sat Mar 25 10:36:29 MDT 
> 2023
>
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
>   Architecture: OpenBSD.amd64
>   Machine : amd64
> >Description:
>   Since the upgrade to 7.3 of a HA gateway ("redgatea" and "redgateb", one
> external network, 2 internal networks, carp on all interfaces) I see a
> high network latency for incoming network traffic every 12 seconds.
> dmesg:
 ...
> inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics" rev 0x35
> drm0 at inteldrm0
> inteldrm0: msi, CHERRYVIEW, gen 8

This generation often has problems with hdmi detection polling causing
latency spikes for network traffic and everything else.
Can you plug in a monitor (or a headless hdmi plug), or disable inteldrm?



7.3 regression: high network latency every 12 seconds on all interfaces

2023-04-28 Thread Harald Dunkel
>Synopsis:      7.3 regression: high network latency every 12 seconds on all 
>interfaces
>Category:  network
>Environment:
System  : OpenBSD 7.3
Details : OpenBSD 7.3 (GENERIC.MP) #1125: Sat Mar 25 10:36:29 MDT 
2023
 
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

Architecture: OpenBSD.amd64
Machine : amd64
>Description:
Since the upgrade to 7.3 of a HA gateway ("redgatea" and "redgateb", one
external network, 2 internal networks, carp on all interfaces) I see a
high network latency for incoming network traffic every 12 seconds.
Trying to ping redgatea from redgateb over the pfsync interface, for
example:

redgateb # ping 192.168.23.2
PING 192.168.23.2 (192.168.23.2): 56 data bytes
64 bytes from 192.168.23.2: icmp_seq=0 ttl=255 time=0.585 ms
64 bytes from 192.168.23.2: icmp_seq=1 ttl=255 time=48.559 ms
64 bytes from 192.168.23.2: icmp_seq=2 ttl=255 time=153.323 ms
64 bytes from 192.168.23.2: icmp_seq=3 ttl=255 time=0.233 ms
64 bytes from 192.168.23.2: icmp_seq=4 ttl=255 time=0.230 ms
64 bytes from 192.168.23.2: icmp_seq=5 ttl=255 time=0.227 ms
64 bytes from 192.168.23.2: icmp_seq=6 ttl=255 time=1.001 ms
64 bytes from 192.168.23.2: icmp_seq=7 ttl=255 time=1.253 ms
64 bytes from 192.168.23.2: icmp_seq=8 ttl=255 time=0.224 ms
64 bytes from 192.168.23.2: icmp_seq=9 ttl=255 time=0.229 ms
64 bytes from 192.168.23.2: icmp_seq=10 ttl=255 time=0.231 ms
64 bytes from 192.168.23.2: icmp_seq=11 ttl=255 time=0.228 ms
64 bytes from 192.168.23.2: icmp_seq=12 ttl=255 time=0.267 ms
64 bytes from 192.168.23.2: icmp_seq=13 ttl=255 time=259.893 ms
64 bytes from 192.168.23.2: icmp_seq=14 ttl=255 time=364.299 ms
64 bytes from 192.168.23.2: icmp_seq=15 ttl=255 time=0.228 ms
64 bytes from 192.168.23.2: icmp_seq=16 ttl=255 time=0.230 ms
64 bytes from 192.168.23.2: icmp_seq=17 ttl=255 time=0.231 ms
64 bytes from 192.168.23.2: icmp_seq=18 ttl=255 time=1.349 ms
64 bytes from 192.168.23.2: icmp_seq=19 ttl=255 time=1.113 ms
64 bytes from 192.168.23.2: icmp_seq=20 ttl=255 time=0.232 ms
64 bytes from 192.168.23.2: icmp_seq=21 ttl=255 time=0.232 ms
64 bytes from 192.168.23.2: icmp_seq=22 ttl=255 time=0.225 ms
64 bytes from 192.168.23.2: icmp_seq=23 ttl=255 time=0.223 ms
64 bytes from 192.168.23.2: icmp_seq=24 ttl=255 time=0.224 ms
64 bytes from 192.168.23.2: icmp_seq=25 ttl=255 time=469.175 ms
64 bytes from 192.168.23.2: icmp_seq=26 ttl=255 time=571.747 ms
64 bytes from 192.168.23.2: icmp_seq=27 ttl=255 time=0.253 ms
64 bytes from 192.168.23.2: icmp_seq=28 ttl=255 time=0.225 ms
64 bytes from 192.168.23.2: icmp_seq=29 ttl=255 time=0.229 ms
64 bytes from 192.168.23.2: icmp_seq=30 ttl=255 time=0.227 ms
64 bytes from 192.168.23.2: icmp_seq=31 ttl=255 time=1.222 ms
64 bytes from 192.168.23.2: icmp_seq=32 ttl=255 time=0.995 ms
64 bytes from 192.168.23.2: icmp_seq=33 ttl=255 time=0.238 ms
64 bytes from 192.168.23.2: icmp_seq=34 ttl=255 time=0.238 ms
64 bytes from 192.168.23.2: icmp_seq=35 ttl=255 time=0.230 ms
64 bytes from 192.168.23.2: icmp_seq=36 ttl=255 time=0.230 ms
64 bytes from 192.168.23.2: icmp_seq=37 ttl=255 time=679.469 ms
64 bytes from 192.168.23.2: icmp_seq=38 ttl=255 time=781.050 ms
64 bytes from 192.168.23.2: icmp_seq=39 ttl=255 time=0.221 ms
64 bytes from 192.168.23.2: icmp_seq=40 ttl=255 time=0.240 ms
^C
--- 192.168.23.2 ping statistics ---
41 packets transmitted, 41 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 0.221/81.489/781.050/195.848 ms

There is no switch involved in this pfsync connection, just a
single cable from NIC to NIC.

I see the same performance problem for incoming traffic on all
other network interfaces of redgatea and redgateb, MASTER and
BACKUP, even on the external connection. For outgoing traffic
(eg if I try to ping a 3rd host *from* redgatea) there is a
performance impact, too, but it is much lower:

redgatea# ping 10.100.100.101
PING 10.100.100.101 (10.100.100.101): 56 data bytes
64 bytes from 10.100.100.101: icmp_seq=0 ttl=64 time=0.281 ms
64 bytes from 10.100.100.101: icmp_seq=1 ttl=64 time=0.238 ms
64 bytes from 10.100.100.101: icmp_seq=2 ttl=64 time=0.235 ms
64 bytes from 10.100.100.101: icmp_seq=3 ttl=64 time=0.231 ms
64 bytes from 10.100.100.101: icmp_seq=4 ttl=64 time=0.239 ms
64 bytes from 10.100.100.101: icmp_seq=5 ttl=64 time=0.228 ms
64 bytes from 10.100.100.101: icmp_seq=6 ttl=64