On 10/10/25 12:52 AM, Claudio Jeker wrote:
On Thu, Oct 09, 2025 at 02:43:03PM -0400, Nick Holland wrote:
this time, without the ACPI dumps, to reduce size of the
message.
Nick.
-------- Forwarded Message --------
Subject: ixl issue on current snapshots
Date: Thu, 9 Oct 2025 14:18:00 -0400
From: Nick Holland <[email protected]>
To: [email protected] <[email protected]>
> Synopsis: ixl/carp stops working a few minutes after boot on very recent
snapshots
> Category: amd64
> Environment:
System : OpenBSD 7.8
Details : OpenBSD 7.8-beta (GENERIC.MP) #35: Thu Sep 18 16:01:31
MDT 2025
[email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
Architecture: OpenBSD.amd64
Machine : amd64
> Description:
After upgrading to the Oct 4 snapshot, the ixl(4) port that is part
of a carp(4) failover system stops working. A second ixl port that is
not
part of a carp group (but also gets little traffic) seems to keep
working.
The Sep 18 snapshot works properly, the next snapshot, Sep 28, shows the
problem.
Sep 18: good
Sep 28: bad
Oct 4: bad
Oct 8: bad
System will come up well enough to start serving, but will soon (within
a couple minutes) stop responding to new traffic. Old connections
sometimes stay active for some time, but new connections cannot be
made. At the moment, I can ping my personal mail server from this
machine
(they aren't even in the same country), but a machine on the same subnet
does not respond to pings. CARP fails over properly to the machine
running
an older snapshot.
Please share the kstat ixlX::: output.
# kstat ixl0:::
ixl0:0:ixl-port:0
rx bytes: 18633653 bytes
mac local errs: 0
mac remote errs: 0
mac short: 0 packets
crc errs: 0 packets
rx len errs: 0 packets
byte errs: 0 packets
illegal byte: 0 packets
rx undersize: 0 packets
rx oversize: 0 packets
rx link xon: 0 packets
rx link xoff: 0 packets
rx 64B: 69324 packets
rx 65-127B: 71693 packets
rx 128-255B: 11901 packets
rx 256-511B: 1823 packets
rx 512-1023B: 925 packets
rx 1024-1522B: 3466 packets
rx 1523-9522B: 0 packets
rx fragment: 0 packets
rx jabber: 0 packets
rx ucasts: 93625 packets
rx mcasts: 2668 packets
rx bcasts: 62839 packets
rx discards: 0 packets
rx lo discards: 0 packets
rx no dest: 159160 packets
tx bytes: 382257053 bytes
tx 64B: 8247 packets
tx 65-127B: 21262 packets
tx 128-255B: 4052 packets
tx 256-511B: 1719 packets
tx 512-1023B: 1797 packets
tx 1024-1522B: 248819 packets
tx 1523-9522B: 0 packets
tx link xon: 0 packets
tx link xoff: 0 packets
tx ucasts: 285868 packets
tx mcasts: 6 packets
tx bcasts: 22 packets
tx link down: 0 packets
ixl0:0:ixl-vsi:0
rx discards: 333 packets
tx bytes: 381063696 bytes
tx ucasts: 285868 packets
tx mcasts: 6 packets
tx bcasts: 22 packets
tx errs: 0 packets
tx discards: 0 packets
rx bytes: 18626375 bytes
rx ucasts: 93625 packets
rx mcasts: 2569 packets
rx bcasts: 62839 packets
rx noproto: 0 packets
ixl0:0:rxq:0
packets: 158687 packets
bytes: 17967485 bytes
fdrops: 0 packets
qdrops: 0 packets
errors: 0 packets
qlen: 0 packets
enqueues: 126142
dequeues: 126129
ixl0:0:rxq:1
packets: 0 packets
bytes: 0 bytes
fdrops: 0 packets
qdrops: 0 packets
errors: 0 packets
qlen: 0 packets
enqueues: 0
dequeues: 0
ixl0:0:rxq:2
packets: 0 packets
bytes: 0 bytes
fdrops: 0 packets
qdrops: 0 packets
errors: 0 packets
qlen: 0 packets
enqueues: 0
dequeues: 0
ixl0:0:rxq:3
packets: 0 packets
bytes: 0 bytes
fdrops: 0 packets
qdrops: 0 packets
errors: 0 packets
qlen: 0 packets
enqueues: 0
dequeues: 0
ixl0:0:txq:0
packets: 37900 packets
bytes: 158613810 bytes
qdrops: 48356 packets
errors: 0 packets
qlen: 1024 packets
maxqlen: 1024 packets
oactive: true
oactives: 1
ixl0:0:txq:1
packets: 21738 packets
bytes: 146797431 bytes
qdrops: 48703 packets
errors: 0 packets
qlen: 1024 packets
maxqlen: 1024 packets
oactive: true
oactives: 1
ixl0:0:txq:2
packets: 7230 packets
bytes: 13521265 bytes
qdrops: 49807 packets
errors: 0 packets
qlen: 1024 packets
maxqlen: 1024 packets
oactive: true
oactives: 1
ixl0:0:txq:3
packets: 11933 packets
bytes: 51592464 bytes
qdrops: 49257 packets
errors: 0 packets
qlen: 1024 packets
maxqlen: 1024 packets
oactive: true
oactives: 1
Looking at your timeline please try to revert the last diff to if_ixl.c
from Sep 17.
Diff below is what got committed (use patch -R).
Reversion of that diff seems to have settled down the system.
It has been running now for 90 minutes, and considering the past was
typically no more than two minutes, I think it is good.
Nick.