Dear sir or madam,

we run multiple Intel E810-CQDA2 100G adapters (2x QSFP28) in our fleet of servers . The machines are running Ubuntu 22.04 LTS (Jammy), wieth Linux kernel 6.2.0-36-generic (Ubuntu HWE Kernel).

This is the output from ethtool:

---cut ---
# ethtool -i eth2
driver: ice
version: 6.2.0-36-generic
firmware-version: 4.30 0x8001af29 1.3429.0
expansion-rom-version:
bus-info: 0000:a1:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

--- cut ---

We observe strange, totally unrealistic traffic spikes (Multiple Terabits/s) in our monitoring. We use the Prometheus Node Exporter and the netdev collector (https://github.com/prometheus/node_exporter/blob/ed1b8e3d88851806627e4f8262ee26232ca56c2c/collector/netdev_common.go#L39). I found issue https://github.com/prometheus/node_exporter/issues/1849 and it appears that others have noticed similar issues with the counters.

I have now dumped "/proc/net/dev" of one of the machines once per second to a logfile per interface to show the issue actually originates from the "ice" kernel driver
and not from any of our other tooling.

I can provide the whole files, but if you just look at two timestamps in particular, you can actually see two jump in the counters:

--- cut ---
Inter-|   Receive |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
[...]
Nov 16 14:44:17   eth2: 322480275246795 161202637791 12245 2396226    0     0          0  71204126 497958797609464 188500340907    0    0    0     0       0          0 Nov 16 14:44:18   eth2: 386617853382565 193953665830 12245 2396226    0     0          0  71204282 593586606935949 223802656120    0    0    0     0       0          0
[...]
Nov 16 14:49:10   eth2: 386662845936810 193977501895 12247 2396226    0     0          0  71230993 593637495306092 223827197609    0    0    0     0       0          0 Nov 16 14:49:11   eth2: 450845520538932 226752438356 12247 2396226    0     0          0  71230993 689316465134429 259154140003    0    0    0     0       0          0
[...]
--- cut ---


If you require any more information to narrow down the issue, please don't hesitate to contact me.



Regards


Christian Rohmann


_______________________________________________
Intel-wired-lan mailing list
[email protected]
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

Reply via email to