On 9 Nov 2020, at 9:24, Jesper Dangaard Brouer wrote:
On Sat, 07 Nov 2020 14:00:04 +0100
Thomas Rosenstein via Bloat <[email protected]> wrote:
Here's an extract from the ethtool https://pastebin.com/cabpWGFz just
in
case there's something hidden.
Yes, there is something hiding in the data from ethtool_stats.pl[1]:
(10G Mellanox Connect-X cards via 10G SPF+ DAC)
stat: 1 ( 1) <= outbound_pci_stalled_wr_events
/sec
stat: 339731557 (339,731,557) <= rx_buffer_passed_thres_phy /sec
I've not seen this counter 'rx_buffer_passed_thres_phy' before,
looking
in the kernel driver code it is related to "rx_buffer_almost_full".
The numbers per second is excessive (but it be related to a driver bug
as it ends up reading "high" -> rx_buffer_almost_full_high in the
extended counters).
stat: 29583661 ( 29,583,661) <= rx_bytes /sec
stat: 30343677 ( 30,343,677) <= rx_bytes_phy /sec
You are receiving with 236 Mbit/s in 10Gbit/s link. There is a
difference between what the OS sees (rx_bytes) and what the NIC
hardware sees (rx_bytes_phy) (diff approx 6Mbit/s).
stat: 19552 ( 19,552) <= rx_packets /sec
stat: 19950 ( 19,950) <= rx_packets_phy /sec
Could these packets be from VLAN interfaces that are not used in the OS?
Above RX packet counters also indicated HW is seeing more packets that
OS is receiving.
Next counters is likely your problem:
stat: 718 ( 718) <= tx_global_pause /sec
stat: 954035 ( 954,035) <= tx_global_pause_duration /sec
stat: 714 ( 714) <= tx_pause_ctrl_phy /sec
As far as I can see that's only the TX, and we are only doing RX on this
interface - so maybe that's irrelevant?
It looks like you have enabled Ethernet Flow-Control, and something is
causing pause frames to be generated. It seem strange that this
happen
on a 10Gbit/s link with only 236 Mbit/s.
The TX byte counters are also very strange:
stat: 26063 ( 26,063) <= tx_bytes /sec
stat: 71950 ( 71,950) <= tx_bytes_phy /sec
Also, it's TX, and we are only doing RX, as I said already somewhere,
it's async routing, so the TX data comes via another router back.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
[1]
https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl
Strange size distribution:
stat: 19922 ( 19,922) <= rx_1519_to_2047_bytes_phy /sec
stat: 14 ( 14) <= rx_65_to_127_bytes_phy /sec
_______________________________________________
Bloat mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/bloat