On Fri, 06 Nov 2020 12:45:31 +0100 Toke Høiland-Jørgensen <[email protected]> wrote:
> "Thomas Rosenstein" <[email protected]> writes: > > > On 6 Nov 2020, at 12:18, Jesper Dangaard Brouer wrote: > > > >> On Fri, 06 Nov 2020 10:18:10 +0100 > >> "Thomas Rosenstein" <[email protected]> wrote: > >> > >>>>> I just tested 5.9.4 seems to also fix it partly, I have long > >>>>> stretches where it looks good, and then some increases again. (3.10 > >>>>> Stock has them too, but not so high, rather 1-3 ms) > >>>>> > >> > >> That you have long stretches where latency looks good is interesting > >> information. My theory is that your system have a periodic userspace > >> process that does a kernel syscall that takes too long, blocking > >> network card from processing packets. (Note it can also be a kernel > >> thread). > > [...] > > > > Could this be related to netlink? I have gobgpd running on these > > routers, which injects routes via netlink. > > But the churn rate during the tests is very minimal, maybe 30 - 40 > > routes every second. Yes, this could be related. The internal data-structure for FIB lookups is a fibtrie which is a compressed patricia tree, related to radix tree idea. Thus, I can imagine that the kernel have to rebuild/rebalance the tree with all these updates. > > > > Otherwise we got: salt-minion, collectd, node_exporter, sshd > > collectd may be polling the interface stats; try turning that off? It should be fairly easy for you to test the theory if any of these services (except sshd) is causing this, by turning them off individually. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer _______________________________________________ Bloat mailing list [email protected] https://lists.bufferbloat.net/listinfo/bloat
