Hi,
I'm seeing these warnings quite frequently on a system that has full-internet table programmed into Linux kernel. Sep 20 11:50:48 ganges bird: Kernel dropped some netlink messages, will resync on next scan. Sep 20 11:50:57 ganges bird: Kernel dropped some netlink messages, will resync on next scan. Sep 20 11:50:59 ganges bird: Kernel dropped some netlink messages, will resync on next scan. Sep 20 11:51:00 ganges bird: Kernel dropped some netlink messages, will resync on next scan. Sep 20 11:51:01 ganges bird: Kernel dropped some netlink messages, will resync on next scan. Sep 20 11:51:05 ganges bird: Kernel dropped some netlink messages, will resync on next scan. Sep 20 11:51:05 ganges bird: Kernel dropped some netlink messages, will resync on next scan. Sep 20 11:51:07 ganges bird: Kernel dropped some netlink messages, will resync on next scan. Sep 20 11:51:10 ganges bird: Kernel dropped some netlink messages, will resync on next scan. Sep 20 11:51:10 ganges bird: Kernel dropped some netlink messages, will resync on next scan. Sep 20 11:51:14 ganges bird: Kernel dropped some netlink messages, will resync on next scan. Sep 20 11:51:17 ganges bird: Kernel dropped some netlink messages, will resync on next scan. Sep 20 11:51:19 ganges bird: Kernel dropped some netlink messages, will resync on next scan. Sep 20 11:51:19 ganges bird: Kernel dropped some netlink messages, will resync on next scan. First off, I'm not expecting wonders here, I don't expect having near to 1m routes in Linux to work flawlessly however just looking to improve on the situation. My setup: bird 1.6.6 (Debian 10) with Linux 4.9.xxx Internet routes are segregated into a dedicated kernel routing table (#99 below), no non-bird routes are in this table. prot kernel is export only (again, no non-bird routes in this kernel table) bird table "internet" contains a protocol kernel, a protocol pipe, and 3 protocol bgps (for 3 iBGP peers). prot kernel config is: protocol kernel kernel_internet { debug { states, interfaces }; table internet; kernel table 99; scan time 60; persist; learn off; graceful restart on; import none; export filter { if net ~ IP_MY_NET_PLUS then reject; if net ~ IP_CORE_NET then reject; accept; }; } I'm seeing netlink drops when upstream internet churn is say more than 200 updates/sec or so, not huge, but quite freqent and can continue for minutes/hours. Some items I've investigated so far: Increasing net.core.rmem_max and net.core.wmem_max sysctls doesn't seem to help much, strace of bird doesn't indicate any EAGAIN or blocking when writing to the netlink sockets. strace shows some room for optimization in the prot kernel (these would obviously be code changes). For example, when a route changes next-hop/interface, 2 netlink messages are sent, delete followed by add, instead of a single change/replace (this would complicate bird, but reduce netlink message in half for updates). There is plenty of cpu cycles available, bird is <%1, etc... Any pointers on tuning or config changes that may help here are appreciated. -- Dave