We set our scan time to 2s and then create many routes using "ip route add². With many (say 50k) routes, the scan starts taking a second or more so BIRD is ignoring about 50% of the route updates and only picking them up on the scan.
We¹re running 100ms interval pings between our containers that we¹re routing so we see a a cluster of ping times around 2s and another cluster around 0ms. -Shaun On 13/08/2015 17:00, "Ondrej Zajicek" <[email protected]> wrote: >On Thu, Aug 13, 2015 at 03:12:26PM +0000, Shaun Crampton wrote: >> Hi, >> >> We¹re using BIRD to redistribute routes that are programmed into the >>Linux >> kernel for routing to local containers or VMs. We set a scan time in >>the >> kernel section of the config in order to notice when routes are removed. >> >> Normally, BIRD picks up routes that are added extremely quickly. >>However, >> if a route is added during a scan, it seems to be missed and it is not >> picked up until the next scan, many seconds later. > >Hi > >I was not aware of this issue but the cause it is pretty clear - scans >are implemented in a synchronous way and BIRD ignores all non-related >messages during these scans. > >The proper solution would be to make make the BIRD netlink code fully >asynchronous, but that means rewritting half of netlink and route >scanning code. As a workaround we could just queue these asynchronous >messages and process them after scans (and other netlink operations). > >BTW, the issue is likely not limited to route scans but may happen with >any netlink operation (like request for route change), but other >operations are probably too quick to cause the problem in practice. > >Do you have a simple way how to trigger the issue? > >-- >Elen sila lumenn' omentielvo > >Ondrej 'Santiago' Zajicek (email: [email protected]) >OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) >"To err is human -- to blame it on a computer is even more so."
