Hi.

> Using the appearance of a log message as an indicator of precise timing of 
> when
> a RIB update happened is handwavy at best, and flat-out wrong at worst.

Exactly.


Once I replaced the BIRD with FRR on router "rZ"(bystander;
https://gist.github.com/tonusoo/1cced39aa6ae53143d12623a05f02331),
then I indeed observed 10+ seconds long propagation delays. Input
queues of the bgpd were constantly full, related TCP receive queues
were extremely high and this caused "rZ" to frequently send TCP
messages, with receive window set to zero, to its BGP neighbors "rM"
and "rN". Setting the scheduling priority of bgpd and zebra(daemon
responsible for updating the kernel routing table) to highest possible
value did lower the input queue of bgpd, but it was still very high.
Dropping the 5k oscillating routes in ingress route-map of the FRR
allowed bgpd to consume all the CPU resources of the virtual machine
as it no longer had to compete with zebra, but it was still not enough
to keep the bgpd input queue low.


The research paper includes scripts for generating the routers
configurations: https://zenodo.org/records/16739858 These
configurations include few elements that are uncommon in production
routers and which may, more or less, affect the performance of the
routing daemon:

  * FRR in "bystander" dumps the UPDATE messages to MRT file

  * logging processed BGP route advertisements. This was already
explained by Matthew Petach.



Martin
_______________________________________________
NANOG mailing list 
https://lists.nanog.org/archives/list/[email protected]/message/XV6ALV3HI4OID73ITTUURVKHCWISMZZ4/

Reply via email to