For 20 years in the industry, I’ve misunderstood an element of BGP.
My understanding was that “multihop” was used any time you might receive a route from a thing, but be given destinations for other things.
In some cases, that mean routers multiple hops away, yes. But in my incorrect understanding, it also meant route-servers at Internet Exchanges (which are considered directly connected.
So for all of my time at this job and last, I set nearly everything in Quagga/zebra/BIRD to multihop…and it worked.
But…recently something in Linux changed, it seems…where the algorithm used to recursively try to solve routes when “multihop” was used now completely ignores directly-connected interfaces, maybe...?
So…when BIRD did it’s reconfig…all of a sudden it was unable to resolve the route lookups for anything labeled “multihop”. This resulted in every route received over a multihop BGP session with a route-server at an IXP being marked "!", unreachable, reject.
Thus, removing the line “multihop” from all of our route-server peers fixed the issue, routes were again received and added to the kernel properly, and life went on.
Maybe nobody else out there has the same misunderstanding I did, and nobody will ever experience this again...but if posting here in my embarrassment helps one of you down the line, I suppose it was all worth it.
-jake
