Re: [j-nsp] BGP output queue priorities between RIBs/NLRIs

Rob Foehl Mon, 09 Nov 2020 18:56:45 -0800

On Mon, 9 Nov 2020, Jeffrey Haas wrote:

As the source of this particular bit of difficulty, a bit of explanation for 
why it simply wasn't done when the initial feature was authored.


Much appreciated -- the explanation, anyway ;)

An immense amount of work in the BGP code is built around the need to not have 
to keep full state on EVERYTHING.  We're already one of the most stateful BGP 
implementations on the planet.  Many times that helps us, sometimes it doesn't.

But as a result of such designs, for certain kinds of large work it is 
necessary to have a consistent work list and build a simple iterator on that.  
One of the more common patterns that is impacted by this is the walk of the 
various routing tables.  As noted, we start roughly at inet.0 and go forward 
based on internal table order.

Makes sense, but also erases the utility of output queue priorities whenmultiple tables are involved. Is there any feasibility of moving the RIBwalking in the direction of more parallelism, or at least something likeround robin between tables, without incurring too much overhead / bugsurface / et cetera?

The primary challenge for populating the route queues in user desired orders is 
to move that code out of the pattern that is used for quite a few other things. 
 While you may want your evpn routes to go first, you likely don't want route 
resolution which is using earlier tables to be negatively impacted.  Decoupling 
the iterators for the overlapping table impacts is challenging, at best.  Once 
we're able to achieve that, the user configuration becomes a small thing.

I'm actually worried that if the open ER goes anywhere, it'll result inthe ability to specify a table order only, and that's an awfully bighammer when what's really needed is the equivalent of the output queuepriorities covering the entire process. Some of these animals are moreequal than others.

I don't recall seeing the question about the route refreshes, but I can offer a 
small bit of commentary: The CLI for our route refresh isn't as fine-grained as 
it could be.  The BGP extension for route refresh permits per afi/safi 
refreshing and honestly, we should expose that to the user.  I know I flagged 
this for PLM at one point in the past.

The route refresh issue mostly causes trouble when bringing new PEs intoexisting instances, and is presumably a consequence of the same behavior:the refresh message includes the correct AFI/SAFI, but the remote winds upwalking every RIB before it starts emitting routes for the requestedfamily (and no others). The open case for the output queue issue has anote from 9/2 wherein TAC was able to reproduce this behavior and collectpacket captures of both the specific refresh message and the long periodof silence before any routes were sent.


-Rob

_______________________________________________
juniper-nsp mailing list [email protected]
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] BGP output queue priorities between RIBs/NLRIs

Reply via email to