On Sun, 17 May 2015, David Lamparter wrote:
[meme images were inserted into this mail after i switched into f*ck it
mode.]
I can see. The mode that is. I generally read email with a terminal
based MUA.
Because this is a scenario with only one router in the R set, yet we're
generating a significant amount of network-propagated churn that can
impact larger parts of the DFZ.
Compared to what? Not to normal BGP.
Could an update-delay with a global wait do better. Yes. *IF* the timer
has been tuned *just right*.
Tell me how we do that for all Quagga users so we can enable update-delay
by default?
I don't believe we can. To enable update-delay, the max-delay would have
to be so low that it'd be plain-BGP for many people. In which case, the
peer-variant would be better.
I say we can support all the use-cases. You say only your use-case should
be supported.
We can end up telling A that our best path to 6 is B, then D, then C, in
rapid succession - and it's quite possible this is actually the best
path for A, meaning it'll readvertise this to its own peers.
So that's fine.
If you update a peer with routes that are still via you, its FIB doesn't
change - it should still go via you (your own FIB may change needlessly
though).
All the granulities achieve that. They all avoid the spurious
"advertise-withdraw" because the restarted router sends a
ultimately-non-best route to the stable peer before.
Coalescing more over a longer time-span. I do *NOT* object to that.
However, I would object to making it so every peer /always/ has to be
subject to that delay.
Further, we will (I strongly suspect) never be able to enable the global
update-delay by default.
We can though have the default be much better. Maybe not as good as
update-delay with a suitable max-delay for certain use cases, but
certainly better than default, plain BGP-4.
Yet you seem to want to make this an either/or thing. Either plain-BGP or
your way, and no other. I really don't get that.
Remember, in general, you have:
No, I don't remember that, because I'm assuming the number of routers
that has restarted to be 0 or 1.
[snip large bulk of text that assumes more than 1 router in R]
Yes, the general case.
If you limit the case so there are no other restarted peers, then this
doesn't matter. If you constrain things to just to the case that suits
your argument, then sure, your argument does indeed win. Is that your
concern here?
The GR RFC explicitly says the restarting speaker "MUST defer route
selection" and "noted that prior to route selection, the speaker has no
routes to advertise to its peers and no routes to update the forwarding
state." Where 'its peers' includes other peers that are restarting.
Great. So what. We can do better, by not applying this optimisation where
it's not needed and so allowing us to provide a timer-free and hence
friendlier to use, easier to enable, GR that will still optimise out some
churn and without harming convergence time. And perfectly interoperably.
And while still providing the update-delay mode you want.
I'm still arguing my (1.) and (2.) from previous mail, i.e.:
1. global deferral is neccessary to avoid network-propagating churn in
the situation where only 1 router is restarting
Well, yes, if you want to limit the R-set to just 1-router you can ignore
my arguments. Sure.
In reality, in general, there will be 0...m peers that have restarted.
BGP consists of more than bytes on the wire.
I look forward to your patches to store received routes in a separate
AdjIn RIB, as per RFC4271. ;) Until then, gosh, look our RIB /does/ in
fact have routes to advertise. ;)
What extra churn though? There is no extra churn relative to BGP-4.
It's either extra delay relative to BGP-4, or extra churn relative to
4724 GR. Feel free to pick one.
Yes, there are trade-offs. I acknowledged that at least in our IRC
discussion.
Minimising CPU churn vs optimising for best convergence. Etc.
it filters out the worst transients (sending a route that the remote-peer
has a better path for before you've got it, leading to
UPDATE-then-WITHDRAW to that remote peer)
Those are actually the least problematic because the peer won't select
them and they won't continue to travel through the BGP domain.
I am not arguing against the mode you want at all!
I'm saying other modes are also useful, because not everyone wants to
minimise CPU churn. Some people are not running massive route-servers. You
want to optimise for one case only.
Note, the "UPDATE, oops, yours was better. WITHDRAW" case that GR can
avoid (both peer-specific or global) can cause issues locally, because
local FIB update-loads from such transients can be a problem of
themselves. Including for us, for those who use Quagga for forwarding.
Or we can discuss how our community works overall.
Yes, please do kick that off.
regards,
--
Paul Jakma [email protected] @pjakma Key ID: 64A2FF6A
Fortune:
"If you are afraid of loneliness, don't marry."
-- Chekhov
_______________________________________________
Quagga-dev mailing list
[email protected]
https://lists.quagga.net/mailman/listinfo/quagga-dev