On Fri, 15 May 2015, David Lamparter wrote:

On Thu, May 14, 2015 at 10:45:58PM +0100, Paul Jakma wrote:
a) GR peers that are both in update-delay just immediately send EoR to
    each other, and so go out of update-delay mode with each other.

This is incorrect; you seem to have forgotten update-delay is global.

No I didn't, it is the global aspect of delaying all UPDATEs that I have the issue with.

Each side will stay in update-delay until *all* of its peers are in an
acceptable state, where
acceptable := EoR || R=1 || (NoGR && Keepalive)
or alternatively the timer expires.

Yes, understood.

b) Non-GR peers sit in update-delay mode until one or both send keepalives
    to kick the other out of update-delay mode.

Again, they stay in update-delay mode until all peers globally are in an acceptable state, or the timer expires. The "look for Keepalive instead of EoR" thing seems to be a Cumulus invention, we could strip that out if needed. It's not a core aspect of the functionality, really.

The keepalive thing is sort of neat, though it kind of depends on what you're optimising for.

My issue with it is that this is optimising for one specific case. One router restarts (R in the centre), and this is optimising for where all its neighbours are stable (stable in the uptime sense - not the "quiescent RIB sense" - RIB-stability and uptime-stability may well be inversely related):

 S S S
  \|/
   R
  /|\
 S S S

I.e. one router has restarted.

More generally, the restarted router has 0 or more non-restarted peers and 0 or more restarted peers:

   0...
  S S S
   \|/
    R
   / \
  R   R
   0...

To minimise the impact of the restarted routing domain on the stable domain, you would want the restarted-domain to converge fully first (internally and on the information from the stable domain), and only then allow routing information to go from the restarted domain to the stable domain. That'd be hard to do, even with free reign to change BGP. So you can only consider local information.

Locally, you want to send updates to other routers in the R-domain as quickly as possible, and delay sending to the S-domain as long as possible.

Stopping all RIB processing is fine in the first case (you could argue this single-router R-domain is more common, and it's fine for route-servers, but it's less good in general for routing convergence.

What I would like is to defer UPDATEs only from the R-domain to S-domain peers. This is what the current code does.

With the global update-delay, CPU churn is being traded off for worse convergence (I had one other concern about it, I initially thought it was adding a queue - but that wasn't the case). I'm not sure that that is a trade-off that everyone wants. I also dislike configurable options - we have lots already. It'd be better to "do the right thing" as much as possible.

What I've been trying to do is explore how to get there.

(With Non-GR, we don't really know when the peer is done sending us their full table... using keepalives for that seems, hm, "innovative", though I agree with you it might turn out to be a bad idea. Should investigate this further.)

It's a neat trick, if you want to wait. Not everyone wants their routers to "wait and see" though, sometimes they want them to get back to passing UPDATEs ASAP.

Uh, no, they don't. The patch creates a global per-box M/L state; a box starts out believing it is in L and transitions into M based on above "acceptable" condition on *all* peers, or timer expiry. As long as we're L, we don't send any outgoing updates as an automatic side-effect of not running bestpath selection.

And that is precisely the issue.

Nothing gets sent globally, even though it would make sense to still send where both sides are equal. The L is /relative/ thing - not absolute.

Please reread the patchset; bgp_update_restarted_peers() handles this by not waiting for peers that indicate R=1, thus taking care of a L<>L session. Also, a peer in M will never wait. There's your asymmetry ;)

Ok, so it doesn't incude them in the calculation. It still defers sending updates to them, needlessly - if "fast convergence, while minimising disruption to the network that stayed up" is a goal.

This patchset is pretty exactly RFC 4724, aside from the Non-GR Keepalive thing. Peers in M will never wait, thus "go first". Peers in L will wait until they think they have a reasonable view of the network. That includes not waiting for other peers in L.

If delaying, it sends EoR immediately though, doesn't it? That's not RFC GR, and that needn't play nice with other implementations.

Yeah, maybe we should do that...  instead of arguing higher authority...

I'm not arguing to higher authority. I'm asking that you stop treating me as if I'm an idiot, and at least _listen_ to me.

E.g.:

- You NACK my patch to remove the startup timer and make the R-bit be
  dependent on state (any state is better than that damn timer - and
  I was perfectly happy to refine exactly what state, as discussed via
  IRC) with:

  "The R bit is intended to be based on wallclock time."

  After a (good) off-list discussion, and me posting a follow-up to
  address a concern you raised, which you reply to again arguing the need
  for a timer, you then post your own patch which, gosh, goes and removes
  the startup timer and makes the R-bit timer be, gosh, dependent on
  state, just as I had argued.

  WTF dude? Is that working together?

- We had a (productive I thought) discussion on IRC. We didn't resolve
  everything, but I thought we were going in the right direction, and I
  thought it became clear that we had different use-cases in mind. That
  you were concerned with minimising CPU churn on the restarted router,
  while I was concerned with minimising transient churn on the
  non-restarted router.

  The conclusion of that seemed to be for me to post a patch to fix the
  concrete issue you raised (restricting to the startup peers) and we
  could discuss further. I did that, you reply again with NACK on all the
  patches. WTF, how is that working together?

regards,
--
Paul Jakma      [email protected]  @pjakma Key ID: 64A2FF6A
Fortune:
You're never too old to become younger.
                -- Mae West

_______________________________________________
Quagga-dev mailing list
[email protected]
https://lists.quagga.net/mailman/listinfo/quagga-dev

Reply via email to