As first, I'm supportive for the work & I think it's of solid applicable value albeit it's strictly not IETF territory (it's not necessary for interop strictly speaking).
First is very blunt: if you manage to really make all the routers in the area compute @ precisely the same time, you may not be doing yourself the favor you seek ;-) What I mean is that generating perfectly synchronized peaks in a network tends to generate strange attractors, a good example was the synchronization of the HELLOs on all links over time that had to be jittered. Peaks can stress infra unexpectedly & lead to e.g. synchronized re-advertisement of LSAs (or anything that SPF can trigger now and in the future). Given on top that an SPF in the future is not necessarily the 2-3 msec SPF seen today (rLFA & such runs seem to become the new flavor of SPF) I suggest to include a small configurable jitter before the first SPF is triggered (couple msecs should do the trick but I'm willing to hear the argument that flooding de-sync's the SPF runs enough already). The other issue is far more subtle but may merit a section in the draft. This work is pushing the protocol in a very specific direction along the CAP paradigm, i.e. a link-state routing protocol is roughly 1. Always 100% P (partitioned) 2. Basically 100% available A (tad hard to define given FIBs) 3. _eventually_ consistent C Now, it is fairly well understood that having all 3 is not possible across very wide set of CS problems and we are not exempt of that. We cannot move P so pushing on the C will cause A to move to the negative. Now, what do I mean by that. Triggering the SPFs more aggressively will give you better consistent&available in the scenario of a single link failure if things go well. Now, compared to e.g. a batching algorithm that computes every 500 msecs without backing off and will show linear consistent&available even in case of fast-link flapping, many links failing consecutively and so on, exponential backoff will cause massively lower consistency after several link failure and this network-wide so certain people may loose big time when using that. Beside that, the quick SPFs can block lots of other things in the protocol that are not parallelized or block other protocols waiting for SPFs to finish or next SPF (2nd failure) stuck on FIB download running (all hypothetical, but availability in widest sense will go down if you see more consistency). Again, the work is good but the section will show people that it's not an 'universal' improvement but something triggered to ideally a seldom occurring 1 or 2 links failure. Thanks --- tony "FUTURE, n. That period of time in which our affairs prosper, our friends are true and our happiness is assured." ― Ambrose Bierce<http://www.goodreads.com/author/show/14403.Ambrose_Bierce>, The Unabridged Devil's Dictionary<http://www.goodreads.com/work/quotes/865289>
_______________________________________________ rtgwg mailing list [email protected] https://www.ietf.org/mailman/listinfo/rtgwg
