As first, I'm supportive for the work & I think it's of solid applicable value 
albeit it's strictly not IETF territory (it's not necessary for interop 
strictly speaking).

First is very blunt:  if you manage to really make all the routers in the area 
compute @ precisely the same time, you may not be doing yourself the favor you 
seek ;-)  What I mean is that generating perfectly synchronized peaks in a 
network tends to generate strange attractors, a good example was the 
synchronization of the HELLOs on all links over time that had to be jittered. 
Peaks can stress infra unexpectedly & lead to e.g. synchronized 
re-advertisement of LSAs  (or anything that SPF can trigger now and in the 
future).  Given on top that an SPF in the future is not necessarily the 2-3 
msec SPF seen today (rLFA & such runs seem to become the new flavor of SPF) I 
suggest to include a small configurable jitter before the first SPF is 
triggered (couple msecs should do the trick but I'm willing to hear the 
argument that flooding de-sync's the SPF runs enough already).

The other issue is far more subtle but may merit a section in the draft.  This 
work is pushing the protocol in a very specific direction along the CAP 
paradigm, i.e. a link-state routing protocol is roughly


1.      Always 100% P (partitioned)

2.      Basically 100% available  A  (tad hard to define given FIBs)

3.      _eventually_ consistent C

Now, it is fairly well understood that having all 3 is not possible across very 
wide set of CS problems and we are not exempt of that.  We cannot move P  so 
pushing on the C will cause A to move to the negative. Now, what do I mean by 
that.  Triggering the SPFs more aggressively will give you better 
consistent&available in the scenario of a single link failure if things go 
well. Now, compared to e.g. a batching algorithm that computes every 500 msecs 
without backing off and will show linear consistent&available even in case of 
fast-link flapping, many links failing consecutively and so on, exponential 
backoff will cause massively lower consistency after several link failure and 
this network-wide so certain people may loose big time when using that. Beside 
that, the quick SPFs can block lots of other things in the protocol that are 
not parallelized or block other protocols waiting for SPFs to finish or next 
SPF (2nd failure) stuck on FIB download running (all hypothetical, but 
availability in widest sense will go down if you see more consistency). Again, 
the work is good but the section will show people that it's not an 'universal' 
improvement but something triggered to ideally a seldom occurring 1 or 2 links 
failure.

Thanks

--- tony



"FUTURE, n.
That period of time in which our affairs prosper, our friends are true and our 
happiness is assured."
― Ambrose Bierce<http://www.goodreads.com/author/show/14403.Ambrose_Bierce>, 
The Unabridged Devil's Dictionary<http://www.goodreads.com/work/quotes/865289>

_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

Reply via email to