Hi Tony, I have rethinked this topic and I take back my suggestions - moreover I tend to agree with you that support of the single flooding algorithm at a given time would be sufficient.
I was a bit riding on a wave of another Tony, but to change the flooding algorithm in a safe way - saydo it via provisioning system or even via leader** - the simplest solution would be to utilize the MI-ISIS feature. That way new algo can be deployed safely in a new instance, tested then what goes to RIB flipped. And that would never put the network back into full flooding pattern which is indeed a very valid concern. So with that said I stand by my comment that flooding algorithms can be smoothly enabled by NETCONF either from full to (semi-)optimal flooding by using multiple ISIS instances. Cheers. R. PS. Of course sending and processing multiple copies of the same info is a waste. But with multithreaded ISIS implementations do we have any sufficient data that dedicating RE/RP CPU core to link state flooding + adding sufficient RAM to it is really something of a terribly burning problem ? When Petr pushed BGP as IGP the main issue was not the scale issue of ISIS. The main issue was ISIS uniform availability of off the shelf OEM platforms. ** I am very sceptical that the leader from vendor X will always work smoothly with all boxes from vendors Y, Z, W, Q ... On Thu, Nov 21, 2024 at 12:49 PM Tony Przygienda <[email protected]> wrote: > Robert, in case of using -prz- you don't have to go either way. you can > just roll over node by node and you are guaranteed that the whole graph > will be _sufficiently_ covered at all times and both components will do > their own "optimal" CDS but as TonyL points out there is -always- a chance > you will see much less than "optimal" flooding during such transition since > on the component edges you have full flooding (optimal meaning here really > "minimum number of links covering CDS" I guess). That is from OPEX > standpoint probably most appealing (though _most_ very large networks the > discussion is more about -double- coverage and avoiding articulations IME > than anything about "minimal number links CDS". Of course large IP fabric > and a large SP network are not the same thing and different "optimality" > may be desired ;-) > > if you don't want to mix algorithms at _any_ time then A is fine. More > OPEX intensive of course and at certain point there is full flooding again > (which may melt the network if it relies on flooding throughput achievable > with reduction in place only). Leader solution (discarding all other > considerations) simplifies it since the rollover needs to touch every node > just once and then leader once and the network never "fully floods" though > blast radius on leader reconfig is full network (all nodes fall over at > same time). This is a consideration to customers as has been voiced due to > increasing amount of incidents in last years, part of it being scale, part > of it being automation but often only possible due to single node having a > network wide blast radius when being touched. > > Scenario B is juggling chainsaws. Some people are very good at it, those > that are not are not at it for significant periods of time ;-) > > my 2c > > -- tony > > Footnote: the -prz- thingy allows to use the fact that a node _not_ > advertising any supported algorithm is a zero prunner and hence that _may_ > be leveraged by the algorithm on the other side but that's a different > kettle of fish discussion. > > On Thu, Nov 21, 2024 at 10:47 AM Robert Raszuk <[email protected]> wrote: > >> Hi Tony, >> >> > And it's perfectly fine AFAIS if the WG decides that working on >> "multiple >> > algorithm mix in the network" is not something to be pursuited, >> >> Scenario - A: >> >> Are you saying that when there is a need for migration from one algorithm >> to another the recommended prescription would be to first migrate to full >> flooding then enable the 2nd algorithm such that there is never the case >> when two algorithms co-exit ? >> >> Or >> >> Scenario - B: >> >> Are you saying that it would be fine for two algorithms to co-exit during >> such migration (limited time under close operator's control) in spite of >> chances of creating a single point of failure ? If the latter I presume a >> good implementation will allow one to manually force flooding on a link in >> spite of algorithm suppressing it during such special migration times. >> >> Personally I would be in favor of scenario B. >> >> Thx, >> Robert >> >> >> On Thu, Nov 21, 2024 at 10:16 AM Tony Przygienda <[email protected]> >> wrote: >> >>> Thanks Tony, good drill down. I see two points here: >>> >>> 1. the point I take here is that in the example resulting prunner >>> framework flooding covers the full graph, i.e. correctness as in >>> "sufficient flooding" is still assured. >>> 2. the solution may be _not_ optimal in terms of constructing a single >>> CDS, i.e. on the boundaries basically full flooding is mandated by the >>> prunner framework. Actually the most extreme case is where _every_ node in >>> the network runs a different algorithm and the prunner framework says >>> "well, flood on all links with different algorithm on the other side". Then >>> it all collapses into full flooding again. >>> >>> If that's my correct reading then please observe that the -prz- draft >>> does NOT state that in mixed algorithm scenarios _optimal_ flooding in any >>> sense is guaranteed (optimality here seems to mean "CDS with minimal number >>> of links"), it only says that prunner framework will guarantee "sufficient" >>> flooding to build an overall CDS, not less and not more. In fact that's the >>> paragraph that is possibly bits cryptical to most saying that you'd need a >>> "meta-prunner" algorithm for such stuff or synchronization of boundaries of >>> a component (funny enough, the considerations in such design start to be >>> closely related to arbitrary hierarchy principles ;-). There are other >>> considerations but they become even more arcane and AFAIS achieving an >>> "optimal" CDS when components with multiple algorithms are mixed is in >>> pragmatic terms not possible. >>> >>> So, if we agree that prunner framework (i.e. miximing multiple >>> algorithms) does guarantee "sufficient" flooding (i.e. full CDS) but does >>> NOT guarantee any "only necessary" flooding then we're in sync. And it's >>> perfectly fine AFAIS if the WG decides that working on "multiple algorithm >>> mix in the network" is not something to be pursuited, it will be sufficient >>> then to e.g. extend the 97xx to provide leader-based and leaderless >>> signalling as two options (just like there is centralized computed and >>> distributed already) and say that "mixing both modes or multiple algorithms >>> under leaderless is outside the scope of the document". Not every problem >>> under the sun needs to be solved by a WG and practical implications of such >>> scope limitations AFAIS are limited since in practical purposes mixing >>> limits blast radius on migrations and nothing else really AFAIU ;-) >>> >>> So I guess I wasn't specific enough when I said that I don't see a >>> counter example for -prz- framework not being correct. By correctness I >>> always meant "any mix of algorithms being prunners in a network will always >>> deliver _sufficient_ flooding" and not implied any kind of flooding >>> optimality. >>> >>> thanks >>> >>> --- tony >>> >>> >>> On Wed, Nov 20, 2024 at 10:57 PM Tony Li <[email protected]> wrote: >>> >>>> >>>> Hi all, >>>> >>>> Tony P. asked for a counter-example to why neighbor-only algorithm >>>> information is sufficient. This email attempts to articulate just such an >>>> example. >>>> >>>> Suppose that we have a bi-partite network, with two halves, A and B. >>>> Part A contains nodes A1, A2, A3, …. Part B contains nodes B1, B2, B3, …. >>>> >>>> The two halves are connected by three links (A1, B1), (A2, B2), and >>>> (A3, B3). >>>> >>>> The correct flooding topology in this situation is to select exactly >>>> two of the three links. Selecting only one of the links would create a >>>> single point of failure. Selecting all three links leads to unacceptable >>>> and unnecessary flooding. >>>> >>>> Suppose that A1 and B1 are running algorithm X. All other nodes are >>>> running algorithm Y. >>>> >>>> Suppose that under algorithm X, links 2 and 3 are selected. Therefore, >>>> A1 and B1 choose to prune (A1, B1). Further, suppose that under algorithm >>>> Y, links 1 and 2 are selected. Therefore, nodes A3 and B3 choose to prune >>>> (A3, B). Now, only (A2, B2) is selected, creating a single point of >>>> failure. >>>> >>>> The key points here are simple: >>>> >>>> - An algorithm makes assumptions about how other nodes in the topology >>>> are going to behave. If multiple algorithms are in play, those assumptions >>>> may not hold. >>>> >>>> - Two concurrent algorithms, while each individually correct, can still >>>> produce a flawed flooding topology if they are asked to interoperate. >>>> >>>> - Full flooding at the boundary between the algorithms is not >>>> sufficient to correct the situation. >>>> >>>> Regards, >>>> Tony >>>> >>>> _______________________________________________ >>>> Lsr mailing list -- [email protected] >>>> To unsubscribe send an email to [email protected] >>>> >>> _______________________________________________ >>> Lsr mailing list -- [email protected] >>> To unsubscribe send an email to [email protected] >>> >>
_______________________________________________ Lsr mailing list -- [email protected] To unsubscribe send an email to [email protected]
