[Lsr] Re: A counter example

Robert Raszuk Thu, 21 Nov 2024 16:36:13 -0800

Hi Tony,

I have rethinked this topic and I take back my suggestions - moreover I
tend to agree with you that support of the single flooding algorithm at
a given time would be sufficient.


I was a bit riding on a wave of another Tony, but to change the flooding
algorithm in a safe way - saydo it via provisioning system or even via
leader** - the simplest solution would be to utilize the MI-ISIS feature.
That way new algo can be deployed safely in a new instance, tested then
what goes to RIB flipped.

And that would never put the network back into full flooding pattern which
is indeed a very valid concern.

So with that said I stand by my comment that flooding algorithms can be
smoothly enabled by NETCONF either from full to (semi-)optimal flooding by
using multiple ISIS instances.

Cheers.
R.

PS. Of course sending and processing multiple copies of the same info is a
waste. But with multithreaded ISIS implementations do we have any
sufficient data that dedicating RE/RP CPU core to link state flooding +
adding sufficient RAM to it is really something of a terribly burning
problem ? When Petr pushed BGP as IGP the main issue was not the scale
issue of ISIS. The main issue was ISIS uniform availability of off the
shelf OEM platforms.

** I am very sceptical that the leader from vendor X will always work
smoothly with all boxes from vendors Y, Z, W, Q ...

On Thu, Nov 21, 2024 at 12:49 PM Tony Przygienda <[email protected]>
wrote:

> Robert, in case of using -prz- you don't have to go either way. you can
> just roll over node by node and you are guaranteed that the whole graph
> will be _sufficiently_ covered at all times and both components will do
> their own "optimal" CDS but as TonyL points out there is -always- a chance
> you will see much less than "optimal" flooding during such transition since
> on the component edges you have full flooding (optimal meaning here really
> "minimum number of links covering CDS" I guess). That is from OPEX
> standpoint probably most appealing (though _most_ very large networks the
> discussion is more about -double- coverage and avoiding articulations IME
> than anything about "minimal number links CDS". Of course large IP fabric
> and a large SP network are not the same thing and different "optimality"
> may be desired ;-)
>
> if you don't want to mix algorithms at _any_ time then A is fine. More
> OPEX intensive of course and at certain point there is full flooding again
> (which may melt the network if it relies on flooding throughput achievable
> with reduction in place only). Leader solution (discarding all other
> considerations) simplifies it since the rollover needs to touch every node
> just once and then leader once and the network never "fully floods" though
> blast radius on leader reconfig is full network (all nodes fall over at
> same time). This is a consideration to customers as has been voiced due to
> increasing amount of incidents in last years, part of it being scale, part
> of it being automation but often only possible due to single node having a
> network wide blast radius when being touched.
>
> Scenario B is juggling chainsaws. Some people are very good at it, those
> that are not are not at it for significant periods of time ;-)
>
> my 2c
>
> -- tony
>
> Footnote: the -prz- thingy allows to  use the fact that a node _not_
> advertising any supported algorithm is a zero prunner and hence that _may_
> be leveraged by the algorithm on the other side but that's a different
> kettle of fish discussion.
>
> On Thu, Nov 21, 2024 at 10:47 AM Robert Raszuk <[email protected]> wrote:
>
>> Hi Tony,
>>
>> > And it's perfectly fine AFAIS if the WG decides that working on
>> "multiple
>> > algorithm mix in the network" is not something to be pursuited,
>>
>> Scenario - A:
>>
>> Are you saying that when there is a need for migration from one algorithm
>> to another the recommended prescription would be to first migrate to full
>> flooding then enable the 2nd algorithm such that there is never the case
>> when two algorithms co-exit ?
>>
>> Or
>>
>> Scenario - B:
>>
>> Are you saying that it would be fine for two algorithms to co-exit during
>> such migration (limited time under close operator's control) in spite of
>> chances of creating a single point of failure ? If the latter I presume a
>> good implementation will allow one to manually force flooding on a link in
>> spite of algorithm suppressing it during such special migration times.
>>
>> Personally I would be in favor of scenario B.
>>
>> Thx,
>> Robert
>>
>>
>> On Thu, Nov 21, 2024 at 10:16 AM Tony Przygienda <[email protected]>
>> wrote:
>>
>>> Thanks Tony, good drill down. I see two points here:
>>>
>>> 1. the point I take here is that in the example resulting prunner
>>> framework flooding covers the full graph, i.e. correctness as in
>>> "sufficient flooding" is still assured.
>>> 2. the solution may be _not_ optimal in terms of constructing a single
>>> CDS, i.e. on the boundaries basically full flooding is mandated by the
>>> prunner framework. Actually the most extreme case is where _every_ node in
>>> the network runs a different algorithm and the prunner framework says
>>> "well, flood on all links with different algorithm on the other side". Then
>>> it all collapses into full flooding again.
>>>
>>> If that's my correct reading then please observe that the -prz- draft
>>> does NOT state that in mixed algorithm scenarios _optimal_ flooding in any
>>> sense is guaranteed (optimality here seems to mean "CDS with minimal number
>>> of links"), it only says that prunner framework will guarantee "sufficient"
>>> flooding to build an overall CDS, not less and not more. In fact that's the
>>> paragraph that is possibly bits cryptical to most saying that you'd need a
>>> "meta-prunner" algorithm for such stuff or synchronization of boundaries of
>>> a component (funny enough, the considerations in such design start to be
>>> closely related to arbitrary hierarchy principles ;-). There are other
>>> considerations but they become even more arcane and AFAIS achieving an
>>> "optimal" CDS when components with multiple algorithms are mixed is in
>>> pragmatic terms not possible.
>>>
>>> So, if we agree that prunner framework (i.e. miximing multiple
>>> algorithms) does guarantee "sufficient" flooding (i.e. full CDS) but does
>>> NOT guarantee any "only necessary" flooding then we're in sync. And it's
>>> perfectly fine AFAIS if the WG decides that working on "multiple algorithm
>>> mix in the network" is not something to be pursuited, it will be sufficient
>>> then to e.g. extend the 97xx to provide leader-based and leaderless
>>> signalling as two options (just like there is centralized computed and
>>> distributed already) and say that "mixing both modes or multiple algorithms
>>> under leaderless is outside the scope of the document". Not every problem
>>> under the sun needs to be solved by a WG and practical implications of such
>>> scope limitations AFAIS are limited since in practical purposes mixing
>>> limits blast radius on migrations and nothing else really AFAIU ;-)
>>>
>>> So I guess I wasn't specific enough when I said that I don't see a
>>> counter example for -prz- framework not being correct. By correctness I
>>> always meant "any mix of algorithms being prunners in a network will always
>>> deliver _sufficient_ flooding" and not implied any kind of flooding
>>> optimality.
>>>
>>> thanks
>>>
>>> --- tony
>>>
>>>
>>> On Wed, Nov 20, 2024 at 10:57 PM Tony Li <[email protected]> wrote:
>>>
>>>>
>>>> Hi all,
>>>>
>>>> Tony P. asked for a counter-example to why neighbor-only algorithm
>>>> information is sufficient. This email attempts to articulate just such an
>>>> example.
>>>>
>>>> Suppose that we have a bi-partite network, with two halves, A and B.
>>>> Part A contains nodes A1, A2, A3, ….  Part B contains nodes B1, B2, B3, ….
>>>>
>>>> The two halves are connected by three links (A1, B1), (A2, B2), and
>>>> (A3, B3).
>>>>
>>>> The correct flooding topology in this situation is to select exactly
>>>> two of the three links. Selecting only one of the links would create a
>>>> single point of failure. Selecting all three links leads to unacceptable
>>>> and unnecessary flooding.
>>>>
>>>> Suppose that A1 and B1 are running algorithm X.  All other nodes are
>>>> running algorithm Y.
>>>>
>>>> Suppose that under algorithm X, links 2 and 3 are selected.  Therefore,
>>>> A1 and B1 choose to prune (A1, B1).  Further, suppose that under algorithm
>>>> Y, links 1 and 2 are selected. Therefore, nodes A3 and B3 choose to prune
>>>> (A3, B).  Now, only (A2, B2) is selected, creating a single point of
>>>> failure.
>>>>
>>>> The key points here are simple:
>>>>
>>>> - An algorithm makes assumptions about how other nodes in the topology
>>>> are going to behave. If multiple algorithms are in play, those assumptions
>>>> may not hold.
>>>>
>>>> - Two concurrent algorithms, while each individually correct, can still
>>>> produce a flawed flooding topology if they are asked to interoperate.
>>>>
>>>> - Full flooding at the boundary between the algorithms is not
>>>> sufficient to correct the situation.
>>>>
>>>> Regards,
>>>> Tony
>>>>
>>>> _______________________________________________
>>>> Lsr mailing list -- [email protected]
>>>> To unsubscribe send an email to [email protected]
>>>>
>>> _______________________________________________
>>> Lsr mailing list -- [email protected]
>>> To unsubscribe send an email to [email protected]
>>>
>>

_______________________________________________
Lsr mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[Lsr] Re: A counter example

Reply via email to