[Lsr] Re: Consensus Call on LSR WG work on "Leaderless Flooding Algorithm for Distributed Flood Reduction to allow reduced configuration, minimal blast radius, and ease of incremental deployment"

Tony Przygienda Tue, 19 Nov 2024 09:48:45 -0800

Hey Tony,

On Tue, Nov 19, 2024 at 5:28 PM Tony Li <[email protected]> wrote:


>
> Hi Tony,
>
> Please see inline.  I’ve taken the liberty of reordering issues.
>
>
> As to "optimal" flooding reduction, this is largely a rathole IMO. Any CDS
> will basically generate the same amount of copies (which is what we
> optimize for primarily IMO) and arguing that a certain CDS is better than
> "another" CDS will lead to all kind of discussions about speed of
> replication depending on fanout, CPU vs. bandwidth weighting and ultimate
> resiliency against temporary CDS partitioning in case of node failures. We
> had discussions behind the scenes to e.g. take into account link BW when
> computing the graph, however it seems that beside causing very significant
> amount of computation, especially on link failures (and I don't think FRR
> for flood reduction is a good idea BTW ;-) such optimizations are hard to
> quantify. Throw into the mix different configurations/dynamic flooding
> speeds per link for fast flooding and we basically have NP hard
> entertainment on our hand of barely any practical, pragmatic value.
>
>
>
> I agree that trying to objectively determine the “optimal” flooding
> reduction algorithm is a rathole.  There are many dimensions that an
> algorithm can and should be evaluated on.  These are articulated in RFC
> 9667, Section 3. Different networks may place different importance on each
> of these requirements, so discussing optimality without regard to a given
> network and management policies would be ultimately fruitless.  Rather, it
> would be good for each algorithm to articulate how it meets each of the
> requirements listed in that section.
>

We agree on the "objective" or "generic" function to evaluate "optimality"
being a rathole. good ;-)

otherwise yes,  people can have different preferred ponies based on the
requirements driving their topology/solution space and that speaks for
relevance of different algorithms and/or signalling approaches. But to be
precise, it does not necessarily imply that people or the WG should care
about multiple algorithms/signalling at the same time on the network ;-)
(as footnote: anyone who wants to migrate from one algorithm/version to
another node-by-node w/o first disabling the first algorithm everywhere
will however end up in the "multiple algorithms at same time on network").


>
> As to correctness of leaderless/leader based/any signalling I refer to
> draft
> https://datatracker.ietf.org/doc/draft-lsr-prz-interop-flood-reduction-architecture/
> and until a counter example is provided (or the assumptions therein
> challenged) the solution should allow to support a generic framework with
> mix of algorithms and signaling schemes (if such a need even arises) as
> long every node indicates what it is configured to do or runs.
>
>
>
> Several points here:
>
> 1) The need is present.  We have multiple shipping implementations that do
> not interoperate presently. Networks that adopt any one of them are
> effectively in a state of vendor lock until we have a selection mechanism.
> This is an unacceptable situation.
>
> 2) I find your draft extremely challenging to read. This is very likely a
> language issue.
>
> 3) Perhaps as a result of point 2, I am still not seeing a mechanism in
> this draft. I see no signalling whatsoever.
>

well, signalling being put into this draft would be one possible place
if/when we agree to work on leaderless signalling but as I wrote, let's
agree we should work on leaderless signalling or not first, then let's look
for a good place as disjoint experiment, joining with 9667 experiment or
whatever WG productively agrees on as placeholder and solution it wants to
deliver.


>
> 4) Would you be willing to stipulate that two algorithms running
> simultaneously in the same network without regard to one another is
> unacceptable, as it is likely to cause gaps in flooding or massive
> over-flooding?
>

hmm, first comment I get it's unreadable so as kernel of the thing; in
simple terms the draft says that algorithms/signalling can be arbitrarily
mixed (again, if the need even arises)  and will gurantee flooding coverage
as long they are "prunners" which are 2 required straight-forward
properties:

1. any node running flood reduction needs to either advertise that it is
"configured to run an algorithm" or "is running a certain algorithm" so
other nodes know what they participate or can participate in.
2. any adjacency with a different configured or running algorithm on the
other side needs to be fully flooded

obviously within the "component", i.e. e'one signalling same algorithm the
algorithm MUST guarantee flooding coverage (CDS)

obviously AFAIS again, if a node violates 1. and is not indicating what it
is configured to run or running then we have "ships in the night" and ships
in the night never interoperate. Nodes or leaders "assuming" that nodes
support the algorithm are basically ships in the night which means, a
single algorithm without being a prunner will work fine but it won't mix
(and again, assuming we even care about presence of multiple
algorithms/signalling running at same time in the network).

Since you mention specific drafts, I don't really want to drag this thread
into solutions since that's a follow up thing depending on consensus here
but AFAIS RFC9667 respin can actually fairly easily include the leaderless
mode with some clarifications (and possibly even w/o adding new sub-TLVs
;-) and it would be further beneficial since the RFC is underspecified in
certain respects and bears possibly defects as it stands which we can roll
up then. Again, discussion to be had if that's the path the WG starts to
pursue and I'm happy to suggest a strawman of necessary changes then.

-- tony

_______________________________________________
Lsr mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[Lsr] Re: Consensus Call on LSR WG work on "Leaderless Flooding Algorithm for Distributed Flood Reduction to allow reduced configuration, minimal blast radius, and ease of incremental deployment"

Reply via email to