Hey Renato, great to see a new implementation coming up and results that justify the work. And as usual implementing you found the interesting questions partially implied, partially omitted in the draft ;-) rest inline
On Wed, Nov 5, 2025 at 3:54 AM Renato Westphal <[email protected]> wrote: > Hi all, > > I really like the idea behind this draft and decided to build a prototype > implementation. In my tests, I observed an 80%-90% reduction in flooding > volume across different Clos topologies, which is pretty nice given how > simple these computations are and how most of them can be cached in advance. > > To provide a concrete example, I tested the following three-tier Clos > topology: > * super spines: 4 > * pods: 12 > * spines per pod: 4 > * leaves per pod: 16 > * total number of nodes: 244 > > Using vanilla IS-IS, any LSP update in a leaf node causes around 1680 > refloods across the network. With the proposed algorithm, the number drops > to 280, nearly one per node. > > Now I have a few questions: > > 1 - The technique used in this draft relies heavily on computing truncated > SPTs from the perspective of neighbors, using hop count as the metric. One > thing that is unclear is how networks with multiple topologies (RFC 5120) > are handled. Defaulting to the standard topology (MT ID #0) could break > flooding if that topology does not cover the entire network (aside from the > CSNP fallback). > good observation but overthought a bit ;-) Flooding in MT is happening always "on all the topologies" irregardless and hence we can disregard it for the purpose of this draft. Or simpler, 5120 does NOT filter any flooding based on topologies and hence reduction can disregard it. ~Merits a small note in the draft maybe. > > 2 - In Section 1.2.3, step 1.C of the algorithm describes iterating over > all IS nodes two hops away from TN and checking whether each node is on the > shortest path from TN to the LSP originator. How can that check be > performed if the SPT from the perspective of TN is truncated to two hops? > the spt truncated to two hops is only enough for rule one. rule two says " The second stage is simpler, consisting of a single rule: do not flood modified LSPs along any of the shortest paths towards the origin of the modified LSP. " that does in fact imply a SPT from the view of the originator. anything else will not lead to a full reduction. Shraddha may chime in since she had good amount of examples) and overflooding. I think she also had an example where flooding could actually not cover the whole graph if the full SPT from originator is not computed. I assume she will answer in here further. > > 3 - Section 1.2.7 states: > > "An implementation should pay particular attention that the case of a > stale LSP with a higher version that persists in the network still works > correctly in case the originator reboots and starts with lower version. > Though the flooding of an LSP back to originator is suppressed by this > extension the normal PSNP and CSNP procedures should trigger re-origination > by the source of a higher version correctly". > > I don't quite follow this paragraph. If the received self-originated LSP > exists in the database and the received LSP is considered more recent, the > local IS will update the sequence number of the database LSP and start a > new flood, which differs from a reflood and is not affected by the > optimizations in this draft. Have I misunderstood what the actual problem > is? > with rule 2 you would NOT flood back at the originator and with that the LSDB would not get fixed by originator issuing a higher seqnr#. Acee mentioned the problem obliquely and having thought it through it needs the special paragraph. The problem persists in recursive way in case the old version is somewhere further in the network and rule 2 prevents "back flooding". We saw it BTW in RIFT as well in heavy testing breaking links and it was caused by a mix of reduction and flooding scopes and it needed the according rules but then I forgot all about it ;-) In fact, this flood reduction is just a more generic form of RIFT flood reduction and both are children of MANET work largely. > > 4 - In section 1.2.3, there are two references to "move to Step 5", but > that step no longer exists. I checked and it was removed in version -07 of > the draft. Was it removed by accident? > my bad. xml2rfc probably changed and went to letters. I need to put in proper reference in XML source I guess > > 5 - draft-prz-lsr-interop-flood-reduction-architecture-01 defines the > framework for flooding reduction algorithms on which this document is > based. It would be beneficial to reference that draft somewhere in the text > to give readers additional context on distributed flooding reduction > mechanisms. > I'm fine either way. to move forward this draft kind of focuses on the algorithm only now and it's probably better to reference from architecture to this as a specific example. I'll see whether there will be further comments on this on the list. > > Best regards, > -- > Renato Westphal > _______________________________________________ > Lsr mailing list -- [email protected] > To unsubscribe send an email to [email protected] >
_______________________________________________ Lsr mailing list -- [email protected] To unsubscribe send an email to [email protected]
