Since I don't see anything substantial new in this email and you did not respond to Jul 19 thread refuting your largely same arguments (especially your #1 seems simply incorrect, CSNP will _never_ detect CSUM collision if seqnr is the same as Tony Li pointed out and hence the whole argument 'it's unreliable with this but is was reliable before' is a fallacy, especially once you do some probabilities math). Hence I allow myself to ignore this thread. As a mildly acerbic note, _enormous_ amount of data is being sync'ed using Merkle trees in things like Dynamo DB that is holding up most of amazon.
As to #2 and #3 it is up to the implementation to use such strategies and they don't need standardization though AFAIS a section of 'further considerations' could include such advice albeit I doubt its value, e.g. bringing up an adjacency with a huge database (where this spec is aiming at) where 99% of the database is already the same (flap) can massively benefit the scenario. thanks -- tony On Mon, Oct 6, 2025 at 6:15 AM Les Ginsberg (ginsberg) <ginsberg= [email protected]> wrote: > At a high level, this is the way I am viewing this draft. > > > > IS-IS has 100% reliable flooding defined in the base specification. > > As scale requirements increase, there has been an interest in optimizing > flooding. See: > > > > https://datatracker.ietf.org/doc/rfc9667/ > > https://datatracker.ietf.org/doc/draft-ietf-lsr-isis-flood-reduction-arch/ > > > > Flooding optimizations introduce the possibility that the reliability of > flooding may be compromised. Therefore, the use of CSNPs is of interest as > this can restore the > > 100% reliable flooding guarantee while allowing flooding optimizations to > be deployed. However, at scale, the size of a complete set of CSNPs can > become large, so there is interest in finding a way to reduce the cost of > using CSNPs. > > > > This draft is a proposal which can significantly reduce the number and > size of PDUs required to convey a summary of the state of the LSPDB (which > is what CSNPs do today). However, it does not provide a guarantee that all > LSPDB > > inconsistencies will be detected. > > > > I do not believe we are (or should be) in the business of defining > solutions which work "most of the time". > > > > I cannot support this proposal. > > > > Below are specific issues I see in the proposed solution. But my > objections are fundamentally about not being able to provide 100% > reliability. Addressing the issues below will not alter my opinion unless > it also provides 100% reliability. > > > > > > Issue #1: Unreliability > > > > The draft proposes to use a simple hash to summarize the state of a range > > of LSPs. The possibility of "hash collision" is not insignificant. When it > > occurs it will be undetectable - which compromises the reliability of the > > Update process. > > > > It has been mentioned that even the existing PDU checksum mechanism used > by IS-IS > > (fletcher) can produce collisions - which is true. But in such a case, the > raw > > data is still present in the PDU and can be used to detect LSPDB > inconsistencies > > even in the presence of a checksum collision. In the HSNP proposal, because > > only a summary of the data is present it is not possible to detect or > recover from a hash collision. > > > > Issue #2: Solution becomes less useful in the presence of LSPDB Differences > > > > The choice of system ID ranges to advertise in the HSNP is optimized for > > cases where the neighbors LSPDBs are mostly in sync. In the case of an > > established adjacency, this is likely to be true. But in the case of > adjacency > > bringup this is less likely. > > > > If one neighbor has LSPs from nodes A, B, C and the other neighbor has not > yet > > received any LSPs from B, then the choice of a system ID range greater > than 1 > > is likely to trigger a hash mismatch and result in either flooding of > > LSPs from all nodes unnecessarily or require reversion to traditional > CSNPs. > > > > This makes the solution unusable in the case of adjacency bringup - which > is a case also worthy of optimization. A good solution to this issue should > be usable both for adjacency bringup and periodic CSNPs. > > > > Issue #3: The solution degrades as scale (size of the LSPDB) increases > > > > When the LSPDBs are mismatched the > > likelihood of hash mismatches increases. Even in a stable network, there > is a > > base level of LSP refresh flooding that occurs. Assuming an LSP lifetime of > > 65535 seconds and an LSP refresh time of 65000 seconds we can expect > > a base level of LSP updates as shown below: > > > > Size of LSPDB Average LSP flooding rate > > ------------------------------------------- > > 1000 1 LSP/65 seconds > > 10000 1 LSP/6.5 seconds > > 20000 1 LSP/3.25 seconds > > ... > > > > This means as scale increases, the likelihood that hash mismatches will > occur > > increases. Even in the absence of any LSP flooding pathology this is likely > > to trigger redundant LSP flooding or a reversion to > > traditional SNPs. > > > > To overcome this, one could imagine a strategy that suppresses HSNPs when > > SRM bits are currently set on an interface - but as one of the primary use > > cases for HSNPs is in the presence of flooding optimizations where flooding > > is intentionally suppressed on some interfaces that strategy will not be > > applicable in such cases. > > > > Les > > > _______________________________________________ > Lsr mailing list -- [email protected] > To unsubscribe send an email to [email protected] >
_______________________________________________ Lsr mailing list -- [email protected] To unsubscribe send an email to [email protected]
