[Lsr] Fwd: Re: Review comments for draft-prz-lsr-hierarchical-snps-00: High Level Concerns

Tony Przygienda Fri, 18 Jul 2025 11:00:48 -0700

instead of reply-all

---------- Forwarded message ---------
From: Tony Przygienda <[email protected]>
Date: Fri, Jul 18, 2025 at 7:40 PM
Subject: Re: [Lsr] Re: Review comments for
draft-prz-lsr-hierarchical-snps-00: High Level Concerns
To: Les Ginsberg (ginsberg) <[email protected]>





On Fri, Jul 18, 2025 at 7:14 PM Les Ginsberg (ginsberg) <ginsberg=
[email protected]> wrote:

> Tony –
>
>
>
> Thanx for the quick response.
>
> Please see inline.
>
>
>
>
>
> *From:* Tony Li <[email protected]>
> *Sent:* Friday, July 18, 2025 12:45 AM
> *To:* Les Ginsberg (ginsberg) <[email protected]>
> *Cc:* [email protected]; lsr <[email protected]>
> *Subject:* Re: [Lsr] Review comments for
> draft-prz-lsr-hierarchical-snps-00: High Level Concerns
>
>
>
>
>
> Hi Les,
>
>
>
> *1)The uniqueness of the calculated hash is an essential component for
> this to work. Given that you are using a simple XOR on a 64 bit number -
> and then "compressing" it to 32 bits for advertisement - uniqueness is NOT
> guaranteed. The danger of false positives (i.e., hashes that match when
> they should not) would compromise the solution. Can you provide more detail
> on the efficacy of the hash?*
>
>
>
>
>
> I’m sorry, you’re a bit confused here. We do NOT need uniqueness of the
> hash.  In fact, one of the essential properties of all hashes is that they
> are not unique. Multiple inputs will always produce hash collisions.  This
> is necessarily true: the size of the input is larger than the size of the
> output. Information is necessarily lost.
>
>
>
> This is already true for the Fletcher checksum that is used as part of
> CSNPs.
>
>
>
> What we do want is to ensure that the hashing function is sensitive to the
> inputs. That is, for a small change in the input, there is a change in the
> hash value.
>
>
>
> Since we are not doing security here, we do NOT care about the ability to
> compute a hash collision.
>
>
>
> That said, I don’t think that we are particularly sensitive to the
> specific hashing function. My personal preference would be to continue to
> use the Fletcher checksum just because the code is already there in all
> implementations. One could also reasonably use CRC-16, CRC-32, etc.
>
>
>
>
>
> *[LES:] Let’s use a very simple example.*
>
>
>
> *A and B are neighbors*
>
> *For LSPs originated by Node C here is the current state of the LSPDB:*
>
>
>
> *A has (C.00-00(Seq 10), C.00-01(Seq 8), C-00.02(Seq 7) Merkle hash:
> 0xABCD*
>
> *B has (C.00-00(Seq 10), C.00-01(Seq 9), C-00.02(Seq 6) Merkle hash:
> 0xABCD*
>
> *(unlikely that the hashes match -  but possible)*
>
>
>
> *When A and B exchange hash TLVs they will think they have the same set of
> LSPs originated by C even though they don’t.*
>
> *They would clear any SRM bits currently set to send updated LSPs received
> from C on the interface connecting A-B.*
>
> *We have just broken the reliability of the update process.*
>
>
>
> *The analogy of the use of fletcher checksum on PDU contents is not a good
> one. The checksum allows a receiver to determine whether any bit errors
> occurred in the transmission. If a bit error occurs and is undetected by
> the checksum, that is bad – but it just means that a few bits in the data
> are wrong – not that we are missing the entire LSP.*
>
>
>
> *I appreciate there is no magic here – but I think we can easily agree
> that improving scalability at the expense of reliability is not a tradeoff
> we can accept.*
>

well, we already have this problem today as I described, the more stuff the
hash/checksum covers the more likely it becomes of course that caches
collide. only way to be better here is to distribute bigger or more
caches/checksums. And shifted XORs are actually som,e of the best "entropy
generators" based on work done on MAC hashes for SPT AFAIR

what we could suggest is "send every 5th time instaed of HSNP full CNSP"
just to have "in worst case still CSNP reliable but slower (no free lunch)"


>
>
>
>
>
>
> *3)I would like to raise the question as to whether we should prioritize a
> solution that aids initial LSPDB sync on adjacency bringup over a solution
> which works well after LSPDB synchronization (periodic CSNPs).*
>
>
>
>
>
> Our solution works well in both cases.  In the case of initial bringup,
> our mechanism exchanges a logarithmic number of packets to isolate the
> exact LSPs that are inconsistent.  In the case where databases are already
> synchronized, this means that only a single top-level HSNP is required.
>
>
>
> This is also true in the case of continuing verification of synchronized
> databases.
>
>
>
> *[LES:] The solution you have proposed works much better when the LSPDBs
> on the neighbors are “almost the same” because the ranges of LSPs covered
> in each hash are more likely to be the same.*
>
> *At adjacency bringup this is less likely to be the case – meaning that
> every time I receive an HSNP from you I am more likely to need to calculate
> the hash the way you did rather than simply check a cached hash value.*
>
> *(BTW – the use of cached hash values is mentioned in the draft as
> desirable – I did not invent this goal. **😊**)*
>
> *One way of improving this is to limit the hash TLV to LSPs from a single
> node (no range required).*
>
> *This improves xSNP scalability from per LSP to per node.*
>

so it's all assumptions. you can assume they almost identical, you can
assume they are mostly very different and every time you can claim either
sending or not sending is the better solution

AFAIS we either say "on bringup dump CSNPs so at least it's done once" or
we say "just send HSNP" (with footnote above)

as to one hash for one node. The  really large deplopyments are _oodles_ of
routers, most of them 1-2 LSPs only so this will basically barely buy
anything in terms of improving scale in practical large backbones given the
data I mostly see


>
>
> *The need for periodic CSNPs arose from early attempts at flooding
> optimizations (mesh groups) where an error in the manual configuration
> could jeopardize the reliability of the Update Process. In deployments
> where standards based flooding optimizations are used, the need for
> periodic CSNPs is lessened as the standards based solution should be well
> tested. Periodic CSNPs becomes the "suspenders" in a "belt" based
> deployment (or if you prefer the "belt" in a "suspenders" based
> deployment). I am wondering if we should deemphasize the use of periodic
> CSNPs?  In any case, the size of a full CSNP set is a practical issue in
> scale deployments - especially where a node has a large number of
> neighbors. Sending the full CSNP set on adjacency UP is a necessary step
> and therefore I would like to see this use case get greater attention over
> the optional periodic CSNP case.*
>
>
>
>
>
> SInce this now reduces to sending a single top level HSNP, and I like
> having a belt and suspenders (figuratively), things are already much
> cheaper and I would favor retaining that.
>
>
>
>
>
> *4)You choose to define new PDUs - which is certainly a viable option. But
> I am wondering if you considered simply defining a new TLV to be included
> in existing xSNPs. I can imagine cases - especially in PSNP usage - where a
> mixture of existing LSP entries and new Merkle Hash entries could usefully
> be sent in a PSNP to request/ack LSPs as we do today. The use of the hash
> TLV in PSNPs could add some efficiency to LSP acknowledgments.*
>
>
>
>
>
> We chose to go to new PDUs to not risk interoperability problems. We could
> easily see outselves wanting to generate packets that only include HSNP
> information and no legacy CSNP/PSNP information.
>
> *[LES:] I am cautious about new PDUs because it translates into new
> PDUs/level and – somewhere down the road – new PDUs to support new scopes
> (RFC 7356). (The 256 LSP limit per node is another limitation that we may
> yet have to deal with.)*
>
> *Given we are already negotiating the use of the new TLV/neighbor – and
> that in IS-IS unsupported TLVs are always ignored – I don’t see that the
> new TLV approach is more risky.*
>

I agree with Tony Li. confounding things that way is just laying foot loops
for ourselves


>
>
>
>
> *   Les*
>
>
>
> T
>
>
> _______________________________________________
> Lsr mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>

_______________________________________________
Lsr mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[Lsr] Fwd: Re: Review comments for draft-prz-lsr-hierarchical-snps-00: High Level Concerns

Reply via email to