Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

Van De Velde, Gunter (Nokia - BE/Antwerp) Thu, 16 Jun 2022 01:10:13 -0700

Hi Gyan, Daniel, Peter, All,

Thanks for sharing your insights and I agree mostly with your feedback


I agree and understand that summarization is needed to reduce the size of the 
LSDB. I also agree summarization good design practice, especially with IPv6 and 
SRv6 in mind. There never has been doubt about that.
I am not sure I agree that UAP/UPA is ‘optimal-design’. Maybe it is the best we 
can do, however I have a healthy worry we could be suffering tunnel vision and 
that proposed solution may not be good enough.
We should not be blind and believe that advertising UPA/PUA does not come 
without a cost. The architectural PUA/UPA usage complexity cost may not be 
worth the effort (none of the integration of using a PUA/UPA event triggers 
come for free). Do we really believe that PUA/UPA solve all the SID 
reachability problems for all IGP network design and SR use-cases elegantly? 
Maybe some use-case design constraints and assumptions should be documented to 
clarify architecturally where PUA/UPA is most beneficial for operators? Just 
stating “outside scope of the draft” seems unfair to operators interested in 
PUA/UPAs

Let me give two examples where PUA/UPA benefit is unclear:

(1) Multiple-ABRs

I was wondering for example if a ingress router receives a PUA signaling that a 
given locator becomes unreachable, does that actually really signals that the 
SID ‘really’ is unreachable for a router?

For example (simple design to illustrate the corner-case):

ingressPE#1---area#1---ABR#1---area---ABR#2---area#3---egressPE#2
     |                                                      |
     |                                                      |
     +--------area#1---ABR#3---area---ABR#4---area#3--------+

What if ABR#4 would loose connectivity to egressPE#2 and ABR#2 does not?
In that case ABR#4 will originate a UPA/PUA and ABR#2 does not originate a 
PUA/UPA.
How is ingressPE#1 supposed to handle this situation? The only thing 
ingressPE#1 see is that suddenly there is a PUA/UPA but reachability may not 
have changed at all and remains perfectly reacheable.


(2) with sr-policy or SRv6 SRTE
What if we have an inter-area/domain/level SRTE or sr-policy and suddenly there 
is a PUA/UPA for one of the SIDs in the sid-list of the path.
will this impact the srte or sr-policy in any way? Will transit routers do 
anything with the UPA/PUA and drop packets. Will transit routers trigger 
fast-restoration?
Can PCEs/controllers use the SID for crafting paths? Will all SRTE/sr-policy 
using the locator be pruned or re-signaled?
Will ingress router do something with the PUA information? Should PUA/UPA draft 
give guidelines around this?

Be well,
G/







If there is an SRTE or sr-policy using a given SID somewhere in the SID list… 
and suddenly



From: Gyan Mishra <[email protected]>
Sent: Thursday, June 16, 2022 6:12 AM
To: Voyer, Daniel <[email protected]>
Cc: Van De Velde, Gunter (Nokia - BE/Antwerp) <[email protected]>; 
[email protected]; 
draft-wang-lsr-prefix-unreachable-annoucement 
<[email protected]>; [email protected]
Subject: Re: [Lsr] Thoughts about PUAs - are we not over-engineering?


Summarization has always been a best practice for network scalability thereby 
reducing the size of the RIB and LSDB.

So in this case as Dan pointed out,  the summary route is an abstraction of the 
area and so if a component prefix of the summary became unreachable we need a 
way to signal that the PE next hop is no longer reachable to help optimize 
convergence.

We are just trying to make summarization work better then it does today so we 
don’t have to rely on domain wide flooding of host routes.

Thanks

Gyan


On Wed, Jun 15, 2022 at 4:42 PM Voyer, Daniel 
<[email protected]<mailto:[email protected]>> wrote:
Hi Gunter,

Thanks for your comments,

The idea, here, with summarization is to "reduce" the LSDB quite a lots and 
make a given backbone much more scalable / flexible and allow to simplify NNI's 
within that given backbones considerably.
Summarization is "needed" for better scale and, in the context of IPv6, will 
help in preventing blowing up the IGP.  With the size of an IPv6 prefix range 
(ex. /64) allocated per domain - summarization will help to contain the LSDB to 
that domain.

What we are "highlighting" in draft-ppsenak-lsr-igp-ureach-prefix-announce-00, 
is an easy way to overcome the fact that PEs are hidden behind a summary route 
and need a fast way to notify other PEs when they become unreachable.

I don't see "over-engineering" here, I see "optimal-engineering" instead.

Thanks
Dan

On 2022-06-14, 4:59 AM, "Van De Velde, Gunter (Nokia - BE/Antwerp)" 
<[email protected]<mailto:[email protected]>> wrote:

    Hi All,

    When reading both proposals about PUA's:
    * draft-ppsenak-lsr-igp-ureach-prefix-announce-00
    * draft-wang-lsr-prefix-unreachable-annoucement-09

    The identified problem space seems a correct observation, and indeed 
summaries hide remote area network instabilities. It is one of the perceived 
benefits of using summaries. The place in the network where this hiding takes 
the most impact upon convergence is at service nodes (PE's for L3/L2/transport) 
where due to the summarization its difficult to detect that the transport 
tunnel end-point suddenly becomes unreachable. My concern however is if it 
really is a problem that is worthy for LSR WG to solve.

    To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is not a 
preferred solution due to the expectation that all nodes in an area must be 
upgraded to support the IGP capability. From this operational perspective the 
draft "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant, as 
only the A(S)BR's and particular PEs must be upgraded to support PUA's. I do 
have concerns about the number of PUA advertisements in hierarchically 
summarized networks (/24 (site) -> /20 (region) -> /16 (core)). More specific, 
in the /16 backbone area, how many of these PUAs will be floating around 
creating LSP LSDB update churns? How to control the potentially exponential 
number of observed PUAs from floating everywhere? (will this lead to OSPF type 
NSSA areas where areas will be purged from these PUAs for scaling stability?)

    Long story short, should we not take a step back and re-think this 
identified problem space? Is the proposed solution space not more evil as the 
problem space? We do summarization because it brings stability and reduce the 
number of link state updates within an area. And now with PUA we re-introduce 
additional link state updates (PUAs), we blow up the LSDB with information 
opaque to SPF best-path calculation. In addition there is suggestion of new 
state-machinery to track the igp reachability of 'protected' prefixes and there 
is maybe desire to contain or filter updates cross inter-area boundaries. And 
finally, how will we represent and track PUA in the RTM?

    What is wrong with simply not doing summaries and forget about these PUAs 
to pinch holes in the summary prefixes? this worked very well during last two 
decennia. Are we not over-engineering with PUAs?

    G/
    
------------------------------------------------------------------------------
    External Email: Please use caution when opening links and attachments / 
Courriel externe: Soyez prudent avec les liens et documents joints


_______________________________________________
Lsr mailing list
[email protected]<mailto:[email protected]>
https://www.ietf.org/mailman/listinfo/lsr
--

[http://ss7.vzw.com/is/image/VerizonWireless/vz-logo-email]<http://www.verizon.com/>

Gyan Mishra

Network Solutions Architect

Email [email protected]<mailto:[email protected]>

M 301 502-1347

_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

Reply via email to