Hi Robert,

I agree with you that the operator problem space is not limited to 
multi-area/levels with IGP summarisation.

With the PUA/UPA proposals I get the feeling that LSR WG is jumping into the 
deep-end and is re-vectoring the IGP to carry opaque information not used for 
SPF/cSPF.
I believe we should be conservative for such and if LSR WG progresses with such 
decision.

It could very well be that re-vectoring is the best solution, but I guess we 
need to agree first on understanding the operator problem space.

G/

From: Robert Raszuk <[email protected]>
Sent: Tuesday, June 14, 2022 11:51 AM
To: Van De Velde, Gunter (Nokia - BE/Antwerp) <[email protected]>
Cc: lsr <[email protected]>; [email protected]; 
draft-wang-lsr-prefix-unreachable-annoucement 
<[email protected]>
Subject: Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

Hello Gunter,

I agree with pretty much all you said except the conclusion - do nothing :).

To me if you need to accelerate connectivity restoration upon an unlikely event 
like a complete PE failure the right vehicle to signal this is within the 
service layer itself. Let's keep in mind that links do fail a lot in the 
networks - routers do not (or they do it is multiple orders of magnitude less 
frequent event). Especially links on the PE-CE boundaries do fail a lot.

Removal of next hop reachability can be done with BGP and based on BGP native 
recursion will have the exact same effect as presented ideas. Moreover it will 
be stateful for the endpoints which again to me is a feature not a bug.

Some suggested to define a new extension in BGP to signal it even without using 
double recursion - well one of them has been proposed in the past - 
https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt At that time 
the feedback received was that native BGP withdraws are fast enough so no need 
to bother. Well those native withdrawals are working today as well as some 
claim that specific implementations can withdraw RD:* when PE hosting such RDs 
fail and RDs are allocated in a unique per VRF fashion.

Then we have the DROID proposal which again may look like overkill for this 
very problem, but if you consider the bigger picture of what networks control 
plane pub-sub signalling needs, it establishes the foundation for such.

Many thanks,
Robert


On Tue, Jun 14, 2022 at 10:59 AM Van De Velde, Gunter (Nokia - BE/Antwerp) 
<[email protected]<mailto:[email protected]>> wrote:
Hi All,

When reading both proposals about PUA's:
* draft-ppsenak-lsr-igp-ureach-prefix-announce-00
* draft-wang-lsr-prefix-unreachable-annoucement-09

The identified problem space seems a correct observation, and indeed summaries 
hide remote area network instabilities. It is one of the perceived benefits of 
using summaries. The place in the network where this hiding takes the most 
impact upon convergence is at service nodes (PE's for L3/L2/transport) where 
due to the summarization its difficult to detect that the transport tunnel 
end-point suddenly becomes unreachable. My concern however is if it really is a 
problem that is worthy for LSR WG to solve.

To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is not a 
preferred solution due to the expectation that all nodes in an area must be 
upgraded to support the IGP capability. From this operational perspective the 
draft "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant, as 
only the A(S)BR's and particular PEs must be upgraded to support PUA's. I do 
have concerns about the number of PUA advertisements in hierarchically 
summarized networks (/24 (site) -> /20 (region) -> /16 (core)). More specific, 
in the /16 backbone area, how many of these PUAs will be floating around 
creating LSP LSDB update churns? How to control the potentially exponential 
number of observed PUAs from floating everywhere? (will this lead to OSPF type 
NSSA areas where areas will be purged from these PUAs for scaling stability?)

Long story short, should we not take a step back and re-think this identified 
problem space? Is the proposed solution space not more evil as the problem 
space? We do summarization because it brings stability and reduce the number of 
link state updates within an area. And now with PUA we re-introduce additional 
link state updates (PUAs), we blow up the LSDB with information opaque to SPF 
best-path calculation. In addition there is suggestion of new state-machinery 
to track the igp reachability of 'protected' prefixes and there is maybe desire 
to contain or filter updates cross inter-area boundaries. And finally, how will 
we represent and track PUA in the RTM?

What is wrong with simply not doing summaries and forget about these PUAs to 
pinch holes in the summary prefixes? this worked very well during last two 
decennia. Are we not over-engineering with PUAs?

G/

_______________________________________________
Lsr mailing list
[email protected]<mailto:[email protected]>
https://www.ietf.org/mailman/listinfo/lsr
_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Reply via email to