Hi Gunter,

On 15/06/2022 11:02, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:
Hi Robert,

I agree with you that the operator problem space is not limited to multi-area/levels with IGP summarisation.

With the PUA/UPA proposals I get the feeling that LSR WG is jumping into the deep-end and is re-vectoring the IGP to carry opaque information not used for SPF/cSPF.

I believe we should be conservative for such and if LSR WG progresses with such decision.

please note that UPA draft builds on existing protocol specification defined in RFC5305 and RFC5308 that allow the metric larger then MAX_PATH_METRIC to be used "for purposes other than building the normal IP routing table". We are just documenting one of them.

thanks,
Peter



It could very well be that re-vectoring is the best solution, but I guess we need to agree first on understanding the operator problem space.

G/

*From:*Robert Raszuk <[email protected]>
*Sent:* Tuesday, June 14, 2022 11:51 AM
*To:* Van De Velde, Gunter (Nokia - BE/Antwerp) <[email protected]> *Cc:* lsr <[email protected]>; [email protected]; draft-wang-lsr-prefix-unreachable-annoucement <[email protected]>
*Subject:* Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

Hello Gunter,

I agree with pretty much all you said except the conclusion - do nothing :).

To me if you need to accelerate connectivity restoration upon an unlikely event like a complete PE failure the right vehicle to signal this is within the service layer itself. Let's keep in mind that links do fail a lot in the networks - routers do not (or they do it is multiple orders of magnitude less frequent event). Especially links on the PE-CE boundaries do fail a lot.

Removal of next hop reachability can be done with BGP and based on BGP native recursion will have the exact same effect as presented ideas. Moreover it will be stateful for the endpoints which again to me is a feature not a bug.

Some suggested to define a new extension in BGP to signal it even without using double recursion - well one of them has been proposed in the past - https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt <https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt> At that time the feedback received was that native BGP withdraws are fast enough so no need to bother. Well those native withdrawals are working today as well as some claim that specific implementations can withdraw RD:* when PE hosting such RDs fail and RDs are allocated in a unique per VRF fashion.

Then we have the DROID proposal which again may look like overkill for this very problem, but if you consider the bigger picture of what networks control plane pub-sub signalling needs, it establishes the foundation for such.

Many thanks,

Robert

On Tue, Jun 14, 2022 at 10:59 AM Van De Velde, Gunter (Nokia - BE/Antwerp) <[email protected] <mailto:[email protected]>> wrote:

    Hi All,

    When reading both proposals about PUA's:
    * draft-ppsenak-lsr-igp-ureach-prefix-announce-00
    * draft-wang-lsr-prefix-unreachable-annoucement-09

    The identified problem space seems a correct observation, and indeed
    summaries hide remote area network instabilities. It is one of the
    perceived benefits of using summaries. The place in the network
    where this hiding takes the most impact upon convergence is at
    service nodes (PE's for L3/L2/transport) where due to the
    summarization its difficult to detect that the transport tunnel
    end-point suddenly becomes unreachable. My concern however is if it
    really is a problem that is worthy for LSR WG to solve.

    To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09"
    is not a preferred solution due to the expectation that all nodes in
    an area must be upgraded to support the IGP capability. From this
    operational perspective the draft
    "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant,
    as only the A(S)BR's and particular PEs must be upgraded to support
    PUA's. I do have concerns about the number of PUA advertisements in
    hierarchically summarized networks (/24 (site) -> /20 (region) ->
    /16 (core)). More specific, in the /16 backbone area, how many of
    these PUAs will be floating around creating LSP LSDB update churns?
    How to control the potentially exponential number of observed PUAs
    from floating everywhere? (will this lead to OSPF type NSSA areas
    where areas will be purged from these PUAs for scaling stability?)

    Long story short, should we not take a step back and re-think this
    identified problem space? Is the proposed solution space not more
    evil as the problem space? We do summarization because it brings
    stability and reduce the number of link state updates within an
    area. And now with PUA we re-introduce additional link state updates
    (PUAs), we blow up the LSDB with information opaque to SPF best-path
    calculation. In addition there is suggestion of new state-machinery
    to track the igp reachability of 'protected' prefixes and there is
    maybe desire to contain or filter updates cross inter-area
    boundaries. And finally, how will we represent and track PUA in the RTM?

    What is wrong with simply not doing summaries and forget about these
    PUAs to pinch holes in the summary prefixes? this worked very well
    during last two decennia. Are we not over-engineering with PUAs?

    G/

    _______________________________________________
    Lsr mailing list
    [email protected] <mailto:[email protected]>
    https://www.ietf.org/mailman/listinfo/lsr
    <https://www.ietf.org/mailman/listinfo/lsr>


_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Reply via email to