Re: [Lsr] BGP vs PUA/PULSE

Robert Raszuk Mon, 24 Jan 2022 01:58:01 -0800

Chris,

I would like to state one important point ...


Some folks used terms "for those special prefixes" or  "super important
prefixes"  only to smooth the discussion. But  there is not such a thing.
All what is being discussed is all PEs. Some want to also add SR segment
endpoints.

No one will be selecting a subset of the above. That's even more drastic
for the PUA/PULSE approach as it inherently contains a cliff effect.

Thx,
R.










On Mon, Jan 24, 2022 at 10:40 AM Christian Hopps <[email protected]> wrote:

>
> "Aijun Wang" <[email protected]> writes:
>
> > Hi, Chris:
> > We should notice here that it is not "a prefix", it's possible for "all
> node's loopback address, or even some link's address".
> > Gyan's reference for RFC5302 state clearly the disadvantage of
> > non-summarization, and the operators have followed this approach also
> about 20
> > years, then you just propose to divert to another direction?
> >
>
> For 20 years we haven't needed PUA/PULSE, now your saying we do, so I'm
> saying don't use summarization *for these special prefixes it suddenly
> doesn't work for*.
>
> I have *never* said do not use summarization. I've have tried very hard to
> say very clearly "for those special prefixes" every time I have responded
> to this thread. It's very frustrating.
>
> I'm saying do not summarize these "super important prefixes" -- these
> prefixes you want to modify the summarization process because summarization
> doens't work for them"
>
> Again KISS applies here:
>
>       If the summarization process *doesn't work* for a given prefix P,
> then *don't use summarization* for prefix P!
>
> Thanks,
> Chris.
> [As wg member]
>
> > Best Regards
> >
> > Aijun Wang
> > China Telecom
> >
> > -----Original Message-----
> > From: Christian Hopps <[email protected]>
> > Sent: Monday, January 24, 2022 1:50 PM
> > To: Gyan Mishra <[email protected]>
> > Cc: Christian Hopps <[email protected]>; Aijun Wang <
> [email protected]>;
> > Hannes Gredler <[email protected]>; John E Drake <[email protected]>;
> Les
> > Ginsberg (ginsberg) <[email protected]>; Peter Psenak (ppsenak)
> > <[email protected]>; Robert Raszuk <[email protected]>; Shraddha Hegde
> > <[email protected]>; Tony Li <[email protected]>; lsr <[email protected]>
> > Subject: Re: [Lsr] BGP vs PUA/PULSE
> >
> >
> > Ok, I guess I'll repeat what I said, as I don't believe anything new was
> presented here.
> >
> >     Yes, having worked intimately with these IGPs for > 20 years now,
> >     I understand the use and the implications of using summary
> >     routes. :)
> >
> >     My opinion remains unchanged.
> >
> > "If a prefix is important enough to consider seriously hacking the
> routing
> > protocol to signal the prefix being unreachable, then that prefix is
> important
> > enough to not summarize to begin with." IOW; KISS
> >
> > I'd prefer to not keep repeating this when presented with the same
> arguments, so please take any silence on my part as my opinion being
> unchanged.
> >
> > Thanks,
> > Chris.
> > [As WG member]
> >
> >
> >
> > Gyan Mishra <[email protected]> writes:
> >
> >> Hi Chris
> >>
> >>
> >> Just about every vendor out there recommended best practice is to
> >> layout address plan to take advantage of summarization wherever
> >> possible and that as well includes PE loopback next hop attribute to
> >> limit the router load as well as size of LSDB in the backbone as well
> >> as domain wide.
> >>
> >> I think you would be hard pressed to find any vendor that would say go
> >> ahead and flood loopbacks domain wide and don’t summarize.
> >>
> >> In large domains flooding domain wide is not feasible and
> >> summarization is requirement even for the critical loopback BGP next
> >> hops for most operators.
> >>
> >> RFC 5302 talks about the ramifications of flooding in ISIS domain in
> >> section 1.2 excerpt below:
> >>
> >>
> >> 1.2.  Scalability
> >>
> >>    The disadvantage to performing the domain-wide prefix distribution
> >>    described above is that it has an impact on the scalability of IS-IS.
> >>    Areas within IS-IS help scalability in that LSPs are contained within
> >>    a single area.  This limits the size of the link state database,
> >>    which in turn limits the complexity of the shortest path computation.
> >>
> >>    Further, the summarization of the prefix information aids scalability
> >>    in that the abstraction of the prefix information removes the sheer
> >>    number of data items to be transported and the number of routes to be
> >>    computed.
> >>
> >>    It should be noted quite strongly that the distribution of prefixes
> >>    on a domain-wide basis impacts the scalability of IS-IS in the second
> >>    respect.  It will increase the number of prefixes throughout the
> >>    domain.  This will result in increased memory consumption,
> >>    transmission requirements, and computation requirements throughout
> >>    the domain.
> >>
> >>    It must also be noted that the domain-wide distribution of prefixes
> >>    has no effect whatsoever on the first aspect of scalability, namely
> >>    the existence of areas and the limitation of the distribution of the
> >>    link state database.
> >>
> >>
> >>
> >>
> >> Gyan
> >> On Fri, Jan 14, 2022 at 9:07 PM Christian Hopps <[email protected]>
> >> wrote:
> >>
> >>     Yes, having worked intimately with these IGPs for > 20 years now,
> >>     I understand the use and the implications of using summary
> >>     routes. :)
> >>
> >>     My opinion remains unchanged.
> >>
> >>     Thanks,
> >>     Chris.
> >>     [as wg member]
> >>
> >>     > On Jan 14, 2022, at 8:50 PM, Aijun Wang <
> >>     [email protected]> wrote:
> >>     >
> >>     > Hi, Christian:
> >>     >
> >>     > We should consider the balance and efficiency for the summary
> >>     or not summary.
> >>     > If you don’t summary, then all the areas will be filled with
> >>     the specified detail routes(all PE’s loopback, may also include
> >>     all P’s loopback). This can certainly increase the burden of the
> >>     routers.
> >>     >
> >>     > But with summary, all these specific routes need not exist in
> >>     the routing table. The nodes within the IGP need only be notified
> >>     when one node is failure to accelerate the switchover of the
> >>     overlay service.
> >>     > And, you can also select to not using such mechanism, then the
> >>     service will be backhole for some time until the service/
> >>     application find this abnormal phenomenon.
> >>     > PUA/PULSE are just the mechanism to reduce the abnormal
> >>     durations, it is one kind of FRR technique.
> >>     >
> >>     > Aijun Wang
> >>     > China Telecom
> >>     >
> >>     >> On Jan 15, 2022, at 09:26, Christian Hopps <[email protected]>
> >>     wrote:
> >>     >>
> >>     >>
> >>     >>
> >>     >>> On Jan 14, 2022, at 8:25 PM, Christian Hopps <
> >>     [email protected]> wrote:
> >>     >>>
> >>     >>> I understand the proposal. As I've stated elsewhere, I do not
> >>     believe there is a problem here that needs solving. The "problem"
> >>     was created by the user by summarizing prefixes that should not
> >>     have been summarized -- they mis-configured their network. The
> >>     routing protocols works just fine (act very quickly) if you don't
> >>     incorrectly summarize "really important prefixes".
> >>     >>>
> >>     >>> I was simply pointing out that IGPs also don't deal in
> >>     liveness since that keeps coming up.
> >>     >>
> >>     >> Sorry that was "as wg member".
> >>     >>
> >>     >>>
> >>     >>> Thanks,
> >>     >>> Chris.
> >>     >>>
> >>     >>>>> On Jan 14, 2022, at 8:06 PM, Aijun Wang <
> >>     [email protected]> wrote:
> >>     >>>>
> >>     >>>> Hi, Christian and John:
> >>     >>>>
> >>     >>>> No. I think you all may misunderstand the proposal. What we
> >>     are detecting is actually the reachability/liveness of node that
> >>     connected to the application, not the application itself.
> >>     >>>> And, I think the node liveness is same as the node
> >>     reachability. They will all influence or break the path to their
> >>     connected service if their forwarding function is failed.
> >>     >>>>
> >>     >>>> Aijun Wang
> >>     >>>> China Telecom
> >>     >>>>
> >>     >>>>> On Jan 15, 2022, at 08:56, Christian Hopps <
> >>     [email protected]> wrote:
> >>     >>>>>
> >>     >>>>> Indeed, and in fact the IGP should only be dealing with the
> >>     reachability to the node, not with the node or applications
> >>     liveness.
> >>     >>>>>
> >>     >>>>> Thanks,
> >>     >>>>> Chris.
> >>     >>>>> [as wg member]
> >>     >>>>>
> >>     >>>>>> On Jan 14, 2022, at 7:47 PM, John E Drake <
> >>     [email protected]> wrote:
> >>     >>>>>>
> >>     >>>>>> I don’t think so.  Today things just work, at a given time
> >>     scale.  What you said you are trying to do is reduce the time
> >>     scale for detecting that an application on a node has failed.
> >>     However, conflating the health of a node with the health of an
> >>     application on that node seems to be inherently flawed.
> >>     >>>>>>
> >>     >>>>>> Yours Irrespectively,
> >>     >>>>>>
> >>     >>>>>> John
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>> Juniper Business Use Only
> >>     >>>>>> From: Aijun Wang <[email protected]>
> >>     >>>>>> Sent: Friday, January 14, 2022 7:40 PM
> >>     >>>>>> To: John E Drake <[email protected]>
> >>     >>>>>> Cc: Les Ginsberg (ginsberg) <[email protected]>; Robert
> >>     Raszuk <[email protected]>; Christian Hopps <[email protected]>;
> >>     Shraddha Hegde <[email protected]>; Tony Li <[email protected]>;
> >>     Hannes Gredler <[email protected]>; lsr <[email protected]>; Peter
> >>     Psenak (ppsenak) <[email protected]>
> >>     >>>>>> Subject: Re: [Lsr] BGP vs PUA/PULSE
> >>     >>>>>>
> >>     >>>>>> [External Email. Be cautious of content]
> >>     >>>>>>
> >>     >>>>>> When the node is up, all the following process are passed
> >>     to the application layer. This is the normal procedures of the
> >>     IGP should do.
> >>     >>>>>> According to your logic, IGP are solving the wrong problem
> >>     now?
> >>     >>>>>>
> >>     >>>>>> Aijun Wang
> >>     >>>>>> China Telecom
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>> On Jan 15, 2022, at 08:30, John E Drake <jdrake=
> >>     [email protected]> wrote:
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>> Correct, but as Tony, Robert and I have noted, a node
> >>     being up does not mean that an application on that node is up,
> >>     which means that your proposed solution is probably a solution to
> >>     the wrong problem.  Further, Robert’s solution is probably a
> >>     solution to the right problem.
> >>     >>>>>>
> >>     >>>>>> Yours Irrespectively,
> >>     >>>>>>
> >>     >>>>>> John
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>> Juniper Business Use Only
> >>     >>>>>> From: Aijun Wang <[email protected]>
> >>     >>>>>> Sent: Friday, January 14, 2022 5:53 PM
> >>     >>>>>> To: John E Drake <[email protected]>
> >>     >>>>>> Cc: Robert Raszuk <[email protected]>; Les Ginsberg
> >>     (ginsberg) <[email protected]>; Christian Hopps <
> >>     [email protected]>; Shraddha Hegde <[email protected]>; Tony
> >>     Li <[email protected]>; Hannes Gredler <[email protected]>; lsr <
> >>     [email protected]>; Peter Psenak (ppsenak) <[email protected]>
> >>     >>>>>> Subject: Re: [Lsr] BGP vs PUA/PULSE
> >>     >>>>>>
> >>     >>>>>> [External Email. Be cautious of content]
> >>     >>>>>>
> >>     >>>>>> Hi, John:
> >>     >>>>>> Please note if the node is down, the service will not be
> >>     accessed.
> >>     >>>>>> We are discussing the “DOWN” notification, not the “UP”
> >>     notification.
> >>     >>>>>>
> >>     >>>>>> Aijun Wang
> >>     >>>>>> China Telecom
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>> On Jan 15, 2022, at 00:25, John E Drake <jdrake=
> >>     [email protected]> wrote:
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>> Hi,
> >>     >>>>>>
> >>     >>>>>> Comment inline below.
> >>     >>>>>>
> >>     >>>>>> Yours Irrespectively,
> >>     >>>>>>
> >>     >>>>>> John
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>> Juniper Business Use Only
> >>     >>>>>> From: Lsr <[email protected]> On Behalf Of Robert
> >>     Raszuk
> >>     >>>>>> Sent: Monday, January 10, 2022 7:15 PM
> >>     >>>>>> To: Les Ginsberg (ginsberg) <[email protected]>
> >>     >>>>>> Cc: Christian Hopps <[email protected]>; Aijun Wang <
> >>     [email protected]>; Shraddha Hegde <[email protected]
> >>     >; Tony Li <[email protected]>; Hannes Gredler <[email protected]>;
> >>     lsr <[email protected]>; Peter Psenak (ppsenak) <[email protected]>
> >>     >>>>>> Subject: Re: [Lsr] BGP vs PUA/PULSE
> >>     >>>>>>
> >>     >>>>>> [External Email. Be cautious of content]
> >>     >>>>>>
> >>     >>>>>> Hi Les,
> >>     >>>>>>
> >>     >>>>>>> You seem focused on the notification delivery mechanism
> >>     only.
> >>     >>>>>>
> >>     >>>>>> Not really. For me, an advertised summary is like a prefix
> >>     when you are dialing a country code. Call signaling knows to go
> >>     north if you are calling a crab shop in Alaska.
> >>     >>>>>>
> >>     >>>>>> Now such direction does not indicate if the shop is open
> >>     or has crabs.
> >>     >>>>>>
> >>     >>>>>> That info you need to get over the top as a service. So I
> >>     am much more in favor to make the service to tell you directly or
> >>     indirectly that it is available.
> >>     >>>>>>
> >>     >>>>>> [JD]  Right.  Just because a node is up and connected to
> >>     the network does not imply that a given application is active on
> >>     it.
> >>     >>>>>>
> >>     >>>>>> Best,
> >>     >>>>>> R.
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>> On Tue, Jan 11, 2022 at 1:07 AM Les Ginsberg (ginsberg) <
> >>     [email protected]> wrote:
> >>     >>>>>> Robert -
> >>     >>>>>>
> >>     >>>>>> From: Robert Raszuk <[email protected]>
> >>     >>>>>> Sent: Monday, January 10, 2022 2:56 PM
> >>     >>>>>> To: Les Ginsberg (ginsberg) <[email protected]>
> >>     >>>>>> Cc: Tony Li <[email protected]>; Christian Hopps <
> >>     [email protected]>; Peter Psenak (ppsenak) <[email protected]>;
> >>     Shraddha Hegde <[email protected]>; Aijun Wang <
> >>     [email protected]>; Hannes Gredler <[email protected]>;
> >>     lsr <[email protected]>
> >>     >>>>>> Subject: Re: [Lsr] BGP vs PUA/PULSE
> >>     >>>>>>
> >>     >>>>>> Les,
> >>     >>>>>>
> >>     >>>>>> We have received requests from real customers who both
> >>     need to summarize AND would like better response time to loss of
> >>     reachability to individual nodes.
> >>     >>>>>>
> >>     >>>>>> We all agree the request is legitimate.
> >>     >>>>>>
> >>     >>>>>> [LES:] It does not seem to me that everyone does agree on
> >>     that – but I appreciate that you agree.
> >>     >>>>>>
> >>     >>>>>> But do they realize that to practically employ what you
> >>     are proposing (new PDU flooding) requires 100% software upgrade
> >>     to all IGP nodes in the entire network ? Do they also realize
> >>     that to effectively use it requires data plane change (sure
> >>     software but data plane code is not as simple as PI) on all
> >>     ingress PEs ?
> >>     >>>>>>
> >>     >>>>>> [LES:] As far as forwarding, as Peter has indicated, we
> >>     have a POC and it works fine. And there are many possible ways
> >>     for implementations to go.
> >>     >>>>>> It may or may not require 100% software upgrade – but I
> >>     agree a significant number of nodes have to be upgraded to at
> >>     least support pulse flooding.
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>> And with scale requirements you are describing it seems
> >>     this would be 1000s of nodes (if not more). That's massive if
> >>     compared to alternative approaches to achieve the same or even
> >>     better results.
> >>     >>>>>>
> >>     >>>>>> [LES:] Be happy to review other solutions if/when someone
> >>     writes them up.
> >>     >>>>>> I think what is overlooked in the discussion of other
> >>     solutions is that reachability info is provided by the IGP. If
> >>     all the IGP advertises is a summary then how would individual
> >>     loss of reachability become known at scale?
> >>     >>>>>> You seem focused on the notification delivery mechanism
> >>     only.
> >>     >>>>>>
> >>     >>>>>> Les
> >>     >>>>>>
> >>     >>>>>> Many thx,
> >>     >>>>>> Robert
> >>     >>>>>>
> >>     >>>>>> _______________________________________________
> >>     >>>>>> Lsr mailing list
> >>     >>>>>> [email protected]
> >>     >>>>>> https://www.ietf.org/mailman/listinfo/lsr
> >>     >>>>>
> >>     >>>>
> >>     >>>
> >>     >>
> >>     >
> >>
> >>     _______________________________________________
> >>     Lsr mailing list
> >>     [email protected]
> >>     https://www.ietf.org/mailman/listinfo/lsr
>
>

_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Re: [Lsr] BGP vs PUA/PULSE

Reply via email to