Hi, Chris: We should notice here that it is not "a prefix", it's possible for "all node's loopback address, or even some link's address". Gyan's reference for RFC5302 state clearly the disadvantage of non-summarization, and the operators have followed this approach also about 20 years, then you just propose to divert to another direction?
Best Regards Aijun Wang China Telecom -----Original Message----- From: Christian Hopps <[email protected]> Sent: Monday, January 24, 2022 1:50 PM To: Gyan Mishra <[email protected]> Cc: Christian Hopps <[email protected]>; Aijun Wang <[email protected]>; Hannes Gredler <[email protected]>; John E Drake <[email protected]>; Les Ginsberg (ginsberg) <[email protected]>; Peter Psenak (ppsenak) <[email protected]>; Robert Raszuk <[email protected]>; Shraddha Hegde <[email protected]>; Tony Li <[email protected]>; lsr <[email protected]> Subject: Re: [Lsr] BGP vs PUA/PULSE Ok, I guess I'll repeat what I said, as I don't believe anything new was presented here. Yes, having worked intimately with these IGPs for > 20 years now, I understand the use and the implications of using summary routes. :) My opinion remains unchanged. "If a prefix is important enough to consider seriously hacking the routing protocol to signal the prefix being unreachable, then that prefix is important enough to not summarize to begin with." IOW; KISS I'd prefer to not keep repeating this when presented with the same arguments, so please take any silence on my part as my opinion being unchanged. Thanks, Chris. [As WG member] Gyan Mishra <[email protected]> writes: > Hi Chris > > > Just about every vendor out there recommended best practice is to > layout address plan to take advantage of summarization wherever > possible and that as well includes PE loopback next hop attribute to > limit the router load as well as size of LSDB in the backbone as well > as domain wide. > > I think you would be hard pressed to find any vendor that would say go > ahead and flood loopbacks domain wide and don’t summarize. > > In large domains flooding domain wide is not feasible and > summarization is requirement even for the critical loopback BGP next > hops for most operators. > > RFC 5302 talks about the ramifications of flooding in ISIS domain in > section 1.2 excerpt below: > > > 1.2. Scalability > > The disadvantage to performing the domain-wide prefix distribution > described above is that it has an impact on the scalability of IS-IS. > Areas within IS-IS help scalability in that LSPs are contained within > a single area. This limits the size of the link state database, > which in turn limits the complexity of the shortest path computation. > > Further, the summarization of the prefix information aids scalability > in that the abstraction of the prefix information removes the sheer > number of data items to be transported and the number of routes to be > computed. > > It should be noted quite strongly that the distribution of prefixes > on a domain-wide basis impacts the scalability of IS-IS in the second > respect. It will increase the number of prefixes throughout the > domain. This will result in increased memory consumption, > transmission requirements, and computation requirements throughout > the domain. > > It must also be noted that the domain-wide distribution of prefixes > has no effect whatsoever on the first aspect of scalability, namely > the existence of areas and the limitation of the distribution of the > link state database. > > > > > Gyan > On Fri, Jan 14, 2022 at 9:07 PM Christian Hopps <[email protected]> > wrote: > > Yes, having worked intimately with these IGPs for > 20 years now, > I understand the use and the implications of using summary > routes. :) > > My opinion remains unchanged. > > Thanks, > Chris. > [as wg member] > > > On Jan 14, 2022, at 8:50 PM, Aijun Wang < > [email protected]> wrote: > > > > Hi, Christian: > > > > We should consider the balance and efficiency for the summary > or not summary. > > If you don’t summary, then all the areas will be filled with > the specified detail routes(all PE’s loopback, may also include > all P’s loopback). This can certainly increase the burden of the > routers. > > > > But with summary, all these specific routes need not exist in > the routing table. The nodes within the IGP need only be notified > when one node is failure to accelerate the switchover of the > overlay service. > > And, you can also select to not using such mechanism, then the > service will be backhole for some time until the service/ > application find this abnormal phenomenon. > > PUA/PULSE are just the mechanism to reduce the abnormal > durations, it is one kind of FRR technique. > > > > Aijun Wang > > China Telecom > > > >> On Jan 15, 2022, at 09:26, Christian Hopps <[email protected]> > wrote: > >> > >> > >> > >>> On Jan 14, 2022, at 8:25 PM, Christian Hopps < > [email protected]> wrote: > >>> > >>> I understand the proposal. As I've stated elsewhere, I do not > believe there is a problem here that needs solving. The "problem" > was created by the user by summarizing prefixes that should not > have been summarized -- they mis-configured their network. The > routing protocols works just fine (act very quickly) if you don't > incorrectly summarize "really important prefixes". > >>> > >>> I was simply pointing out that IGPs also don't deal in > liveness since that keeps coming up. > >> > >> Sorry that was "as wg member". > >> > >>> > >>> Thanks, > >>> Chris. > >>> > >>>>> On Jan 14, 2022, at 8:06 PM, Aijun Wang < > [email protected]> wrote: > >>>> > >>>> Hi, Christian and John: > >>>> > >>>> No. I think you all may misunderstand the proposal. What we > are detecting is actually the reachability/liveness of node that > connected to the application, not the application itself. > >>>> And, I think the node liveness is same as the node > reachability. They will all influence or break the path to their > connected service if their forwarding function is failed. > >>>> > >>>> Aijun Wang > >>>> China Telecom > >>>> > >>>>> On Jan 15, 2022, at 08:56, Christian Hopps < > [email protected]> wrote: > >>>>> > >>>>> Indeed, and in fact the IGP should only be dealing with the > reachability to the node, not with the node or applications > liveness. > >>>>> > >>>>> Thanks, > >>>>> Chris. > >>>>> [as wg member] > >>>>> > >>>>>> On Jan 14, 2022, at 7:47 PM, John E Drake < > [email protected]> wrote: > >>>>>> > >>>>>> I don’t think so. Today things just work, at a given time > scale. What you said you are trying to do is reduce the time > scale for detecting that an application on a node has failed. > However, conflating the health of a node with the health of an > application on that node seems to be inherently flawed. > >>>>>> > >>>>>> Yours Irrespectively, > >>>>>> > >>>>>> John > >>>>>> > >>>>>> > >>>>>> Juniper Business Use Only > >>>>>> From: Aijun Wang <[email protected]> > >>>>>> Sent: Friday, January 14, 2022 7:40 PM > >>>>>> To: John E Drake <[email protected]> > >>>>>> Cc: Les Ginsberg (ginsberg) <[email protected]>; Robert > Raszuk <[email protected]>; Christian Hopps <[email protected]>; > Shraddha Hegde <[email protected]>; Tony Li <[email protected]>; > Hannes Gredler <[email protected]>; lsr <[email protected]>; Peter > Psenak (ppsenak) <[email protected]> > >>>>>> Subject: Re: [Lsr] BGP vs PUA/PULSE > >>>>>> > >>>>>> [External Email. Be cautious of content] > >>>>>> > >>>>>> When the node is up, all the following process are passed > to the application layer. This is the normal procedures of the > IGP should do. > >>>>>> According to your logic, IGP are solving the wrong problem > now? > >>>>>> > >>>>>> Aijun Wang > >>>>>> China Telecom > >>>>>> > >>>>>> > >>>>>> On Jan 15, 2022, at 08:30, John E Drake <jdrake= > [email protected]> wrote: > >>>>>> > >>>>>> > >>>>>> Correct, but as Tony, Robert and I have noted, a node > being up does not mean that an application on that node is up, > which means that your proposed solution is probably a solution to > the wrong problem. Further, Robert’s solution is probably a > solution to the right problem. > >>>>>> > >>>>>> Yours Irrespectively, > >>>>>> > >>>>>> John > >>>>>> > >>>>>> > >>>>>> Juniper Business Use Only > >>>>>> From: Aijun Wang <[email protected]> > >>>>>> Sent: Friday, January 14, 2022 5:53 PM > >>>>>> To: John E Drake <[email protected]> > >>>>>> Cc: Robert Raszuk <[email protected]>; Les Ginsberg > (ginsberg) <[email protected]>; Christian Hopps < > [email protected]>; Shraddha Hegde <[email protected]>; Tony > Li <[email protected]>; Hannes Gredler <[email protected]>; lsr < > [email protected]>; Peter Psenak (ppsenak) <[email protected]> > >>>>>> Subject: Re: [Lsr] BGP vs PUA/PULSE > >>>>>> > >>>>>> [External Email. Be cautious of content] > >>>>>> > >>>>>> Hi, John: > >>>>>> Please note if the node is down, the service will not be > accessed. > >>>>>> We are discussing the “DOWN” notification, not the “UP” > notification. > >>>>>> > >>>>>> Aijun Wang > >>>>>> China Telecom > >>>>>> > >>>>>> > >>>>>> On Jan 15, 2022, at 00:25, John E Drake <jdrake= > [email protected]> wrote: > >>>>>> > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> Comment inline below. > >>>>>> > >>>>>> Yours Irrespectively, > >>>>>> > >>>>>> John > >>>>>> > >>>>>> > >>>>>> Juniper Business Use Only > >>>>>> From: Lsr <[email protected]> On Behalf Of Robert > Raszuk > >>>>>> Sent: Monday, January 10, 2022 7:15 PM > >>>>>> To: Les Ginsberg (ginsberg) <[email protected]> > >>>>>> Cc: Christian Hopps <[email protected]>; Aijun Wang < > [email protected]>; Shraddha Hegde <[email protected] > >; Tony Li <[email protected]>; Hannes Gredler <[email protected]>; > lsr <[email protected]>; Peter Psenak (ppsenak) <[email protected]> > >>>>>> Subject: Re: [Lsr] BGP vs PUA/PULSE > >>>>>> > >>>>>> [External Email. Be cautious of content] > >>>>>> > >>>>>> Hi Les, > >>>>>> > >>>>>>> You seem focused on the notification delivery mechanism > only. > >>>>>> > >>>>>> Not really. For me, an advertised summary is like a prefix > when you are dialing a country code. Call signaling knows to go > north if you are calling a crab shop in Alaska. > >>>>>> > >>>>>> Now such direction does not indicate if the shop is open > or has crabs. > >>>>>> > >>>>>> That info you need to get over the top as a service. So I > am much more in favor to make the service to tell you directly or > indirectly that it is available. > >>>>>> > >>>>>> [JD] Right. Just because a node is up and connected to > the network does not imply that a given application is active on > it. > >>>>>> > >>>>>> Best, > >>>>>> R. > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Tue, Jan 11, 2022 at 1:07 AM Les Ginsberg (ginsberg) < > [email protected]> wrote: > >>>>>> Robert - > >>>>>> > >>>>>> From: Robert Raszuk <[email protected]> > >>>>>> Sent: Monday, January 10, 2022 2:56 PM > >>>>>> To: Les Ginsberg (ginsberg) <[email protected]> > >>>>>> Cc: Tony Li <[email protected]>; Christian Hopps < > [email protected]>; Peter Psenak (ppsenak) <[email protected]>; > Shraddha Hegde <[email protected]>; Aijun Wang < > [email protected]>; Hannes Gredler <[email protected]>; > lsr <[email protected]> > >>>>>> Subject: Re: [Lsr] BGP vs PUA/PULSE > >>>>>> > >>>>>> Les, > >>>>>> > >>>>>> We have received requests from real customers who both > need to summarize AND would like better response time to loss of > reachability to individual nodes. > >>>>>> > >>>>>> We all agree the request is legitimate. > >>>>>> > >>>>>> [LES:] It does not seem to me that everyone does agree on > that – but I appreciate that you agree. > >>>>>> > >>>>>> But do they realize that to practically employ what you > are proposing (new PDU flooding) requires 100% software upgrade > to all IGP nodes in the entire network ? Do they also realize > that to effectively use it requires data plane change (sure > software but data plane code is not as simple as PI) on all > ingress PEs ? > >>>>>> > >>>>>> [LES:] As far as forwarding, as Peter has indicated, we > have a POC and it works fine. And there are many possible ways > for implementations to go. > >>>>>> It may or may not require 100% software upgrade – but I > agree a significant number of nodes have to be upgraded to at > least support pulse flooding. > >>>>>> > >>>>>> > >>>>>> And with scale requirements you are describing it seems > this would be 1000s of nodes (if not more). That's massive if > compared to alternative approaches to achieve the same or even > better results. > >>>>>> > >>>>>> [LES:] Be happy to review other solutions if/when someone > writes them up. > >>>>>> I think what is overlooked in the discussion of other > solutions is that reachability info is provided by the IGP. If > all the IGP advertises is a summary then how would individual > loss of reachability become known at scale? > >>>>>> You seem focused on the notification delivery mechanism > only. > >>>>>> > >>>>>> Les > >>>>>> > >>>>>> Many thx, > >>>>>> Robert > >>>>>> > >>>>>> _______________________________________________ > >>>>>> Lsr mailing list > >>>>>> [email protected] > >>>>>> https://www.ietf.org/mailman/listinfo/lsr > >>>>> > >>>> > >>> > >> > > > > _______________________________________________ > Lsr mailing list > [email protected] > https://www.ietf.org/mailman/listinfo/lsr _______________________________________________ Lsr mailing list [email protected] https://www.ietf.org/mailman/listinfo/lsr
