Not each information carried within the LSP will be consumed by every Node 
within the IGP, and the PUA/PULSE message doesn’t trigger the SPF calculation.

What are the melt down effect that you are worrying then?

PUB/SUB model introduces again the connection states to the network devices, 
especially on the ABR. Is this should be considered seriously at the large 
scale network?

And, we are discussing the control plane message transport, not the forwarding 
plane flooding. Then why bridged L2 can be used to replace the IP network?

 

 

Best Regards

 

Aijun Wang

China Telecom

 

From: [email protected] <[email protected]> 
Sent: Thursday, January 27, 2022 2:13 PM
To: Les Ginsberg (ginsberg) <[email protected]>
Cc: Christian Hopps <[email protected]>; [email protected]; Greg Mirsky 
<[email protected]>; Aijun Wang <[email protected]>
Subject: Re: [Lsr] How to forward the solutions for "Prefixes Unreachable 
Notification" problem

 

Right, I also saw req' up to 0.5M nodes in flat IGP core by some customers 
thinking they'll have the money & need for such things (though I know only one 
company in the world right now that runs anything deployed of this scale give 
it 2-3 multiplier of total networking devices including switches ;-). Now, 
generally IME those are customers that did not work with IGPs extensively since 
the beginning and put lots assumptions forward which may look good on paper but 
it is not clear whether they can be built with the current technology available 
(or maybe ever given we're starting to talk gin-ormous ;-) amounts of state 
needing synchronization over what is basically a broadcast). My dry observation 
was that we either need to break up the network (theoretically, with enough 
areas that should be possible but @ this IGP scale nothing is obvious ;-) or 
the other approach is to build it in a very regular graph and use the 
properties of the graph to reduce the amount of state in smart ways so in short 
not use "traditional" IGPs really. 

 

Now, because someone wants a huge IGP and lots of scale problems are being hit, 
the answer is not to make it go "faster" by increasing flooding & push more and 
more stuff into the broadcast, especially with flat addresses. If that 
"strategy" would work we would not have IP but the planet would be bridged L2 
;-) 

 

Sounds abstract until the ugliness in the field overruns you (and the 
assertions of "rate limited" and "timer expiring" and such stuff is basically 
assumptions which will be by experience invalidated by customers fiddling knobs 
or the 'unexpected' meltdown that will expose the one huge blast radius this 
all encompasses). 

 

IME at this scale you are really best served by a subscribe/publish mechanism 
or to put it differently scoped queries on the network. Or you limit the domain 
by e.g. carrying nhops in their own dedicated IBGP reflected or some such thing 
(roughly stuff Robert was ruminating on).  And @ publish/subscribe 
architectural fork in the road I truly do think that IGPs role should be 
limited to exposing the SSAPs (roughly Tony's stuff). A possible variation on 
the theme is mp2mp substrate that is a S-PMSI at the same time but that's 
probably too far out for people's thinking of today ;-) 

 

The discussion is currently bogged down as couple folks observed, given the 
variety of assumptions and experience levels & claims extended happily (where I 
often ask myself how they have been confabulated) I don't see much progress 
will be made. A good option may be to observe that there is no ultimate need 
for IGP to get involved here and things can be solved by an overlay solution of 
PEs just fine and with that move on to practical problems we can agree IGP must 
deal with (as others did already). 

 

--- tony 

 

 

 

 

 

 

 

On Thu, Jan 27, 2022 at 1:49 AM Les Ginsberg (ginsberg) 
<[email protected] <mailto:[email protected]> > 
wrote:

Chris -

The scale request comes from real customers. So, it is understandable for you 
to be "aghast" - but it is a real request.

As far as BFD goes, my opinion is this won’t scale. There is a significant 
difference between operating sessions which continuously monitor liveness in a 
full mesh versus using some approach which only triggers network-wide traffic 
when some topology change is locally detected. There are multiple approaches 
being discussed which do the latter - but BFD is not one of them.

You can disagree - or - as Greg has done - say we don’t really have to consider 
this scale. I am not going to try to convince you otherwise.
But if so you aren’t solving the problem we have been asked to solve.

   Les


> -----Original Message-----
> From: Christian Hopps <[email protected] <mailto:[email protected]> >
> Sent: Wednesday, January 26, 2022 2:15 PM
> To: Les Ginsberg (ginsberg) <[email protected] <mailto:[email protected]> >
> Cc: Greg Mirsky <[email protected] <mailto:[email protected]> >; 
> Aijun Wang
> <[email protected] <mailto:[email protected]> >; [email protected] 
> <mailto:[email protected]> 
> Subject: Re: [Lsr] How to forward the solutions for "Prefixes Unreachable
> Notification" problem
> 
> 
> "Les Ginsberg (ginsberg)" <[email protected] 
> <mailto:[email protected]> > writes:
> 
> > Greg –
> >
> > With 100K PE scale, we are talking about 100K BFD sessions/PE and
> > close to 5 million BFD sessions network-wide.
> >
> > Eliminating one of the options we are discussing is admittedly a
> > small step, but still worthwhile.
> 
> Hang on a sec. :)
> 
> We are starting off with this GINORMOUS network with 100,000 PE routers!
> Why would 5 million sessions of anything over this gigantic network of
> routers be a reason to disregard it as a solution? (How many total routers are
> there BTW?)
> 
> If you build something gignatic *everything* is going to scale way up. To use
> an oldie but a goodie: TANSTAAFL.
> 
> Thanks,
> Chris.
> 
> 
> >
> >
> > However, If you still want to continue to advocate for BFD, I will
> > say no more.
> >
> >
> >
> >    Les
> >
> >
> >
> > From: Lsr <[email protected] <mailto:[email protected]> > On Behalf 
> > Of Greg Mirsky
> > Sent: Tuesday, January 25, 2022 7:06 PM
> > To: Aijun Wang <[email protected] <mailto:[email protected]> >
> > Cc: lsr <[email protected] <mailto:[email protected]> >
> > Subject: Re: [Lsr] How to forward the solutions for "Prefixes
> > Unreachable Notification" problem
> >
> >
> >
> > Hi Aijun,
> >
> > I believe that under Option D you can add multihop BFD per RFC 5883.
> > No new protols needed.
> >
> >
> >
> > Regards,
> >
> > Greg
> >
> >
> >
> > On Tue, Jan 25, 2022, 18:17 Aijun Wang <[email protected] 
> > <mailto:[email protected]> >
> > wrote:
> >
> >     Hi, All:
> >
> >
> >
> >     As Peter’s example and Acee’s suggestions, let’s focus on the
> >     following problem to think how to solve it efficiently and
> >     reasonably:
> >
> >     Scenario: 100 areas each with 1000 PEs (100K total PEs) with 2
> >     ABRs per area
> >
> >     Problem: Overlay services(BGP or Tunnel) that rely on the IGP
> >     needs to be notified immediately when the remote Peer failed, to
> >     assist such overlay service accomplish fast switchover(how to
> >     switchover is out of the discussion)
> >
> >     Potential Solutions:
> >
> >        There are now mainly four categories of the solutions, as
> >     described below and their brief analysis:
> >
> >        Category A: PUA/PULSE. Utilizes the existing IGP mechanism to
> >     transport/flooding the notification message.
> >
> >        Category B: Detail/Important Prefixes Leaks. Bypass the
> >     summary side-effect for some detailed/important prefixes by
> >     leaking/not summarize them into each area.
> >
> >        Category C: BGP based solution: Utilize the existing BGP
> >     infrastructure to transport the notification message
> >
> >        Category D: OOB Solution. Design some new OOB protocol to
> >     transport the notification message.
> >
> >
> >
> >     Because we are in LSR WG, and people are all IGP experts. After
> >     the intense discussion, can we now focus on the Category A/B?
> >
> >     It is very curious that LSR WG will and should produce some BGP
> >     or OOB based solution. I think they may be feasible, but should
> >     be evaluated/discussed by other WGs.
> >
> >     Or else, I think we can’t converge to one standard solution.
> >
> >
> >
> >     >From the POV of the operator, we prefer to the IGP based
> >     solution. If there is no unsolvable concerns, let’s accept it. I
> >     think there is enough interests and experts to accomplish this
> >     task.
> >
> >
> >
> >     Best Regards
> >
> >
> >
> >     Aijun Wang
> >
> >     China Telecom
> >
> >
> >
> >     _______________________________________________
> >     Lsr mailing list
> >     [email protected] <mailto:[email protected]> 
> >     https://www.ietf.org/mailman/listinfo/lsr
> >
> >
> >
> > _______________________________________________
> > Lsr mailing list
> > [email protected] <mailto:[email protected]> 
> > https://www.ietf.org/mailman/listinfo/lsr

_______________________________________________
Lsr mailing list
[email protected] <mailto:[email protected]> 
https://www.ietf.org/mailman/listinfo/lsr

_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Reply via email to