Re: [Lsr] BGP vs PUA/PULSE

Tony Li Mon, 24 Jan 2022 23:25:04 -0800

Hi Aijun,

> 1) Consider in the BGP scenario: every PE may receive the routes from other 
> PEs, right? So, using the PUB/SUB model, every PE should subscribe the status 
> of the other PEs, right?



My understanding is that a PE typically only has tunnels to some other select 
number of PEs. Yes, each PE would register for the other PEs that it connects 
to.


> 2) Consider in the tunnel scenario: every PE/P may select other PE/P as the 
> tunnel endpoint, right?, So, using the PUB/SUB model, every P/PE should 
> subscribe the status of other P/PEs, right?
> Then, with the approach of PUB/SUB direction, the network will eventually 
> evolved into full mesh like subscription. That is, every device in the 
> network will care about every other device's status. Then, isn't the flooding 
> mechanism the most efficient one?


Efficient in what metric? In terms of the number of unique messages initiated, 
yes. However, that is not the metric that matters. What matters is the load on 
the network when there are failures and PUA dumps things into the L2 LSDB.

We specifically have to design routing protocols to operate in worst case 
scenarios. PUA means that there is no real upper bound to the worst case. Bad 
news that we weren’t expecting can just keep piling up. The worst scale point 
is when everything has failed. For ensuring stability that’s disasterous. And 
stability is way, way, way more important than efficiency.


> If we take the no summary solution, for the above so called "important 
> prefixes", then:
> 1) All the devices within the network will be filled with these detail 
> prefixes in the normal state, right? 


Yes.


> 2) When there is the massive failure as that you often worried, the status of 
> such detailed prefixes will influence the IGP convergence, right?


Unlikely. If there are massive failures then prefixes will be withdrawn, not 
added.  State goes down. The stress on the network goes down. This is a highly 
desirable property.  Withdrawing state is not going to significantly affect the 
performance of the IGP.  SPF performance is O(n log n) for n = (# edges + 
#nodes).  The number of prefixes is relatively noise.


> 3) And, when the massive failure recovery, the status of such detailed 
> prefixes will influence AGAIN the performance of IGP, right?


No.  Same reasoning.


> But with the summary+PUA/PULSE(with the threshold control on ABRs as 
> described in 
> https://mailarchive.ietf.org/arch/msg/lsr/fnP1dwvWhT3oRduwXGK73NzQAUo/), 
> 1) There is NO stress for all the devices in the network to keep the above 
> detailed "important prefixes" in the normal state, right?


True. Also irrelevant.


> 2) There is CONTROLLED influence to the IGP when massive failure occur, right?


Perhaps, but the grave concern is that the ‘controlled influence’ is adequate 
to maintain stability and yet providing some benefit.  Again, the feature is 
architected backwards: it adds stress under failure conditions. Exactly what we 
don’t want.

And whatever control is installed, some customer will dial it up to 11 and then 
call my CEO when their network melts down. No thank you.


> 3) There is NO influence to the IGP performance when massive failure 
> recovery, right?


Irrelevant. The recovery time is irrelevant. Again, the primary requirement is 
stabiilty.


> Which one is the best option then?


As we’ve been saying for months now, the ordering is:

1) Leak PE loopbacks
2) Pub/Sub
3) Carry loopbacks in BGP and recurse
4) Multi-hop BFD 
5) Pulse
6) PUA

Stability, stability, stability, and stability. Get the message?

Tony


_______________________________________________
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr

Re: [Lsr] BGP vs PUA/PULSE

Reply via email to