Re: [Lsr] BGP vs PUA/PULSE

Christian Hopps Fri, 14 Jan 2022 18:07:12 -0800

Yes, having worked intimately with these IGPs for > 20 years now, I understand 
the use and the implications of using summary routes. :)


My opinion remains unchanged.

Thanks,
Chris.
[as wg member]

> On Jan 14, 2022, at 8:50 PM, Aijun Wang <[email protected]> wrote:
> 
> Hi, Christian:
> 
> We should consider the balance and efficiency for the summary or not summary.
> If you don’t summary, then all the areas will be filled with the specified 
> detail routes(all PE’s loopback, may also include all P’s loopback). This can 
> certainly increase the burden of the routers. 
> 
> But with summary, all these specific routes need not exist in the routing 
> table. The nodes within the IGP need only be notified when one node is 
> failure to accelerate the switchover of the overlay service. 
> And, you can also select to not using such mechanism, then the service will 
> be backhole for some time until the service/application find this abnormal 
> phenomenon.
> PUA/PULSE are just the mechanism to reduce the abnormal durations, it is one 
> kind of FRR technique.
> 
> Aijun Wang
> China Telecom
> 
>> On Jan 15, 2022, at 09:26, Christian Hopps <[email protected]> wrote:
>> 
>> 
>> 
>>> On Jan 14, 2022, at 8:25 PM, Christian Hopps <[email protected]> wrote:
>>> 
>>> I understand the proposal. As I've stated elsewhere, I do not believe there 
>>> is a problem here that needs solving. The "problem" was created by the user 
>>> by summarizing prefixes that should not have been summarized -- they 
>>> mis-configured their network. The routing protocols works just fine (act 
>>> very quickly) if you don't incorrectly summarize "really important 
>>> prefixes".
>>> 
>>> I was simply pointing out that IGPs also don't deal in liveness since that 
>>> keeps coming up.
>> 
>> Sorry that was "as wg member".
>> 
>>> 
>>> Thanks,
>>> Chris.
>>> 
>>>>> On Jan 14, 2022, at 8:06 PM, Aijun Wang <[email protected]> wrote:
>>>> 
>>>> Hi, Christian and John:
>>>> 
>>>> No. I think you all may misunderstand the proposal. What we are detecting 
>>>> is actually the reachability/liveness of node that connected to the 
>>>> application, not the application itself.
>>>> And, I think the node liveness is same as the node reachability. They will 
>>>> all influence or break the path to their connected service if their 
>>>> forwarding function is failed.
>>>> 
>>>> Aijun Wang
>>>> China Telecom
>>>> 
>>>>> On Jan 15, 2022, at 08:56, Christian Hopps <[email protected]> wrote:
>>>>> 
>>>>> Indeed, and in fact the IGP should only be dealing with the reachability 
>>>>> to the node, not with the node or applications liveness.
>>>>> 
>>>>> Thanks,
>>>>> Chris.
>>>>> [as wg member]
>>>>> 
>>>>>> On Jan 14, 2022, at 7:47 PM, John E Drake <[email protected]> wrote:
>>>>>> 
>>>>>> I don’t think so.  Today things just work, at a given time scale.  What 
>>>>>> you said you are trying to do is reduce the time scale for detecting 
>>>>>> that an application on a node has failed.  However, conflating the 
>>>>>> health of a node with the health of an application on that node seems to 
>>>>>> be inherently flawed.   
>>>>>> 
>>>>>> Yours Irrespectively,
>>>>>> 
>>>>>> John
>>>>>> 
>>>>>> 
>>>>>> Juniper Business Use Only
>>>>>> From: Aijun Wang <[email protected]> 
>>>>>> Sent: Friday, January 14, 2022 7:40 PM
>>>>>> To: John E Drake <[email protected]>
>>>>>> Cc: Les Ginsberg (ginsberg) <[email protected]>; Robert Raszuk 
>>>>>> <[email protected]>; Christian Hopps <[email protected]>; Shraddha Hegde 
>>>>>> <[email protected]>; Tony Li <[email protected]>; Hannes Gredler 
>>>>>> <[email protected]>; lsr <[email protected]>; Peter Psenak (ppsenak) 
>>>>>> <[email protected]>
>>>>>> Subject: Re: [Lsr] BGP vs PUA/PULSE
>>>>>> 
>>>>>> [External Email. Be cautious of content]
>>>>>> 
>>>>>> When the node is up, all the following process are passed to the 
>>>>>> application layer. This is the normal procedures of the IGP should do.
>>>>>> According to your logic, IGP are solving the wrong problem now?
>>>>>> 
>>>>>> Aijun Wang 
>>>>>> China Telecom
>>>>>> 
>>>>>> 
>>>>>> On Jan 15, 2022, at 08:30, John E Drake 
>>>>>> <[email protected]> wrote:
>>>>>> 
>>>>>>  
>>>>>> Correct, but as Tony, Robert and I have noted, a node being up does not 
>>>>>> mean that an application on that node is up, which means that your 
>>>>>> proposed solution is probably a solution to the wrong problem.  Further, 
>>>>>> Robert’s solution is probably a solution to the right problem.
>>>>>> 
>>>>>> Yours Irrespectively,
>>>>>> 
>>>>>> John
>>>>>> 
>>>>>> 
>>>>>> Juniper Business Use Only
>>>>>> From: Aijun Wang <[email protected]> 
>>>>>> Sent: Friday, January 14, 2022 5:53 PM
>>>>>> To: John E Drake <[email protected]>
>>>>>> Cc: Robert Raszuk <[email protected]>; Les Ginsberg (ginsberg) 
>>>>>> <[email protected]>; Christian Hopps <[email protected]>; Shraddha 
>>>>>> Hegde <[email protected]>; Tony Li <[email protected]>; Hannes Gredler 
>>>>>> <[email protected]>; lsr <[email protected]>; Peter Psenak (ppsenak) 
>>>>>> <[email protected]>
>>>>>> Subject: Re: [Lsr] BGP vs PUA/PULSE
>>>>>> 
>>>>>> [External Email. Be cautious of content]
>>>>>> 
>>>>>> Hi, John: 
>>>>>> Please note if the node is down, the service will not be accessed.
>>>>>> We are discussing the “DOWN” notification, not the “UP” notification.
>>>>>> 
>>>>>> Aijun Wang 
>>>>>> China Telecom
>>>>>> 
>>>>>> 
>>>>>> On Jan 15, 2022, at 00:25, John E Drake 
>>>>>> <[email protected]> wrote:
>>>>>> 
>>>>>>  
>>>>>> Hi,
>>>>>> 
>>>>>> Comment inline below.
>>>>>> 
>>>>>> Yours Irrespectively,
>>>>>> 
>>>>>> John
>>>>>> 
>>>>>> 
>>>>>> Juniper Business Use Only
>>>>>> From: Lsr <[email protected]> On Behalf Of Robert Raszuk
>>>>>> Sent: Monday, January 10, 2022 7:15 PM
>>>>>> To: Les Ginsberg (ginsberg) <[email protected]>
>>>>>> Cc: Christian Hopps <[email protected]>; Aijun Wang 
>>>>>> <[email protected]>; Shraddha Hegde <[email protected]>; Tony 
>>>>>> Li <[email protected]>; Hannes Gredler <[email protected]>; lsr 
>>>>>> <[email protected]>; Peter Psenak (ppsenak) <[email protected]>
>>>>>> Subject: Re: [Lsr] BGP vs PUA/PULSE
>>>>>> 
>>>>>> [External Email. Be cautious of content]
>>>>>> 
>>>>>> Hi Les, 
>>>>>> 
>>>>>>> You seem focused on the notification delivery mechanism only.
>>>>>> 
>>>>>> Not really. For me, an advertised summary is like a prefix when you are 
>>>>>> dialing a country code. Call signaling knows to go north if you are 
>>>>>> calling a crab shop in Alaska. 
>>>>>> 
>>>>>> Now such direction does not indicate if the shop is open or has crabs. 
>>>>>> 
>>>>>> That info you need to get over the top as a service. So I am much more 
>>>>>> in favor to make the service to tell you directly or indirectly that it 
>>>>>> is available. 
>>>>>> 
>>>>>> [JD]  Right.  Just because a node is up and connected to the network 
>>>>>> does not imply that a given application is active on it.
>>>>>> 
>>>>>> Best,
>>>>>> R.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Jan 11, 2022 at 1:07 AM Les Ginsberg (ginsberg) 
>>>>>> <[email protected]> wrote:
>>>>>> Robert -
>>>>>> 
>>>>>> From: Robert Raszuk <[email protected]> 
>>>>>> Sent: Monday, January 10, 2022 2:56 PM
>>>>>> To: Les Ginsberg (ginsberg) <[email protected]>
>>>>>> Cc: Tony Li <[email protected]>; Christian Hopps <[email protected]>; 
>>>>>> Peter Psenak (ppsenak) <[email protected]>; Shraddha Hegde 
>>>>>> <[email protected]>; Aijun Wang <[email protected]>; Hannes 
>>>>>> Gredler <[email protected]>; lsr <[email protected]>
>>>>>> Subject: Re: [Lsr] BGP vs PUA/PULSE
>>>>>> 
>>>>>> Les,
>>>>>> 
>>>>>> We have received requests from real customers who both need to summarize 
>>>>>> AND would like better response time to loss of reachability to 
>>>>>> individual nodes.
>>>>>> 
>>>>>> We all agree the request is legitimate. 
>>>>>> 
>>>>>> [LES:] It does not seem to me that everyone does agree on that – but I 
>>>>>> appreciate that you agree.
>>>>>> 
>>>>>> But do they realize that to practically employ what you are proposing 
>>>>>> (new PDU flooding) requires 100% software upgrade to all IGP nodes in 
>>>>>> the entire network ? Do they also realize that to effectively use it 
>>>>>> requires data plane change (sure software but data plane code is not as 
>>>>>> simple as PI) on all ingress PEs ? 
>>>>>> 
>>>>>> [LES:] As far as forwarding, as Peter has indicated, we have a POC and 
>>>>>> it works fine. And there are many possible ways for implementations to 
>>>>>> go.
>>>>>> It may or may not require 100% software upgrade – but I agree a 
>>>>>> significant number of nodes have to be upgraded to at least support 
>>>>>> pulse flooding.
>>>>>> 
>>>>>> 
>>>>>> And with scale requirements you are describing it seems this would be 
>>>>>> 1000s of nodes (if not more). That's massive if compared to alternative 
>>>>>> approaches to achieve the same or even better results. 
>>>>>> 
>>>>>> [LES:] Be happy to review other solutions if/when someone writes them up.
>>>>>> I think what is overlooked in the discussion of other solutions is that 
>>>>>> reachability info is provided by the IGP. If all the IGP advertises is a 
>>>>>> summary then how would individual loss of reachability become known at 
>>>>>> scale?
>>>>>> You seem focused on the notification delivery mechanism only.
>>>>>> 
>>>>>> Les
>>>>>> 
>>>>>> Many thx,
>>>>>> Robert
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Lsr mailing list
>>>>>> [email protected]
>>>>>> https://www.ietf.org/mailman/listinfo/lsr
>>>>> 
>>>> 
>>> 
>> 
> 

_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Re: [Lsr] BGP vs PUA/PULSE

Reply via email to