Re: [Lsr] BGP vs PUA/PULSE

Aijun Wang Wed, 01 Dec 2021 15:59:06 -0800

Hi, Robert:

Aijun Wang
China Telecom


> On Dec 2, 2021, at 04:42, Robert Raszuk <[email protected]> wrote:
> 
> 
> Apologies 2 corrections:
> 
> 1)  s/to their inter-as/ to their inter-area/
> 
> 2)  "service stops for configured PULSE timeout (as discussed 200 sec)."  
> Actually in the described case it is much worse ... Service stops forever to 
> such area as service layer may not be at all aware about this kind of false 
> positive ! 
> 
> Btw this is also not an implementation detail as all multi vendor ABRs better 
> work in the same manner. 
> 
> And the robust solution to this case seems to be along the lines of the logic 
> you have described. PULSES must be acted on by L2 ABRs or by remote PEs 
> *only* when all sources of the summaries inject identical PULSE. 

[WAJ] 
https://datatracker.ietf.org/doc/html/draft-wang-lsr-prefix-unreachable-annoucement-08#section-4
 has described such situations. I have also introduced it in the IETF 112 
meeting.
Please see the last paragraph of this section.

> 
> That makes the feature a bit more complex ....
> 
> Thx,
> R.
> 
> 
> 
> 
> 
> 
> 
>> On Wed, Dec 1, 2021 at 9:25 PM Robert Raszuk <[email protected]> wrote:
>> Hi Tony,
>> 
>> I have been thinking about your email a bit more. Actually the destructive 
>> issue you have described can happen not only in the case of partitioned L1 
>> areas. 
>> 
>> Deployment scenario: 
>> 
>> It is quite often the case that ABRs connectivity intra-area are very 
>> different to their inter-as connections. That usually means that different 
>> line cards are used to connect to other routers in the local area then those 
>> in the core area. 
>> 
>> So when anything happens to the line card which connects L1 (for example it 
>> goes down, there is massive congestion, protocol queue is full etc ...) when 
>> previously received LSPs expire such ABR may trigger PULSE of all PE routers 
>> domain wide. And all the fuses discussed to prevent massive flooding will 
>> not kick in as there may be just say 10 PEs in the area - all working just 
>> fine. 
>> 
>> The other ABRs will happily continue to inject summaries but service stops 
>> for configured PULSE timeout (as discussed 200 sec). Note that it is full 
>> service stop not switching to a backup path as all PEs in the area PULSED 
>> domain wide. Not good. 
>> 
>> I have not seen any discussion about such a failure case so far. And only 
>> your mail triggered it ! 
>> 
>> Many thx,
>> R.
>> 
>> 
>> 
>>> On Wed, Dec 1, 2021 at 5:04 PM Robert Raszuk <[email protected]> wrote:
>>> Hi Tony, 
>>> 
>>> On #2 I you are right in the case of src L1 getting partitioned. Yes it 
>>> will kill anycast design. If this is showstopper ... not sure. AFAIK only 
>>> sourcing ABRs need to keep track about all links to PE to be down. That 
>>> requirement does not propagate any further upstream. 
>>> 
>>> Thx
>>> 
>>> On Wed, Dec 1, 2021 at 4:58 PM Tony Przygienda <[email protected]> wrote:
>>>> 1. my question is different. why does the draft say that seqnr# & IDs have 
>>>> to be preserved between restarts 
>>>> 
>>>> 2. I'm still concerned about L1/L2 hierarchy. If an L2 border sees same 
>>>> prefix negative pulses from two different L1/L2s  it still has to keep 
>>>> state to only pulse into L1 after _all_ the guys pulsed negative (which is 
>>>> basically impossible since the _negative_ cannot persist it seems). Now 
>>>> how will it even know that? it has to keep track who advertised the same 
>>>> summary & who pulsed or otherwise it will pulse on anyone with a summary 
>>>> giving a pulse and with that anycast won't work AFAIS and worse you get 
>>>> into weird situations where you have 2 L1/L2 into same L1 area, one lost 
>>>> link to reach the PE (arguably L1 got partitioned) and pulses & then the 
>>>> L1/L2 on the border of the down L1 pulses and tears the session down 
>>>> albeit the prefix is perfectly reachable through the other L1/L2. I assume 
>>>> that parses for the connoscenti ... 
>>>> 
>>>> -=--- tony 
>>>> 
>>>>> On Wed, Dec 1, 2021 at 4:00 PM Peter Psenak <[email protected]> wrote:
>>>>> Tony,
>>>>> 
>>>>> On 01/12/2021 15:31, Tony Przygienda wrote:
>>>>> 
>>>>> > 
>>>>> > Or maybe I missed something in the draft or between the lines in the 
>>>>> > whole thing ... Do we assume the negative just quickly tears down the 
>>>>> > BGP session & then it loses any relevance and we rely on BGP to retry 
>>>>> > after reset automatically or something? 
>>>>> 
>>>>> yes.
>>>>> 
>>>>> 
>>>>> But then why do we even care about retaining the LSP IDs & SeqNr# would 
>>>>> I ask?
>>>>> 
>>>>> it's used for the purpose of flooding, so that during the flooding you 
>>>>> do not flood the same pulse LSP multiple times.
>>>>> 
>>>>> thanks,
>>>>> Peter
>>>>> 
>>>>> 
>>>>> > 
>>>>> > -- tony
>>>>> > 
>>>>> > 
>>>>> > 
>>>>> > 
>>>>> > 
>>>>> > On Tue, Nov 30, 2021 at 11:19 PM Les Ginsberg (ginsberg) 
>>>>> > <[email protected] 
>>>>> > <mailto:[email protected]>> wrote:
>>>>> > 
>>>>> >     Hannes -
>>>>> > 
>>>>> >     Please see
>>>>> >     
>>>>> > https://datatracker.ietf.org/doc/html/draft-ppsenak-lsr-igp-event-notification-00#section-4.1
>>>>> > 
>>>>> >     The new Pulse LSPs don't have remaining lifetime - quite 
>>>>> > intentionally.
>>>>> >     They are only retained long enough to support flooding.
>>>>> > 
>>>>> >     But, you remind me that we need to specify how the checksum is
>>>>> >     calculated. Will do that in the next revision.
>>>>> > 
>>>>> >     Thanx.
>>>>> > 
>>>>> >          Les
>>>>> > 
>>>>> >      > -----Original Message-----
>>>>> >      > From: Hannes Gredler <[email protected] 
>>>>> > <mailto:[email protected]>>
>>>>> >      > Sent: Tuesday, November 30, 2021 11:22 AM
>>>>> >      > To: Peter Psenak (ppsenak) <[email protected]
>>>>> >     <mailto:[email protected]>>
>>>>> >      > Cc: Robert Raszuk <[email protected] <mailto:[email protected]>>;
>>>>> >     Les Ginsberg (ginsberg)
>>>>> >      > <[email protected] <mailto:[email protected]>>; Aijun Wang
>>>>> >     <[email protected] <mailto:[email protected]>>; lsr
>>>>> >      > <[email protected] <mailto:[email protected]>>; Tony Li <[email protected]
>>>>> >     <mailto:[email protected]>>; Shraddha Hegde
>>>>> >      > <[email protected] <mailto:[email protected]>>
>>>>> >      > Subject: Re: [Lsr] BGP vs PUA/PULSE
>>>>> >      >
>>>>> >      > hi peter,
>>>>> >      >
>>>>> >      > Just curious: Do you have an idea how to make short-lived LSPs
>>>>> >     compatible
>>>>> >      > with the problem stated in
>>>>> >      > https://datatracker.ietf.org/doc/html/rfc7987
>>>>> >      >
>>>>> >      > Would like to hear your thoughts on that.
>>>>> >      >
>>>>> >      > thanks,
>>>>> >      >
>>>>> >      > /hannes
>>>>> >      >
>>>>> >      > On Tue, Nov 30, 2021 at 01:15:04PM +0100, Peter Psenak wrote:
>>>>> >      > | Hi Robert,
>>>>> >      > |
>>>>> >      > | On 30/11/2021 12:40, Robert Raszuk wrote:
>>>>> >      > | > Hey Peter,
>>>>> >      > | >
>>>>> >      > | >      > #1 - I am not ok with the ephemeral nature of the
>>>>> >     advertisements. (I
>>>>> >      > | >      > proposed an alternative).
>>>>> >      > | >
>>>>> >      > | >     LSPs have their age today. One can generate LSP with the
>>>>> >     lifetime of 1
>>>>> >      > | >     min. Protocol already allows that.
>>>>> >      > | >
>>>>> >      > | >
>>>>> >      > | > That's a pretty clever comparison indeed. I had a feeling it
>>>>> >     will come
>>>>> >      > | > up here and here you go :)
>>>>> >      > | >
>>>>> >      > | > But I am afraid this is not comparing apple to apples.
>>>>> >      > | >
>>>>> >      > | > In LSPs or LSA flooding you have a bunch of mechanisms to
>>>>> >     make sure the
>>>>> >      > | > information stays fresh
>>>>> >      > | > and does not time out. And the default refresh in ISIS if I
>>>>> >     recall was
>>>>> >      > | > something like 15 minutes ?
>>>>> >      > |
>>>>> >      > | yes, default refresh is 900 for the default lifetime of 1200
>>>>> >     sec. Most
>>>>> >      > | people change both to much larger values.
>>>>> >      > |
>>>>> >      > | If I send the LSP with the lifetime of 1 min, there will never
>>>>> >     be any
>>>>> >      > | refresh of it. It will last 1 min and then will be purged and
>>>>> >     removed from
>>>>> >      > | the database. The only difference with the Pulse LSP is that it
>>>>> >     is not
>>>>> >      > | purged to avoid additional flooding.
>>>>> >      > |
>>>>> >      > |
>>>>> >      > | >
>>>>> >      > | >     Today in all MPLS networks host routes from all areas are
>>>>> >     "spread"
>>>>> >      > | >     everywhere including all P and PE routers, that's how LS
>>>>> >     protocols
>>>>> >      > | >     distribute data, we have no other way to do that in LS 
>>>>> > IGPs.
>>>>> >      > | >
>>>>> >      > | >
>>>>> >      > | > Can't you run OSPF over GRE ? For ISIS Henk had proposal not
>>>>> >     so long ago
>>>>> >      > | > to run it over TCP too.
>>>>> >      > | >
>>>>> >     
>>>>> > https://datatracker.ietf.org/doc/html/draft-hsmit-lsr-isis-flooding-over-
>>>>> >      > tcp-00
>>>>> >      > |
>>>>> >      > | you can run anything over GRE, including IGPs, and you don't
>>>>> >     need TCP
>>>>> >      > | transport for that. I don't see the relevance here. Are you
>>>>> >     suggesting to
>>>>> >      > | create GRE tunnels to all PEs that need the pulses? Nah, that
>>>>> >     would be an
>>>>> >      > | ugly requirement.
>>>>> >      > |
>>>>> >      > | thanks,
>>>>> >      > | Peter
>>>>> >      > |
>>>>> >      > |
>>>>> >      > | >
>>>>> >      > | > Seems like a perfect fit !
>>>>> >      > | >
>>>>> >      > | > Thx,
>>>>> >      > | > R.
>>>>> >      > |
>>>>> > 
>>>>> >     _______________________________________________
>>>>> >     Lsr mailing list
>>>>> >     [email protected] <mailto:[email protected]>
>>>>> >     https://www.ietf.org/mailman/listinfo/lsr
>>>>> > 
>>>>> 
> _______________________________________________
> Lsr mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/lsr

_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Re: [Lsr] BGP vs PUA/PULSE

Reply via email to