Re: [Lsr] BGP vs PUA/PULSE

Tony Przygienda Wed, 01 Dec 2021 06:32:29 -0800

having thought through that a bit more I think it's a significantly worse
idea than I actually thought. You are requesting a node to basically save a
possibly very large amount of state between reboots in transitive fashion
for this to even have a chance to work half-way reliably (assuming
signalling "down" means "it went down and stays down" which surely has to
include "positive pulses" as well BTW AFAIS?)

Imagine following scenario:

router L1 A summarizes 10.1/16 and realizes a 10.1.1.1 behind it went down
(the reason may not be in ISIS, e.g. redistributed 10.1.1.1 was one of the
triggers for 10.1/16 and now is not redistributed anymore). it pulses
negative and has to retain

* A pulsed negative under a 10.1/16 for 10.1.1.1 (and what was the
"origiin" of 10.1.1.1 that was lost to pulse it)
* A pulsed in some ID @ some seqnr# and possibly more stuff

it hits on the border a L1/L2 router B sumarizing 10/8 which now re'pulses
into L2 (or leaks it up, it's same thing, you have to keep state since the
LSP will "expire" within short period)

* keep seqnr# of the "re"-pulsed negative, which pulse from whom triggered
it (there may be multiple folks etc)
* keep ID of the "re"-pulsed negative

I don't think you get away from doing that since A is not visible in L2 and
if there is a A' with 10.1.1.1/32 in the same L1 B MUST suppress A's pulse
otherwise the other end of network cannot tell whether "all" o9r just
"some" of 10.1.1.1 in the area went away.

So in a sense the state in B will be "stuck" if A reboots (unless you start
to generate lots of rules about re-computing reachability to A and based on
that remove that state to make sure the L2 "re"-pulse is withdrawn. or
should it? A link failure here can cause quickly a good amount of "positive
pulse" storms for all the negative pulse state kept @ the border. A going
away is kind of confirmation that the negative is still negative or does it
invalidate the pulse of A. If it doesn't and A goes down the negative is
stuck forever and B will repulse it after reboot? is here a transitive
algebra of some kind?)

Some goes for a C @ another L2/L1 border trying to "leak it down" of course

now comes a more interesting part.

A is shut down, its configuration changed to not include 10.1/16 anymore
(but possibly something subsuming/supersuming). On reboot one has to build
a whole logic computing all the retained negative so they can be pulsed
"positive" to remove the state from B (unless B pulese positive the moment
A goes away which kinds of seems to defaeat the purpose of the whole
negative and is opposite to what one does if A advertises 10.1.1.1/32
directly which going away kind of should pulse it negative from B)

And I won't even ask what happens if A reboots with a different router ID.
Well, maybe? do you start to pulse with old router-ID since otherwise B
will not remove the retained state? That seems like a prescription for
spectacular meltdowns a la proxy purging that had to be removed.

Just bunch thoughts to improve the draft @ best but most likely showing
futility of that approach IMO to construct a clean algebra/architecture of
that stuff under summarization on a global leak-up-down-scope.

Or maybe I missed something in the draft or between the lines in the whole
thing ... Do we assume the negative just quickly tears down the BGP session
& then it loses any relevance and we rely on BGP to retry after reset
automatically or something? But then why do we even care about retaining
the LSP IDs & SeqNr# would I ask?

-- tony

On Tue, Nov 30, 2021 at 11:19 PM Les Ginsberg (ginsberg) <ginsberg=
[email protected]> wrote:

> Hannes -
>
> Please see
> https://datatracker.ietf.org/doc/html/draft-ppsenak-lsr-igp-event-notification-00#section-4.1
>
> The new Pulse LSPs don't have remaining lifetime - quite intentionally.
> They are only retained long enough to support flooding.
>
> But, you remind me that we need to specify how the checksum is calculated.
> Will do that in the next revision.
>
> Thanx.
>
>     Les
>
> > -----Original Message-----
> > From: Hannes Gredler <[email protected]>
> > Sent: Tuesday, November 30, 2021 11:22 AM
> > To: Peter Psenak (ppsenak) <[email protected]>
> > Cc: Robert Raszuk <[email protected]>; Les Ginsberg (ginsberg)
> > <[email protected]>; Aijun Wang <[email protected]>; lsr
> > <[email protected]>; Tony Li <[email protected]>; Shraddha Hegde
> > <[email protected]>
> > Subject: Re: [Lsr] BGP vs PUA/PULSE
> >
> > hi peter,
> >
> > Just curious: Do you have an idea how to make short-lived LSPs compatible
> > with the problem stated in
> > https://datatracker.ietf.org/doc/html/rfc7987
> >
> > Would like to hear your thoughts on that.
> >
> > thanks,
> >
> > /hannes
> >
> > On Tue, Nov 30, 2021 at 01:15:04PM +0100, Peter Psenak wrote:
> > | Hi Robert,
> > |
> > | On 30/11/2021 12:40, Robert Raszuk wrote:
> > | > Hey Peter,
> > | >
> > | >      > #1 - I am not ok with the ephemeral nature of the
> advertisements. (I
> > | >      > proposed an alternative).
> > | >
> > | >     LSPs have their age today. One can generate LSP with the
> lifetime of 1
> > | >     min. Protocol already allows that.
> > | >
> > | >
> > | > That's a pretty clever comparison indeed. I had a feeling it will
> come
> > | > up here and here you go :)
> > | >
> > | > But I am afraid this is not comparing apple to apples.
> > | >
> > | > In LSPs or LSA flooding you have a bunch of mechanisms to make sure
> the
> > | > information stays fresh
> > | > and does not time out. And the default refresh in ISIS if I recall
> was
> > | > something like 15 minutes ?
> > |
> > | yes, default refresh is 900 for the default lifetime of 1200 sec. Most
> > | people change both to much larger values.
> > |
> > | If I send the LSP with the lifetime of 1 min, there will never be any
> > | refresh of it. It will last 1 min and then will be purged and removed
> from
> > | the database. The only difference with the Pulse LSP is that it is not
> > | purged to avoid additional flooding.
> > |
> > |
> > | >
> > | >     Today in all MPLS networks host routes from all areas are
> "spread"
> > | >     everywhere including all P and PE routers, that's how LS
> protocols
> > | >     distribute data, we have no other way to do that in LS IGPs.
> > | >
> > | >
> > | > Can't you run OSPF over GRE ? For ISIS Henk had proposal not so long
> ago
> > | > to run it over TCP too.
> > | >
> https://datatracker.ietf.org/doc/html/draft-hsmit-lsr-isis-flooding-over-
> > tcp-00
> > |
> > | you can run anything over GRE, including IGPs, and you don't need TCP
> > | transport for that. I don't see the relevance here. Are you suggesting
> to
> > | create GRE tunnels to all PEs that need the pulses? Nah, that would be
> an
> > | ugly requirement.
> > |
> > | thanks,
> > | Peter
> > |
> > |
> > | >
> > | > Seems like a perfect fit !
> > | >
> > | > Thx,
> > | > R.
> > |
>
> _______________________________________________
> Lsr mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/lsr
>

_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Re: [Lsr] BGP vs PUA/PULSE

Reply via email to