Re: [Lsr] BGP vs PUA/PULSE

Les Ginsberg (ginsberg) Tue, 30 Nov 2021 14:16:19 -0800

Hannes -

Inline.


> -----Original Message-----
> From: Hannes Gredler <[email protected]>
> Sent: Tuesday, November 30, 2021 11:15 AM
> To: Les Ginsberg (ginsberg) <[email protected]>
> Cc: Aijun Wang <[email protected]>; 'Robert Raszuk'
> <[email protected]>; 'lsr' <[email protected]>; 'Tony Li' <[email protected]>;
> 'Shraddha Hegde' <[email protected]>; Peter Psenak (ppsenak)
> <[email protected]>
> Subject: Re: [Lsr] BGP vs PUA/PULSE
> 
> hi les,
> 
> please see inline.
> 
> On Mon, Nov 29, 2021 at 10:39:17PM +0000, Les Ginsberg (ginsberg) wrote:
> |    Hannes -
> |
> |
> |
> |    Thanx for bringing a new voice into the discussion.
> |
> |    Please see inline.
> |
> |
> |
> |    > -----Original Message-----
> |
> |    > From: Hannes Gredler <[email protected]>
> |
> |    > Sent: Monday, November 29, 2021 1:27 AM
> |
> |    > To: Aijun Wang <[email protected]>
> |
> |    > Cc: 'Robert Raszuk' <[email protected]>; 'lsr' <[email protected]>; Les
> |    Ginsberg
> |
> |    > (ginsberg) <[email protected]>; 'Tony Li' <[email protected]>; 'Shraddha
> |
> |    > Hegde' <[email protected]>; Peter Psenak (ppsenak)
> |
> |    > <[email protected]>
> |
> |    > Subject: Re: [Lsr] BGP vs PUA/PULSE
> |
> |    >
> |
> |    > On Mon, Nov 29, 2021 at 09:42:57AM +0800, Aijun Wang wrote:
> |
> |    >
> |
> |    > [ ... ]
> |
> |    >
> |
> |    > |    Option 3: The “DOWN” detection on ABR is same as PUA/PULSE, the
> |
> |    > different
> |
> |    > |    is how to propagate such “DOWN” information. Considering we have
> |
> |    > observed
> |
> |    > |    that all P/PE router in other areas may be interested such
> |    information,
> |
> |    > |    your proposal will require every P/PE router run BGP-LS, which is
> |    not the
> |
> |    > |    aimed deploy scenarios for BGP-LS.
> |
> |    >
> |
> |    > HG> BGP-LS has been conceived to solve the very problem of providing
> |
> |    > visibility of other
> |
> |    > area's link state. I fail to see what is out of scope here.
> |
> |    >
> |
> |    [LES:] BGP-LS only advertises what the IGPs themselves advertise.
> 
> HG> That is an implementation choice; the protocol allows multiple sources
> of information.
> https://www.iana.org/assignments/bgp-ls-parameters/bgp-ls-
> parameters.xhtml#protocol-ids
> And even standalone implementations are fine as the LSVR WG suggests.
> 
[LES:] Yes - I thought you might bring that up. 😊
If you want to use BGP-LS to advertise things on its own, that's fine and - as 
you point out - even supported by RFC 7752.
But it isn’t part of the LSR WG discussion where we are discussing possible IGP 
solutions.


> |    In this case, both IGP proposals involve ephemeral advertisements - so
> |    even if we were to define BGP-LS support for these new advertisements
> -
> |    they wouldn't persist long enough to be reflected in BGP-LS.
> 
> HG> you could model the liveliness tracker as a "source protocol" and
> advertise the state of your endpoint.
> 
> |    So, I really don’t know why we are discussing BGP-LS in the context of
> |    this thread.
> 
> HG> because some people think it's a bad a idea to put this corner-case app
> in the very core of our networks.
> 
[LES:] Again, if you want to propose BGP-LS as a possible solution, feel free 
to do so. I believe IDR would be the right WG for that discussion.

> |    (This seems to be one example of what Acee correctly identified as this
> |    discussion going "off track".)
> 
> HG> please do not derail the discussion. It is well within the mandate of LSR
> to discuss solution space and have a healthy discussion
> about the scalingaspects of a proposal. If the WG has concerns on the scaling
> properties and brings up alternatives to make a given
> use-case work that is IMO fine and we may just use some airtime for that.
> After all it's link-state routing, right ?
> 
> |
> |
> |
> |    > |    Then, if IGP has such capabilities, why bother BGP? What is the
> |    benefit?
> |
> |    >
> |
> |    > HG> simply put: seperation of concerns. Agreed consensus is to mostly
> |    use
> |
> |    > the
> |
> |    > IGP for topology discovery and put the bulk of carrying reachability
> |
> |    > information
> |
> |    > into BGP which gives us:
> |
> |    >
> |
> |    [LES:] I am not convinced either side can claim "consensus" in this
> |    discussion. That is a work in progress. 😊
> 
> HG> by consensus i meant that the BGP/IGP split is common practise of
> building large scale networks.
> 
> |    However, when you say IGPs are (exclusively?) for topology discovery - it
> |    seems to suggest that IGP shouldn’t be advertising prefix reachability at
> |    all. Hopefully, that is not what you intend.
> 
> HG> did not say that. of course we need minimal IP reachability for
> bootstrapping the iBGP transport mesh.
>     but we certainly do not use the IGP for carrying bulk routes at (Internet)
> scale.

[LES:] No one is proposing that.
WE are proposing to signal loss of reachability to prefixes that are already 
being learned by the IGPs.

> 
> |    One of the points that still baffles me is the assertion of an
> |    architectural violation in the IGP proposals.
> |
> 
> HG> did not say that.
> 
[LES:] That was in response to earlier comments from other folks. I used your 
post as an opportunity to discuss that further.
Apologies for unintentionally implying that you had said this.

> |
> |    It is OK for IGPs to advertise all prefixes covered by a summary (i.e., 
> do
> |    not summarize).
> |
> |    It is OK for IGPs to advertise multiple summaries (e.g., multiple /24s
> |    instead of a single /16).
> |
> |    It is even OK for IGPs to advertise some prefixes covered by a summary
> |    along with the summary (don’t know if any implementations do this - but
> |    they could).
> |
> |    None of this is an "architectural violation".
> |
> 
> HG> Again - Don't think its an architectual violation - I just think in it's 
> current
> state
> it has a lot of havoc potential under load.
> 
> |    But advertising a summary and signaling the loss of reachability to a
> |    specific prefix covered by the summary is seen by some as an
> architectural
> |    violation.
> |
> |    Sorry, I still don't understand this argument.
> |
> |
> |
> |    You can not like the approach. You can be concerned about scaling
> |    properties (more on that below). You can question the effectiveness of
> |    ephemeral advertisements.
> |
> |    These kinds of objections/concerns I can easily understand - even if we
> |    don’t agree on their significance.
> |
> |    But claiming that "IGPs are not supposed to do this"??
> |
> |    Not grokking this.
> |
> |
> |
> |    We have not added any new information to the IGP itself. We are only
> |    suggesting a new form of advertisement to signal some information
> already
> |    known to the IGP, but which is currently not advertised (in some
> |    deployments) by the configuration of summaries.
> |
> |
> |
> |
> |
> |    > 1) flow-control capabilities (=by virtue of TCP) and
> |
> |    > 2) furthermore operators can scale and isolate the distribution vehicle
> |    for a
> |
> |    > given AFI/SAFI service
> |
> |    >    using a dedicated RR infrastructure which does not mess with your
> |    bread
> |
> |    > and butter service
> |
> |    >    infra.
> |
> |    >
> |
> |    > IMO it is not a good idea to put (negative) reachability information
> |    back into
> |
> |    > the IGP as you
> |
> |    > would loose this "seperation of concerns" aspect and potentially
> |    de-stabilize
> |
> |    > your topology discovery
> |
> |    > tool and hence *all* your bread-and-butter services.
> |
> |    >
> |
> |    [LES:] The questions of scale (as I have previously commented) are very
> |    legitimate - and more has to be specified before an IGP solution would be
> |    considered ready for deployment. But there are tools easily applicable to
> |    address this (rate limiting, embedded summarization, perhaps others).
> |
> |    The more significant point is to focus on the goal - which in this usage
> |    is improved convergence time.
> |
> |    When the network is largely stable, convergence improvements can be
> |    achieved w/o risk.
> 
> 
> HG> Ok. To me scaling under load is *the* issue.
> The question is what happens if the network gets unstable and your vanilla
> LSPs
> have to compete for I/O and CPU resources with your "negative updates".
> Can you see
> situations where this resource contention has destabilizing potential ?
> 
[LES:] We have acknowledged the potential for destabilization and are committed 
to addressing it.
We have also highlighted that the problem to be solved isn’t how to converge 
faster when a network melts down in a significant way. It is how to converge 
faster when there are modest topology changes in an otherwise stable network.
Some folks think if you can't deal with the meltdown that the solution isn’t 
worth doing - but I think this overlooks when improved convergence can actually 
be achieved and when it can’t.

But "doing no harm" - absolutely agree with that goal.

   Les


> |    When widespread failures occur, real time signaling of any type is
> |    unlikely to provide improved convergence - which is why the IGPs today
> |    shift the focus from convergence to stability by slowing down the rate of
> |    updates sent and SPFs performed. This is STILL true even in the fast
> |    convergence/FRR era.
> |
> |    I see no reason why the same tools should not be used in this case.
> 
> HG> well when the fun starts - and 1000s of LSPs fly like bullets, good luck
> pacing that ...
_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Re: [Lsr] BGP vs PUA/PULSE

Reply via email to