Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-21 Thread Aijun Wang
Hi, Gunter:

Thanks for your deep considerations for the partition scenarios although it is 
one rarely event I the network.
Regarding to your statements, one point I want to correct is that in the 
mentioned scenario, R2 is also detached from R4 in area 2. This is the reason 
that only R2 sends out the PUA message, or else, if R2 can reach Pt2 via R4, it 
shouldn’t send out the PUA for Pt2.
So, in this scenario, R4 and R2 can only exchange the LSA in area 0. When R2 
receive the specific detail prefix for Pt2 from R4, it just install this 
specific route into its routing table as usual behavior——The PUA messages that 
it advertised has already stopped—-as noted in the draft, the PUA message only 
last one short time to assist the service convergence.
Regarding to your churn concerns, as discussed before on the list, not every 
link failure will cause the PUA advertisement, this can be configured on the 
ABR. Currently, we interest mainly the node’s reachability(that is, the 
loopback addresses of the routers).

Aijun Wang
China Telecom

> On Jun 21, 2022, at 20:40, Van De Velde, Gunter (Nokia - BE/Antwerp) 
>  wrote:
> 
> 
> wrt partitioned area’s and UPA’s. The wang PUA draft has an interesting 
> proposal for assuring that even with UPAs and partitioned area’s 
> non-conflicting information exist.
> The solution proposal does come at cost of increased ISIS churn. This could 
> be an acceptable cost, especially considering that UPAs will only appear by 
> exception and even more rare in combination with area partitioning.
>  
> Lets consider the following (based upon section “3.2 from 
> draft-wang-lsr-prefix-unreachable”):
>  
> For ISIS,  R2 L1/L2 would advertise the PUA for Pt2.
> R4 L1/L2 router receives the L2 PUA route for Pt2 and because it is also 
> summarizing Pt2 from L1, it now decides to advertise a specific route to Pt2.
> R2 now sees that R4 advertises a specific route for Pt2 and programs Pt2 next 
> hop to R4. This will stop the PUA advertisement on R2 immediately.
>  
> Although this enhancement as described is in the 
> draft-wang-lsr-prefix-unreachable draft it should work with 
> draft-ppsenak-lsr-igp-ureach-prefix. A side effect is more churn.
> This setup where R2 and R4 only see each other in L2 causes a lot of churn as 
> a PUA needs to be advertised for Pt2 and all downstream routes of T2.
> On top , R4 now advertises specific route for each of the PUA’s and a bit 
> later R2 floods the same LSPs again but without the PUA’s.
>  
>  
> ***
>  +-+--++-+--+
>  | +--++--+   ++-+   ++-++-++   + -++--+|
>  | |S1++S2+---+R1+---|R0++R2+---+T1++T2||
>  | +-++Ps1 +-++   ++-+   +--++-++   Pt2 +-++|
>  |   |   | |   | ||   | |
>  |   |   | |   | ||   | |
>  |   |   | |  L2 ||   | |
>  |   |   | |   | ||   | |
>  | +-++Ps4 +-++   ++-+   +-++    Pt4+-++|
>  | |S4++S3+---+R3+---+R4+---+T3++T4||
>  | +--++--+   ++-+   +-++   ++-++--+|
>  | |   ||
>  | |   ||
>  | | ISIS L2   |  ISIS L1   |
>  +-+---++
>  
> Inter-Area Prefix Unreachable Announcement Scenario
>  
>  
> 3.2.  Inter-Area Links Failure Scenario
>  
>In a link failure scenario, if the link between T1/T2 and T1/T3 are
>down, R2 will not be able to reach node T2.  But as R2 and R4 do the
>summary announcement, and the summary address covers the bgp next hop
>prefix of Pt2, other nodes in area 0 area 1 will still send traffic
>to T2 bgp next hop prefix Pt2 via the border router R2, thus black
>hole sink routing the traffic.
>  
>In such a situation, the border router R2 should notify other routers
>that it can't reach the prefix Pt2, and lets the other ABRs(R4) that
>can reach prefix Pt2 advertise one specific route to Pt2, then the
>internal routers will select R4 as the bypass router to reach prefix
>Pt2.
> ***
>  
> G/
>  
>  
> -Original Message-
> From: Peter Psenak  
> Sent: Thursday, June 16, 2022 12:04 PM
> To: Van De Velde, Gunter (Nokia - BE/Antwerp) 
> ; Gyan Mishra ; Voyer, 
> Daniel 
> Cc: draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
> draft-wang-lsr-prefix-unreachable-annoucement 
> ; lsr@ietf.org
> Subject: Re: [Lsr] Thoughts about PUAs - are we not over-engineering?
&g

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-21 Thread Van De Velde, Gunter (Nokia - BE/Antwerp)
wrt partitioned area’s and UPA’s. The wang PUA draft has an interesting 
proposal for assuring that even with UPAs and partitioned area’s 
non-conflicting information exist.
The solution proposal does come at cost of increased ISIS churn. This could be 
an acceptable cost, especially considering that UPAs will only appear by 
exception and even more rare in combination with area partitioning.

Lets consider the following (based upon section “3.2 from 
draft-wang-lsr-prefix-unreachable”):

For ISIS,  R2 L1/L2 would advertise the PUA for Pt2.
R4 L1/L2 router receives the L2 PUA route for Pt2 and because it is also 
summarizing Pt2 from L1, it now decides to advertise a specific route to Pt2.
R2 now sees that R4 advertises a specific route for Pt2 and programs Pt2 next 
hop to R4. This will stop the PUA advertisement on R2 immediately.

Although this enhancement as described is in the 
draft-wang-lsr-prefix-unreachable draft it should work with 
draft-ppsenak-lsr-igp-ureach-prefix. A side effect is more churn.
This setup where R2 and R4 only see each other in L2 causes a lot of churn as a 
PUA needs to be advertised for Pt2 and all downstream routes of T2.
On top , R4 now advertises specific route for each of the PUA’s and a bit later 
R2 floods the same LSPs again but without the PUA’s.


***
 +-+--++-+--+
 | +--++--+   ++-+   ++-++-++   + -++--+|
 | |S1++S2+---+R1+---|R0++R2+---+T1++T2||
 | +-++Ps1 +-++   ++-+   +--++-++   Pt2 +-++|
 |   |   | |   | ||   | |
 |   |   | |   | ||   | |
 |   |   | |  L2 ||   | |
 |   |   | |   | ||   | |
 | +-++Ps4 +-++   ++-+   +-++    Pt4+-++|
 | |S4++S3+---+R3+---+R4+---+T3++T4||
 | +--++--+   ++-+   +-++   ++-++--+|
 | |   ||
 | |   ||
 | | ISIS L2   |  ISIS L1   |
 +-+---++

Inter-Area Prefix Unreachable Announcement Scenario


3.2.  Inter-Area Links Failure Scenario

   In a link failure scenario, if the link between T1/T2 and T1/T3 are
   down, R2 will not be able to reach node T2.  But as R2 and R4 do the
   summary announcement, and the summary address covers the bgp next hop
   prefix of Pt2, other nodes in area 0 area 1 will still send traffic
   to T2 bgp next hop prefix Pt2 via the border router R2, thus black
   hole sink routing the traffic.

   In such a situation, the border router R2 should notify other routers
   that it can't reach the prefix Pt2, and lets the other ABRs(R4) that
   can reach prefix Pt2 advertise one specific route to Pt2, then the
   internal routers will select R4 as the bypass router to reach prefix
   Pt2.
***

G/


-Original Message-
From: Peter Psenak 
Sent: Thursday, June 16, 2022 12:04 PM
To: Van De Velde, Gunter (Nokia - BE/Antwerp) ; 
Gyan Mishra ; Voyer, Daniel 

Cc: draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
draft-wang-lsr-prefix-unreachable-annoucement 
; lsr@ietf.org
Subject: Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

Hi Gunter,

please see inline (##PP):

On 16/06/2022 10:09, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:
> Hi Gyan, Daniel, Peter, All,
>
> Thanks for sharing your insights and I agree mostly with your feedback
>
> I agree and understand that summarization is needed to reduce the size
> of the LSDB. I also agree summarization good design practice,
> especially with IPv6 and SRv6 in mind. There never has been doubt about that.
>
> I am not sure I agree that UAP/UPA is ‘optimal-design’. Maybe it is
> the best we can do, however I have a healthy worry we could be
> suffering tunnel vision and that proposed solution may not be good enough.
>
> We should not be blind and believe that advertising UPA/PUA does not
> come without a cost. The architectural PUA/UPA usage complexity cost
> may not be worth the effort (none of the integration of using a
> PUA/UPA event triggers come for free). Do we really believe that
> PUA/UPA solve all the SID reachability problems for all IGP network
> design and SR use-cases elegantly? Maybe some use-case design
> constraints and assumptions should be documented to clarify
> architecturally where PUA/UPA is most beneficial for operators? Just
> stating “outside scope of the draft” seems unfair to operators
> interested in PUA/UPAs

##PP
we are trying to solve a particular problem of remote PE going down in network 
where summarization is used. I believe that is stated clearly in the UPA draft.

>
> Let me give tw

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-20 Thread Anup MalenaaDu
Thanks Aijun.

So, in places where BFD can be reasonably deployed, would PUA really help?

- Anup

On Mon, Jun 20, 2022 at 4:06 PM Aijun Wang 
wrote:

> Hi, Anup:
>
> The advantage of PUA over BFD is that the operator needs not deploy o(n^2)
> BFD sessions for the services that rely on the IGP reachablity.
> Such comparisons have been discussed on the list.
>
> Aijun Wang
> China Telecom
>
> On Jun 18, 2022, at 12:55, Anup MalenaaDu  wrote:
>
> 
> Hi,
>
> BGP uses BFD to track the remote PEs.
> So, how does PUA really help?
>
> To be precise,
> 1. what are the advantages of having PUAs for IGPs
> 2. what are the advantages for services like BGP, Tunnels, LSPs etc going
> over IGPs
>
> Thanks,
> Anup
>
> On Thu, Jun 16, 2022 at 7:41 PM Voyer, Daniel  40bell...@dmarc.ietf.org> wrote:
>
>> Hi Gunter, see [DV]
>>
>>
>>
>> *From: *"Van De Velde, Gunter (Nokia - BE/Antwerp)" <
>> gunter.van_de_ve...@nokia.com>
>> *Date: *Thursday, June 16, 2022 at 6:38 AM
>> *To: *Robert Raszuk 
>> *Cc: *Gyan Mishra , Dan Voyer <
>> daniel.vo...@bell.ca>, "
>> draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org" <
>> draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org>,
>> draft-wang-lsr-prefix-unreachable-annoucement <
>> draft-wang-lsr-prefix-unreachable-annoucem...@ietf.org>, "lsr@ietf.org" <
>> lsr@ietf.org>
>> *Subject: *[EXT]RE: [Lsr] Thoughts about PUAs - are we not
>> over-engineering?
>>
>>
>>
>> Hi Robert, Peter, Bruno
>>
>>
>>
>> You wrote:
>>
>> “Aas there is no association between node_id (perhaps loopback) and SIDs
>> (note that egress can use many SIDs) UPA really does not signal anything
>> about SIDs reachability or liveness. “
>>
>>
>>
>> Sure, but UPA signals that a locator is unreachable, would that not
>> result for the SRv6 SID to become unreachable as well?
>>
>>
>>
>> I understood that UPA have increased value add benefit when using with
>> SRv6. If suddenly a locator becomes unreachable, then it I guess the
>> associated 128 bit SIDs become unreachable too, causing an event for
>> something to happen in the transport network to fix the problem.
>>
>>
>>
>> That being said, Peter makes a good point stating that UPA is not solving
>> the problem of partitioning areas, and hence, maybe my use-case is not
>> overly relevant.
>>
>>
>>
>> So progressing, an operator using ABR based summarization then there are
>> few options:
>>
>>1. No summarization at all at ABRs
>>2. Summarize on ABR all prefixes that can be summarized
>>3. Summarize all prefixes that are not associated with PEs and remain
>>advertising individual PE addresses
>>4. Summarize all prefixes and use UPA’s to advertise unreachability
>>of protected prefixes
>>
>>
>>
>> [DV] if “an operator using ABR based summarization” then option 1 is
>> out, right ? Also, option 4 is the point of this draft – but furthermore,
>> if an aggregation device, inside a domain, is also being summarized – as
>> the entire domain get summarized – but this agg device doesn’t have any
>> services, because it’s an aggregation device, “then it’s up to the operator
>> designing the network to implement” a form of policy/filter. So if that agg
>> device reload, due to a maintenance, we don’t care about the unreachability
>> advertisement (adding unnecessary LSP in the LSDB).
>>
>>
>>
>> We all know that option 1 -3 work well and has been working well for long
>> time. Behavior is very well understood
>>
>>
>>
>> With the new option 4, to add value, applications need to get what is
>> being referenced as ‘vendor secret sauce’ … I can already see the fun
>> caused by inconsistent behavior and interop issues due to under
>> specification.
>>
>> [DV] not sure I am following your “secret sauce” point here. Following
>> the RFC5305/RFC5308 should be clear.
>>
>>
>>
>> The question I remain to have is if the UPA provide higher benefit as the
>> tax it introduces. I can see operators suffer due to under specification,
>> causing interop and inconsistent behaviors.
>>
>>
>>
>> I agree with Bruno’s statement “If you believe that all you need is
>> RFC5305/RFC5308 I guess this means that we don't need
>> draft-ppsenak-lsr-igp-ureach-prefix-announce”
>>
>>
>>
>> [DV] well, “draft-ppsenak-lsr-igp-ureach-prefix-an

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-20 Thread Aijun Wang
Hi, Anup:

The advantage of PUA over BFD is that the operator needs not deploy o(n^2) BFD 
sessions for the services that rely on the IGP reachablity. 
Such comparisons have been discussed on the list.

Aijun Wang
China Telecom

> On Jun 18, 2022, at 12:55, Anup MalenaaDu  wrote:
> 
> 
> Hi,
> 
> BGP uses BFD to track the remote PEs. 
> So, how does PUA really help?
> 
> To be precise, 
> 1. what are the advantages of having PUAs for IGPs 
> 2. what are the advantages for services like BGP, Tunnels, LSPs etc going 
> over IGPs
> 
> Thanks,
> Anup
> 
>> On Thu, Jun 16, 2022 at 7:41 PM Voyer, Daniel 
>>  wrote:
>> Hi Gunter, see [DV]
>> 
>>  
>> 
>> From: "Van De Velde, Gunter (Nokia - BE/Antwerp)" 
>> 
>> Date: Thursday, June 16, 2022 at 6:38 AM
>> To: Robert Raszuk 
>> Cc: Gyan Mishra , Dan Voyer , 
>> "draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org" 
>> , 
>> draft-wang-lsr-prefix-unreachable-annoucement 
>> , "lsr@ietf.org" 
>> 
>> Subject: [EXT]RE: [Lsr] Thoughts about PUAs - are we not over-engineering?
>> 
>>  
>> 
>> Hi Robert, Peter, Bruno
>> 
>>  
>> 
>> You wrote:
>> 
>> “Aas there is no association between node_id (perhaps loopback) and SIDs 
>> (note that egress can use many SIDs) UPA really does not signal anything 
>> about SIDs reachability or liveness. “
>> 
>>  
>> 
>> Sure, but UPA signals that a locator is unreachable, would that not result 
>> for the SRv6 SID to become unreachable as well?
>> 
>>  
>> 
>> I understood that UPA have increased value add benefit when using with SRv6. 
>> If suddenly a locator becomes unreachable, then it I guess the associated 
>> 128 bit SIDs become unreachable too, causing an event for something to 
>> happen in the transport network to fix the problem.
>> 
>>  
>> 
>> That being said, Peter makes a good point stating that UPA is not solving 
>> the problem of partitioning areas, and hence, maybe my use-case is not 
>> overly relevant.
>> 
>>  
>> 
>> So progressing, an operator using ABR based summarization then there are few 
>> options:
>> 
>> No summarization at all at ABRs
>> Summarize on ABR all prefixes that can be summarized
>> Summarize all prefixes that are not associated with PEs and remain 
>> advertising individual PE addresses
>> Summarize all prefixes and use UPA’s to advertise unreachability of 
>> protected prefixes
>>  
>> 
>> [DV] if “an operator using ABR based summarization” then option 1 is out, 
>> right ? Also, option 4 is the point of this draft – but furthermore, if an 
>> aggregation device, inside a domain, is also being summarized – as the 
>> entire domain get summarized – but this agg device doesn’t have any 
>> services, because it’s an aggregation device, “then it’s up to the operator 
>> designing the network to implement” a form of policy/filter. So if that agg 
>> device reload, due to a maintenance, we don’t care about the unreachability 
>> advertisement (adding unnecessary LSP in the LSDB).
>> 
>>  
>> 
>> We all know that option 1 -3 work well and has been working well for long 
>> time. Behavior is very well understood
>> 
>>  
>> 
>> With the new option 4, to add value, applications need to get what is being 
>> referenced as ‘vendor secret sauce’ … I can already see the fun caused by 
>> inconsistent behavior and interop issues due to under specification.
>> 
>> [DV] not sure I am following your “secret sauce” point here. Following the 
>> RFC5305/RFC5308 should be clear.
>> 
>>  
>> 
>> The question I remain to have is if the UPA provide higher benefit as the 
>> tax it introduces. I can see operators suffer due to under specification, 
>> causing interop and inconsistent behaviors.
>> 
>>  
>> 
>> I agree with Bruno’s statement “If you believe that all you need is 
>> RFC5305/RFC5308 I guess this means that we don't need 
>> draft-ppsenak-lsr-igp-ureach-prefix-announce”
>> 
>>  
>> 
>> [DV] well, “draft-ppsenak-lsr-igp-ureach-prefix-announce”, is describing a 
>> use case/architecture and what you can do w/ RFC5305/RFC5308 – its 
>> “informational” 
>> 
>>  
>> 
>> G/
>> 
>>  
>> 
>>  
>> 
>> From: Robert Raszuk  
>> Sent: Thursday, June 16, 2022 11:54 AM
>> To: Van De Velde, Gunter (Nokia - BE/Antwerp) 
>> Cc: Gyan Mishra ; Voyer, Daniel 
>&g

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-17 Thread Anup MalenaaDu
Hi,

BGP uses BFD to track the remote PEs.
So, how does PUA really help?

To be precise,
1. what are the advantages of having PUAs for IGPs
2. what are the advantages for services like BGP, Tunnels, LSPs etc going
over IGPs

Thanks,
Anup

On Thu, Jun 16, 2022 at 7:41 PM Voyer, Daniel  wrote:

> Hi Gunter, see [DV]
>
>
>
> *From: *"Van De Velde, Gunter (Nokia - BE/Antwerp)" <
> gunter.van_de_ve...@nokia.com>
> *Date: *Thursday, June 16, 2022 at 6:38 AM
> *To: *Robert Raszuk 
> *Cc: *Gyan Mishra , Dan Voyer ,
> "draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org" <
> draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org>,
> draft-wang-lsr-prefix-unreachable-annoucement <
> draft-wang-lsr-prefix-unreachable-annoucem...@ietf.org>, "lsr@ietf.org" <
> lsr@ietf.org>
> *Subject: *[EXT]RE: [Lsr] Thoughts about PUAs - are we not
> over-engineering?
>
>
>
> Hi Robert, Peter, Bruno
>
>
>
> You wrote:
>
> “Aas there is no association between node_id (perhaps loopback) and SIDs
> (note that egress can use many SIDs) UPA really does not signal anything
> about SIDs reachability or liveness. “
>
>
>
> Sure, but UPA signals that a locator is unreachable, would that not result
> for the SRv6 SID to become unreachable as well?
>
>
>
> I understood that UPA have increased value add benefit when using with
> SRv6. If suddenly a locator becomes unreachable, then it I guess the
> associated 128 bit SIDs become unreachable too, causing an event for
> something to happen in the transport network to fix the problem.
>
>
>
> That being said, Peter makes a good point stating that UPA is not solving
> the problem of partitioning areas, and hence, maybe my use-case is not
> overly relevant.
>
>
>
> So progressing, an operator using ABR based summarization then there are
> few options:
>
>1. No summarization at all at ABRs
>2. Summarize on ABR all prefixes that can be summarized
>3. Summarize all prefixes that are not associated with PEs and remain
>advertising individual PE addresses
>4. Summarize all prefixes and use UPA’s to advertise unreachability of
>protected prefixes
>
>
>
> [DV] if “an operator using ABR based summarization” then option 1 is out,
> right ? Also, option 4 is the point of this draft – but furthermore, if an
> aggregation device, inside a domain, is also being summarized – as the
> entire domain get summarized – but this agg device doesn’t have any
> services, because it’s an aggregation device, “then it’s up to the operator
> designing the network to implement” a form of policy/filter. So if that agg
> device reload, due to a maintenance, we don’t care about the unreachability
> advertisement (adding unnecessary LSP in the LSDB).
>
>
>
> We all know that option 1 -3 work well and has been working well for long
> time. Behavior is very well understood
>
>
>
> With the new option 4, to add value, applications need to get what is
> being referenced as ‘vendor secret sauce’ … I can already see the fun
> caused by inconsistent behavior and interop issues due to under
> specification.
>
> [DV] not sure I am following your “secret sauce” point here. Following the 
> RFC5305/RFC5308
> should be clear.
>
>
>
> The question I remain to have is if the UPA provide higher benefit as the
> tax it introduces. I can see operators suffer due to under specification,
> causing interop and inconsistent behaviors.
>
>
>
> I agree with Bruno’s statement “If you believe that all you need is
> RFC5305/RFC5308 I guess this means that we don't need
> draft-ppsenak-lsr-igp-ureach-prefix-announce”
>
>
>
> [DV] well, “draft-ppsenak-lsr-igp-ureach-prefix-announce”, is describing
> a use case/architecture and what you can do w/ RFC5305/RFC5308 – its
> “informational” 
>
>
>
> G/
>
>
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Thursday, June 16, 2022 11:54 AM
> *To:* Van De Velde, Gunter (Nokia - BE/Antwerp) <
> gunter.van_de_ve...@nokia.com>
> *Cc:* Gyan Mishra ; Voyer, Daniel  40bell...@dmarc.ietf.org>;
> draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org;
> draft-wang-lsr-prefix-unreachable-annoucement <
> draft-wang-lsr-prefix-unreachable-annoucem...@ietf.org>; lsr@ietf.org
> *Subject:* Re: [Lsr] Thoughts about PUAs - are we not over-engineering?
>
>
>
> Gunter,
>
>
>
> (1) Multiple-ABRs
>
>
>
> I was wondering for example if a ingress router receives a PUA signaling
> that a given locator becomes unreachable, does that actually really signals
> that the SID ‘really’ is unreachable for a router?
>
>
>
> Aas there is no association be

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-16 Thread Voyer, Daniel
Hi Gunter, see [DV]

From: "Van De Velde, Gunter (Nokia - BE/Antwerp)" 

Date: Thursday, June 16, 2022 at 6:38 AM
To: Robert Raszuk 
Cc: Gyan Mishra , Dan Voyer , 
"draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org" 
, 
draft-wang-lsr-prefix-unreachable-annoucement 
, "lsr@ietf.org" 

Subject: [EXT]RE: [Lsr] Thoughts about PUAs - are we not over-engineering?

Hi Robert, Peter, Bruno

You wrote:
“Aas there is no association between node_id (perhaps loopback) and SIDs (note 
that egress can use many SIDs) UPA really does not signal anything about SIDs 
reachability or liveness. “

Sure, but UPA signals that a locator is unreachable, would that not result for 
the SRv6 SID to become unreachable as well?

I understood that UPA have increased value add benefit when using with SRv6. If 
suddenly a locator becomes unreachable, then it I guess the associated 128 bit 
SIDs become unreachable too, causing an event for something to happen in the 
transport network to fix the problem.

That being said, Peter makes a good point stating that UPA is not solving the 
problem of partitioning areas, and hence, maybe my use-case is not overly 
relevant.

So progressing, an operator using ABR based summarization then there are few 
options:

  1.  No summarization at all at ABRs
  2.  Summarize on ABR all prefixes that can be summarized
  3.  Summarize all prefixes that are not associated with PEs and remain 
advertising individual PE addresses
  4.  Summarize all prefixes and use UPA’s to advertise unreachability of 
protected prefixes

[DV] if “an operator using ABR based summarization” then option 1 is out, right 
? Also, option 4 is the point of this draft – but furthermore, if an 
aggregation device, inside a domain, is also being summarized – as the entire 
domain get summarized – but this agg device doesn’t have any services, because 
it’s an aggregation device, “then it’s up to the operator designing the network 
to implement” a form of policy/filter. So if that agg device reload, due to a 
maintenance, we don’t care about the unreachability advertisement (adding 
unnecessary LSP in the LSDB).

We all know that option 1 -3 work well and has been working well for long time. 
Behavior is very well understood

With the new option 4, to add value, applications need to get what is being 
referenced as ‘vendor secret sauce’ … I can already see the fun caused by 
inconsistent behavior and interop issues due to under specification.
[DV] not sure I am following your “secret sauce” point here. Following the 
RFC5305/RFC5308 should be clear.

The question I remain to have is if the UPA provide higher benefit as the tax 
it introduces. I can see operators suffer due to under specification, causing 
interop and inconsistent behaviors.


I agree with Bruno’s statement “If you believe that all you need is 
RFC5305/RFC5308 I guess this means that we don't need 
draft-ppsenak-lsr-igp-ureach-prefix-announce”


[DV] well, “draft-ppsenak-lsr-igp-ureach-prefix-announce”, is describing a use 
case/architecture and what you can do w/ RFC5305/RFC5308 – its “informational” 

G/


From: Robert Raszuk 
Sent: Thursday, June 16, 2022 11:54 AM
To: Van De Velde, Gunter (Nokia - BE/Antwerp) 
Cc: Gyan Mishra ; Voyer, Daniel 
; 
draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
draft-wang-lsr-prefix-unreachable-annoucement 
; lsr@ietf.org
Subject: Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

Gunter,

(1) Multiple-ABRs

I was wondering for example if a ingress router receives a PUA signaling that a 
given locator becomes unreachable, does that actually really signals that the 
SID ‘really’ is unreachable for a router?

Aas there is no association between node_id (perhaps loopback) and SIDs (note 
that egress can use many SIDs) UPA really does not signal anything about SIDs 
reachability or liveness.

 For example (simple design to illustrate the corner-case):

ingressPE#1---area#1---ABR#1---area---ABR#2---area#3---egressPE#2
 |  |
 |  |
 +area#1---ABR#3---area---ABR#4---area#3+

What if ABR#4 would loose connectivity to egressPE#2 and ABR#2 does not?
In that case ABR#4 will originate a UPA/PUA and ABR#2 does not originate a 
PUA/UPA.
How is ingressPE#1 supposed to handle this situation? The only thing 
ingressPE#1 see is that suddenly there is a PUA/UPA but reachability may not 
have changed at all and remains perfectly reacheable.

Valid case. But PE1 should only switch when alternative backup path exists. If 
there is a single path it should do nothing in any case of receiving UPA. We 
have discussed that case before and as you know the formal answer was "out of 
scope" or "vendor's secret sauce" :).

The justification here is that switching to healthy backup is better then 
continue using perhaps semi-sick path.

Best,
R.

_

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-16 Thread Robert Raszuk
UPAs may not even contain the advertised locator in SIDs. That is not
clearly spelled out what exactly ABRs should advertise.

I presume:
a) something which was flooded in the local domain and was not being leaked
AND
b) something which stopped to be flooded in a local domain
AND
c) there is local policy specifying such range

 agree with Bruno’s statement “If you believe that all you need is
> RFC5305/RFC5308 I guess this means that we don't need
> draft-ppsenak-lsr-igp-ureach-prefix-announce”
>

Well at this time this is an Informational draft.

But based on Bruno's comments I am worried if any node receiving something
with MAX_PATH_METRIC which was not advertised before as valid and reachable
prefix and did not make it into LSDB or RIB/FIB will not simply introduce a
new unknown for the implementations state how to handle such prefix which
may result in different interesting undefined behaviour(s).

Thx,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-16 Thread Van De Velde, Gunter (Nokia - BE/Antwerp)
Hi Robert, Peter, Bruno

You wrote:
“Aas there is no association between node_id (perhaps loopback) and SIDs (note 
that egress can use many SIDs) UPA really does not signal anything about SIDs 
reachability or liveness. “

Sure, but UPA signals that a locator is unreachable, would that not result for 
the SRv6 SID to become unreachable as well?

I understood that UPA have increased value add benefit when using with SRv6. If 
suddenly a locator becomes unreachable, then it I guess the associated 128 bit 
SIDs become unreachable too, causing an event for something to happen in the 
transport network to fix the problem.

That being said, Peter makes a good point stating that UPA is not solving the 
problem of partitioning areas, and hence, maybe my use-case is not overly 
relevant.

So progressing, an operator using ABR based summarization then there are few 
options:

  1.  No summarization at all at ABRs
  2.  Summarize on ABR all prefixes that can be summarized
  3.  Summarize all prefixes that are not associated with PEs and remain 
advertising individual PE addresses
  4.  Summarize all prefixes and use UPA’s to advertise unreachability of 
protected prefixes

We all know that option 1 -3 work well and has been working well for long time. 
Behavior is very well understood

With the new option 4, to add value, applications need to get what is being 
referenced as ‘vendor secret sauce’ … I can already see the fun caused by 
inconsistent behavior and interop issues due to under specification.

The question I remain to have is if the UPA provide higher benefit as the tax 
it introduces. I can see operators suffer due to under specification, causing 
interop and inconsistent behaviors.


I agree with Bruno’s statement “If you believe that all you need is 
RFC5305/RFC5308 I guess this means that we don't need 
draft-ppsenak-lsr-igp-ureach-prefix-announce”

G/


From: Robert Raszuk 
Sent: Thursday, June 16, 2022 11:54 AM
To: Van De Velde, Gunter (Nokia - BE/Antwerp) 
Cc: Gyan Mishra ; Voyer, Daniel 
; 
draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
draft-wang-lsr-prefix-unreachable-annoucement 
; lsr@ietf.org
Subject: Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

Gunter,

(1) Multiple-ABRs

I was wondering for example if a ingress router receives a PUA signaling that a 
given locator becomes unreachable, does that actually really signals that the 
SID ‘really’ is unreachable for a router?

Aas there is no association between node_id (perhaps loopback) and SIDs (note 
that egress can use many SIDs) UPA really does not signal anything about SIDs 
reachability or liveness.

 For example (simple design to illustrate the corner-case):

ingressPE#1---area#1---ABR#1---area---ABR#2---area#3---egressPE#2
 |  |
 |  |
 +area#1---ABR#3---area---ABR#4---area#3+

What if ABR#4 would loose connectivity to egressPE#2 and ABR#2 does not?
In that case ABR#4 will originate a UPA/PUA and ABR#2 does not originate a 
PUA/UPA.
How is ingressPE#1 supposed to handle this situation? The only thing 
ingressPE#1 see is that suddenly there is a PUA/UPA but reachability may not 
have changed at all and remains perfectly reacheable.

Valid case. But PE1 should only switch when alternative backup path exists. If 
there is a single path it should do nothing in any case of receiving UPA. We 
have discussed that case before and as you know the formal answer was "out of 
scope" or "vendor's secret sauce" :).

The justification here is that switching to healthy backup is better then 
continue using perhaps semi-sick path.

Best,
R.

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-16 Thread Peter Psenak

Hi Gunter,

please see inline (##PP):

On 16/06/2022 10:09, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:

Hi Gyan, Daniel, Peter, All,

Thanks for sharing your insights and I agree mostly with your feedback

I agree and understand that summarization is needed to reduce the size 
of the LSDB. I also agree summarization good design practice, especially 
with IPv6 and SRv6 in mind. There never has been doubt about that.


I am not sure I agree that UAP/UPA is ‘optimal-design’. Maybe it is the 
best we can do, however I have a healthy worry we could be suffering 
tunnel vision and that proposed solution may not be good enough.


We should not be blind and believe that advertising UPA/PUA does not 
come without a cost. The architectural PUA/UPA usage complexity cost may 
not be worth the effort (none of the integration of using a PUA/UPA 
event triggers come for free). Do we really believe that PUA/UPA solve 
all the SID reachability problems for all IGP network design and SR 
use-cases elegantly? Maybe some use-case design constraints and 
assumptions should be documented to clarify architecturally where 
PUA/UPA is most beneficial for operators? Just stating “outside scope of 
the draft” seems unfair to operators interested in PUA/UPAs


##PP
we are trying to solve a particular problem of remote PE going down in 
network where summarization is used. I believe that is stated clearly in 
the UPA draft.




Let me give two examples where PUA/UPA benefit is unclear:

(1) Multiple-ABRs

I was wondering for example if a ingress router receives a PUA signaling 
that a given locator becomes unreachable, does that actually really 
signals that the SID ‘really’ is unreachable for a router?


For example (simple design to illustrate the corner-case):

ingressPE#1---area#1---ABR#1---area---ABR#2---area#3---egressPE#2

  |  |

  |  |

  +area#1---ABR#3---area---ABR#4---area#3+

What if ABR#4 would loose connectivity to egressPE#2 and ABR#2 does not?

In that case ABR#4 will originate a UPA/PUA and ABR#2 does not originate 
a PUA/UPA.


How is ingressPE#1 supposed to handle this situation? The only thing 
ingressPE#1 see is that suddenly there is a PUA/UPA but reachability may 
not have changed at all and remains perfectly reacheable.


##PP
we are not trying to solve the area partitioning problem with UPA.

Clearly, if you summarize on both ABRs and your area partitions, you 
connectivity is broken, as you have no control on which ABR the traffic 
will use to enter the partitioned area. If you hit the one that has no 
connectivity to the egress PE, your traffic will be dropped.


With UPA, at least the service traffic can be switched to an alternate 
egress PE, if there is one, preserving the connectivity for the service 
prefixes.




(2) with sr-policy or SRv6 SRTE

What if we have an inter-area/domain/level SRTE or sr-policy and 
suddenly there is a PUA/UPA for one of the SIDs in the sid-list of the path.


will this impact the srte or sr-policy in any way? Will transit routers 
do anything with the UPA/PUA and drop packets. Will transit routers 
trigger fast-restoration?


##PP
we are not specifying any of that. If the implementation decide to use 
UPA on transit routers for some application, we do not prohibit it.




Can PCEs/controllers use the SID for crafting paths? Will all 
SRTE/sr-policy using the locator be pruned or re-signaled?


Will ingress router do something with the PUA information? Should 
PUA/UPA draft give guidelines around this?


##PP
UPA draft only describes the ISIS asignalling part, not the external 
application handling of the UPA. That would not be appropriate in IGP draft.


thanks,
Peter



Be well,

G/

If there is an SRTE or sr-policy using a given SID somewhere in the SID 
list… and suddenly


*From:*Gyan Mishra 
*Sent:* Thursday, June 16, 2022 6:12 AM
*To:* Voyer, Daniel 
*Cc:* Van De Velde, Gunter (Nokia - BE/Antwerp) 
; 
draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
draft-wang-lsr-prefix-unreachable-annoucement 
; lsr@ietf.org

*Subject:* Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

Summarization has always been a best practice for network scalability 
thereby reducing the size of the RIB and LSDB.


So in this case as Dan pointed out,  the summary route is an abstraction 
of the area and so if a component prefix of the summary became 
unreachable we need a way to signal that the PE next hop is no longer 
reachable to help optimize convergence.


We are just trying to make summarization work better then it does today 
so we don’t have to rely on domain wide flooding of host routes.


Thanks

Gyan

On Wed, Jun 15, 2022 at 4:42 PM Voyer, Daniel 
<mailto:40bell...@dmarc.ietf.org>> wrote:


Hi Gunter,

Thanks for your comments,

The idea, here, with summarization is to "reduce" the LSDB quite 

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-16 Thread Robert Raszuk
Gunter,

(1) Multiple-ABRs
>
>
>
> I was wondering for example if a ingress router receives a PUA signaling
> that a given locator becomes unreachable, does that actually really signals
> that the SID ‘really’ is unreachable for a router?
>

Aas there is no association between node_id (perhaps loopback) and SIDs
(note that egress can use many SIDs) UPA really does not signal anything
about SIDs reachability or liveness.


>  For example (simple design to illustrate the corner-case):
>
>
>
> ingressPE#1---area#1---ABR#1---area---ABR#2---area#3---egressPE#2
>
>  |  |
>
>  |  |
>
>  +area#1---ABR#3---area---ABR#4---area#3+
>
>
>
> What if ABR#4 would loose connectivity to egressPE#2 and ABR#2 does not?
>
> In that case ABR#4 will originate a UPA/PUA and ABR#2 does not originate a
> PUA/UPA.
>
> How is ingressPE#1 supposed to handle this situation? The only thing
> ingressPE#1 see is that suddenly there is a PUA/UPA but reachability may
> not have changed at all and remains perfectly reacheable.
>

Valid case. But PE1 should only switch when alternative backup path exists.
If there is a single path it should do nothing in any case of receiving
UPA. We have discussed that case before and as you know the formal answer
was "out of scope" or "vendor's secret sauce" :).

The justification here is that switching to healthy backup is better then
continue using perhaps semi-sick path.

Best,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-16 Thread Aijun Wang
Hi, Gunter:

Thanks for your through thoughts.
For your mentioned case 1), the PUA draft has already the considerations: 
https://datatracker.ietf.org/doc/html/draft-wang-lsr-prefix-unreachable-annoucement-09#section-4

“If the nodes in the area receive the PUAM flood from all of its ABR
   routers, they will start BGP convergence process if there exist BGP
   session on this PUAM prefix.  The PUAM creates a forced fail over
   action to initiate immediate control plane convergence switchover to
   alternate egress PE.  Without the PUAM forced convergence the down
   prefix will yield black hole routing resulting in loss of
   connectivity.

   When only some of the ABRs can't reach the failure node/link, as that
   described in Section 3.2, the ABR that can reach the PUAM prefix
   should advertise one specific route to this PUAM prefix.  The
   internal routers within another area can then bypass the ABRs that
   can't reach the PUAM prefix, to reach the PUAM prefix.”

For your mentioned case 2), we think the transit receiver should do the local 
bypass if there is PLR configured, or the ingress router/PCE should switch the 
traffic to other path that can avoid the failure node. These are all 
applications of the PUA/UPA messages, and we can add some statements if 
necessary on the deployment considerations parts.

Aijun Wang
China Telecom

> On Jun 16, 2022, at 16:10, Van De Velde, Gunter (Nokia - BE/Antwerp) 
>  wrote:
> 
> 
> Hi Gyan, Daniel, Peter, All,
>  
> Thanks for sharing your insights and I agree mostly with your feedback
>  
> I agree and understand that summarization is needed to reduce the size of the 
> LSDB. I also agree summarization good design practice, especially with IPv6 
> and SRv6 in mind. There never has been doubt about that.
> I am not sure I agree that UAP/UPA is ‘optimal-design’. Maybe it is the best 
> we can do, however I have a healthy worry we could be suffering tunnel vision 
> and that proposed solution may not be good enough.
> We should not be blind and believe that advertising UPA/PUA does not come 
> without a cost. The architectural PUA/UPA usage complexity cost may not be 
> worth the effort (none of the integration of using a PUA/UPA event triggers 
> come for free). Do we really believe that PUA/UPA solve all the SID 
> reachability problems for all IGP network design and SR use-cases elegantly? 
> Maybe some use-case design constraints and assumptions should be documented 
> to clarify architecturally where PUA/UPA is most beneficial for operators? 
> Just stating “outside scope of the draft” seems unfair to operators 
> interested in PUA/UPAs
>  
> Let me give two examples where PUA/UPA benefit is unclear:
>  
> (1) Multiple-ABRs
>  
> I was wondering for example if a ingress router receives a PUA signaling that 
> a given locator becomes unreachable, does that actually really signals that 
> the SID ‘really’ is unreachable for a router?
>  
> For example (simple design to illustrate the corner-case):
>  
> ingressPE#1---area#1---ABR#1---area---ABR#2---area#3---egressPE#2
>  |  |
>  |  |
>  +area#1---ABR#3---area---ABR#4---area#3+
>  
> What if ABR#4 would loose connectivity to egressPE#2 and ABR#2 does not?
> In that case ABR#4 will originate a UPA/PUA and ABR#2 does not originate a 
> PUA/UPA.
> How is ingressPE#1 supposed to handle this situation? The only thing 
> ingressPE#1 see is that suddenly there is a PUA/UPA but reachability may not 
> have changed at all and remains perfectly reacheable.
>  
>  
> (2) with sr-policy or SRv6 SRTE
> What if we have an inter-area/domain/level SRTE or sr-policy and suddenly 
> there is a PUA/UPA for one of the SIDs in the sid-list of the path.
> will this impact the srte or sr-policy in any way? Will transit routers do 
> anything with the UPA/PUA and drop packets. Will transit routers trigger 
> fast-restoration?
> Can PCEs/controllers use the SID for crafting paths? Will all SRTE/sr-policy 
> using the locator be pruned or re-signaled?
> Will ingress router do something with the PUA information? Should PUA/UPA 
> draft give guidelines around this?
>  
> Be well,
> G/
>  
>  
>  
>  
>  
>  
>  
> If there is an SRTE or sr-policy using a given SID somewhere in the SID list… 
> and suddenly
>  
>  
>  
> From: Gyan Mishra  
> Sent: Thursday, June 16, 2022 6:12 AM
> To: Voyer, Daniel 
> Cc: Van De Velde, Gunter (Nokia - BE/Antwerp) 
> ; 
> draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
> draft-wang-lsr-prefix-unreachable-annoucement 
> ; lsr@ietf.org
> Subject: Re: [Lsr] Thoughts about PUAs - are we not over-engineering?
>  
>

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-16 Thread Van De Velde, Gunter (Nokia - BE/Antwerp)
Hi Gyan, Daniel, Peter, All,

Thanks for sharing your insights and I agree mostly with your feedback

I agree and understand that summarization is needed to reduce the size of the 
LSDB. I also agree summarization good design practice, especially with IPv6 and 
SRv6 in mind. There never has been doubt about that.
I am not sure I agree that UAP/UPA is ‘optimal-design’. Maybe it is the best we 
can do, however I have a healthy worry we could be suffering tunnel vision and 
that proposed solution may not be good enough.
We should not be blind and believe that advertising UPA/PUA does not come 
without a cost. The architectural PUA/UPA usage complexity cost may not be 
worth the effort (none of the integration of using a PUA/UPA event triggers 
come for free). Do we really believe that PUA/UPA solve all the SID 
reachability problems for all IGP network design and SR use-cases elegantly? 
Maybe some use-case design constraints and assumptions should be documented to 
clarify architecturally where PUA/UPA is most beneficial for operators? Just 
stating “outside scope of the draft” seems unfair to operators interested in 
PUA/UPAs

Let me give two examples where PUA/UPA benefit is unclear:

(1) Multiple-ABRs

I was wondering for example if a ingress router receives a PUA signaling that a 
given locator becomes unreachable, does that actually really signals that the 
SID ‘really’ is unreachable for a router?

For example (simple design to illustrate the corner-case):

ingressPE#1---area#1---ABR#1---area---ABR#2---area#3---egressPE#2
 |  |
 |  |
 +area#1---ABR#3---area---ABR#4---area#3+

What if ABR#4 would loose connectivity to egressPE#2 and ABR#2 does not?
In that case ABR#4 will originate a UPA/PUA and ABR#2 does not originate a 
PUA/UPA.
How is ingressPE#1 supposed to handle this situation? The only thing 
ingressPE#1 see is that suddenly there is a PUA/UPA but reachability may not 
have changed at all and remains perfectly reacheable.


(2) with sr-policy or SRv6 SRTE
What if we have an inter-area/domain/level SRTE or sr-policy and suddenly there 
is a PUA/UPA for one of the SIDs in the sid-list of the path.
will this impact the srte or sr-policy in any way? Will transit routers do 
anything with the UPA/PUA and drop packets. Will transit routers trigger 
fast-restoration?
Can PCEs/controllers use the SID for crafting paths? Will all SRTE/sr-policy 
using the locator be pruned or re-signaled?
Will ingress router do something with the PUA information? Should PUA/UPA draft 
give guidelines around this?

Be well,
G/







If there is an SRTE or sr-policy using a given SID somewhere in the SID list… 
and suddenly



From: Gyan Mishra 
Sent: Thursday, June 16, 2022 6:12 AM
To: Voyer, Daniel 
Cc: Van De Velde, Gunter (Nokia - BE/Antwerp) ; 
draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
draft-wang-lsr-prefix-unreachable-annoucement 
; lsr@ietf.org
Subject: Re: [Lsr] Thoughts about PUAs - are we not over-engineering?


Summarization has always been a best practice for network scalability thereby 
reducing the size of the RIB and LSDB.

So in this case as Dan pointed out,  the summary route is an abstraction of the 
area and so if a component prefix of the summary became unreachable we need a 
way to signal that the PE next hop is no longer reachable to help optimize 
convergence.

We are just trying to make summarization work better then it does today so we 
don’t have to rely on domain wide flooding of host routes.

Thanks

Gyan


On Wed, Jun 15, 2022 at 4:42 PM Voyer, Daniel 
mailto:40bell...@dmarc.ietf.org>> wrote:
Hi Gunter,

Thanks for your comments,

The idea, here, with summarization is to "reduce" the LSDB quite a lots and 
make a given backbone much more scalable / flexible and allow to simplify NNI's 
within that given backbones considerably.
Summarization is "needed" for better scale and, in the context of IPv6, will 
help in preventing blowing up the IGP.  With the size of an IPv6 prefix range 
(ex. /64) allocated per domain - summarization will help to contain the LSDB to 
that domain.

What we are "highlighting" in draft-ppsenak-lsr-igp-ureach-prefix-announce-00, 
is an easy way to overcome the fact that PEs are hidden behind a summary route 
and need a fast way to notify other PEs when they become unreachable.

I don't see "over-engineering" here, I see "optimal-engineering" instead.

Thanks
Dan

On 2022-06-14, 4:59 AM, "Van De Velde, Gunter (Nokia - BE/Antwerp)" 
mailto:gunter.van_de_ve...@nokia.com>> wrote:

Hi All,

When reading both proposals about PUA's:
* draft-ppsenak-lsr-igp-ureach-prefix-announce-00
* draft-wang-lsr-prefix-unreachable-annoucement-09

The identified problem space seems a correct observation, and indeed 
summaries hide remote area networ

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Gyan Mishra
Summarization has always been a best practice for network scalability
thereby reducing the size of the RIB and LSDB.

So in this case as Dan pointed out,  the summary route is an abstraction of
the area and so if a component prefix of the summary became unreachable we
need a way to signal that the PE next hop is no longer reachable to help
optimize convergence.

We are just trying to make summarization work better then it does today so
we don’t have to rely on domain wide flooding of host routes.

Thanks

Gyan


On Wed, Jun 15, 2022 at 4:42 PM Voyer, Daniel  wrote:

> Hi Gunter,
>
> Thanks for your comments,
>
> The idea, here, with summarization is to "reduce" the LSDB quite a lots
> and make a given backbone much more scalable / flexible and allow to
> simplify NNI's within that given backbones considerably.
> Summarization is "needed" for better scale and, in the context of IPv6,
> will help in preventing blowing up the IGP.  With the size of an IPv6
> prefix range (ex. /64) allocated per domain - summarization will help to
> contain the LSDB to that domain.
>
> What we are "highlighting" in
> draft-ppsenak-lsr-igp-ureach-prefix-announce-00, is an easy way to overcome
> the fact that PEs are hidden behind a summary route and need a fast way to
> notify other PEs when they become unreachable.
>
> I don't see "over-engineering" here, I see "optimal-engineering" instead.
>
> Thanks
> Dan
>
> On 2022-06-14, 4:59 AM, "Van De Velde, Gunter (Nokia - BE/Antwerp)" <
> gunter.van_de_ve...@nokia.com> wrote:
>
> Hi All,
>
> When reading both proposals about PUA's:
> * draft-ppsenak-lsr-igp-ureach-prefix-announce-00
> * draft-wang-lsr-prefix-unreachable-annoucement-09
>
> The identified problem space seems a correct observation, and indeed
> summaries hide remote area network instabilities. It is one of the
> perceived benefits of using summaries. The place in the network where this
> hiding takes the most impact upon convergence is at service nodes (PE's for
> L3/L2/transport) where due to the summarization its difficult to detect
> that the transport tunnel end-point suddenly becomes unreachable. My
> concern however is if it really is a problem that is worthy for LSR WG to
> solve.
>
> To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is
> not a preferred solution due to the expectation that all nodes in an area
> must be upgraded to support the IGP capability. From this operational
> perspective the draft "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is
> more elegant, as only the A(S)BR's and particular PEs must be upgraded to
> support PUA's. I do have concerns about the number of PUA advertisements in
> hierarchically summarized networks (/24 (site) -> /20 (region) -> /16
> (core)). More specific, in the /16 backbone area, how many of these PUAs
> will be floating around creating LSP LSDB update churns? How to control the
> potentially exponential number of observed PUAs from floating everywhere?
> (will this lead to OSPF type NSSA areas where areas will be purged from
> these PUAs for scaling stability?)
>
> Long story short, should we not take a step back and re-think this
> identified problem space? Is the proposed solution space not more evil as
> the problem space? We do summarization because it brings stability and
> reduce the number of link state updates within an area. And now with PUA we
> re-introduce additional link state updates (PUAs), we blow up the LSDB with
> information opaque to SPF best-path calculation. In addition there is
> suggestion of new state-machinery to track the igp reachability of
> 'protected' prefixes and there is maybe desire to contain or filter updates
> cross inter-area boundaries. And finally, how will we represent and track
> PUA in the RTM?
>
> What is wrong with simply not doing summaries and forget about these
> PUAs to pinch holes in the summary prefixes? this worked very well during
> last two decennia. Are we not over-engineering with PUAs?
>
> G/
>
> --
> External Email: Please use caution when opening links and attachments
> / Courriel externe: Soyez prudent avec les liens et documents joints
>
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
-- 



*Gyan Mishra*

*Network Solutions A**rchitect *

*Email gyan.s.mis...@verizon.com *



*M 301 502-1347*
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Voyer, Daniel
Hi Gunter,

Thanks for your comments,

The idea, here, with summarization is to "reduce" the LSDB quite a lots and 
make a given backbone much more scalable / flexible and allow to simplify NNI's 
within that given backbones considerably.
Summarization is "needed" for better scale and, in the context of IPv6, will 
help in preventing blowing up the IGP.  With the size of an IPv6 prefix range 
(ex. /64) allocated per domain - summarization will help to contain the LSDB to 
that domain.

What we are "highlighting" in draft-ppsenak-lsr-igp-ureach-prefix-announce-00, 
is an easy way to overcome the fact that PEs are hidden behind a summary route 
and need a fast way to notify other PEs when they become unreachable.

I don't see "over-engineering" here, I see "optimal-engineering" instead.

Thanks
Dan

On 2022-06-14, 4:59 AM, "Van De Velde, Gunter (Nokia - BE/Antwerp)" 
 wrote:

Hi All,

When reading both proposals about PUA's:
* draft-ppsenak-lsr-igp-ureach-prefix-announce-00
* draft-wang-lsr-prefix-unreachable-annoucement-09

The identified problem space seems a correct observation, and indeed 
summaries hide remote area network instabilities. It is one of the perceived 
benefits of using summaries. The place in the network where this hiding takes 
the most impact upon convergence is at service nodes (PE's for L3/L2/transport) 
where due to the summarization its difficult to detect that the transport 
tunnel end-point suddenly becomes unreachable. My concern however is if it 
really is a problem that is worthy for LSR WG to solve.

To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is not a 
preferred solution due to the expectation that all nodes in an area must be 
upgraded to support the IGP capability. From this operational perspective the 
draft "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant, as 
only the A(S)BR's and particular PEs must be upgraded to support PUA's. I do 
have concerns about the number of PUA advertisements in hierarchically 
summarized networks (/24 (site) -> /20 (region) -> /16 (core)). More specific, 
in the /16 backbone area, how many of these PUAs will be floating around 
creating LSP LSDB update churns? How to control the potentially exponential 
number of observed PUAs from floating everywhere? (will this lead to OSPF type 
NSSA areas where areas will be purged from these PUAs for scaling stability?)

Long story short, should we not take a step back and re-think this 
identified problem space? Is the proposed solution space not more evil as the 
problem space? We do summarization because it brings stability and reduce the 
number of link state updates within an area. And now with PUA we re-introduce 
additional link state updates (PUAs), we blow up the LSDB with information 
opaque to SPF best-path calculation. In addition there is suggestion of new 
state-machinery to track the igp reachability of 'protected' prefixes and there 
is maybe desire to contain or filter updates cross inter-area boundaries. And 
finally, how will we represent and track PUA in the RTM?

What is wrong with simply not doing summaries and forget about these PUAs 
to pinch holes in the summary prefixes? this worked very well during last two 
decennia. Are we not over-engineering with PUAs?

G/

--
External Email: Please use caution when opening links and attachments / 
Courriel externe: Soyez prudent avec les liens et documents joints


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Voyer, Daniel
Hi Gunter,

Thanks for your comments,

The idea, here, with summarization is to "reduce" the LSDB quite a lots and 
make a given backbone much more scalable / flexible and allow to simplify NNI's 
within that given backbones considerably. 
Summarization is "needed" for better scale and, in the context of IPv6, will 
help in preventing blowing up the IGP.  With the size of an IPv6 prefix range 
(ex. /64) allocated per domain - summarization will help to contain the LSDB to 
that domain.

What we are "highlighting" in draft-ppsenak-lsr-igp-ureach-prefix-announce-00, 
is an easy way to overcome the fact that PEs are hidden behind a summary route 
and need a fast way to notify other PEs when they become unreachable. 

I don't see "over-engineering" here, I see "optimal-engineering" instead. 

Thanks
Dan

On 2022-06-14, 4:59 AM, "Van De Velde, Gunter (Nokia - BE/Antwerp)" 
 wrote:

Hi All,

When reading both proposals about PUA's:
* draft-ppsenak-lsr-igp-ureach-prefix-announce-00
* draft-wang-lsr-prefix-unreachable-annoucement-09

The identified problem space seems a correct observation, and indeed 
summaries hide remote area network instabilities. It is one of the perceived 
benefits of using summaries. The place in the network where this hiding takes 
the most impact upon convergence is at service nodes (PE's for L3/L2/transport) 
where due to the summarization its difficult to detect that the transport 
tunnel end-point suddenly becomes unreachable. My concern however is if it 
really is a problem that is worthy for LSR WG to solve.

To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is not a 
preferred solution due to the expectation that all nodes in an area must be 
upgraded to support the IGP capability. From this operational perspective the 
draft "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant, as 
only the A(S)BR's and particular PEs must be upgraded to support PUA's. I do 
have concerns about the number of PUA advertisements in hierarchically 
summarized networks (/24 (site) -> /20 (region) -> /16 (core)). More specific, 
in the /16 backbone area, how many of these PUAs will be floating around 
creating LSP LSDB update churns? How to control the potentially exponential 
number of observed PUAs from floating everywhere? (will this lead to OSPF type 
NSSA areas where areas will be purged from these PUAs for scaling stability?)

Long story short, should we not take a step back and re-think this 
identified problem space? Is the proposed solution space not more evil as the 
problem space? We do summarization because it brings stability and reduce the 
number of link state updates within an area. And now with PUA we re-introduce 
additional link state updates (PUAs), we blow up the LSDB with information 
opaque to SPF best-path calculation. In addition there is suggestion of new 
state-machinery to track the igp reachability of 'protected' prefixes and there 
is maybe desire to contain or filter updates cross inter-area boundaries. And 
finally, how will we represent and track PUA in the RTM?

What is wrong with simply not doing summaries and forget about these PUAs 
to pinch holes in the summary prefixes? this worked very well during last two 
decennia. Are we not over-engineering with PUAs?

G/

--
External Email: Please use caution when opening links and attachments / 
Courriel externe: Soyez prudent avec les liens et documents joints


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Robert Raszuk
>
> looks to me that you are trying to steer the discussion towards the BGP
> based solution. Not something I'm interested on this thread.
>

Not at all. It was you not me who used argument that UPA/PUA is useful for
networks with no BGP ... example:

Quote:



*"I have explained that several times to you. There are SP networksrunning
the services on top of p2p IP sec tunnels for example, with no BGP."*



> > Also not all tunnels have keepalives. I am talking about mGRE
> > encapsulation as an example where you simply encapsulate and have no
> > idea other than consulting RIB if the dst node is up or down.
>
> in such case you can not use summarization at all.
>

Ok. Good to know :).

Best,
R.

PS.

Btw important point. Yes networks experience scale limits. But those limits
are usually not due to exponential grow of number of PEs. Such grow is
often associated with moving network services from routers to compute
blades. And guess what protocol is used in underlay to those compute blades
... BGP :).
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Peter Psenak

On 15/06/2022 15:41, Robert Raszuk wrote:

Traffic will initially switch to alternate path, if any, an
later the native mechanism (BGP signalling, tunnel keepalive, etc),
will
take over and bring it to its final state.


On one hand you are saying that UPA is useful where there is no BGP. So 
let's talk about such a scenario.


looks to me that you are trying to steer the discussion towards the BGP 
based solution. Not something I'm interested on this thread.




Also not all tunnels have keepalives. I am talking about mGRE 
encapsulation as an example where you simply encapsulate and have no 
idea other than consulting RIB if the dst node is up or down.


in such case you can not use summarization at all.

thanks,
Peter




In this discussed case it will keep sending packets to remote area only 
to drop it there ... not good.


Thx,
R.


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Robert Raszuk
> Traffic will initially switch to alternate path, if any, an
> later the native mechanism (BGP signalling, tunnel keepalive, etc), will
> take over and bring it to its final state.
>

On one hand you are saying that UPA is useful where there is no BGP. So
let's talk about such a scenario.

Also not all tunnels have keepalives. I am talking about mGRE encapsulation
as an example where you simply encapsulate and have no idea other than
consulting RIB if the dst node is up or down.

In this discussed case it will keep sending packets to remote area only to
drop it there ... not good.

Thx,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Peter Psenak

Robert,

On 15/06/2022 14:47, Robert Raszuk wrote:

Peter,

My question is precise  your answer is pretty loose :)

Imagine I use summarization and as you many times said there is no BGP 
running. So how do I indicate planned scheduled maintenance in such 
cases ? Say from either ABRs or PEs/Ps itself ?


if nothing special is done, UPA will be triggered for prefixes that are 
advertised by the node which undergoes planned restart - as the 
reachable prefixes that are summarized will become unreachable as a 
result of the node going down.




In fact, looking practically that may be much more useful and needed 
then signalling node failures.


And the issue I observed with using UPA is that as it is ephemeral it 
may not work well during extended maintenance windows. Stateful 
solutions however would work fine.


it works just fine for any node down case, being it a failure or planned 
maintenance. Traffic will initially switch to alternate path, if any, an 
later the native mechanism (BGP signalling, tunnel keepalive, etc), will 
take over and bring it to its final state.


thanks,
Peter



Thx,
R.







On Wed, Jun 15, 2022 at 2:34 PM Peter Psenak > wrote:


Robert,

On 15/06/2022 14:13, Robert Raszuk wrote:
 > Peter,
 >
 >     the meaning of LSInfinity has been defined decades ago. No
matter how
 >
 >     much you may not like it, but it means unreachable.
 >
 >
 > True. But that brings another question ... Do you envision to use
UPA
 > also to indicate planned maintenance of a node ?

depends on how the planned maintenance is performed. If yo just turn
the
node off, UPA will catch it. If you instead set OL-bit, or use link max
metric initially, it may or may not be used, depending on what the
ABR/ASBR is programmed to do. There is quite some flexibility if needed.

thanks,
Peter


 >
 > Thx,
 > R.
 >



___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Robert Raszuk
Peter,

My question is precise  your answer is pretty loose :)

Imagine I use summarization and as you many times said there is no BGP
running. So how do I indicate planned scheduled maintenance in such cases ?
Say from either ABRs or PEs/Ps itself ?

In fact, looking practically that may be much more useful and needed then
signalling node failures.

And the issue I observed with using UPA is that as it is ephemeral it may
not work well during extended maintenance windows. Stateful solutions
however would work fine.

Thx,
R.







On Wed, Jun 15, 2022 at 2:34 PM Peter Psenak  wrote:

> Robert,
>
> On 15/06/2022 14:13, Robert Raszuk wrote:
> > Peter,
> >
> > the meaning of LSInfinity has been defined decades ago. No matter how
> >
> > much you may not like it, but it means unreachable.
> >
> >
> > True. But that brings another question ... Do you envision to use UPA
> > also to indicate planned maintenance of a node ?
>
> depends on how the planned maintenance is performed. If yo just turn the
> node off, UPA will catch it. If you instead set OL-bit, or use link max
> metric initially, it may or may not be used, depending on what the
> ABR/ASBR is programmed to do. There is quite some flexibility if needed.
>
> thanks,
> Peter
>
>
> >
> > Thx,
> > R.
> >
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Peter Psenak

Robert,

On 15/06/2022 14:13, Robert Raszuk wrote:

Peter,

the meaning of LSInfinity has been defined decades ago. No matter how

much you may not like it, but it means unreachable.


True. But that brings another question ... Do you envision to use UPA 
also to indicate planned maintenance of a node ?


depends on how the planned maintenance is performed. If yo just turn the 
node off, UPA will catch it. If you instead set OL-bit, or use link max 
metric initially, it may or may not be used, depending on what the 
ABR/ASBR is programmed to do. There is quite some flexibility if needed.


thanks,
Peter




Thx,
R.



___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Aijun Wang
Hi, Peter:

Please review your document carefully:

https://datatracker.ietf.org/doc/html/draft-ppsenak-lsr-igp-ureach-prefix-announce-00#section-2.1:(UPA
 in IS-IS)

As per the definitions referenced in the preceding section, any
   prefix advertisement with a metric value greater than 0xFE00 can
   be used for purposes other than normal routing calculations.  Such an
   advertisement can be interpreted by the receiver as a UPA.

https://datatracker.ietf.org/doc/html/draft-ppsenak-lsr-igp-ureach-prefix-announce-00#section-3.1(UPA
 in OSPF)

Using the existing mechanism already defined in the standards, as
   described in previous section, an advertisement of the inter-area or
   external prefix inside OSPF or OSPFv3 LSA that has the age set to
   value lower than MaxAge and metic set to LSInfinity can be
   interpreted by the receiver as a UPA.

Aijun Wang
China Telecom

> On Jun 15, 2022, at 20:09, Peter Psenak  
> wrote:
> 
> On 15/06/2022 13:39, Aijun Wang wrote:
>> Hi, Peter:
>> What’s my meaning is that if you redefine or reuse the meaning of 
>> LSInfinity, there will be issues for other scenario that want to utilize 
>> this field.
>> In the mentioned example, the prefixes associated with the LSInfinity is 
>> certainly reachable, which is contradicted with your assumption.
> 
> not at all, you are interpreting it that way.
> 
> Peter
> 
> 
>> Aijun Wang
>> China Telecom
>>>> On Jun 15, 2022, at 19:18, Peter Psenak 
>>>>  wrote:
>>> 
>>> Aijun,
>>> 
>>>> On 15/06/2022 12:12, Aijun Wang wrote:
>>>> Hi, Peter:
>>>> If you use LSInfinity as the indicator of the prefixes unreachable, then 
>>>> how about you solve the situations that described in 
>>>> https://datatracker.ietf.org/doc/html/draft-ietf-lsr-ip-flexalgo#section-6.4
>>>>  
>>>> <https://datatracker.ietf.org/doc/html/draft-ietf-lsr-ip-flexalgo#section-6.4>,
>>>>  in which the the metric in parent TLV MUST be set to LSInfinity?
>>> 
>>> if the IP Algorithm Prefix Reachability Sub-TLV is present the metric from 
>>> that Sub-TLV is used instead. There is no problem.
>>> 
>>>> Will you consider all such prefixes unreachable? This is certainly not the 
>>>> aim of the IP FlexAlgo document.
>>>> In conclusion, the prefixes unreachable information should be indicated 
>>>> explicitly by other means, as that introduced in the PUA draft.
>>> 
>>> the meaning of LSInfinity has been defined decades ago. No matter how much 
>>> you may not like it, but it means unreachable.
>>> 
>>> thanks,
>>> Peter
>>> 
>>>> Aijun Wang
>>>> China Telecom
>>>>>> On Jun 15, 2022, at 17:09, Peter Psenak 
>>>>>>  wrote:
>>>>> 
>>>>> Hi Gunter,
>>>>> 
>>>>> On 15/06/2022 11:02, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:
>>>>>> Hi Robert,
>>>>>> I agree with you that the operator problem space is not limited to 
>>>>>> multi-area/levels with IGP summarisation.
>>>>>> With the PUA/UPA proposals I get the feeling that LSR WG is jumping into 
>>>>>> the deep-end and is re-vectoring the IGP to carry opaque information not 
>>>>>> used for SPF/cSPF.
>>>>>> I believe we should be conservative for such and if LSR WG progresses 
>>>>>> with such decision.
>>>>> 
>>>>> please note that UPA draft builds on existing protocol specification 
>>>>> defined in RFC5305 and RFC5308 that allow the metric larger then 
>>>>> MAX_PATH_METRIC to be used "for purposes other than building the normal 
>>>>> IP routing table". We are just documenting one of them.
>>>>> 
>>>>> thanks,
>>>>> Peter
>>>>> 
>>>>> 
>>>>>> It could very well be that re-vectoring is the best solution, but I 
>>>>>> guess we need to agree first on understanding the operator problem space.
>>>>>> G/
>>>>>> *From:*Robert Raszuk 
>>>>>> *Sent:* Tuesday, June 14, 2022 11:51 AM
>>>>>> *To:* Van De Velde, Gunter (Nokia - BE/Antwerp) 
>>>>>> 
>>>>>> *Cc:* lsr ; 
>>>>>> draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
>>>>>> draft-wang-lsr-prefix-unreachable-annoucement 
>>>>>> 
>>>>>> *Subject:* Re: [Lsr]

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Robert Raszuk
Peter,

the meaning of LSInfinity has been defined decades ago. No matter how
>
much you may not like it, but it means unreachable.


True. But that brings another question ... Do you envision to use UPA also
to indicate planned maintenance of a node ?

Thx,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Peter Psenak

On 15/06/2022 13:39, Aijun Wang wrote:

Hi, Peter:
What’s my meaning is that if you redefine or reuse the meaning of LSInfinity, 
there will be issues for other scenario that want to utilize this field.
In the mentioned example, the prefixes associated with the LSInfinity is 
certainly reachable, which is contradicted with your assumption.


not at all, you are interpreting it that way.

Peter




Aijun Wang
China Telecom


On Jun 15, 2022, at 19:18, Peter Psenak  
wrote:

Aijun,


On 15/06/2022 12:12, Aijun Wang wrote:
Hi, Peter:
If you use LSInfinity as the indicator of the prefixes unreachable, then how about 
you solve the situations that described in 
https://datatracker.ietf.org/doc/html/draft-ietf-lsr-ip-flexalgo#section-6.4 
<https://datatracker.ietf.org/doc/html/draft-ietf-lsr-ip-flexalgo#section-6.4>, 
in which the the metric in parent TLV MUST be set to LSInfinity?


if the IP Algorithm Prefix Reachability Sub-TLV is present the metric from that 
Sub-TLV is used instead. There is no problem.


Will you consider all such prefixes unreachable? This is certainly not the aim 
of the IP FlexAlgo document.
In conclusion, the prefixes unreachable information should be indicated 
explicitly by other means, as that introduced in the PUA draft.


the meaning of LSInfinity has been defined decades ago. No matter how much you 
may not like it, but it means unreachable.

thanks,
Peter


Aijun Wang
China Telecom

On Jun 15, 2022, at 17:09, Peter Psenak  
wrote:


Hi Gunter,

On 15/06/2022 11:02, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:

Hi Robert,
I agree with you that the operator problem space is not limited to 
multi-area/levels with IGP summarisation.
With the PUA/UPA proposals I get the feeling that LSR WG is jumping into the 
deep-end and is re-vectoring the IGP to carry opaque information not used for 
SPF/cSPF.
I believe we should be conservative for such and if LSR WG progresses with such 
decision.


please note that UPA draft builds on existing protocol specification defined in RFC5305 
and RFC5308 that allow the metric larger then MAX_PATH_METRIC to be used "for 
purposes other than building the normal IP routing table". We are just documenting 
one of them.

thanks,
Peter



It could very well be that re-vectoring is the best solution, but I guess we 
need to agree first on understanding the operator problem space.
G/
*From:*Robert Raszuk 
*Sent:* Tuesday, June 14, 2022 11:51 AM
*To:* Van De Velde, Gunter (Nokia - BE/Antwerp) 
*Cc:* lsr ; draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
draft-wang-lsr-prefix-unreachable-annoucement 

*Subject:* Re: [Lsr] Thoughts about PUAs - are we not over-engineering?
Hello Gunter,
I agree with pretty much all you said except the conclusion - do nothing :).
To me if you need to accelerate connectivity restoration upon an unlikely event 
like a complete PE failure the right vehicle to signal this is within the 
service layer itself. Let's keep in mind that links do fail a lot in the 
networks - routers do not (or they do it is multiple orders of magnitude less 
frequent event). Especially links on the PE-CE boundaries do fail a lot.
Removal of next hop reachability can be done with BGP and based on BGP native 
recursion will have the exact same effect as presented ideas. Moreover it will 
be stateful for the endpoints which again to me is a feature not a bug.
Some suggested to define a new extension in BGP to signal it even without using 
double recursion - well one of them has been proposed in the past - 
https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt 
<https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt> At that 
time the feedback received was that native BGP withdraws are fast enough so no need 
to bother. Well those native withdrawals are working today as well as some claim that 
specific implementations can withdraw RD:* when PE hosting such RDs fail and RDs are 
allocated in a unique per VRF fashion.
Then we have the DROID proposal which again may look like overkill for this 
very problem, but if you consider the bigger picture of what networks control 
plane pub-sub signalling needs, it establishes the foundation for such.
Many thanks,
Robert
On Tue, Jun 14, 2022 at 10:59 AM Van De Velde, Gunter (Nokia - BE/Antwerp) 
mailto:gunter.van_de_ve...@nokia.com>> wrote:
Hi All,
When reading both proposals about PUA's:
* draft-ppsenak-lsr-igp-ureach-prefix-announce-00
* draft-wang-lsr-prefix-unreachable-annoucement-09
The identified problem space seems a correct observation, and indeed
summaries hide remote area network instabilities. It is one of the
perceived benefits of using summaries. The place in the network
where this hiding takes the most impact upon convergence is at
service nodes (PE's for L3/L2/transport) where due to the
summarization its difficult to detect that the transport tunnel
end-point suddenly becomes unr

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Aijun Wang
Hi, Peter:
What’s my meaning is that if you redefine or reuse the meaning of LSInfinity, 
there will be issues for other scenario that want to utilize this field.
In the mentioned example, the prefixes associated with the LSInfinity is 
certainly reachable, which is contradicted with your assumption.

Aijun Wang
China Telecom

> On Jun 15, 2022, at 19:18, Peter Psenak  
> wrote:
> 
> Aijun,
> 
>> On 15/06/2022 12:12, Aijun Wang wrote:
>> Hi, Peter:
>> If you use LSInfinity as the indicator of the prefixes unreachable, then how 
>> about you solve the situations that described in 
>> https://datatracker.ietf.org/doc/html/draft-ietf-lsr-ip-flexalgo#section-6.4 
>> <https://datatracker.ietf.org/doc/html/draft-ietf-lsr-ip-flexalgo#section-6.4>,
>>  in which the the metric in parent TLV MUST be set to LSInfinity?
> 
> if the IP Algorithm Prefix Reachability Sub-TLV is present the metric from 
> that Sub-TLV is used instead. There is no problem.
> 
>> Will you consider all such prefixes unreachable? This is certainly not the 
>> aim of the IP FlexAlgo document.
>> In conclusion, the prefixes unreachable information should be indicated 
>> explicitly by other means, as that introduced in the PUA draft.
> 
> the meaning of LSInfinity has been defined decades ago. No matter how much 
> you may not like it, but it means unreachable.
> 
> thanks,
> Peter
> 
>> Aijun Wang
>> China Telecom
>>>> On Jun 15, 2022, at 17:09, Peter Psenak 
>>>>  wrote:
>>> 
>>> Hi Gunter,
>>> 
>>> On 15/06/2022 11:02, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:
>>>> Hi Robert,
>>>> I agree with you that the operator problem space is not limited to 
>>>> multi-area/levels with IGP summarisation.
>>>> With the PUA/UPA proposals I get the feeling that LSR WG is jumping into 
>>>> the deep-end and is re-vectoring the IGP to carry opaque information not 
>>>> used for SPF/cSPF.
>>>> I believe we should be conservative for such and if LSR WG progresses with 
>>>> such decision.
>>> 
>>> please note that UPA draft builds on existing protocol specification 
>>> defined in RFC5305 and RFC5308 that allow the metric larger then 
>>> MAX_PATH_METRIC to be used "for purposes other than building the normal IP 
>>> routing table". We are just documenting one of them.
>>> 
>>> thanks,
>>> Peter
>>> 
>>> 
>>>> It could very well be that re-vectoring is the best solution, but I guess 
>>>> we need to agree first on understanding the operator problem space.
>>>> G/
>>>> *From:*Robert Raszuk 
>>>> *Sent:* Tuesday, June 14, 2022 11:51 AM
>>>> *To:* Van De Velde, Gunter (Nokia - BE/Antwerp) 
>>>> 
>>>> *Cc:* lsr ; 
>>>> draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
>>>> draft-wang-lsr-prefix-unreachable-annoucement 
>>>> 
>>>> *Subject:* Re: [Lsr] Thoughts about PUAs - are we not over-engineering?
>>>> Hello Gunter,
>>>> I agree with pretty much all you said except the conclusion - do nothing 
>>>> :).
>>>> To me if you need to accelerate connectivity restoration upon an unlikely 
>>>> event like a complete PE failure the right vehicle to signal this is 
>>>> within the service layer itself. Let's keep in mind that links do fail a 
>>>> lot in the networks - routers do not (or they do it is multiple orders of 
>>>> magnitude less frequent event). Especially links on the PE-CE boundaries 
>>>> do fail a lot.
>>>> Removal of next hop reachability can be done with BGP and based on BGP 
>>>> native recursion will have the exact same effect as presented ideas. 
>>>> Moreover it will be stateful for the endpoints which again to me is a 
>>>> feature not a bug.
>>>> Some suggested to define a new extension in BGP to signal it even without 
>>>> using double recursion - well one of them has been proposed in the past - 
>>>> https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt 
>>>> <https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt> At 
>>>> that time the feedback received was that native BGP withdraws are fast 
>>>> enough so no need to bother. Well those native withdrawals are working 
>>>> today as well as some claim that specific implementations can withdraw 
>>>> RD:* when PE hosting such RDs fail and RDs are allocated in a unique per 
>&

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Peter Psenak

Aijun,

On 15/06/2022 12:12, Aijun Wang wrote:

Hi, Peter:

If you use LSInfinity as the indicator of the prefixes unreachable, then 
how about you solve the situations that described in 
https://datatracker.ietf.org/doc/html/draft-ietf-lsr-ip-flexalgo#section-6.4 
<https://datatracker.ietf.org/doc/html/draft-ietf-lsr-ip-flexalgo#section-6.4>, 
in which the the metric in parent TLV MUST be set to LSInfinity?


if the IP Algorithm Prefix Reachability Sub-TLV is present the metric 
from that Sub-TLV is used instead. There is no problem.


Will you consider all such prefixes unreachable? This is certainly not 
the aim of the IP FlexAlgo document.


In conclusion, the prefixes unreachable information should be indicated 
explicitly by other means, as that introduced in the PUA draft.


the meaning of LSInfinity has been defined decades ago. No matter how 
much you may not like it, but it means unreachable.


thanks,
Peter



Aijun Wang
China Telecom

On Jun 15, 2022, at 17:09, Peter Psenak 
 wrote:


Hi Gunter,

On 15/06/2022 11:02, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:

Hi Robert,
I agree with you that the operator problem space is not limited to 
multi-area/levels with IGP summarisation.
With the PUA/UPA proposals I get the feeling that LSR WG is jumping 
into the deep-end and is re-vectoring the IGP to carry opaque 
information not used for SPF/cSPF.
I believe we should be conservative for such and if LSR WG progresses 
with such decision.


please note that UPA draft builds on existing protocol specification 
defined in RFC5305 and RFC5308 that allow the metric larger then 
MAX_PATH_METRIC to be used "for purposes other than building the 
normal IP routing table". We are just documenting one of them.


thanks,
Peter


It could very well be that re-vectoring is the best solution, but I 
guess we need to agree first on understanding the operator problem space.

G/
*From:*Robert Raszuk 
*Sent:* Tuesday, June 14, 2022 11:51 AM
*To:* Van De Velde, Gunter (Nokia - BE/Antwerp) 

*Cc:* lsr ; 
draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
draft-wang-lsr-prefix-unreachable-annoucement 


*Subject:* Re: [Lsr] Thoughts about PUAs - are we not over-engineering?
Hello Gunter,
I agree with pretty much all you said except the conclusion - do 
nothing :).
To me if you need to accelerate connectivity restoration upon an 
unlikely event like a complete PE failure the right vehicle to signal 
this is within the service layer itself. Let's keep in mind that 
links do fail a lot in the networks - routers do not (or they do it 
is multiple orders of magnitude less frequent event). Especially 
links on the PE-CE boundaries do fail a lot.
Removal of next hop reachability can be done with BGP and based on 
BGP native recursion will have the exact same effect as presented 
ideas. Moreover it will be stateful for the endpoints which again to 
me is a feature not a bug.
Some suggested to define a new extension in BGP to signal it even 
without using double recursion - well one of them has been proposed 
in the past - 
https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt 
<https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt> 
At that time the feedback received was that native BGP withdraws are 
fast enough so no need to bother. Well those native withdrawals are 
working today as well as some claim that specific implementations can 
withdraw RD:* when PE hosting such RDs fail and RDs are allocated in 
a unique per VRF fashion.
Then we have the DROID proposal which again may look like overkill 
for this very problem, but if you consider the bigger picture of what 
networks control plane pub-sub signalling needs, it establishes the 
foundation for such.

Many thanks,
Robert
On Tue, Jun 14, 2022 at 10:59 AM Van De Velde, Gunter (Nokia - 
BE/Antwerp) <mailto:gunter.van_de_ve...@nokia.com>> wrote:

   Hi All,
   When reading both proposals about PUA's:
   * draft-ppsenak-lsr-igp-ureach-prefix-announce-00
   * draft-wang-lsr-prefix-unreachable-annoucement-09
   The identified problem space seems a correct observation, and indeed
   summaries hide remote area network instabilities. It is one of the
   perceived benefits of using summaries. The place in the network
   where this hiding takes the most impact upon convergence is at
   service nodes (PE's for L3/L2/transport) where due to the
   summarization its difficult to detect that the transport tunnel
   end-point suddenly becomes unreachable. My concern however is if it
   really is a problem that is worthy for LSR WG to solve.
   To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09"
   is not a preferred solution due to the expectation that all nodes in
   an area must be upgraded to support the IGP capability. From this
   operational perspective the draft
   "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant,
   as only the A(S)BR's and particular PEs must 

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Aijun Wang
Hi, Peter:

If you use LSInfinity as the indicator of the prefixes unreachable, then how 
about you solve the situations that described in 
https://datatracker.ietf.org/doc/html/draft-ietf-lsr-ip-flexalgo#section-6.4, 
in which the the metric in parent TLV MUST be set to LSInfinity?
Will you consider all such prefixes unreachable? This is certainly not the aim 
of the IP FlexAlgo document.

In conclusion, the prefixes unreachable information should be indicated 
explicitly by other means, as that introduced in the PUA draft.

Aijun Wang
China Telecom

> On Jun 15, 2022, at 17:09, Peter Psenak  
> wrote:
> 
> Hi Gunter,
> 
>> On 15/06/2022 11:02, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:
>> Hi Robert,
>> I agree with you that the operator problem space is not limited to 
>> multi-area/levels with IGP summarisation.
>> With the PUA/UPA proposals I get the feeling that LSR WG is jumping into the 
>> deep-end and is re-vectoring the IGP to carry opaque information not used 
>> for SPF/cSPF.
>> I believe we should be conservative for such and if LSR WG progresses with 
>> such decision.
> 
> please note that UPA draft builds on existing protocol specification defined 
> in RFC5305 and RFC5308 that allow the metric larger then MAX_PATH_METRIC to 
> be used "for purposes other than building the normal IP routing table". We 
> are just documenting one of them.
> 
> thanks,
> Peter
> 
> 
>> It could very well be that re-vectoring is the best solution, but I guess we 
>> need to agree first on understanding the operator problem space.
>> G/
>> *From:*Robert Raszuk 
>> *Sent:* Tuesday, June 14, 2022 11:51 AM
>> *To:* Van De Velde, Gunter (Nokia - BE/Antwerp) 
>> 
>> *Cc:* lsr ; 
>> draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
>> draft-wang-lsr-prefix-unreachable-annoucement 
>> 
>> *Subject:* Re: [Lsr] Thoughts about PUAs - are we not over-engineering?
>> Hello Gunter,
>> I agree with pretty much all you said except the conclusion - do nothing :).
>> To me if you need to accelerate connectivity restoration upon an unlikely 
>> event like a complete PE failure the right vehicle to signal this is within 
>> the service layer itself. Let's keep in mind that links do fail a lot in the 
>> networks - routers do not (or they do it is multiple orders of magnitude 
>> less frequent event). Especially links on the PE-CE boundaries do fail a lot.
>> Removal of next hop reachability can be done with BGP and based on BGP 
>> native recursion will have the exact same effect as presented ideas. 
>> Moreover it will be stateful for the endpoints which again to me is a 
>> feature not a bug.
>> Some suggested to define a new extension in BGP to signal it even without 
>> using double recursion - well one of them has been proposed in the past - 
>> https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt 
>> <https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt> At that 
>> time the feedback received was that native BGP withdraws are fast enough so 
>> no need to bother. Well those native withdrawals are working today as well 
>> as some claim that specific implementations can withdraw RD:* when PE 
>> hosting such RDs fail and RDs are allocated in a unique per VRF fashion.
>> Then we have the DROID proposal which again may look like overkill for this 
>> very problem, but if you consider the bigger picture of what networks 
>> control plane pub-sub signalling needs, it establishes the foundation for 
>> such.
>> Many thanks,
>> Robert
>> On Tue, Jun 14, 2022 at 10:59 AM Van De Velde, Gunter (Nokia - BE/Antwerp) 
>> mailto:gunter.van_de_ve...@nokia.com>> wrote:
>>Hi All,
>>When reading both proposals about PUA's:
>>* draft-ppsenak-lsr-igp-ureach-prefix-announce-00
>>* draft-wang-lsr-prefix-unreachable-annoucement-09
>>The identified problem space seems a correct observation, and indeed
>>summaries hide remote area network instabilities. It is one of the
>>perceived benefits of using summaries. The place in the network
>>where this hiding takes the most impact upon convergence is at
>>service nodes (PE's for L3/L2/transport) where due to the
>>summarization its difficult to detect that the transport tunnel
>>end-point suddenly becomes unreachable. My concern however is if it
>>really is a problem that is worthy for LSR WG to solve.
>>To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09"
>>is not a preferred solution due to the expectation that all nodes in
>>an area must

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Peter Psenak

Hi Gunter,

On 15/06/2022 11:02, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:

Hi Robert,

I agree with you that the operator problem space is not limited to 
multi-area/levels with IGP summarisation.


With the PUA/UPA proposals I get the feeling that LSR WG is jumping into 
the deep-end and is re-vectoring the IGP to carry opaque information not 
used for SPF/cSPF.


I believe we should be conservative for such and if LSR WG progresses 
with such decision.


please note that UPA draft builds on existing protocol specification 
defined in RFC5305 and RFC5308 that allow the metric larger then 
MAX_PATH_METRIC to be used "for purposes other than building the normal 
IP routing table". We are just documenting one of them.


thanks,
Peter




It could very well be that re-vectoring is the best solution, but I 
guess we need to agree first on understanding the operator problem space.


G/

*From:*Robert Raszuk 
*Sent:* Tuesday, June 14, 2022 11:51 AM
*To:* Van De Velde, Gunter (Nokia - BE/Antwerp) 

*Cc:* lsr ; 
draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
draft-wang-lsr-prefix-unreachable-annoucement 


*Subject:* Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

Hello Gunter,

I agree with pretty much all you said except the conclusion - do nothing 
:).


To me if you need to accelerate connectivity restoration upon an 
unlikely event like a complete PE failure the right vehicle to signal 
this is within the service layer itself. Let's keep in mind that links 
do fail a lot in the networks - routers do not (or they do it is 
multiple orders of magnitude less frequent event). Especially links on 
the PE-CE boundaries do fail a lot.


Removal of next hop reachability can be done with BGP and based on BGP 
native recursion will have the exact same effect as presented ideas. 
Moreover it will be stateful for the endpoints which again to me is a 
feature not a bug.


Some suggested to define a new extension in BGP to signal it even 
without using double recursion - well one of them has been proposed in 
the past - 
https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt 
<https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt> At 
that time the feedback received was that native BGP withdraws are fast 
enough so no need to bother. Well those native withdrawals are working 
today as well as some claim that specific implementations can withdraw 
RD:* when PE hosting such RDs fail and RDs are allocated in a unique per 
VRF fashion.


Then we have the DROID proposal which again may look like overkill for 
this very problem, but if you consider the bigger picture of what 
networks control plane pub-sub signalling needs, it establishes the 
foundation for such.


Many thanks,

Robert

On Tue, Jun 14, 2022 at 10:59 AM Van De Velde, Gunter (Nokia - 
BE/Antwerp) <mailto:gunter.van_de_ve...@nokia.com>> wrote:


Hi All,

When reading both proposals about PUA's:
* draft-ppsenak-lsr-igp-ureach-prefix-announce-00
* draft-wang-lsr-prefix-unreachable-annoucement-09

The identified problem space seems a correct observation, and indeed
summaries hide remote area network instabilities. It is one of the
perceived benefits of using summaries. The place in the network
where this hiding takes the most impact upon convergence is at
service nodes (PE's for L3/L2/transport) where due to the
summarization its difficult to detect that the transport tunnel
end-point suddenly becomes unreachable. My concern however is if it
really is a problem that is worthy for LSR WG to solve.

To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09"
is not a preferred solution due to the expectation that all nodes in
an area must be upgraded to support the IGP capability. From this
operational perspective the draft
"draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant,
as only the A(S)BR's and particular PEs must be upgraded to support
PUA's. I do have concerns about the number of PUA advertisements in
hierarchically summarized networks (/24 (site) -> /20 (region) ->
/16 (core)). More specific, in the /16 backbone area, how many of
these PUAs will be floating around creating LSP LSDB update churns?
How to control the potentially exponential number of observed PUAs
from floating everywhere? (will this lead to OSPF type NSSA areas
where areas will be purged from these PUAs for scaling stability?)

Long story short, should we not take a step back and re-think this
identified problem space? Is the proposed solution space not more
evil as the problem space? We do summarization because it brings
stability and reduce the number of link state updates within an
area. And now with PUA we re-introduce additional link state updates
(PUAs), we blow up the LSDB with information opaque to SPF best-path
calculation. In 

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Peter Psenak

Hi Gunter,

please see inline:


On 15/06/2022 10:38, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:

Hi Peter, All,


From a BGP perspective (PE service nodes) the event detection when transport 
tunnel end-point suddenly becomes unreachable is an operational problem. I 
think we all agree.

This problem exists in any multi-domain network, and is not limited to a 
multi-area/level IGP with summarization. Hence my doubts that simple encodings 
using the IGP as API for unreachability signaling is an optimal solution.


we are solving the problem for inter-area and/or inter-domain IGP 
networks. There are plenty of them.




Churning the LSDB for these things doesn't seem right.  Would this mean that we 
hack the IGP implementation so we don't trigger SPFs on rx of these updates?


I would not call adding a UPA announcement for a very rare event 
churning the LSDB. I really do not see the problem there.


UPA is a prefix advertisement with unreachable metric. Given that the 
prefix was never advertised with valid metric before (due to 
summarization) even PRC is not required.



Another concern is how we hook into BGP sideways to update it. Typically a 
router just looks at RTM and tunnel-tables for reachability. Now it would have 
check all the time a separate bypass-list.


that is a matter of implementation.


What about the pseudo-state. On startup I would imagine we would have to 
originate this PUA until a certain point?



UPA is only advertised if the component prefix of the summary that was 
reachable in its source area/domain becomes unreachable. Nothing is sent 
on startup.




Some consideration about installing the PUA route as a blackhole route, it does 
not seem an option because resolution of BGP next-hops with blackhole /32 
routes has to continue to mean “drop” matching traffic because of the 
widespread way this is used for DDOS protection. So there is need another 
“install” type for the “unreachable” IGP prefix which does not exist yet.


again, UPA processing is a matter of the implementation and is out of 
the scope of the draft. All you need to do is to trigger BGP PIC for 
destinations that use the UPA prefix as its NH. Isn't that hard.




To make IGP based Prefix-unreachability-signal successful seems not a trivial 
task pe-to-pe, and involves more than simplistic dumping of opaque link-state 
messages into IGP and to re-vector interior routing as an API. I'm a bit 
tormented regarding the potential evil caused to IGP for signaling 
prefix-unreachability. It may not be worth the effort. Especially when 
realizing that the problem space is not limited to multi-area/level 
summarization but instead exists in any multi-domain network.


once you implement it, you realize that it was not that hard at all.

thanks,
Peter




Maybe IETF should consider looking at the bigger picture, at service level, and 
document a full service level solution framework instead of looking only at IGP 
in atomic fashion.

G/

-Original Message-
From: Peter Psenak 
Sent: Tuesday, June 14, 2022 5:46 PM
To: Van De Velde, Gunter (Nokia - BE/Antwerp) ; lsr 

Cc: draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
draft-wang-lsr-prefix-unreachable-annoucement 

Subject: Re: Thoughts about PUAs - are we not over-engineering?

Hi Gunter,

please see inline:

On 14/06/2022 10:59, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:

Hi All,

When reading both proposals about PUA's:
* draft-ppsenak-lsr-igp-ureach-prefix-announce-00
* draft-wang-lsr-prefix-unreachable-annoucement-09

The identified problem space seems a correct observation, and indeed summaries 
hide remote area network instabilities. It is one of the perceived benefits of 
using summaries. The place in the network where this hiding takes the most 
impact upon convergence is at service nodes (PE's for L3/L2/transport) where 
due to the summarization its difficult to detect that the transport tunnel 
end-point suddenly becomes unreachable. My concern however is if it really is a 
problem that is worthy for LSR WG to solve.


the request to address the problem is coming from the field. The scale of the 
networks in the field is growing significantly and the summarization is being 
implemented to keep the prefix scale under control.




To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is
not a preferred solution due to the expectation that all nodes in an
area must be upgraded to support the IGP capability. From this
operational perspective the draft
"draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant, as
only the A(S)BR's and particular PEs must be upgraded to support
PUA's. I do have concerns about the number of PUA advertisements in
hierarchically summarized networks (/24 (site) -> /20 (region) -> /16
(core)). More specific, in the /16 backbone area, how many of these
PUAs will be floating around creating LSP LSDB update churns? How to
control the potentially exponential number of observed PUAs from

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Van De Velde, Gunter (Nokia - BE/Antwerp)
Hi Robert,

I agree with you that the operator problem space is not limited to 
multi-area/levels with IGP summarisation.

With the PUA/UPA proposals I get the feeling that LSR WG is jumping into the 
deep-end and is re-vectoring the IGP to carry opaque information not used for 
SPF/cSPF.
I believe we should be conservative for such and if LSR WG progresses with such 
decision.

It could very well be that re-vectoring is the best solution, but I guess we 
need to agree first on understanding the operator problem space.

G/

From: Robert Raszuk 
Sent: Tuesday, June 14, 2022 11:51 AM
To: Van De Velde, Gunter (Nokia - BE/Antwerp) 
Cc: lsr ; draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
draft-wang-lsr-prefix-unreachable-annoucement 

Subject: Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

Hello Gunter,

I agree with pretty much all you said except the conclusion - do nothing :).

To me if you need to accelerate connectivity restoration upon an unlikely event 
like a complete PE failure the right vehicle to signal this is within the 
service layer itself. Let's keep in mind that links do fail a lot in the 
networks - routers do not (or they do it is multiple orders of magnitude less 
frequent event). Especially links on the PE-CE boundaries do fail a lot.

Removal of next hop reachability can be done with BGP and based on BGP native 
recursion will have the exact same effect as presented ideas. Moreover it will 
be stateful for the endpoints which again to me is a feature not a bug.

Some suggested to define a new extension in BGP to signal it even without using 
double recursion - well one of them has been proposed in the past - 
https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt At that time 
the feedback received was that native BGP withdraws are fast enough so no need 
to bother. Well those native withdrawals are working today as well as some 
claim that specific implementations can withdraw RD:* when PE hosting such RDs 
fail and RDs are allocated in a unique per VRF fashion.

Then we have the DROID proposal which again may look like overkill for this 
very problem, but if you consider the bigger picture of what networks control 
plane pub-sub signalling needs, it establishes the foundation for such.

Many thanks,
Robert


On Tue, Jun 14, 2022 at 10:59 AM Van De Velde, Gunter (Nokia - BE/Antwerp) 
mailto:gunter.van_de_ve...@nokia.com>> wrote:
Hi All,

When reading both proposals about PUA's:
* draft-ppsenak-lsr-igp-ureach-prefix-announce-00
* draft-wang-lsr-prefix-unreachable-annoucement-09

The identified problem space seems a correct observation, and indeed summaries 
hide remote area network instabilities. It is one of the perceived benefits of 
using summaries. The place in the network where this hiding takes the most 
impact upon convergence is at service nodes (PE's for L3/L2/transport) where 
due to the summarization its difficult to detect that the transport tunnel 
end-point suddenly becomes unreachable. My concern however is if it really is a 
problem that is worthy for LSR WG to solve.

To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is not a 
preferred solution due to the expectation that all nodes in an area must be 
upgraded to support the IGP capability. From this operational perspective the 
draft "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant, as 
only the A(S)BR's and particular PEs must be upgraded to support PUA's. I do 
have concerns about the number of PUA advertisements in hierarchically 
summarized networks (/24 (site) -> /20 (region) -> /16 (core)). More specific, 
in the /16 backbone area, how many of these PUAs will be floating around 
creating LSP LSDB update churns? How to control the potentially exponential 
number of observed PUAs from floating everywhere? (will this lead to OSPF type 
NSSA areas where areas will be purged from these PUAs for scaling stability?)

Long story short, should we not take a step back and re-think this identified 
problem space? Is the proposed solution space not more evil as the problem 
space? We do summarization because it brings stability and reduce the number of 
link state updates within an area. And now with PUA we re-introduce additional 
link state updates (PUAs), we blow up the LSDB with information opaque to SPF 
best-path calculation. In addition there is suggestion of new state-machinery 
to track the igp reachability of 'protected' prefixes and there is maybe desire 
to contain or filter updates cross inter-area boundaries. And finally, how will 
we represent and track PUA in the RTM?

What is wrong with simply not doing summaries and forget about these PUAs to 
pinch holes in the summary prefixes? this worked very well during last two 
decennia. Are we not over-engineering with PUAs?

G/

___
Lsr mailing list
Lsr@ietf.org<mailto:Lsr@ietf.org>
htt

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Van De Velde, Gunter (Nokia - BE/Antwerp)
Hi Aijun,

Thanks for sharing your thoughts.
I agree that the observed problem is valid and is service impacting for 
operators.

It is wise to be conservative about using the IGP as an API to advertise opaque 
properties. The PUA/UPA have nothing to do with calculating SPF/cSPF.

Maybe we should first try to understand and agree on the full problem space 
before we directly jump into IGP encodings and risk over-engineering a solution?

G/




From: Aijun Wang 
Sent: Tuesday, June 14, 2022 12:27 PM
To: Van De Velde, Gunter (Nokia - BE/Antwerp) 
Cc: lsr ; draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
draft-wang-lsr-prefix-unreachable-annoucement 

Subject: Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

Hi, Gunter:

Let me try to answer some of your concerns.

The reason that we prefer to the Summary+PUA/UPA solution is that the node 
failure(which is the main scenario that we focus now) is one rarely thing in 
the network. Then the unreachable event triggered mechanism is more efficient 
than advertising all of the node’s reachable address. This point has been 
discussed in the mail list in past.

In the 
https://datatracker.ietf.org/doc/html/draft-wang-lsr-prefix-unreachable-annoucement-09#section-8,
 we have illustrated how to control the advertisement of PUA message on the 
ABR. If this can’t settle your concerns, we can consider more policy on the ABR.

Regarding to the tracking and representation of PUA in RTM, we have proposed in 
the earlier version of this draft, that is to install one black hole route to 
the specified detailed prefix.

The reason that PUA requires routers within one area to be upgraded is that it 
want to avoid the situations when the router doesn’t recognize PUA message and 
misbehave. We are considering the convergence of PUA/UPA solutions, which may 
relax such requirements during deployment.


Aijun Wang
China Telecom


On Jun 14, 2022, at 16:59, Van De Velde, Gunter (Nokia - BE/Antwerp) 
mailto:gunter.van_de_ve...@nokia.com>> wrote:
Hi All,

When reading both proposals about PUA's:
* draft-ppsenak-lsr-igp-ureach-prefix-announce-00
* draft-wang-lsr-prefix-unreachable-annoucement-09

The identified problem space seems a correct observation, and indeed summaries 
hide remote area network instabilities. It is one of the perceived benefits of 
using summaries. The place in the network where this hiding takes the most 
impact upon convergence is at service nodes (PE's for L3/L2/transport) where 
due to the summarization its difficult to detect that the transport tunnel 
end-point suddenly becomes unreachable. My concern however is if it really is a 
problem that is worthy for LSR WG to solve.

To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is not a 
preferred solution due to the expectation that all nodes in an area must be 
upgraded to support the IGP capability. From this operational perspective the 
draft "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant, as 
only the A(S)BR's and particular PEs must be upgraded to support PUA's. I do 
have concerns about the number of PUA advertisements in hierarchically 
summarized networks (/24 (site) -> /20 (region) -> /16 (core)). More specific, 
in the /16 backbone area, how many of these PUAs will be floating around 
creating LSP LSDB update churns? How to control the potentially exponential 
number of observed PUAs from floating everywhere? (will this lead to OSPF type 
NSSA areas where areas will be purged from these PUAs for scaling stability?)

Long story short, should we not take a step back and re-think this identified 
problem space? Is the proposed solution space not more evil as the problem 
space? We do summarization because it brings stability and reduce the number of 
link state updates within an area. And now with PUA we re-introduce additional 
link state updates (PUAs), we blow up the LSDB with information opaque to SPF 
best-path calculation. In addition there is suggestion of new state-machinery 
to track the igp reachability of 'protected' prefixes and there is maybe desire 
to contain or filter updates cross inter-area boundaries. And finally, how will 
we represent and track PUA in the RTM?

What is wrong with simply not doing summaries and forget about these PUAs to 
pinch holes in the summary prefixes? this worked very well during last two 
decennia. Are we not over-engineering with PUAs?

G/

___
Lsr mailing list
Lsr@ietf.org<mailto:Lsr@ietf.org>
https://www.ietf.org/mailman/listinfo/lsr
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Van De Velde, Gunter (Nokia - BE/Antwerp)
Hi Peter, All,

From a BGP perspective (PE service nodes) the event detection when transport 
tunnel end-point suddenly becomes unreachable is an operational problem. I 
think we all agree.
This problem exists in any multi-domain network, and is not limited to a 
multi-area/level IGP with summarization. Hence my doubts that simple encodings 
using the IGP as API for unreachability signaling is an optimal solution.  

Churning the LSDB for these things doesn't seem right.  Would this mean that we 
hack the IGP implementation so we don't trigger SPFs on rx of these updates?  
Another concern is how we hook into BGP sideways to update it. Typically a 
router just looks at RTM and tunnel-tables for reachability. Now it would have 
check all the time a separate bypass-list.  
What about the pseudo-state. On startup I would imagine we would have to 
originate this PUA until a certain point?

Some consideration about installing the PUA route as a blackhole route, it does 
not seem an option because resolution of BGP next-hops with blackhole /32 
routes has to continue to mean “drop” matching traffic because of the 
widespread way this is used for DDOS protection. So there is need another 
“install” type for the “unreachable” IGP prefix which does not exist yet.

To make IGP based Prefix-unreachability-signal successful seems not a trivial 
task pe-to-pe, and involves more than simplistic dumping of opaque link-state 
messages into IGP and to re-vector interior routing as an API. I'm a bit 
tormented regarding the potential evil caused to IGP for signaling 
prefix-unreachability. It may not be worth the effort. Especially when 
realizing that the problem space is not limited to multi-area/level 
summarization but instead exists in any multi-domain network. 

Maybe IETF should consider looking at the bigger picture, at service level, and 
document a full service level solution framework instead of looking only at IGP 
in atomic fashion.

G/

-Original Message-
From: Peter Psenak  
Sent: Tuesday, June 14, 2022 5:46 PM
To: Van De Velde, Gunter (Nokia - BE/Antwerp) ; 
lsr 
Cc: draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
draft-wang-lsr-prefix-unreachable-annoucement 

Subject: Re: Thoughts about PUAs - are we not over-engineering?

Hi Gunter,

please see inline:

On 14/06/2022 10:59, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:
> Hi All,
> 
> When reading both proposals about PUA's:
> * draft-ppsenak-lsr-igp-ureach-prefix-announce-00
> * draft-wang-lsr-prefix-unreachable-annoucement-09
> 
> The identified problem space seems a correct observation, and indeed 
> summaries hide remote area network instabilities. It is one of the perceived 
> benefits of using summaries. The place in the network where this hiding takes 
> the most impact upon convergence is at service nodes (PE's for 
> L3/L2/transport) where due to the summarization its difficult to detect that 
> the transport tunnel end-point suddenly becomes unreachable. My concern 
> however is if it really is a problem that is worthy for LSR WG to solve.

the request to address the problem is coming from the field. The scale of the 
networks in the field is growing significantly and the summarization is being 
implemented to keep the prefix scale under control.


> 
> To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is 
> not a preferred solution due to the expectation that all nodes in an 
> area must be upgraded to support the IGP capability. From this 
> operational perspective the draft 
> "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant, as 
> only the A(S)BR's and particular PEs must be upgraded to support 
> PUA's. I do have concerns about the number of PUA advertisements in 
> hierarchically summarized networks (/24 (site) -> /20 (region) -> /16 
> (core)). More specific, in the /16 backbone area, how many of these 
> PUAs will be floating around creating LSP LSDB update churns? How to 
> control the potentially exponential number of observed PUAs from 
> floating everywhere? (will this lead to OSPF type NSSA areas where 
> areas will be purged from these PUAs for scaling stability?)

Node going down is a rare event. The expected number of UPAs at any given time 
is very small. Implementations can limit the number of UPAs on ABR/ASBR in case 
of a catastrophic events, in which case the UPAs would hardly help anyway.

> 
> Long story short, should we not take a step back and re-think this identified 
> problem space? Is the proposed solution space not more evil as the problem 
> space? We do summarization because it brings stability and reduce the number 
> of link state updates within an area. And now with PUA we re-introduce 
> additional link state updates (PUAs), we blow up the LSDB with information 
> opaque to SPF best-path calculation. In addition there is suggestion of new 
> state-machinery to track the igp reachability of 'protected' prefixes and 
> there is maybe desire to contain or 

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-14 Thread Aijun Wang
Hi, Robert:

Agreed. The potential usages of PUA/UPA are not only PE routers(for BGP PIC), 
but also prevalent Tunnel technologies(GRE/SRv6).
All these nodes are important and we can’t punches so many holes in the summary 
range.

Aijun Wang
China Telecom

> On Jun 14, 2022, at 22:43, Robert Raszuk  wrote:
> 
> 
> Acee, 
> 
> > Note that any good implementation will allow one to punch holes in their 
> > area ranges so that critical prefixes are advertised or
> 
> Every PE address is critical. The story that one PE can be more important 
> than any other is just to mislead you at best. 
> 
> And we are (I hope) scoped the discussion to summaries. 
> 
> I realize  PUE also wants to cover P failures so in this case each P is also 
> equally important. 
> 
> Thx,
> R,
> 
> 
>> On Tue, Jun 14, 2022 at 3:57 PM Acee Lindem (acee)  wrote:
>> Speaking as WG member:
>> 
>>  
>> 
>> From: Lsr  on behalf of Robert Raszuk 
>> 
>> Date: Tuesday, June 14, 2022 at 9:27 AM
>> To: Christian Hopps 
>> Cc: Gunter Van de Velde , lsr , 
>> "draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org" 
>> , 
>> draft-wang-lsr-prefix-unreachable-annoucement 
>> 
>> Subject: Re: [Lsr] Thoughts about PUAs - are we not over-engineering?
>> 
>>  
>> 
>> All,
>> 
>>  
>> 
>> > What is wrong with simply not doing summaries
>> 
>>  
>> 
>> What's wrong is that you are reaching the scaling issue much sooner than 
>> when you inject summaries. 
>> 
>>  
>> 
>> Note that any good implementation will allow one to punch holes in their 
>> area ranges so that critical prefixes are advertised or included in the 
>> range existence criteria.
>> 
>>  
>> 
>> Thanks,
>> 
>> Acee
>> 
>>  
>> 
>>  
>> 
>> Note that the number of those host routes is flooded irrespective of the 
>> actual need everywhere based on the sick assumption that perhaps they may be 
>> needed there. There is no today to the best of my knowledge controlled 
>> leaking to only subset to what is needed. 
>> 
>>  
>> 
>> But this is not the main worry. Main worry is that in redundant networks you 
>> are seeing many copies of the very same route being flooded all over the 
>> place. So in a not so big 1000 node network the number of host routes may 
>> exceed 8000 easily. cri
>> 
>>  
>> 
>> Sure when things are stable all is cool. But we should prepare for the 
>> worst, not the best. In fact, the ability to encapsulate to an aggregate 
>> switch IP (GRE or UDP) or nowadays SRv6 has been one of the strongest 
>> advantages. 
>> 
>>  
>> 
>> So as started before the problem does exist. Neither PULSE nor PUE solve it 
>> which are both limited to PE failures detection which is not enough (maybe 
>> even not worth). But PE-CE failures need to be signalled in the case of 
>> injecting summaries. Maybe as I said in previous msg just BGP withdrawal is 
>> fine. If not we should seek a solution which addresses the real problem, not 
>> an infrequent one. 
>> 
>>  
>> 
>> Best,
>> 
>> R.
>> 
>>  
>> 
>>  
>> 
>>  
>> 
>> On Tue, Jun 14, 2022 at 2:51 PM Christian Hopps  wrote:
>> 
>> 
>> 
>> > On Jun 14, 2022, at 04:59, Van De Velde, Gunter (Nokia - BE/Antwerp) 
>> >  wrote:
>> > 
>> > What is wrong with simply not doing summaries and forget about these PUAs 
>> > to pinch holes in the summary prefixes? this worked very well during last 
>> > two decennia. Are we not over-engineering with PUAs?
>> 
>> 100% yes, IMO.
>> 
>> Thanks,
>> Chris.
>> [as wg-member]
>> ___
>> Lsr mailing list
>> Lsr@ietf.org
>> https://www.ietf.org/mailman/listinfo/lsr
>> 
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-14 Thread Peter Psenak

Hi Gunter,

please see inline:

On 14/06/2022 10:59, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:

Hi All,

When reading both proposals about PUA's:
* draft-ppsenak-lsr-igp-ureach-prefix-announce-00
* draft-wang-lsr-prefix-unreachable-annoucement-09

The identified problem space seems a correct observation, and indeed summaries 
hide remote area network instabilities. It is one of the perceived benefits of 
using summaries. The place in the network where this hiding takes the most 
impact upon convergence is at service nodes (PE's for L3/L2/transport) where 
due to the summarization its difficult to detect that the transport tunnel 
end-point suddenly becomes unreachable. My concern however is if it really is a 
problem that is worthy for LSR WG to solve.


the request to address the problem is coming from the field. The scale 
of the networks in the field is growing significantly and the 
summarization is being implemented to keep the prefix scale under control.





To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is not a preferred solution 
due to the expectation that all nodes in an area must be upgraded to support the IGP capability. From 
this operational perspective the draft "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is 
more elegant, as only the A(S)BR's and particular PEs must be upgraded to support PUA's. I do have 
concerns about the number of PUA advertisements in hierarchically summarized networks (/24 (site) -> 
/20 (region) -> /16 (core)). More specific, in the /16 backbone area, how many of these PUAs will be 
floating around creating LSP LSDB update churns? How to control the potentially exponential number of 
observed PUAs from floating everywhere? (will this lead to OSPF type NSSA areas where areas will be 
purged from these PUAs for scaling stability?)


Node going down is a rare event. The expected number of UPAs at any 
given time is very small. Implementations can limit the number of UPAs 
on ABR/ASBR in case of a catastrophic events, in which case the UPAs 
would hardly help anyway.




Long story short, should we not take a step back and re-think this identified 
problem space? Is the proposed solution space not more evil as the problem 
space? We do summarization because it brings stability and reduce the number of 
link state updates within an area. And now with PUA we re-introduce additional 
link state updates (PUAs), we blow up the LSDB with information opaque to SPF 
best-path calculation. In addition there is suggestion of new state-machinery 
to track the igp reachability of 'protected' prefixes and there is maybe desire 
to contain or filter updates cross inter-area boundaries. And finally, how will 
we represent and track PUA in the RTM?


the problem space is valid, as conformed by the field. As described 
above, the number of UPAs will be low, so there is no danger of 
defeating the purpose of the summarization.




What is wrong with simply not doing summaries and forget about these PUAs to 
pinch holes in the summary prefixes? this worked very well during last two 
decennia. Are we not over-engineering with PUAs?


it's the scale of the current networks, which is growing exponentially, 
which demands the use of the summarization.



thanks,
Peter



G/



___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-14 Thread Robert Raszuk
Acee,

> Note that any good implementation will allow one to punch holes in their
area ranges so that critical prefixes are advertised or

Every PE address is critical. The story that one PE can be more important
than any other is just to mislead you at best.

And we are (I hope) scoped the discussion to summaries.

I realize  PUE also wants to cover P failures so in this case each P is
also equally important.

Thx,
R,


On Tue, Jun 14, 2022 at 3:57 PM Acee Lindem (acee)  wrote:

> Speaking as WG member:
>
>
>
> *From: *Lsr  on behalf of Robert Raszuk <
> rob...@raszuk.net>
> *Date: *Tuesday, June 14, 2022 at 9:27 AM
> *To: *Christian Hopps 
> *Cc: *Gunter Van de Velde , lsr <
> lsr@ietf.org>, "draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org" <
> draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org>,
> draft-wang-lsr-prefix-unreachable-annoucement <
> draft-wang-lsr-prefix-unreachable-annoucem...@ietf.org>
> *Subject: *Re: [Lsr] Thoughts about PUAs - are we not over-engineering?
>
>
>
> All,
>
>
>
> > What is wrong with simply not doing summaries
>
>
>
> What's wrong is that you are reaching the scaling issue much sooner than
> when you inject summaries.
>
>
>
> Note that any good implementation will allow one to punch holes in their
> area ranges so that critical prefixes are advertised or included in the
> range existence criteria.
>
>
>
> Thanks,
>
> Acee
>
>
>
>
>
> Note that the number of those host routes is flooded irrespective of the
> actual need everywhere based on the sick assumption that perhaps they may
> be needed there. There is no today to the best of my knowledge controlled
> leaking to only subset to what is needed.
>
>
>
> But this is not the main worry. Main worry is that in redundant networks
> you are seeing many copies of the very same route being flooded all over
> the place. So in a not so big 1000 node network the number of host routes
> may exceed 8000 easily. cri
>
>
>
> Sure when things are stable all is cool. But we should prepare for the
> worst, not the best. In fact, the ability to encapsulate to an aggregate
> switch IP (GRE or UDP) or nowadays SRv6 has been one of the strongest
> advantages.
>
>
>
> So as started before the problem does exist. Neither PULSE nor PUE solve
> it which are both limited to PE failures detection which is not enough
> (maybe even not worth). But PE-CE failures need to be signalled in the case
> of injecting summaries. Maybe as I said in previous msg just BGP withdrawal
> is fine. If not we should seek a solution which addresses the real problem,
> not an infrequent one.
>
>
>
> Best,
>
> R.
>
>
>
>
>
>
>
> On Tue, Jun 14, 2022 at 2:51 PM Christian Hopps  wrote:
>
>
>
> > On Jun 14, 2022, at 04:59, Van De Velde, Gunter (Nokia - BE/Antwerp) <
> gunter.van_de_ve...@nokia.com> wrote:
> >
> > What is wrong with simply not doing summaries and forget about these
> PUAs to pinch holes in the summary prefixes? this worked very well during
> last two decennia. Are we not over-engineering with PUAs?
>
> 100% yes, IMO.
>
> Thanks,
> Chris.
> [as wg-member]
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-14 Thread Acee Lindem (acee)
Speaking as WG member:

From: Lsr  on behalf of Robert Raszuk 
Date: Tuesday, June 14, 2022 at 9:27 AM
To: Christian Hopps 
Cc: Gunter Van de Velde , lsr , 
"draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org" 
, 
draft-wang-lsr-prefix-unreachable-annoucement 

Subject: Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

All,

> What is wrong with simply not doing summaries

What's wrong is that you are reaching the scaling issue much sooner than when 
you inject summaries.

Note that any good implementation will allow one to punch holes in their area 
ranges so that critical prefixes are advertised or included in the range 
existence criteria.

Thanks,
Acee


Note that the number of those host routes is flooded irrespective of the actual 
need everywhere based on the sick assumption that perhaps they may be needed 
there. There is no today to the best of my knowledge controlled leaking to only 
subset to what is needed.

But this is not the main worry. Main worry is that in redundant networks you 
are seeing many copies of the very same route being flooded all over the place. 
So in a not so big 1000 node network the number of host routes may exceed 8000 
easily. cri

Sure when things are stable all is cool. But we should prepare for the worst, 
not the best. In fact, the ability to encapsulate to an aggregate switch IP 
(GRE or UDP) or nowadays SRv6 has been one of the strongest advantages.

So as started before the problem does exist. Neither PULSE nor PUE solve it 
which are both limited to PE failures detection which is not enough (maybe even 
not worth). But PE-CE failures need to be signalled in the case of injecting 
summaries. Maybe as I said in previous msg just BGP withdrawal is fine. If not 
we should seek a solution which addresses the real problem, not an infrequent 
one.

Best,
R.



On Tue, Jun 14, 2022 at 2:51 PM Christian Hopps 
mailto:cho...@chopps.org>> wrote:


> On Jun 14, 2022, at 04:59, Van De Velde, Gunter (Nokia - BE/Antwerp) 
> mailto:gunter.van_de_ve...@nokia.com>> wrote:
>
> What is wrong with simply not doing summaries and forget about these PUAs to 
> pinch holes in the summary prefixes? this worked very well during last two 
> decennia. Are we not over-engineering with PUAs?

100% yes, IMO.

Thanks,
Chris.
[as wg-member]
___
Lsr mailing list
Lsr@ietf.org<mailto:Lsr@ietf.org>
https://www.ietf.org/mailman/listinfo/lsr
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-14 Thread Robert Raszuk
All,

> What is wrong with simply not doing summaries

What's wrong is that you are reaching the scaling issue much sooner than
when you inject summaries.

Note that the number of those host routes is flooded irrespective of the
actual need everywhere based on the sick assumption that perhaps they may
be needed there. There is no today to the best of my knowledge controlled
leaking to only subset to what is needed.

But this is not the main worry. Main worry is that in redundant networks
you are seeing many copies of the very same route being flooded all over
the place. So in a not so big 1000 node network the number of host routes
may exceed 8000 easily.

Sure when things are stable all is cool. But we should prepare for the
worst, not the best. In fact, the ability to encapsulate to an aggregate
switch IP (GRE or UDP) or nowadays SRv6 has been one of the strongest
advantages.

So as started before the problem does exist. Neither PULSE nor PUE solve it
which are both limited to PE failures detection which is not enough (maybe
even not worth). But PE-CE failures need to be signalled in the case of
injecting summaries. Maybe as I said in previous msg just BGP withdrawal is
fine. If not we should seek a solution which addresses the real problem,
not an infrequent one.

Best,
R.



On Tue, Jun 14, 2022 at 2:51 PM Christian Hopps  wrote:

>
>
> > On Jun 14, 2022, at 04:59, Van De Velde, Gunter (Nokia - BE/Antwerp) <
> gunter.van_de_ve...@nokia.com> wrote:
> >
> > What is wrong with simply not doing summaries and forget about these
> PUAs to pinch holes in the summary prefixes? this worked very well during
> last two decennia. Are we not over-engineering with PUAs?
>
> 100% yes, IMO.
>
> Thanks,
> Chris.
> [as wg-member]
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-14 Thread Christian Hopps



> On Jun 14, 2022, at 04:59, Van De Velde, Gunter (Nokia - BE/Antwerp) 
>  wrote:
> 
> What is wrong with simply not doing summaries and forget about these PUAs to 
> pinch holes in the summary prefixes? this worked very well during last two 
> decennia. Are we not over-engineering with PUAs?

100% yes, IMO.

Thanks,
Chris.
[as wg-member]
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-14 Thread Aijun Wang
Hi, Gunter:

Let me try to answer some of your concerns.

The reason that we prefer to the Summary+PUA/UPA solution is that the node 
failure(which is the main scenario that we focus now) is one rarely thing in 
the network. Then the unreachable event triggered mechanism is more efficient 
than advertising all of the node’s reachable address. This point has been 
discussed in the mail list in past.

In the 
https://datatracker.ietf.org/doc/html/draft-wang-lsr-prefix-unreachable-annoucement-09#section-8,
 we have illustrated how to control the advertisement of PUA message on the 
ABR. If this can’t settle your concerns, we can consider more policy on the ABR.

Regarding to the tracking and representation of PUA in RTM, we have proposed in 
the earlier version of this draft, that is to install one black hole route to 
the specified detailed prefix.

The reason that PUA requires routers within one area to be upgraded is that it 
want to avoid the situations when the router doesn’t recognize PUA message and 
misbehave. We are considering the convergence of PUA/UPA solutions, which may 
relax such requirements during deployment.


Aijun Wang
China Telecom

> On Jun 14, 2022, at 16:59, Van De Velde, Gunter (Nokia - BE/Antwerp) 
>  wrote:
> Hi All,
> 
> When reading both proposals about PUA's:
> * draft-ppsenak-lsr-igp-ureach-prefix-announce-00
> * draft-wang-lsr-prefix-unreachable-annoucement-09
> 
> The identified problem space seems a correct observation, and indeed 
> summaries hide remote area network instabilities. It is one of the perceived 
> benefits of using summaries. The place in the network where this hiding takes 
> the most impact upon convergence is at service nodes (PE's for 
> L3/L2/transport) where due to the summarization its difficult to detect that 
> the transport tunnel end-point suddenly becomes unreachable. My concern 
> however is if it really is a problem that is worthy for LSR WG to solve.
> 
> To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is not a 
> preferred solution due to the expectation that all nodes in an area must be 
> upgraded to support the IGP capability. From this operational perspective the 
> draft "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant, as 
> only the A(S)BR's and particular PEs must be upgraded to support PUA's. I do 
> have concerns about the number of PUA advertisements in hierarchically 
> summarized networks (/24 (site) -> /20 (region) -> /16 (core)). More 
> specific, in the /16 backbone area, how many of these PUAs will be floating 
> around creating LSP LSDB update churns? How to control the potentially 
> exponential number of observed PUAs from floating everywhere? (will this lead 
> to OSPF type NSSA areas where areas will be purged from these PUAs for 
> scaling stability?)
> 
> Long story short, should we not take a step back and re-think this identified 
> problem space? Is the proposed solution space not more evil as the problem 
> space? We do summarization because it brings stability and reduce the number 
> of link state updates within an area. And now with PUA we re-introduce 
> additional link state updates (PUAs), we blow up the LSDB with information 
> opaque to SPF best-path calculation. In addition there is suggestion of new 
> state-machinery to track the igp reachability of 'protected' prefixes and 
> there is maybe desire to contain or filter updates cross inter-area 
> boundaries. And finally, how will we represent and track PUA in the RTM?
> 
> What is wrong with simply not doing summaries and forget about these PUAs to 
> pinch holes in the summary prefixes? this worked very well during last two 
> decennia. Are we not over-engineering with PUAs?
> 
> G/
> 
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-14 Thread Robert Raszuk
Hello Gunter,

I agree with pretty much all you said except the conclusion - do nothing
:).

To me if you need to accelerate connectivity restoration upon an unlikely
event like a complete PE failure the right vehicle to signal this is
within the service layer itself. Let's keep in mind that links do fail a
lot in the networks - routers do not (or they do it is multiple orders of
magnitude less frequent event). Especially links on the PE-CE boundaries do
fail a lot.

Removal of next hop reachability can be done with BGP and based on BGP
native recursion will have the exact same effect as presented ideas.
Moreover it will be stateful for the endpoints which again to me is a
feature not a bug.

Some suggested to define a new extension in BGP to signal it even without
using double recursion - well one of them has been proposed in the past -
https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt At that
time the feedback received was that native BGP withdraws are fast enough so
no need to bother. Well those native withdrawals are working today as well
as some claim that specific implementations can withdraw RD:* when PE
hosting such RDs fail and RDs are allocated in a unique per VRF fashion.

Then we have the DROID proposal which again may look like overkill for this
very problem, but if you consider the bigger picture of what networks
control plane pub-sub signalling needs, it establishes the foundation for
such.

Many thanks,
Robert


On Tue, Jun 14, 2022 at 10:59 AM Van De Velde, Gunter (Nokia - BE/Antwerp) <
gunter.van_de_ve...@nokia.com> wrote:

> Hi All,
>
> When reading both proposals about PUA's:
> * draft-ppsenak-lsr-igp-ureach-prefix-announce-00
> * draft-wang-lsr-prefix-unreachable-annoucement-09
>
> The identified problem space seems a correct observation, and indeed
> summaries hide remote area network instabilities. It is one of the
> perceived benefits of using summaries. The place in the network where this
> hiding takes the most impact upon convergence is at service nodes (PE's for
> L3/L2/transport) where due to the summarization its difficult to detect
> that the transport tunnel end-point suddenly becomes unreachable. My
> concern however is if it really is a problem that is worthy for LSR WG to
> solve.
>
> To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is not
> a preferred solution due to the expectation that all nodes in an area must
> be upgraded to support the IGP capability. From this operational
> perspective the draft "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is
> more elegant, as only the A(S)BR's and particular PEs must be upgraded to
> support PUA's. I do have concerns about the number of PUA advertisements in
> hierarchically summarized networks (/24 (site) -> /20 (region) -> /16
> (core)). More specific, in the /16 backbone area, how many of these PUAs
> will be floating around creating LSP LSDB update churns? How to control the
> potentially exponential number of observed PUAs from floating everywhere?
> (will this lead to OSPF type NSSA areas where areas will be purged from
> these PUAs for scaling stability?)
>
> Long story short, should we not take a step back and re-think this
> identified problem space? Is the proposed solution space not more evil as
> the problem space? We do summarization because it brings stability and
> reduce the number of link state updates within an area. And now with PUA we
> re-introduce additional link state updates (PUAs), we blow up the LSDB with
> information opaque to SPF best-path calculation. In addition there is
> suggestion of new state-machinery to track the igp reachability of
> 'protected' prefixes and there is maybe desire to contain or filter updates
> cross inter-area boundaries. And finally, how will we represent and track
> PUA in the RTM?
>
> What is wrong with simply not doing summaries and forget about these PUAs
> to pinch holes in the summary prefixes? this worked very well during last
> two decennia. Are we not over-engineering with PUAs?
>
> G/
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


[Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-14 Thread Van De Velde, Gunter (Nokia - BE/Antwerp)
Hi All,

When reading both proposals about PUA's:
* draft-ppsenak-lsr-igp-ureach-prefix-announce-00
* draft-wang-lsr-prefix-unreachable-annoucement-09

The identified problem space seems a correct observation, and indeed summaries 
hide remote area network instabilities. It is one of the perceived benefits of 
using summaries. The place in the network where this hiding takes the most 
impact upon convergence is at service nodes (PE's for L3/L2/transport) where 
due to the summarization its difficult to detect that the transport tunnel 
end-point suddenly becomes unreachable. My concern however is if it really is a 
problem that is worthy for LSR WG to solve.

To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is not a 
preferred solution due to the expectation that all nodes in an area must be 
upgraded to support the IGP capability. From this operational perspective the 
draft "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant, as 
only the A(S)BR's and particular PEs must be upgraded to support PUA's. I do 
have concerns about the number of PUA advertisements in hierarchically 
summarized networks (/24 (site) -> /20 (region) -> /16 (core)). More specific, 
in the /16 backbone area, how many of these PUAs will be floating around 
creating LSP LSDB update churns? How to control the potentially exponential 
number of observed PUAs from floating everywhere? (will this lead to OSPF type 
NSSA areas where areas will be purged from these PUAs for scaling stability?)

Long story short, should we not take a step back and re-think this identified 
problem space? Is the proposed solution space not more evil as the problem 
space? We do summarization because it brings stability and reduce the number of 
link state updates within an area. And now with PUA we re-introduce additional 
link state updates (PUAs), we blow up the LSDB with information opaque to SPF 
best-path calculation. In addition there is suggestion of new state-machinery 
to track the igp reachability of 'protected' prefixes and there is maybe desire 
to contain or filter updates cross inter-area boundaries. And finally, how will 
we represent and track PUA in the RTM?

What is wrong with simply not doing summaries and forget about these PUAs to 
pinch holes in the summary prefixes? this worked very well during last two 
decennia. Are we not over-engineering with PUAs?

G/

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr