Re: [bess] Questions on draft-sajassi-bess-evpn-ip-aliasing-07

Jorge Rabadan (Nokia) Fri, 20 Oct 2023 11:51:28 -0700

Hi Jeffrey,

Thanks for reviewing.
Please see my comments in-line with [Jorge]. All those are addressed in version 
08.

Thx
Jorge

From: Jeffrey (Zhaohui) Zhang <[email protected]>
Date: Tuesday, September 26, 2023 at 1:55 PM
To: 'Ali Sajassi (sajassi)' <[email protected]>, John E Drake 
<[email protected]>, Jorge Rabadan (Nokia) <[email protected]>
Cc: 'BESS' <[email protected]>
Subject: Questions on draft-sajassi-bess-evpn-ip-aliasing-07

CAUTION: This is an external email. Please be very careful when clicking links 
or opening attachments. See the URL nok.it/ext for additional information.

Hi,

To prepare for the adoption call, I read the draft and have some questions and 
nit comments. If a point has been discussed before, a link to the email archive 
is appreciated.

   ... If H1 is
   locally learned only at one of the multi-homing PEs, PE1 or PE2, due
   to LAG hashing, PE3 will not be able to build an IP ECMP list for the
   H1 host route.

Perhaps remove the red text so that it is clear that "due to LAG hashing" is 
for "H1 is locally learned only …" rather than for "PE3 will not be able to …".
[Jorge] ok, done.

For the following three subsections:

1.1.  Ethernet Segments for Host Routes in Symmetric IRB
   …
   With Asymmetric IRB [RFC9135], …

1.2.  Inter-subnet Forwarding for Prefix Routes in the Interface-less
      IP-VRF-to-IP-VRF Model

   In the Interface-less IP-VRF-to-IP-VRF model described in [RFC9136]
   …

1.3.  Ethernet Segments for Prefix routes in IP-VRF-to-IP-VRF use-cases

   This document also enables fast convergence and aliasing/backup path
   to be used even when the ESI is used exclusively as an L3 construct,
   in an Interface-less IP-VRF-to-IP-VRF scenario [RFC9136].

Do we need to discuss the asymmetric model? It seems to be irrelevant.
[Jorge] in the asymmetric model, the remote nodes are attached to the same 
broadcast domain as the multi-homing PEs. Hence, Aliasing/Backup functions are 
fully defined in the existing specs and you are right, we don’t need to specify 
new procedures. The asymmetric model would only be relevant for section 1.1, 
and that’s why section 1.1 describes how the existing procedures provide a 
solution for asymmetric IRB.

Both 1.2 and 1.3 mention “Interface-less IP-VRF-to-IP-VRF scenario” in RFC9136 
though they are about different scenarios. Perhaps the section titles could be 
more accurate – in fact, they could be more consistent with the a/b/c use cases 
preceding 1.1. The following is a suggestion:

1.1. MAC/IP routes with symmetric IRB
1.2. IP Prefix routes with interface-less model
1.3. IP Prefix routes with ESI being a pure L3 construct
[Jorge] it’s a fair point, I modified the titles, inspired by you suggestion, 
as follows:
“1.1.  Multi-Homing for MAC/IP Advertisement Routes in Symmetric IRB
1.2.  Multi-Homing for IP Prefix Routes in the Interface-less IP-VRF-to-IP-VRF 
Model
1.3.  Multi-Homing for IP Prefix routes with Layer 3 Ethernet Segments”

In 1.3.1:

   In these use-cases, sometimes the CE supports a single BGP session to
   one of the PEs (through which it advertises a number of IP Prefixes
   seating behind itself) and yet, it is desired that remote PEs can
   build an IP ECMP list or backup IP list including all the PEs multi-
   homed to the same CE.

I initially wondered with how PE2 would know to forward traffic to the CE since 
it does not learn the routes from the CE, until it came to me that PE1 will 
re-advertise type-5 routes to every PE. I also see it is explicitly mentioned 
in 4.2. It would be good to briefly mention it in 1.3.1 as well.

It’s also worth pointing out that both PE1 and PE2 can multi-path via each 
other.

[Jorge] ok, I added the following. Let me know if it helps. The multi-path bit 
across a local and a RT-5 is probably feasible if there are other CEs attached 
to PE2, but I believe it would complicate the use case.

“This document provides a solution so that PE3 considers PE2 as a next-hop in 
the IP ECMP list for CE1's prefixes, even if PE2 did not advertise the IP 
Prefix routes for those prefixes in the first place. The solution uses an ESI 
in the IP Prefix routes advertised from PE1 so that, when imported by PE2, PE2 
installs the route as local, since PE2 is also attached to the Ethernet Segment 
identified by the ESI.”

1.3.2 does not seem to be a different use case from 1.3.1. It can be viewed as 
a special case of 1.3.1 – PEC’s attachment to the ES is down. Perhaps fold 
1.3.2 into 1.3.1 as a special case?
[Jorge] that’s correct, I added the following text, let me know if it helps:

“There are two use cases analyzed and supported by this document:
IP Aliasing for EVPN IP Prefix routes
IP Aliasing in a Centralized Routing Model
Both use cases are resolved by the same procedures, and the scenario in Section 
1.3.2 can be considered a special case of Section 1.3.1.”

For the following:

4.1.2.  IP A-D per ES route and SRv6 Transport

   When an SRv6 transport is used, each IP A-D per ES route MUST carry
   an SRv6 L3 Service TLV within the BGP Prefix-SID attribute [RFC9252].
   The Service SID MUST be of value 0.  The SRv6 Endpoint Behavior
   SHOULD be one of these End.DT46, End.DT4, End.DT6, End.DX4, or
   End.DX6.

What is the purpose of the above?
[Jorge] A-D per ES routes carry a BGP encapsulation extended community in case 
of VXLAN, MPLS, etc, and an SRv6 Service TLV in case of SRv6, as also described 
in draft-trr-bess-bgp-srv6-args.

4.1.3.  IP A-D per ES route and ESI Label Extended Community

   Each IP A-D per ES route MUST be sent with the ESI Label extended
   community [I-D.ietf-bess-rfc7432bis].  The ESI Label field of the
   extended community SHOULD be set to zero when sending and MUST be
   ignored on reception.

I assume the purpose is to advertise flags – like whether it is all-active or 
single-active. Good to point that out.
[Jorge] ok, I added:
“(the flags in the ESI Label extended community are processed to determine if 
the Ethernet Segment works in all-active or single-active multi-homing mode).”

4.3.  Handling Silent Host MAC/IP route for IP Aliasing
   …
   Thus to avoid blackholing, when PE2 detects loss of reachability to
   PE1, it should trigger ARP/ND requests for all remote IP prefixes
   received from PE1 across all affected IP-VRFs.  This will force host
   H1 to reply to the solicited ARP/ND messages from PE2 and refresh
   both MAC and IP for the corresponding host in its tables.

This section talks about the silent host MAC/IP route. I suppose there is no 
similar mechanism for Section 4.2 if a single routing session from the CE to 
one of the PEs?
[Jorge] 4.2 talks about the synchronization of the ip routes on the PEs 
attached to the same ES, and 4.3 about the synch of the ARP/ND entries on the 
PEs. Unless a use case is explicitly mentioned, the sections from 2. on are 
relevant to all use cases – 1.1, 1.2 and 1.3. I added a sentence at the end of 
section 2.

For the following:

  5.  Determining Reachability to Unicast IP Addresses

Perhaps change “IP Addresses” to “IP Destinations”?
[Jorge] ok, done.

For the following sections:

5.2.  Remote Learning

   The procedures for remote learning do not change from [RFC7432] or
   [RFC9136].

5.3.  Constructing the EVPN IP Routes

   The procedures for constructing MAC/IP Address or IP Prefix
   Advertisements do not change from [RFC7432] or [RFC9136].

5.3.1.  Route Resolution

5.3.1 is about Route Resolution on a receiving PE, while 5.3 is about 
constructing the routes on an originating PE. Seems that 5.3.1 should be moved 
out of 5.3.
[Jorge] ok, moved to its own section

What is the definition of “remote learning”? Both on a remote PE (e.g. PE3) and 
on a multi-homed PE (e.g., PE2 learning from PE1)? Both need to follow the 
updated route resolution procedures, so “The procedures for remote learning do 
not change from [RFC7432] or [RFC9136]”  does not seem right.
[Jorge] you’re right it is confusing. It meant to mean the process of 
importing, but it is not adding anything, only confusion. I removed this 
section.

What’s the difference between sections 7 and 8?

7.  Load Balancing of Unicast Packets

   The procedures for load balancing of Unicast Packets do not change
   from [RFC7432]

8.  IP Aliasing and Unequal ECMP for IP Prefix Routes

Both seem to be about Load Balancing.
[Jorge] ok, I put them into the same section.

For the following normalization rule:

      … If the
      ingress PE learns a prefix P via a non-reserved ESI RT-5 route
      with a weight (for which IP A-D per ES routes also signal a
      weight) and a zero ESI RT-5 that includes a weight, the ingress PE
      will consider all the PEs attached to the ES as a single PE when
      normalizing weights.

      As an example, consider PE1 and PE2 are attached to ES-1 and PE1
      advertises an RT-5 for prefix P with ESI-1 (and EVPN Link
      Bandwidth of 1).  Consider PE3 advertises an RT-5 for P with ESI=0
      and EVPN Link Bandwidth of 2.  If PE1 and PE2 advertise an EVPN
      Link Bandwidth of 1 and 2, respectively, in the IP A-D per ES
      routes for ES-1, an ingress PE4 SHOULD assign a normalized weight
      of 1 to ES-1 and a normalized weight of 2 to PE3.

What is the rationale for normalizing the weight of ES-1 to 1?
[Jorge] the ES represents a single CE, and the weight of the RT-5 with ESI=1 
influences the number of flows for that CE. So if the remote PE gets two RT-5s: 
RT-5 (weight=1) and RT-5 (weight=2) it should apply the weights based on those 
RT-5s.

Thanks.
Jeffrey

_______________________________________________
BESS mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/bess

Re: [bess] Questions on draft-sajassi-bess-evpn-ip-aliasing-07

Reply via email to