Hi Jeffrey, Thanks for reviewing. Please see my comments in-line with [Jorge]. All those are addressed in version 08.
Thx Jorge From: Jeffrey (Zhaohui) Zhang <zzh...@juniper.net> Date: Tuesday, September 26, 2023 at 1:55 PM To: 'Ali Sajassi (sajassi)' <saja...@cisco.com>, John E Drake <jdr...@juniper.net>, Jorge Rabadan (Nokia) <jorge.raba...@nokia.com> Cc: 'BESS' <bess@ietf.org> Subject: Questions on draft-sajassi-bess-evpn-ip-aliasing-07 CAUTION: This is an external email. Please be very careful when clicking links or opening attachments. See the URL nok.it/ext for additional information. Hi, To prepare for the adoption call, I read the draft and have some questions and nit comments. If a point has been discussed before, a link to the email archive is appreciated. ... If H1 is locally learned only at one of the multi-homing PEs, PE1 or PE2, due to LAG hashing, PE3 will not be able to build an IP ECMP list for the H1 host route. Perhaps remove the red text so that it is clear that "due to LAG hashing" is for "H1 is locally learned only …" rather than for "PE3 will not be able to …". [Jorge] ok, done. For the following three subsections: 1.1. Ethernet Segments for Host Routes in Symmetric IRB … With Asymmetric IRB [RFC9135], … 1.2. Inter-subnet Forwarding for Prefix Routes in the Interface-less IP-VRF-to-IP-VRF Model In the Interface-less IP-VRF-to-IP-VRF model described in [RFC9136] … 1.3. Ethernet Segments for Prefix routes in IP-VRF-to-IP-VRF use-cases This document also enables fast convergence and aliasing/backup path to be used even when the ESI is used exclusively as an L3 construct, in an Interface-less IP-VRF-to-IP-VRF scenario [RFC9136]. Do we need to discuss the asymmetric model? It seems to be irrelevant. [Jorge] in the asymmetric model, the remote nodes are attached to the same broadcast domain as the multi-homing PEs. Hence, Aliasing/Backup functions are fully defined in the existing specs and you are right, we don’t need to specify new procedures. The asymmetric model would only be relevant for section 1.1, and that’s why section 1.1 describes how the existing procedures provide a solution for asymmetric IRB. Both 1.2 and 1.3 mention “Interface-less IP-VRF-to-IP-VRF scenario” in RFC9136 though they are about different scenarios. Perhaps the section titles could be more accurate – in fact, they could be more consistent with the a/b/c use cases preceding 1.1. The following is a suggestion: 1.1. MAC/IP routes with symmetric IRB 1.2. IP Prefix routes with interface-less model 1.3. IP Prefix routes with ESI being a pure L3 construct [Jorge] it’s a fair point, I modified the titles, inspired by you suggestion, as follows: “1.1. Multi-Homing for MAC/IP Advertisement Routes in Symmetric IRB 1.2. Multi-Homing for IP Prefix Routes in the Interface-less IP-VRF-to-IP-VRF Model 1.3. Multi-Homing for IP Prefix routes with Layer 3 Ethernet Segments” In 1.3.1: In these use-cases, sometimes the CE supports a single BGP session to one of the PEs (through which it advertises a number of IP Prefixes seating behind itself) and yet, it is desired that remote PEs can build an IP ECMP list or backup IP list including all the PEs multi- homed to the same CE. I initially wondered with how PE2 would know to forward traffic to the CE since it does not learn the routes from the CE, until it came to me that PE1 will re-advertise type-5 routes to every PE. I also see it is explicitly mentioned in 4.2. It would be good to briefly mention it in 1.3.1 as well. It’s also worth pointing out that both PE1 and PE2 can multi-path via each other. [Jorge] ok, I added the following. Let me know if it helps. The multi-path bit across a local and a RT-5 is probably feasible if there are other CEs attached to PE2, but I believe it would complicate the use case. “This document provides a solution so that PE3 considers PE2 as a next-hop in the IP ECMP list for CE1's prefixes, even if PE2 did not advertise the IP Prefix routes for those prefixes in the first place. The solution uses an ESI in the IP Prefix routes advertised from PE1 so that, when imported by PE2, PE2 installs the route as local, since PE2 is also attached to the Ethernet Segment identified by the ESI.” 1.3.2 does not seem to be a different use case from 1.3.1. It can be viewed as a special case of 1.3.1 – PEC’s attachment to the ES is down. Perhaps fold 1.3.2 into 1.3.1 as a special case? [Jorge] that’s correct, I added the following text, let me know if it helps: “There are two use cases analyzed and supported by this document: IP Aliasing for EVPN IP Prefix routes IP Aliasing in a Centralized Routing Model Both use cases are resolved by the same procedures, and the scenario in Section 1.3.2 can be considered a special case of Section 1.3.1.” For the following: 4.1.2. IP A-D per ES route and SRv6 Transport When an SRv6 transport is used, each IP A-D per ES route MUST carry an SRv6 L3 Service TLV within the BGP Prefix-SID attribute [RFC9252]. The Service SID MUST be of value 0. The SRv6 Endpoint Behavior SHOULD be one of these End.DT46, End.DT4, End.DT6, End.DX4, or End.DX6. What is the purpose of the above? [Jorge] A-D per ES routes carry a BGP encapsulation extended community in case of VXLAN, MPLS, etc, and an SRv6 Service TLV in case of SRv6, as also described in draft-trr-bess-bgp-srv6-args. 4.1.3. IP A-D per ES route and ESI Label Extended Community Each IP A-D per ES route MUST be sent with the ESI Label extended community [I-D.ietf-bess-rfc7432bis]. The ESI Label field of the extended community SHOULD be set to zero when sending and MUST be ignored on reception. I assume the purpose is to advertise flags – like whether it is all-active or single-active. Good to point that out. [Jorge] ok, I added: “(the flags in the ESI Label extended community are processed to determine if the Ethernet Segment works in all-active or single-active multi-homing mode).” 4.3. Handling Silent Host MAC/IP route for IP Aliasing … Thus to avoid blackholing, when PE2 detects loss of reachability to PE1, it should trigger ARP/ND requests for all remote IP prefixes received from PE1 across all affected IP-VRFs. This will force host H1 to reply to the solicited ARP/ND messages from PE2 and refresh both MAC and IP for the corresponding host in its tables. This section talks about the silent host MAC/IP route. I suppose there is no similar mechanism for Section 4.2 if a single routing session from the CE to one of the PEs? [Jorge] 4.2 talks about the synchronization of the ip routes on the PEs attached to the same ES, and 4.3 about the synch of the ARP/ND entries on the PEs. Unless a use case is explicitly mentioned, the sections from 2. on are relevant to all use cases – 1.1, 1.2 and 1.3. I added a sentence at the end of section 2. For the following: 5. Determining Reachability to Unicast IP Addresses Perhaps change “IP Addresses” to “IP Destinations”? [Jorge] ok, done. For the following sections: 5.2. Remote Learning The procedures for remote learning do not change from [RFC7432] or [RFC9136]. 5.3. Constructing the EVPN IP Routes The procedures for constructing MAC/IP Address or IP Prefix Advertisements do not change from [RFC7432] or [RFC9136]. 5.3.1. Route Resolution 5.3.1 is about Route Resolution on a receiving PE, while 5.3 is about constructing the routes on an originating PE. Seems that 5.3.1 should be moved out of 5.3. [Jorge] ok, moved to its own section What is the definition of “remote learning”? Both on a remote PE (e.g. PE3) and on a multi-homed PE (e.g., PE2 learning from PE1)? Both need to follow the updated route resolution procedures, so “The procedures for remote learning do not change from [RFC7432] or [RFC9136]” does not seem right. [Jorge] you’re right it is confusing. It meant to mean the process of importing, but it is not adding anything, only confusion. I removed this section. What’s the difference between sections 7 and 8? 7. Load Balancing of Unicast Packets The procedures for load balancing of Unicast Packets do not change from [RFC7432] 8. IP Aliasing and Unequal ECMP for IP Prefix Routes Both seem to be about Load Balancing. [Jorge] ok, I put them into the same section. For the following normalization rule: … If the ingress PE learns a prefix P via a non-reserved ESI RT-5 route with a weight (for which IP A-D per ES routes also signal a weight) and a zero ESI RT-5 that includes a weight, the ingress PE will consider all the PEs attached to the ES as a single PE when normalizing weights. As an example, consider PE1 and PE2 are attached to ES-1 and PE1 advertises an RT-5 for prefix P with ESI-1 (and EVPN Link Bandwidth of 1). Consider PE3 advertises an RT-5 for P with ESI=0 and EVPN Link Bandwidth of 2. If PE1 and PE2 advertise an EVPN Link Bandwidth of 1 and 2, respectively, in the IP A-D per ES routes for ES-1, an ingress PE4 SHOULD assign a normalized weight of 1 to ES-1 and a normalized weight of 2 to PE3. What is the rationale for normalizing the weight of ES-1 to 1? [Jorge] the ES represents a single CE, and the weight of the RT-5 with ESI=1 influences the number of flows for that CE. So if the remote PE gets two RT-5s: RT-5 (weight=1) and RT-5 (weight=2) it should apply the weights based on those RT-5s. Thanks. Jeffrey
_______________________________________________ BESS mailing list BESS@ietf.org https://www.ietf.org/mailman/listinfo/bess