Hi Jorge and other co-authors,
I am giving another round of review as the document shepherd before I do the
shepherd write-up.
Please see some nits/comments/questions below.
Thanks!
Jeffrey
--------------------------------------------------------------------------------------
EVPN provides a flexible control plane that allows intra-subnet
connectivity in an IP/MPLS and/or an NVO-based network.
Isn't NVO based on IP? There is no pure-IP based EVPN, right? So perhaps either
"in an IP/MPLS based overlay network" or "in an MPLS and/or NVO-based network"?
EVI: EVPN Instance spanning the NVE and PE devices that are
participating on that EVPN.
"NVE/PE"?
IP-VRF: A VPN Routing and Forwarding table for IP addresses on an
NVE/PE, similar to the VRF concept defined in [RFC4364], however,
in this document, the IP routes are always populated by the EVPN
address family.
Do we really want to distinguish the IP-VRF in RFC4364 and the one in this
document? I think it's really the same IP-VRF - routes could be populated from
both EVPN and IP-VPN address family, especially on the DGWs.
If we use the term Tenant System (TS) to designate a physical or
virtual system identified by MAC and IP addresses, and connected to a
MAC-VRF by an Attachment Circuit, the following considerations apply:
...
o Although these VAs provide IP connectivity to VMs and subnets
behind them, they do not always have their own IP interface
connected to the EVPN NVE, e.g. layer-2 firewalls are examples
of VAs not supporting IP interfaces.
In the above two paragraphs, the first one says the TS is identified by
MAC "and IP addresses", then the second paragraph says "do not always
Have their own IP interface". Should "and IP addresses" be changed to
"and maybe IP address as well"?
o TS2 and TS3 are Virtual Appliances (VA) that generate/receive
traffic from/to the subnets and hosts sitting behind them
s/generate/send/
o Integrated Routing and Bridging interfaces IRB1, IRB2 and IRB3 have
their own IP addresses that belong to the EVI-10 subnet too. These
IRB interfaces connect the EVI-10 subnet to Virtual Routing and
Forwarding (IP-VRF) instances that can route the traffic to other
connected subnets for the same tenant (within the DC or at the
other end of the WAN).
s/connected subnets/subnets/
One example of such use cases is the "floating IP" example described
in section 2.1. In this example we need to decouple the advertisement
of the prefixes from the advertisement of the floating IP (vIP23 in
Figure 1) and MAC associated to it, otherwise the solution gets
highly inefficient and does not scale.
I understand what the above is trying to say, but had trouble parsing the
sentence before "otherwise". I think it would better to say "decouple ... from
the advertisement of MAC address of either M2 or M3", as we're advertising with
the floating IP as the overlay index (but not the mac).
o The GW IP (Gateway IP Address) will be a 32 or 128-bit field (ipv4
or ipv6), and will encode an overlay IP index for the IP Prefixes.
s/encode an overlay IP index/encode an IP address as an overlay index/
o The MPLS Label field is encoded as 3 octets, where the high-order
20 bits contain the label value. When sending, the label value
SHOULD be zero to indicate that recursive resolution is needed. If
the received MPLS Label value is zero, the route MUST contain an
Overlay Index and the ingress NVE/PE MUST do recursive resolution
to find the egress NVE/PE. If the received Label value is non-zero,
the route will not be used for recursive resolution unless a local
policy says so.
How about change the second sentence to the following:
... SHOULD be zero if recursive resolution based on overlay index is used.
Notice the "if".
o An Overlay Index can be an ESI, IP address in the address space of
the tenant or MAC address and it is used by an NVE as the next-hop
for a given IP Prefix.
I like it that a mac address can be used as an overlay index; but I don't see
how the mac address as overlay index is encoded?
I see the following later:
* MAC with Zero value means no Router's MAC extended community is
present along with the RT-5. Non-Zero indicates that the extended
community is present and carries a valid MAC address. Examples of
invalid MAC addresses are broadcast or multicast MAC addresses.
It would be good to point out up front (right where the RT-5 format is given)
that Router's MAC EC may be attached to the RT-5.
It is important to note that recursive
resolution of the Overlay Index applies upon installation into an
IP-VRF, and not upon BGP propagation.
What does the above sentence mean? Why is it important to note? Nothing is upon
propagation, right?
o Irrespective of the recursive resolution, if there is no IGP or BGP
route to the BGP next-hop of an RT-5, BGP may fail to install the
RT-5 even if the Overlay Index can be resolved.
May? Should? Must?
The indirection provided by the Overlay Index and its recursive
lookup resolution is required to achieve fast convergence in case of
a failure of the object represented by the Overlay Index. For
instance: in Figure 1, let's assume NVE2/NVE3 advertise 1k RT-5
routes associated to the floating IP address (GWIP=vIP23) and NVE2
advertises an RT-2 claiming the ownership of the floating IP, i.e.
NVE2 encodes vIP23 and M2 in the RT-2. When the floating IP owner
changes from M2 to M3, a single RT-2 withdraw/update is required to
indicate the change. The remote DGW will not change any of the 1k
prefixes associated to vIP23, but will only update the ARP resolution
entry for vIP23 (now pointing at M3).
The "for instance" part is a repetition of section 2.2. How about simply
referring to section 2.2?
+----------+----------+----------+------------+----------------+
| ESI | GW-IP | MAC* | Label | Overlay Index |
|--------------------------------------------------------------|
| Non-Zero | Zero | Zero | Don't Care | ESI |
| Non-Zero | Zero | Non-Zero | Don't Care | ESI |
| Zero | Non-Zero | Zero | Don't Care | GW-IP |
| Zero | Zero | Non-Zero | Zero | MAC |
| Zero | Zero | Non-Zero | Non-Zero | MAC or None** |
| Zero | Zero | Zero | Non-Zero | None(IP NVO)***|
+----------+----------+----------+------------+----------------+
It seems that mac address is a more specific overlay index, so if ESI is also
present then the mac address should be used as the overlay index?
The fifth row is like a variation of the fourth row; why isn't there a
corresponding variation for each of the first three rows? The following
paragraph mentioned earlier seems to apply to all situations.
o The MPLS Label field is encoded as 3 octets, where the high-order
20 bits contain the label value. When sending, the label value
SHOULD be zero to indicate that recursive resolution is needed. If
the received MPLS Label value is zero, the route MUST contain an
Overlay Index and the ingress NVE/PE MUST do recursive resolution
to find the egress NVE/PE. If the received Label value is non-zero,
the route will not be used for recursive resolution unless a local
policy says so.
I struggled with the "IP NVO" in the sixth row because clearly this is MPLS
tunnel not IP tunnel. Then I realized that "IP" here refers to the payload not
the tunnel type:
IP NVO tunnel: it refers to Network Virtualization Overlay tunnels
with IP payload (no MAC header in the payload).
I have to say that "IP NVO tunnel" is a little misleading.
4. IP Prefix Overlay Index use-cases
4.1 TS IP address Overlay Index use-case
If you compare the two section titles above, you may realize the first one is a
little misleading ("IP Prefix" used as overlay index?). Perhaps change to "4.
Overlay Index use-cases"?
In section 4.1:
o Based on the MAC-VRF10 route-target in DGW1 and DGW2, the IP
Prefix route is also imported and SN1/24 is added to the IP-
VRF with Overlay Index IP2 pointing at the local MAC-VRF10. We
assume the RT-5 from NVE2 is preferred over the RT-5 from
NVE3. Should ECMP be enabled in the IP-VRF and both routes
equally preferable, SN1/24 would also be added to the routing
table with Overlay Index IP3.
The last two sentences seem to be contradicting. One says "preferred over" and
the other says "equally preferable".
(5) When the packet arrives at NVE2:
o Based on the tunnel information (VNI for the VXLAN case), the
MAC-VRF10 context is identified for a MAC lookup.
o Encapsulation is stripped-off and based on a MAC lookup
(assuming MAC forwarding on the egress NVE), the packet is
forwarded to TS2, where it will be properly routed.
If the destination is actually on the TS3 side, how does TS2 send traffic to
the final destination? Unless the topology is actually like the one in section
4.2 traffic will get blackholed? But then, the only difference between 4.1 and
4.2 is whether two overlay index (in 4.1, with ECMP) or one overlay index (in
4.2) is used?
In section 4.3:
. Destination inner MAC = M2 (this MAC will be obtained
from the Router's MAC Extended Community received along
with the RT-5 for SN1).
My understanding is that section 4 is descriptive (use cases). The above really
should be "specified" somewhere else not "described" here. OK as I read it on
further it becomes a more and more "specificative".
I do see some text about the Router's MAC EC in 4.4.1, but should that be
pulled out to somewhere that covers all cases (not just 4.4.1).
BTW - it's important to emphasize that the Router's MAC EC here is used to
carry TS MAC address not the "Router's MAC address" :-)
Section 4.4:
In order to provide connectivity for (1), MAC/IP routes (RT-2) are
needed so that IRB or TS MACs and IPs can be distributed.
Connectivity type (2) is accomplished by the exchange of IP Prefix
routes (RT-5) for IPs and subnets sitting behind certain Overlay
Indexes, e.g. GW IP or ESI.
"e.g. GW IP or ESI or TS MAC"
... If
no recursive resolution is needed, the core EVI may not be needed and
the IP-VRFs may be connected directly by Ethernet or IP NVO tunnels.
Even if the core EVI is needed, the tunnels are still ethernet tunnels, right?
Perhaps the last sentence should really be "... connected directly by tenant
(non-core) EVIs"?
Depending on the existence and characteristics of the core-facing IRB
interface in the core EVI, there are three different IP-VRF-to-IP-VRF
scenarios identified and described in this document:
1) Interface-less model
2) Interface-ful with core-facing IRB model
3) Interface-ful with unnumbered core-facing IRB model
I once commented that the "interface-less" and "interface-full" here are
convoluted. It really means if a core EVI and if core VRF IRBs are used. While
I am not requesting to change the terms, it would be good to point out what it
really means. Proposed new text:
Depending on the existence and characteristics of the core EVI and
IRB interfaces for the core-VRFs, there are three different IP-VRF-to-IP-VRF
scenarios identified and described in this document:
1) Interface-less model: no core EVI, no overlay index.
2) Interface-ful with core-VRF IRB model: core EVI, IP address as overlay
index.
3) Interface-ful with unnumbered core-VRF IRB model: core EVI, mac address
as overlay index.
BTW, I would still prefer to rename the "core EVI" to "Supplemental BD" for two
reasons:
- The "core" wording is confusing/misleading, because all the EVIs go over the
core.
- The "core EVI" is really the same as the "Supplemental BD" in draft-lin.
So why not take this opportunity to use the proper name?
4.4.2:
d) The core EVI is composed of the NVE/DGW MAC-VRFs and may contain
other MAC-VRFs without IRB interfaces. Those non-IRB MAC-VRFs will
typically connect TSes that need layer-3 connectivity to remote
subnets.
Can you elaborate the "other MAC-VRFs w/o IRB interfaces"? I have two
confusions here:
- you already mention "NVE/DGW MAC-VRFs", so what are "other" MAC-VRFs?
- If you want to say some MAC-VRFs do not have IRB interfaces, perhaps just say:
d) The core EVI is composed of NVE/DGW MAC-VRFs w/ or w/o IRB interfaces.
But how to get remote traffic to those NVEs w/o core-VRF IRBs using this model?
o Label value SHOULD be zero since the RT-5 route requires a
recursive lookup resolution to an RT-2 route. The MPLS label
or VNI to be used when forwarding packets will be derived from
the RT-2's MPLS Label1 field. The RT-5's Label field will be
ignored on reception.
Perhaps swap the last two sentences:
o Label value SHOULD be zero since the RT-5 route requires a
recursive lookup resolution to an RT-2 route. It is ignored on
reception, and the MPLS label or VNI from the RT-2's MPLS
Label1 field is used when forwarding packets.
Section 5:
c) Allows a flexible implementation where the prefix can be linked to
different types of Overlay Indexes: overlay IP address, overlay
MAC addresses, overlay ESI, underlay BGP next-hops, etc.
Perhaps:
c) Allows a flexible implementation where the prefix can be linked to
different types of Overlay/Underlay Indexes: overlay IP address, overlay
MAC addresses, overlay ESI, underlay BGP next-hops, etc.
_______________________________________________
BESS mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/bess