Dear Authors,
I have some comments on the draft, please find the detailed comments
below, under the section header they belong to.
Many of these comments are related to L2 problems mentioned in the
draft. These problems are out of date because they do not take into
account the state of the art. I think it is not useful to refer to old
technology; it is more beneficial for future development and research to
consider the state of the art. Furthermore, I do not see either value or
need in mentioning L2 problems (especially invalid ones) for the
motivation of the NVO3 work. Please note that it would be very easy to
avoid discussion on whether or not a problem is valid for L2 by just not
mentioning any L2 problem in the drafts.
2.2. Virtual Machine Mobility Requirements
“In traditional data centers, …”
I guess “traditional data center” here means a data center where
virtualization is not applied on the servers, i.e. there are no virtual
machines. However, traditional may mean several things for different
people. I think it would be useful to be entirely clear and use a more
exact term, e.g. “In data centers without server virtualization …”.
2.4. Inadequate Forwarding Table Sizes in Switches
“This places a much larger demand on the switches' forwarding table
capacity compared to non-virtualized environments, causing more traffic
to be flooded or dropped when the addresses in use exceeds the
forwarding table capacity.”
Today’s switches are able to provide network virtualization thus address
this problem.
It would be better to replace the title of the section e.g. with
“Forwarding Table Size Considerations” or with “Inadequate Forwarding
Table Sizes in Switches not Providing Network Virtualization”.
Accordingly, the above sentence could be replaced e.g. with “This places
a demand on the switches' to apply network virtualization in order to
keep forwarding tables tractable.”
2.5. Decoupling Logical and Physical Configuration
“However, in order to limit the broadcast domain of each VLAN,
multi-destination frames within a VLAN should optimally flow only to
those devices that have that VLAN configured.“
This is already solved today, that’s how L2 virtual overlays work: a
broadcast is limited within its VLAN. Therefore, it would be better to
not list it as a problem, e.g. remove the sentence.
2.7. Communication Between Virtual and Traditional Networks
I’m afraid “Traditional Networks” is not specific enough. Note that
network virtualization was already available in the last century. If
“traditional” means “old”, that does not imply that the old network did
not provide network virtualization (furthermore “old” is not exact
either). It would be better to use another term, e.g. replace the
section header to “Communication Between Virtual and Non-virtualized
Networks”, and remove the word “traditional” from the section as well.
“Additional identification, such as VLAN tags, could be used on the
non-virtualized side of such a gateway to enable forwarding of traffic
for multiple virtual networks over a common non-virtualized link.”
This sentence seems to be self contradicting. As described in the first
paragraph of Section 3.1, VLANs implement virtual networks identified by
VLAN tags. If there is no network virtualization on the non-virtualized
side of a gateway, then VLANs are not available at that side either. The
simplest resolution of the issue would be the removal of the sentence.
2.8. Communication Between Virtual Networks
NVE (Network Virtualization Edge) is not resolved at its first
appearance in the document.
2.9. Overlay Design Characteristics
“There are existing layer 2 overlay protocols in existence, but they
were not necessarily designed to solve the problem in the environment of
a highly virtualized data center.“
Scalability was a key design principle for Layer 2 overlay solutions
available today, e.g. Provider Backbone Bridges (PBB). That is, Layer 2
overlay solutions also address provider backbone networks, of which
customers are often providers of services to their customers, therefore,
network virtualization is applied. Despite most of the protocols were
not designed for data centers at the first place, the overlay solutions
provided today by 802.1Q are satisfactory for the highly virtualized
data centers due to the scalability provided. Moreover, there are Layer
2 protocols clearly designed for data centers, please refer to
http://www.ieee802.org/1/pages/dcbridges.html.
Furthermore, having the first sentence in this section suggests that the
existing L2 overlay protocols do not address the bullets listed in the
section, which is not the case.
Therefore, it would be better to remove the sentence: “There are
existing layer 2 overlay protocols in existence, but they were not
necessarily designed to solve the problem in the environment of a highly
virtualized data center.“
“The first hop switch that adds and removes the overlay header will
require new equipment and/or new software.”
New switch is not required if existing technology base is used in the
first hop switch. Deployed standard Ethernet switches support the
overlays specified by 802.1Q, e.g. PBB; it is not necessary to replace them.
It would be better to remove the sentence “The first hop switch that
adds and removes the overlay header will require new equipment and/or
new software.”
Bullet 4 also says “Work with existing, widely deployed network Ethernet
switches and IP routers without requiring wholesale replacement.” which
contradicts with the second sentence aside being not exact. (Replacement
of the first hop switches is not wholesale, but still a replacement.)
The term “widely deployed” is not exact. What standards these “widely
deployed” Ethernet switches and IP routers are compliant to? Are they
only compliant to old (potentially expired/superseded) standards? If one
has a network built upon an old technology and not satisfied with it
then some upgrade is needed as hinted by the second sentence; either to
a more up-to-date standard or to one that will be specified in the future.
The simplest way for resolving these issues is to remove bullet 4.
3.1. Limitations of Existing Virtual Network Models
“A VLAN is an L2 bridging construct that provides some of the semantics
of virtual networks mentioned above.”
VLANs as of today provide all the semantics described by bullets 1 and 2
right “above. Note that VLANs cover B-VLANs, I-VLANs identified by
I-SIDs, S-VLANs and C-VLANs.
Therefore, it would be better to remove “some of” from the sentence.
This section says:
“But there are problems and limitations with L2 VLANs. VLANs are a pure
L2 bridging construct and VLAN identifiers are carried along with data
frames to allow each forwarding point to know what VLAN the frame
belongs to.”
which seems expressing having the virtual network identifier carried as
part of the overlay header is a problem. It seems to be in conflict with
Section 3.2., which says:
“With the overlay, a virtual network identifier (or VNID) can be carried
as part of the overlay header so that every data frame explicitly
identifies the specific virtual network the frame belongs to.“
Furthermore, “In the data plane, an overlay header provides a place to
carry either the VNID, or a locally-significant identifier. In both
cases, the identifier in the overlay header specifies which virtual
network the data packet belongs to.” This text suggests that it is not a
problem to carry the virtual network identifier in the overlay header.
Therefore, it would be beneficial to resolve the self-conflict within
the document, e.g. by removing the statement that the L2 feature is a
problem by erasing “But there are problems and limitations with L2
VLANs.” from 3.1.
“A VLAN today is defined as a 12 bit number, limiting the total number
of VLANs to 4096 (though typically, this number is 4094 since 0 and 4095
are reserved). Due to the large number of tenants that a cloud provider
might service, the 4094 VLAN limit is often inadequate. In addition,
there is often a need for multiple VLANs per tenant, which exacerbates
the issue.”
Existing layer 2 overlays are not limited by 12 bits for a while. As of
today, 60 bits are provided for network virtualization by L2 (12-bit
B-VID + 24-bit I-SID + 12-bit S-VID + 12-bit C-VID); among which the
virtual overlay provided by the 24-bit I-SID clearly does not have the
4K limit mentioned in the text.
As the 4K limit is not there, it would be better to remove text stating
that it is a problem from the draft, e.g. from the paragraph above.
“This means that the limit of 4096 VLANs is associated with an
individual tenant service edge”
Note that the 4K limit is being eliminated for E-VPN as well. Please
refer to I-D.ietf-l2vpn-pbb-evpn.
It would be good to update the text to be in-line with E-VPN today, e.g.
by removing “limit of 4096 VLANs” for E-VPN.
3.2. Benefits of Network Overlays
“The use of a sufficiently large VNID would address current VLAN
limitations associated with single 12-bit VLAN tags.” The 12-bit
limitation is not a current limitation; it was only a limitation years
ago. Current VLANs are specified by the current 802.1Q
(http://standards.ieee.org/getieee802/download/802.1Q-2011.pdf) The
I-SID is 24 bits, which specifies a network overlay (also referred to as
I-VLAN) clearly not having the 12-bit limitation. I think that the I-SID
is a sufficiently large VNID.
It would be better to remove the sentence.
“a VM can now be located anywhere in the data center that the overlay
reaches without regards to traditional constraints implied by L2
properties such as VLAN numbering, or the span of an L2 broadcast domain
scoped to a single pod or access switch.”
These L2 limitations are not there any more. Please check e.g. PBB in
802.1Q. The VM can be anywhere in the data center, e.g. at any pod; the
24-bit I-VLAN does not seem to be limiting, and the L2 broadcast domain
is not limited to a single pod or access switch, it can span several
pods without any problem.
It would be better to remove “without regards to traditional constraints
implied by L2 properties such as VLAN numbering, or the span of an L2
broadcast domain scoped to a single pod or access switch.”
3.3. Overlay Networking Work Areas
“One approach is to build mapping tables entirely via learning (as is
done in 802.1 networks). But to provide better scaling properties, a
more sophisticated approach is needed, i.e., the use of a specialized
control plane protocol.”
It is not clear here why a control plane protocol scales better than
learning with respect to the size of mapping tables in the edge nodes.
For proper operation, the mapping tables need to have exactly the same
entries either they are built up by a control protocol or by data plane
learning. (Note that no entry is required at all in a point-to-point case.)
Note further that the “address mapping dissemination problem” mentioned
later in the document does not exist in case of learning.
Maybe the disadvantages, of a control plane protocol dedicated for
maintaining the mapping tables in all VNEs should be listed too, e.g.
complexity, convergence and VM migration issues.
“a standardized interaction between the NVE and hypervisor may be
needed, for example in the case where the NVE resides on a separate
device from the VM.”
VDP is such a standardized interaction. VDP is designed to communicate
from the end system to the device on the edge of the network that a VM
is moving.
The current text intimates that there is no standardized protocol for
that. It would be better to make clear the existence of a standard
protocol, e.g. by referring to VDP.
4.1. IEEE 802.1aq - Shortest Path Bridging
“SPB is entirely L2 based, extending the L2 Ethernet bridging model.”
SPB extends IS-IS to be able to control L2 as well. SPB does not remove
any IP capabilities from IS-IS, everything is still available.
It would be better to rephrase the sentence, e.g. to “SPB extends IS-IS
in order to be able to perform L2 control in addition to the existing
IS-IS capabilities.”
4.3. TRILL
“TRILL is an L2-based approach aimed at improving deficiencies and
limitations with current Ethernet networks and STP in particular.”
The limitations discussed here are not the limitations of current
Ethernet, especially given that there is no living standard specifying
STP since 2004. (Current Ethernet is specified by the active standards:
http://standards.ieee.org/develop/wg/WG802.1.html. STP is only supported
by means of backwards compatibility.)
Furthermore, TRILL defines its own header format instead of using the
existing L2 headers specified by 802.1Q, thus it is not clearly an L2
approach.
It would be better to rephrase the sentence, e.g. “TRILL is a Local Area
Network protocol, which establishes the forwarding paths using IS-IS
routing and encapsulation of traffic with its own TRILL header.”
“Although it differs from Shortest Path Bridging in many architectural
and implementation details, it is similar in that is provides an L2
based service to end systems.”
No need to mention SPB here, 4.1 is about SPB.
It would be better to remove SPB from this sentence, e.g. replace the
sentence with: “It provides an L2 based service to end systems.”
“TRILL as defined today, supports only the standard (and limited) 12-bit
VLAN model.“
The standard (802.1Q) is not limited by 12-bit VLAN model as described
above. There is e.g. the 24-bit I-SID.
It would be better to update the sentence, e.g.: “TRILL as defined
today, supports only the 12-bit C-VLAN.”
New subsection for VDP
Data Center Bridging results are not mentioned in the related work. As
its focus is clearly data center networks, and VDP addresses Virtual
Machine migration, it is within the scope of this document. It would be
good to add a paragraph on VDP, e.g. along these lines:
4.8 VDP
VDP is the Virtual Station Interface (VSI) Discovery and Configuration
Protocol specified by IEEE P802.1Qbg. VDP is a protocol that supports
the association of a VSI with a port. VDP is run between the end system
(e.g. a hypervisor) and its adjacent switch, i.e. the device on the edge
of the network. VDP is used for example to communicate to the switch
that a Virtual Machine (Virtual Station) is moving, i.e. designed for VM
migration.
5. Further Work
“to reduce the overall amount of flooding and other multicast and
broadcast related traffic (e.g, ARP and ND) currently experienced within
current data centers with a large flat L2 network.”
It is not necessarily the case: properly constructed large L2 network
based on PBB has no issues like that.
It would be better to remove “currently experienced within current data
centers with a large flat L2 network”
Best regards,
János
_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3