Dear Authors,

I have some comments on the draft, please find the detailed comments below, under the section header they belong to.

Many of these comments are related to L2 problems mentioned in the draft. These problems are out of date because they do not take into account the state of the art. I think it is not useful to refer to old technology; it is more beneficial for future development and research to consider the state of the art. Furthermore, I do not see either value or need in mentioning L2 problems (especially invalid ones) for the motivation of the NVO3 work. Please note that it would be very easy to avoid discussion on whether or not a problem is valid for L2 by just not mentioning any L2 problem in the drafts.


2.2. Virtual Machine Mobility Requirements

“In traditional data centers, …”
I guess “traditional data center” here means a data center where virtualization is not applied on the servers, i.e. there are no virtual machines. However, traditional may mean several things for different people. I think it would be useful to be entirely clear and use a more exact term, e.g. “In data centers without server virtualization …”.


2.4. Inadequate Forwarding Table Sizes in Switches

“This places a much larger demand on the switches' forwarding table capacity compared to non-virtualized environments, causing more traffic to be flooded or dropped when the addresses in use exceeds the forwarding table capacity.” Today’s switches are able to provide network virtualization thus address this problem. It would be better to replace the title of the section e.g. with “Forwarding Table Size Considerations” or with “Inadequate Forwarding Table Sizes in Switches not Providing Network Virtualization”. Accordingly, the above sentence could be replaced e.g. with “This places a demand on the switches' to apply network virtualization in order to keep forwarding tables tractable.”


2.5. Decoupling Logical and Physical Configuration

“However, in order to limit the broadcast domain of each VLAN, multi-destination frames within a VLAN should optimally flow only to those devices that have that VLAN configured.“ This is already solved today, that’s how L2 virtual overlays work: a broadcast is limited within its VLAN. Therefore, it would be better to not list it as a problem, e.g. remove the sentence.


2.7. Communication Between Virtual and Traditional Networks

I’m afraid “Traditional Networks” is not specific enough. Note that network virtualization was already available in the last century. If “traditional” means “old”, that does not imply that the old network did not provide network virtualization (furthermore “old” is not exact either). It would be better to use another term, e.g. replace the section header to “Communication Between Virtual and Non-virtualized Networks”, and remove the word “traditional” from the section as well.

“Additional identification, such as VLAN tags, could be used on the non-virtualized side of such a gateway to enable forwarding of traffic for multiple virtual networks over a common non-virtualized link.” This sentence seems to be self contradicting. As described in the first paragraph of Section 3.1, VLANs implement virtual networks identified by VLAN tags. If there is no network virtualization on the non-virtualized side of a gateway, then VLANs are not available at that side either. The simplest resolution of the issue would be the removal of the sentence.


2.8. Communication Between Virtual Networks

NVE (Network Virtualization Edge) is not resolved at its first appearance in the document.


2.9. Overlay Design Characteristics

“There are existing layer 2 overlay protocols in existence, but they were not necessarily designed to solve the problem in the environment of a highly virtualized data center.“ Scalability was a key design principle for Layer 2 overlay solutions available today, e.g. Provider Backbone Bridges (PBB). That is, Layer 2 overlay solutions also address provider backbone networks, of which customers are often providers of services to their customers, therefore, network virtualization is applied. Despite most of the protocols were not designed for data centers at the first place, the overlay solutions provided today by 802.1Q are satisfactory for the highly virtualized data centers due to the scalability provided. Moreover, there are Layer 2 protocols clearly designed for data centers, please refer to http://www.ieee802.org/1/pages/dcbridges.html. Furthermore, having the first sentence in this section suggests that the existing L2 overlay protocols do not address the bullets listed in the section, which is not the case. Therefore, it would be better to remove the sentence: “There are existing layer 2 overlay protocols in existence, but they were not necessarily designed to solve the problem in the environment of a highly virtualized data center.“

“The first hop switch that adds and removes the overlay header will require new equipment and/or new software.” New switch is not required if existing technology base is used in the first hop switch. Deployed standard Ethernet switches support the overlays specified by 802.1Q, e.g. PBB; it is not necessary to replace them. It would be better to remove the sentence “The first hop switch that adds and removes the overlay header will require new equipment and/or new software.”

Bullet 4 also says “Work with existing, widely deployed network Ethernet switches and IP routers without requiring wholesale replacement.” which contradicts with the second sentence aside being not exact. (Replacement of the first hop switches is not wholesale, but still a replacement.) The term “widely deployed” is not exact. What standards these “widely deployed” Ethernet switches and IP routers are compliant to? Are they only compliant to old (potentially expired/superseded) standards? If one has a network built upon an old technology and not satisfied with it then some upgrade is needed as hinted by the second sentence; either to a more up-to-date standard or to one that will be specified in the future.
The simplest way for resolving these issues is to remove bullet 4.


3.1. Limitations of Existing Virtual Network Models

“A VLAN is an L2 bridging construct that provides some of the semantics of virtual networks mentioned above.” VLANs as of today provide all the semantics described by bullets 1 and 2 right “above. Note that VLANs cover B-VLANs, I-VLANs identified by I-SIDs, S-VLANs and C-VLANs.
Therefore, it would be better to remove “some of” from the sentence.

This section says:
“But there are problems and limitations with L2 VLANs. VLANs are a pure L2 bridging construct and VLAN identifiers are carried along with data frames to allow each forwarding point to know what VLAN the frame belongs to.” which seems expressing having the virtual network identifier carried as part of the overlay header is a problem. It seems to be in conflict with Section 3.2., which says: “With the overlay, a virtual network identifier (or VNID) can be carried as part of the overlay header so that every data frame explicitly identifies the specific virtual network the frame belongs to.“ Furthermore, “In the data plane, an overlay header provides a place to carry either the VNID, or a locally-significant identifier. In both cases, the identifier in the overlay header specifies which virtual network the data packet belongs to.” This text suggests that it is not a problem to carry the virtual network identifier in the overlay header. Therefore, it would be beneficial to resolve the self-conflict within the document, e.g. by removing the statement that the L2 feature is a problem by erasing “But there are problems and limitations with L2 VLANs.” from 3.1.

“A VLAN today is defined as a 12 bit number, limiting the total number of VLANs to 4096 (though typically, this number is 4094 since 0 and 4095 are reserved). Due to the large number of tenants that a cloud provider might service, the 4094 VLAN limit is often inadequate. In addition, there is often a need for multiple VLANs per tenant, which exacerbates the issue.” Existing layer 2 overlays are not limited by 12 bits for a while. As of today, 60 bits are provided for network virtualization by L2 (12-bit B-VID + 24-bit I-SID + 12-bit S-VID + 12-bit C-VID); among which the virtual overlay provided by the 24-bit I-SID clearly does not have the 4K limit mentioned in the text. As the 4K limit is not there, it would be better to remove text stating that it is a problem from the draft, e.g. from the paragraph above.

“This means that the limit of 4096 VLANs is associated with an individual tenant service edge” Note that the 4K limit is being eliminated for E-VPN as well. Please refer to I-D.ietf-l2vpn-pbb-evpn. It would be good to update the text to be in-line with E-VPN today, e.g. by removing “limit of 4096 VLANs” for E-VPN.


3.2. Benefits of Network Overlays

“The use of a sufficiently large VNID would address current VLAN limitations associated with single 12-bit VLAN tags.” The 12-bit limitation is not a current limitation; it was only a limitation years ago. Current VLANs are specified by the current 802.1Q (http://standards.ieee.org/getieee802/download/802.1Q-2011.pdf) The I-SID is 24 bits, which specifies a network overlay (also referred to as I-VLAN) clearly not having the 12-bit limitation. I think that the I-SID is a sufficiently large VNID.
It would be better to remove the sentence.

“a VM can now be located anywhere in the data center that the overlay reaches without regards to traditional constraints implied by L2 properties such as VLAN numbering, or the span of an L2 broadcast domain scoped to a single pod or access switch.” These L2 limitations are not there any more. Please check e.g. PBB in 802.1Q. The VM can be anywhere in the data center, e.g. at any pod; the 24-bit I-VLAN does not seem to be limiting, and the L2 broadcast domain is not limited to a single pod or access switch, it can span several pods without any problem. It would be better to remove “without regards to traditional constraints implied by L2 properties such as VLAN numbering, or the span of an L2 broadcast domain scoped to a single pod or access switch.”


3.3. Overlay Networking Work Areas

“One approach is to build mapping tables entirely via learning (as is done in 802.1 networks). But to provide better scaling properties, a more sophisticated approach is needed, i.e., the use of a specialized control plane protocol.” It is not clear here why a control plane protocol scales better than learning with respect to the size of mapping tables in the edge nodes. For proper operation, the mapping tables need to have exactly the same entries either they are built up by a control protocol or by data plane learning. (Note that no entry is required at all in a point-to-point case.) Note further that the “address mapping dissemination problem” mentioned later in the document does not exist in case of learning. Maybe the disadvantages, of a control plane protocol dedicated for maintaining the mapping tables in all VNEs should be listed too, e.g. complexity, convergence and VM migration issues.

“a standardized interaction between the NVE and hypervisor may be needed, for example in the case where the NVE resides on a separate device from the VM.” VDP is such a standardized interaction. VDP is designed to communicate from the end system to the device on the edge of the network that a VM is moving. The current text intimates that there is no standardized protocol for that. It would be better to make clear the existence of a standard protocol, e.g. by referring to VDP.


4.1. IEEE 802.1aq - Shortest Path Bridging

“SPB is entirely L2 based, extending the L2 Ethernet bridging model.”
SPB extends IS-IS to be able to control L2 as well. SPB does not remove any IP capabilities from IS-IS, everything is still available. It would be better to rephrase the sentence, e.g. to “SPB extends IS-IS in order to be able to perform L2 control in addition to the existing IS-IS capabilities.”


4.3. TRILL

“TRILL is an L2-based approach aimed at improving deficiencies and limitations with current Ethernet networks and STP in particular.” The limitations discussed here are not the limitations of current Ethernet, especially given that there is no living standard specifying STP since 2004. (Current Ethernet is specified by the active standards: http://standards.ieee.org/develop/wg/WG802.1.html. STP is only supported by means of backwards compatibility.) Furthermore, TRILL defines its own header format instead of using the existing L2 headers specified by 802.1Q, thus it is not clearly an L2 approach. It would be better to rephrase the sentence, e.g. “TRILL is a Local Area Network protocol, which establishes the forwarding paths using IS-IS routing and encapsulation of traffic with its own TRILL header.”

“Although it differs from Shortest Path Bridging in many architectural and implementation details, it is similar in that is provides an L2 based service to end systems.”
No need to mention SPB here, 4.1 is about SPB.
It would be better to remove SPB from this sentence, e.g. replace the sentence with: “It provides an L2 based service to end systems.”

“TRILL as defined today, supports only the standard (and limited) 12-bit VLAN model.“ The standard (802.1Q) is not limited by 12-bit VLAN model as described above. There is e.g. the 24-bit I-SID. It would be better to update the sentence, e.g.: “TRILL as defined today, supports only the 12-bit C-VLAN.”


New subsection for VDP

Data Center Bridging results are not mentioned in the related work. As its focus is clearly data center networks, and VDP addresses Virtual Machine migration, it is within the scope of this document. It would be good to add a paragraph on VDP, e.g. along these lines:

4.8 VDP

VDP is the Virtual Station Interface (VSI) Discovery and Configuration Protocol specified by IEEE P802.1Qbg. VDP is a protocol that supports the association of a VSI with a port. VDP is run between the end system (e.g. a hypervisor) and its adjacent switch, i.e. the device on the edge of the network. VDP is used for example to communicate to the switch that a Virtual Machine (Virtual Station) is moving, i.e. designed for VM migration.


5. Further Work

“to reduce the overall amount of flooding and other multicast and broadcast related traffic (e.g, ARP and ND) currently experienced within current data centers with a large flat L2 network.” It is not necessarily the case: properly constructed large L2 network based on PBB has no issues like that. It would be better to remove “currently experienced within current data centers with a large flat L2 network”



Best regards,
János



_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

Reply via email to