[nvo3] comments on draft-narten-nvo3-overlay-problem-statement-03.txt

János Farkas Tue, 24 Jul 2012 09:14:19 -0700

Dear Authors,

I have some comments on the draft, please find the detailed commentsbelow, under the section header they belong to.

Many of these comments are related to L2 problems mentioned in thedraft. These problems are out of date because they do not take intoaccount the state of the art. I think it is not useful to refer to oldtechnology; it is more beneficial for future development and research toconsider the state of the art. Furthermore, I do not see either value orneed in mentioning L2 problems (especially invalid ones) for themotivation of the NVO3 work. Please note that it would be very easy toavoid discussion on whether or not a problem is valid for L2 by just notmentioning any L2 problem in the drafts.



2.2. Virtual Machine Mobility Requirements

“In traditional data centers, …”

I guess “traditional data center” here means a data center wherevirtualization is not applied on the servers, i.e. there are no virtualmachines. However, traditional may mean several things for differentpeople. I think it would be useful to be entirely clear and use a moreexact term, e.g. “In data centers without server virtualization …”.



2.4. Inadequate Forwarding Table Sizes in Switches

“This places a much larger demand on the switches' forwarding tablecapacity compared to non-virtualized environments, causing more trafficto be flooded or dropped when the addresses in use exceeds theforwarding table capacity.”Today’s switches are able to provide network virtualization thus addressthis problem.It would be better to replace the title of the section e.g. with“Forwarding Table Size Considerations” or with “Inadequate ForwardingTable Sizes in Switches not Providing Network Virtualization”.Accordingly, the above sentence could be replaced e.g. with “This placesa demand on the switches' to apply network virtualization in order tokeep forwarding tables tractable.”



2.5. Decoupling Logical and Physical Configuration

“However, in order to limit the broadcast domain of each VLAN,multi-destination frames within a VLAN should optimally flow only tothose devices that have that VLAN configured.“This is already solved today, that’s how L2 virtual overlays work: abroadcast is limited within its VLAN. Therefore, it would be better tonot list it as a problem, e.g. remove the sentence.



2.7. Communication Between Virtual and Traditional Networks

I’m afraid “Traditional Networks” is not specific enough. Note thatnetwork virtualization was already available in the last century. If“traditional” means “old”, that does not imply that the old network didnot provide network virtualization (furthermore “old” is not exacteither). It would be better to use another term, e.g. replace thesection header to “Communication Between Virtual and Non-virtualizedNetworks”, and remove the word “traditional” from the section as well.

“Additional identification, such as VLAN tags, could be used on thenon-virtualized side of such a gateway to enable forwarding of trafficfor multiple virtual networks over a common non-virtualized link.”This sentence seems to be self contradicting. As described in the firstparagraph of Section 3.1, VLANs implement virtual networks identified byVLAN tags. If there is no network virtualization on the non-virtualizedside of a gateway, then VLANs are not available at that side either. Thesimplest resolution of the issue would be the removal of the sentence.



2.8. Communication Between Virtual Networks

NVE (Network Virtualization Edge) is not resolved at its firstappearance in the document.



2.9. Overlay Design Characteristics

“There are existing layer 2 overlay protocols in existence, but theywere not necessarily designed to solve the problem in the environment ofa highly virtualized data center.“Scalability was a key design principle for Layer 2 overlay solutionsavailable today, e.g. Provider Backbone Bridges (PBB). That is, Layer 2overlay solutions also address provider backbone networks, of whichcustomers are often providers of services to their customers, therefore,network virtualization is applied. Despite most of the protocols werenot designed for data centers at the first place, the overlay solutionsprovided today by 802.1Q are satisfactory for the highly virtualizeddata centers due to the scalability provided. Moreover, there are Layer2 protocols clearly designed for data centers, please refer tohttp://www.ieee802.org/1/pages/dcbridges.html.Furthermore, having the first sentence in this section suggests that theexisting L2 overlay protocols do not address the bullets listed in thesection, which is not the case.Therefore, it would be better to remove the sentence: “There areexisting layer 2 overlay protocols in existence, but they were notnecessarily designed to solve the problem in the environment of a highlyvirtualized data center.“

“The first hop switch that adds and removes the overlay header willrequire new equipment and/or new software.”New switch is not required if existing technology base is used in thefirst hop switch. Deployed standard Ethernet switches support theoverlays specified by 802.1Q, e.g. PBB; it is not necessary to replace them.It would be better to remove the sentence “The first hop switch thatadds and removes the overlay header will require new equipment and/ornew software.”

Bullet 4 also says “Work with existing, widely deployed network Ethernetswitches and IP routers without requiring wholesale replacement.” whichcontradicts with the second sentence aside being not exact. (Replacementof the first hop switches is not wholesale, but still a replacement.)The term “widely deployed” is not exact. What standards these “widelydeployed” Ethernet switches and IP routers are compliant to? Are theyonly compliant to old (potentially expired/superseded) standards? If onehas a network built upon an old technology and not satisfied with itthen some upgrade is needed as hinted by the second sentence; either toa more up-to-date standard or to one that will be specified in the future.

The simplest way for resolving these issues is to remove bullet 4.


3.1. Limitations of Existing Virtual Network Models

“A VLAN is an L2 bridging construct that provides some of the semanticsof virtual networks mentioned above.”VLANs as of today provide all the semantics described by bullets 1 and 2right “above. Note that VLANs cover B-VLANs, I-VLANs identified byI-SIDs, S-VLANs and C-VLANs.

Therefore, it would be better to remove “some of” from the sentence.

This section says:

“But there are problems and limitations with L2 VLANs. VLANs are a pureL2 bridging construct and VLAN identifiers are carried along with dataframes to allow each forwarding point to know what VLAN the framebelongs to.”which seems expressing having the virtual network identifier carried aspart of the overlay header is a problem. It seems to be in conflict withSection 3.2., which says:“With the overlay, a virtual network identifier (or VNID) can be carriedas part of the overlay header so that every data frame explicitlyidentifies the specific virtual network the frame belongs to.“Furthermore, “In the data plane, an overlay header provides a place tocarry either the VNID, or a locally-significant identifier. In bothcases, the identifier in the overlay header specifies which virtualnetwork the data packet belongs to.” This text suggests that it is not aproblem to carry the virtual network identifier in the overlay header.Therefore, it would be beneficial to resolve the self-conflict withinthe document, e.g. by removing the statement that the L2 feature is aproblem by erasing “But there are problems and limitations with L2VLANs.” from 3.1.

“A VLAN today is defined as a 12 bit number, limiting the total numberof VLANs to 4096 (though typically, this number is 4094 since 0 and 4095are reserved). Due to the large number of tenants that a cloud providermight service, the 4094 VLAN limit is often inadequate. In addition,there is often a need for multiple VLANs per tenant, which exacerbatesthe issue.”Existing layer 2 overlays are not limited by 12 bits for a while. As oftoday, 60 bits are provided for network virtualization by L2 (12-bitB-VID + 24-bit I-SID + 12-bit S-VID + 12-bit C-VID); among which thevirtual overlay provided by the 24-bit I-SID clearly does not have the4K limit mentioned in the text.As the 4K limit is not there, it would be better to remove text statingthat it is a problem from the draft, e.g. from the paragraph above.

“This means that the limit of 4096 VLANs is associated with anindividual tenant service edge”Note that the 4K limit is being eliminated for E-VPN as well. Pleaserefer to I-D.ietf-l2vpn-pbb-evpn.It would be good to update the text to be in-line with E-VPN today, e.g.by removing “limit of 4096 VLANs” for E-VPN.



3.2. Benefits of Network Overlays

“The use of a sufficiently large VNID would address current VLANlimitations associated with single 12-bit VLAN tags.” The 12-bitlimitation is not a current limitation; it was only a limitation yearsago. Current VLANs are specified by the current 802.1Q(http://standards.ieee.org/getieee802/download/802.1Q-2011.pdf) TheI-SID is 24 bits, which specifies a network overlay (also referred to asI-VLAN) clearly not having the 12-bit limitation. I think that the I-SIDis a sufficiently large VNID.

It would be better to remove the sentence.

“a VM can now be located anywhere in the data center that the overlayreaches without regards to traditional constraints implied by L2properties such as VLAN numbering, or the span of an L2 broadcast domainscoped to a single pod or access switch.”These L2 limitations are not there any more. Please check e.g. PBB in802.1Q. The VM can be anywhere in the data center, e.g. at any pod; the24-bit I-VLAN does not seem to be limiting, and the L2 broadcast domainis not limited to a single pod or access switch, it can span severalpods without any problem.It would be better to remove “without regards to traditional constraintsimplied by L2 properties such as VLAN numbering, or the span of an L2broadcast domain scoped to a single pod or access switch.”



3.3. Overlay Networking Work Areas

“One approach is to build mapping tables entirely via learning (as isdone in 802.1 networks). But to provide better scaling properties, amore sophisticated approach is needed, i.e., the use of a specializedcontrol plane protocol.”It is not clear here why a control plane protocol scales better thanlearning with respect to the size of mapping tables in the edge nodes.For proper operation, the mapping tables need to have exactly the sameentries either they are built up by a control protocol or by data planelearning. (Note that no entry is required at all in a point-to-point case.)Note further that the “address mapping dissemination problem” mentionedlater in the document does not exist in case of learning.Maybe the disadvantages, of a control plane protocol dedicated formaintaining the mapping tables in all VNEs should be listed too, e.g.complexity, convergence and VM migration issues.

“a standardized interaction between the NVE and hypervisor may beneeded, for example in the case where the NVE resides on a separatedevice from the VM.”VDP is such a standardized interaction. VDP is designed to communicatefrom the end system to the device on the edge of the network that a VMis moving.The current text intimates that there is no standardized protocol forthat. It would be better to make clear the existence of a standardprotocol, e.g. by referring to VDP.



4.1. IEEE 802.1aq - Shortest Path Bridging

“SPB is entirely L2 based, extending the L2 Ethernet bridging model.”

SPB extends IS-IS to be able to control L2 as well. SPB does not removeany IP capabilities from IS-IS, everything is still available.It would be better to rephrase the sentence, e.g. to “SPB extends IS-ISin order to be able to perform L2 control in addition to the existingIS-IS capabilities.”



4.3. TRILL

“TRILL is an L2-based approach aimed at improving deficiencies andlimitations with current Ethernet networks and STP in particular.”The limitations discussed here are not the limitations of currentEthernet, especially given that there is no living standard specifyingSTP since 2004. (Current Ethernet is specified by the active standards:http://standards.ieee.org/develop/wg/WG802.1.html. STP is only supportedby means of backwards compatibility.)Furthermore, TRILL defines its own header format instead of using theexisting L2 headers specified by 802.1Q, thus it is not clearly an L2approach.It would be better to rephrase the sentence, e.g. “TRILL is a Local AreaNetwork protocol, which establishes the forwarding paths using IS-ISrouting and encapsulation of traffic with its own TRILL header.”

“Although it differs from Shortest Path Bridging in many architecturaland implementation details, it is similar in that is provides an L2based service to end systems.”

No need to mention SPB here, 4.1 is about SPB.

It would be better to remove SPB from this sentence, e.g. replace thesentence with: “It provides an L2 based service to end systems.”

“TRILL as defined today, supports only the standard (and limited) 12-bitVLAN model.“The standard (802.1Q) is not limited by 12-bit VLAN model as describedabove. There is e.g. the 24-bit I-SID.It would be better to update the sentence, e.g.: “TRILL as definedtoday, supports only the 12-bit C-VLAN.”



New subsection for VDP

Data Center Bridging results are not mentioned in the related work. Asits focus is clearly data center networks, and VDP addresses VirtualMachine migration, it is within the scope of this document. It would begood to add a paragraph on VDP, e.g. along these lines:


4.8 VDP

VDP is the Virtual Station Interface (VSI) Discovery and ConfigurationProtocol specified by IEEE P802.1Qbg. VDP is a protocol that supportsthe association of a VSI with a port. VDP is run between the end system(e.g. a hypervisor) and its adjacent switch, i.e. the device on the edgeof the network. VDP is used for example to communicate to the switchthat a Virtual Machine (Virtual Station) is moving, i.e. designed for VMmigration.



5. Further Work

“to reduce the overall amount of flooding and other multicast andbroadcast related traffic (e.g, ARP and ND) currently experienced withincurrent data centers with a large flat L2 network.”It is not necessarily the case: properly constructed large L2 networkbased on PBB has no issues like that.It would be better to remove “currently experienced within current datacenters with a large flat L2 network”




Best regards,
János



_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

[nvo3] comments on draft-narten-nvo3-overlay-problem-statement-03.txt

Reply via email to