Inline at [David-2]. Thanks, --David
From: Linda Dunbar [mailto:[email protected]] Sent: Thursday, September 18, 2014 5:09 PM To: Black, David; [email protected] Cc: Yakov Rekhter Subject: RE: draft-ietf-nvo3-vm-mobility-issues - Technical WG LC comments David, Thank you very much for the detailed comments. Please see the replies inserted below: -----Original Message----- From: nvo3 [mailto:[email protected]] On Behalf Of Black, David -- Section 2.1 Terminology In this document the term "Top of Rack Switch (ToR)" is used to refer to a switch in a data center that is connected to the servers that host VMs. That's a rather poor choice of terminology, as when there are two or more layers of switching in a rack, a switch at the top of the rack is not actually a "Top of Rack Switch (ToR)" as defined by this draft. Something based on "First Hop" would be better than "Top of Rack" and see below on whether this draft is assuming external NVEs. [Linda]Like it or not, the term "Top of Rack Switch (ToR)" is commonly used in Data Centers, referring to the switches connected to Servers. It is true that some servers have blade switches, some servers have virtual switches. NVE could on ToR, Virtual Switch, or Blade Switch. Those switches and likes make up the " L2 physical domain " attached to an NVE. [David-2] There’s an important, but subtle distinction here. I agree that servers are often connected directly to ToR switches, but there are exceptions. A definition has to be broadly applicable - it’s not sufficient to cover only the common case. [David=2] To define the ToR switch as above results in the ToR switch as defined by this draft not being the switch at the top of the rack in data centers where there is switching in the rack (blade servers are an example) - in such data centers, the servers are not directly connected to the ToR switches. That’s the problem - the definition needs to be comprehensively applicable, and is flawed when there is switching in the rack. A different term should be used to designate the switch that is connected to the servers that host VMs so that it covers all the cases. Nit: In the figure: "DBCR" -> "DCBR" (twice). [Linda] Thanks. I'm concerned about the implied requirement for a DCBR as the means of external data center connectivity, as it implies an IP (L3) forwarding hop at the data center boundary, and there are multi-data-center structures in which that does not happen. [Linda] The figure shows the most common network design of most DCs today. I am sure you can find some DCs that are not designed this way. [David-2] There’s only one use of the “DCBR” acronym outside of Section 2.1 - Section 3.4 makes this parenthetical comment: “(the router of the VM's L2-based CUG may be either DCBR or ToR itself).” I suggest deleting that parenthetical comment from section 3.4 rather than trying to figure out how to generalize it to cover all the cases. I don't see any mention of software switches in hypervisors in this section. I think this draft is assuming that NVEs are external, i.e., not located in the server - e.g., otherwise Section 3.3 doesn't make much sense. If so, that crucial assumption needs to be stated. [Linda] We can add a statement saying that NVEs can be on Server as virtual switch, or blade switch. [David-2] That misses the point. Section 3.3 assumes that the NVE is not in a virtual switch in the server, and I think 3.1 does as well, although 3.1 is somewhat unclear to me. Adding that statement will not suffice. The term Closed User Group (CUG), appears to be close to, if not the same as "virtual network" - use of the latter term would be significantly clearer. [Linda] Yes, CUG→ VN. -- Section 3.1 Usage of VLAN-IDs This appears to be a discussion of background that doesn't state a problem, hence does not belong in Section 3 (Problem Statement). Perhaps this should be section 2.2 ? [Linda] This section is to describe the issues of traditional way of using VLANs to represent L2 VNs. To support tens of thousands of virtual networks, the local VID associated with client payload under each NVE has to be locally significant. Therefore, the same L2-based VN MAY have different VLAN-IDs under different NVEs. [David-2] This assumes that the NVE is not in the server - that assumption needs to be stated, see above. Thus when a given VM moves from one non-trivial L2 physical domain to another, the VLAN-ID of the traffic from/to VM in the former may be different than in the latter, and thus cannot assume to stay the same. For data frames traversing through the underlay network, if ingress NVE simply encapsulates an outer header to data frames received from VMs and forward the encapsulated data frames to egress NVE via underlay network, the egress NVE can’t simply decapsulate the outer header and send the decapsulated data frames to attached VMs as done by TRILL. [David-2] If the NVE is in the virtual switch in the server, the NVE absolutely, positively can do exactly that, hence the assumption about NVE location needs to be clearly stated. Nit: This document assumes that within a given non-trivial L2 physical domain traffic from/to VMs that are in that domain, and belong to the same L2-based CUG MUST have the same VLAN-ID. That's true on a per-link basis, but not L2-physical-domain-wide, although using constant VLAN IDs domain-wide is a common practice because it's simple. Perhaps the assumption should be stated on a per-link basis followed by a "common practice" observation. [Linda] The section is to emphasize the problem of using common VLAN-ID for the VNs. How about we use the text above instead? [David-2] Maybe - I want to see the entire proposed rewrite of Section 3.1 along with any text on NVE location assumptions in order to comment further. I'm missing the point of the 3rd paragraph, specifically the "contrast" in this text: In other words, the VLAN-IDs used by a tagged VM network interface are part of the VM's state and cannot be changed when the VM moves from one L2 physical domain to another, even though it is possible for an entity, such as hypervisor virtual switch, to change the VLAN-ID from the value used by NVE to the value expected by the VM (in contrast, a VLAN tag assigned by a hypervisor for use with an untagged VM network interface can change). Both hardware and software switches can add, remove and map VLAN-IDs, so the VLAN-ID used by the VM can be limited to the link between the VM and the first switch. [Linda] correct. How about change the text to: In other words, the VLAN-IDs used by a tagged VM network interface are part of the VM's state and may not be changed when the VM moves from one L2 physical domain to another. Therefore, it is necessary for an entity, such as hypervisor virtual switch, to change the VLAN-ID from the value used by NVE to the value expected by the VM (in contrast, a VLAN tag assigned by a hypervisor for use with an untagged VM network interface can change). [David-2] “it is necessary” is too simple. I would add “when the NVE does not use the VLAN-ID expected by the VM” before “it is necessary”. I don't understand this sentence: If the L2 physical domain is extended to include VM tagged interfaces, the hypervisor virtual switch, and the DC bridged network, then special consideration is needed in assignment of VLAN tags for the VMs, the L2 physical domain and other domains into which the VM may move. What is meant by "special consideration"? [Linda] The “special consideration” should be “having an entity, such as hypervisor virtual switch, to change the VLAN-ID from the value used by NVE to the value expected by the VM” [David-2] I’d suggest merging this sentence with the new sentence proposed above. Nit: This document assumes that within a given non-trivial L2 physical domain traffic from/to VMs that are in that domain, and belong to different L2-based CUG MUST have different VLAN-IDs. Again, per-link + "common practice", see above. -- Section 3.2 Maintaining Connectivity in the Presence of VM Mobility Again, I don't see a problem stated here. There's also significant overlap between this section and both: - Section 3.2 of the problem statement draft, and - Section 3.3 of the framework draft. [Linda] None of the drafts above mention about VM's ARP cache gets flushed once VM moves to another server [David-2] Only for cold migration, as the draft text suggests. The framework draft does mention ARP effects for “hot” mobility, although this refers primarily to use of ARP or RARP: Solutions to maintain connectivity while a VM is moved are necessary in the case of "hot" mobility. This implies that connectivity among VMs is preserved. For instance, for L2 VNs, ARP caches are updated accordingly. What is being added here that isn't already covered in those sections of those drafts? -- Section 3.3 Layer 2 extension This text is hard to parse: Consider a scenario where a VM that is a member of a given L2-based CUG moves from one server to another, and these two servers are in different L2 physical domains, where these domains may be located in the same or different data centers. In order to enable communication between this VM and other VMs of that L2-based CUG, the new L2 physical domain must become interconnected with the other L2 physical domain(s) that presently contain the rest of the VMs of that CUG, and the interconnect must not violate the L2-based CUG requirement to preserve source and destination MAC addresses in the Ethernet header of the packets exchange between this VM and other members of that CUG. I'm missing the point here - two NVEs and an overlay-based encapsulation should solve this. At a, "must become" and "presently" are wrong as the interconnect may exist prior to the move. Also, I think this text assumes an external NVE (i.e., not in the server), see above. [Linda] This paragraph is to emphasize that VM mobility causing changes of the NVEs that interconnect the VN. [David-2] Really?? I don’t see the term NVE anywhere in that paragraph. If that was the point, the paragraph needs a complete rewrite. I’d like to see that rewrite before commenting further. There seems to be significant overlap with section 3.4 of the problem statement draft, and I'm not clear on what this draft is adding. -- Section 3.4 Optimal IP Routing This has massive overlap with Section 3.7 of the problem statement draft. I'm not sure what the text in this draft adds beyond what's already in the problem statement. -- Section 3.5 Preserving Policies This appears to be covered (albeit in less detail) in Section 3.3 of the framework draft. [Linda] Those policies are referring to the zone policies, FW policies, etc, that Framework draft hasn’t covered. [David-2] Ok, there are policies beyond those that “define connectivity among VMs” that need to be preserved. Thanks, --David ---------------------------------------------------- David L. Black, Distinguished Engineer EMC Corporation, 176 South St., Hopkinton, MA 01748 +1 (508) 293-7953 FAX: +1 (508) 293-7786 [email protected]<mailto:[email protected]> Mobile: +1 (978) 394-7754 ---------------------------------------------------- _______________________________________________ nvo3 mailing list [email protected]<mailto:[email protected]> https://www.ietf.org/mailman/listinfo/nvo3
_______________________________________________ nvo3 mailing list [email protected] https://www.ietf.org/mailman/listinfo/nvo3
