On 31/03/14 21:31, Black, David wrote: > Anton, > > Thanks for the reply - it helps clarify things.
I agree, there may be a need for a new Lightweight/Container related section and/or a couple of items in to refer to it. It is mostly about terminology and removing a couple of unnecessary restrictions here and there (probably unintentional ones too). Otherwise the architecture is nearly there. All it needs is some extra material on lightweight and corner Type 2 cases. I will try to formulate a couple of specific change-sets and unicast them to the draft authors. A. > >> 1. VM case - It is the hypervisor side of the vNIC and it is a classic >> L2 service. >> >> See most recent code drop: >> http://lists.gnu.org/archive/html/qemu-devel/2014-03/msg05830.html >> >> It is very clear in what it is - it is a L2 service attached through >> vNIC on the hypervisor side. VM has _NO_ code whatsoever of any shape or >> form that is service related. It is the same as if it was attached to a >> vswitch. > That should fit the current architecture and framework drafts ... > >>>> Section 3.1 - Presenting unicast point to point as an Ethernet is a >>>> hideous ugly hack which does not necessarily need to be there. Its sole >>>> right to exist is saving IP addresses in hosting. It can and should be >>>> augmented at a later date with native tunneling interface drivers where >>>> applicable. > The L2 service is defined by analogy to Ethernet and C-VLANs in particular. > If this is the "same as if it was attached to a vswitch" then the vswitch > is at least Ethernet-like. Please suggest specific text changes, and also > see Section 2.3.1 in the framework draft, which may also be impacted. > >>>> Section 3.2. - We have posted the code that does the encapsulation and >>>> decapsulation in the vNIC. The code has existed for containers for quite >>>> a few years now. The architecture diagram is generic enough to >>>> accommodate it - all it is is colocating the VAP with the Tenant in the >>>> Tenant vNIC. No rocket science. The text however is too prohibitive in >>>> its present form. >>>> >>>> Section 3.3 - Same. I believe we have demonstrated how you can locate >>>> (without the co) the NVE function in an off-host big fat router/switch >>>> of your choice with the VAPs being colocated with the VMs. We have also >>>> open sourced the relevant kvm code for the VAP portion so everyone can >>>> use that now. > Pleas suggest specific text changes. The framework draft already says: > > Note that some NVE functions (e.g., data plane and control plane > functions) may reside in one device or may be implemented separately > in different devices. For example, the NVE functionality could > reside solely on the End Devices, or be distributed between the End > Devices and the ToRs. > > The architecture draft should not be prohibiting that latter distributed > approach, although I'm not sure what text you're objecting to. > >> 2. Bare metal case. For containers the kernel serves _IS_ the >> hypervisor. Representing an overlay interface as an "Ethernet" in a >> container is a classic L2 service. Same story as above - the VM (in this >> case LXC, Solaris container or BSD jail) sees an Ethernet and has _NO_ >> service specific code. It cannot in fact distinguish f.e. a GRE, L2TPv3 >> or VXLAN interface given by the kernel from a vETH coming from a switch >> or tap hooked up into OVS. > That's not consistent with the architecture draft's description of bare > metal servers in Section 5.2, where there is no NVO3 functionality on the > server: > > Many data centers will continue to have at least some servers > operating as non-virtualized (or "bare metal") machines running a > traditional operating system and workload. In such systems, there > will be no NVE functionality on the server, and the server will have > no knowledge of NVO3 (including whether overlays are even in use). > > Section 5.2 was intended to cover the case where the TS is the entire > physical server. Here, it appears that vETH interfaces are anlogous to > vNICs and each TS is the container, so there is both virtualization and > NVE functionality on the server. I think the focus on section 5.2 to > cover use of containers to provide mulitple TSs on a single physical > server is "digging in the wrong place" - this discussion probably belongs > in Section 4.2 on split-NVEs, or possibly a new 4.x section, as the > examples in that section move encap outboard, but this email thread is > discussing a different functional split. > > Thanks, > --David > >> -----Original Message----- >> From: Anton Ivanov (antivano) [mailto:[email protected]] >> Sent: Monday, March 31, 2014 12:35 PM >> To: Black, David >> Cc: [email protected] >> Subject: Re: Proposal for some minor amendments of the arch draft >> >> On 31/03/14 17:22, Black, David wrote: >>> Anton, >>> >>> This doesn't look like a minor amendment. It looks like the NVE is being >>> split across the TS and the NVO3 infrastructure, with the encap/decap >>> processing moved into the TS. >>> >>> When a VM is "attach[ed] directly to an overlay", the service provided >>> to that VM (encap/decap code is in the VM, or the bare metal server, >>> per your 5.2 comments), the service is neither L2 nor L3 - it's something >>> new, and NVO3-specific, as opposed to virtualization of a native network >>> service. >> I think you missed my point. >> >> 1. VM case - It is the hypervisor side of the vNIC and it is a classic >> L2 service. >> >> See most recent code drop: >> http://lists.gnu.org/archive/html/qemu-devel/2014-03/msg05830.html >> >> It is very clear in what it is - it is a L2 service attached through >> vNIC on the hypervisor side. VM has _NO_ code whatsoever of any shape or >> form that is service related. It is the same as if it was attached to a >> vswitch. >> >> 2. Bare metal case. For containers the kernel serves _IS_ the >> hypervisor. Representing an overlay interface as an "Ethernet" in a >> container is a classic L2 service. Same story as above - the VM (in this >> case LXC, Solaris container or BSD jail) sees an Ethernet and has _NO_ >> service specific code. It cannot in fact distinguish f.e. a GRE, L2TPv3 >> or VXLAN interface given by the kernel from a vETH coming from a switch >> or tap hooked up into OVS. >> >>> The real issue here is "where is the service boundary?". NVO3 puts the >>> service boundary at the interface presented *to* the vNIC (or pNIC) by >>> the infrastructure, >> Yes? And? As I said - for containers the host kernel _IS_ the >> infrastructure and the container is getting a L2 service. >> >> As far as VMs and our proposed amendment if you directly decaps in the >> hypervisor side of the vNIC it is exactly a minor amendment as proposed. >> >> A. >> >>> not at some higher layer in the TS (e.g., VM). >>> >>> Thanks, >>> --David >>> >>>> -----Original Message----- >>>> From: nvo3 [mailto:[email protected]] On Behalf Of Anton Ivanov >> (antivano) >>>> Sent: Tuesday, March 04, 2014 3:11 AM >>>> To: [email protected] >>>> Subject: [nvo3] Proposal for some minor amendments of the arch draft >>>> >>>> Hi all, >>>> >>>> We have now published the code which allows a VM to attach directly to >>>> an overlay and decouples the virtual switch from host (in fact the >>>> switch can now become physical). This has been possible using containers >>>> all along so there is nothing new and revolutionary in this. >>>> >>>> Based on that (as well as working on that codebase for a while) I have a >>>> few comments: >>>> >>>> Section 3.1 - Presenting unicast point to point as an Ethernet is a >>>> hideous ugly hack which does not necessarily need to be there. Its sole >>>> right to exist is saving IP addresses in hosting. It can and should be >>>> augmented at a later date with native tunneling interface drivers where >>>> applicable. >>>> >>>> Section 3.2. - We have posted the code that does the encapsulation and >>>> decapsulation in the vNIC. The code has existed for containers for quite >>>> a few years now. The architecture diagram is generic enough to >>>> accommodate it - all it is is colocating the VAP with the Tenant in the >>>> Tenant vNIC. No rocket science. The text however is too prohibitive in >>>> its present form. >>>> >>>> Section 3.3 - Same. I believe we have demonstrated how you can locate >>>> (without the co) the NVE function in an off-host big fat router/switch >>>> of your choice with the VAPs being colocated with the VMs. We have also >>>> open sourced the relevant kvm code for the VAP portion so everyone can >>>> use that now. >>>> >>>> Section 5.2. Overly restrictive. What exactly prevents me even today to >>>> originate a GRE tunnel (which is the current arch)? Nothing. Besides the >>>> sysadmin not being bothered to read "man ip". If the protocol is >>>> supported and if it can be presented as a vNIC there is nothing to >>>> restrict the physical host from originating the overlay. >>>> >>>> I am not going to barge in with "here we have this wonderful new >>>> protocol" and support it by marketing slides. However, I also have to >>>> note - you have missed a wonderful _OLD_ protocol - L2TPv3 static >>>> tunnels and we have a working implementation which has been >>>> open-sourced. We will be augmenting that with GRE at a later date too. >>>> >>>> >>>> Well, >>>> _______________________________________________ >>>> nvo3 mailing list >>>> [email protected] >>>> https://www.ietf.org/mailman/listinfo/nvo3 _______________________________________________ nvo3 mailing list [email protected] https://www.ietf.org/mailman/listinfo/nvo3
