Re: [nvo3] [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04

Bob Briscoe Tue, 18 Sep 2018 16:03:22 -0700

Linda,

Until we can all understand the answers to the following two questions,I don't think we can discuss what track this draft ought to be on, letalone move on to your responses to all my other points.


1/ Applicability

You say this draft solely applies to connections with both ends withinthe controlled DC environment. But the draft says it's aboutmulti-tenant DCs. Are there any multi-tenant DCs that restrict all VMsto only communicate with other VMs within the same controlled DCenvironment?


2/ Purpose of publishing as an RFC

When I said:

#. The introduction does not say what the purpose of publishing thisdraft is.

you responded:

[Linda] The first paragraph on Page 3 has the description why VMMobility is needed.

Whether VM Mobility is needed was not my question. My question was whatis the purpose of the IETF publishing an RFC about VM Mobility? Andparticularly, what is /this/ RFC intended to achieve?

Are the authors trying to argue for a particular approach vs. others?Are you trying to write a tutorial? Are you trying to give the pros andcons of different approaches? Are you trying to give advice on goodpractice (with the implication that alternative practices are lessgood)? Are you trying to clarify ideas by writing them down? Are youtrying to outline the implications of VM Mobility for other protocolsbeing developed within the NVO WG?





Bob

On 10/09/18 19:16, Linda Dunbar wrote:

Bob,
Thank you very much for reviewing the draft and provided in-depthcomments. I am very sorry for the delayed response due to traveling.
Replies to your comments are inserted below marked by [Linda]:
-----Original Message-----
From: Bob Briscoe [mailto:[email protected]]
Sent: Monday, September 03, 2018 9:45 PM
To: [email protected]
Cc: [email protected]; [email protected]; [email protected]
Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04
Reviewer: Bob Briscoe
Review result: Not Ready
I have been selected as the Transport Directorate reviewer for thisdraft. The Transport Directorate seeks to review all transport ortransport-related drafts as they pass through IETF last call and IESGreview, and sometimes on special request. The purpose of the review isto provide assistance to the Transport ADs. For more information aboutthe Transport Directorate Reviews and the Transport Area Review Team,please see https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-ReviewsIn this case, very very few of the review comments relate to transportissues, although the greatest issue concerns a desire that the networkcould pause or stop connections during L3 VM Mobility, which iscertainly a transport issue.[Linda] There is “Hot Migration” with transport service continuing,and there is a “Cold Migration”, which is a common practice in manydata centers, which stop the task running on the old place and move tothe new place before restart as described in the Task Migration.
Is it helpful to add this description to the draft?
==Summary==
The technical aspects of the draft concerning L2 VM mobility (within asubnet) seem sound. However, this is only part of the draft, which hasthe following
issues:
#. The introduction does not say what the purpose of publishing thisdraft is.It seems that, rather than describing a specific protocol orprotocols, it intends to describe the overall system procedure thatwould typically be used in DCs for VM mobility. It is tagged as a BCP,but it does not say who needs this BCP, why it is useful for the IETFto publish this BCP, how wide the authors' knowledge is of currentpractice (given DCs are private), or why this is a BCP rather than aprotocol spec.[Linda] The first paragraph on Page 3 has the description why VMMobility is needed. Is it helpful to move this paragraph to thebeginning of the Introduction Section?
/“//Virtualization which is being used in almost all of today’s data/
/centers enables many virtual machines to run on a single physical/
/computer or compute server. Virtual machines (VM) need hypervisor/
/running on the physical compute server to provide them shared/
/processor/memory/storage. Network connectivity is provided by the/
/network virtualization edge (NVE) [RFC8014]. Being able to move VMs/
/dynamically, or live migration, from one server to another allows for/
/dynamic load balancing or work distribution and thus it is a highly/
/desirable feature [RFC7364].//”/
The draft starts out (S.3) as if it intends to say what a good VMMobility protocol should or shouldn't do, but the rest of the documentdoesn't give any reasoning for these recommendations, it just assertswhat appears to be one view of how a whole VM Mobility system works,sometimes referring to one example protocol RFC for a component part,but more often with no references or details.[Linda] Is it helpful to move the paragraph above to the beginning ofthe Introduction Section? So that audience is aware of why VM Mobilityis needed. And then follow up with what a good VM Mobility protocolshould or shouldn't do?#. It does not seem as if the NVO WG has discussed the purpose ofusing normative text in this draft. See detailed comments.[Linda] The “Intended status” of the draft is “Best Current Practice”.So all the text are not “normative”. Is it Okay?#. The draft silently slips back and forth between VM mobility and VMredundancy, without recognizing the differences. See detailed comments.[Linda] There is only one usage of “redundancy” in the entiredocument, used under the context of “Hot standby option”, indicating the “redundancy” of “the VMs in both primary and secondary domainshave identical information and can provide services simultaneously asin load-share mode of operation” being expensive.#. Please adopt different terminology than "source NVE" and"destination NVE", which are really poor choices of terms for anintermediate node. See detailed comments. Why not use "old NVE" and"new NVE", which is what you mean?[Linda] Thanks for the suggestion. We will change to “Old NVE”, and“new NVE”.#. Applicability is fairly clearly outlined, but it is not clearwhether hosts corresponding with the mobile VMs are part of the samecontrolled environment or on the uncontrolled public Internet. Seedetailed comments.[Linda] “Hosts” are the App running on the VM. It is the under thesame controlled environment. Not on uncontrolled public internet.#. Section 4.2.1 on L3 VM mobility reads like some potentialhalf-thought-through ideas on how to solve L3 mobility, rather thancurrent practice, let alone best current practice. Either currentpractice should be described instead, or the scope of the draft shouldbe narrowed solely to L2 VM mobility. See detailed comments.[Linda] This is refereeing to “Cold Migration”, which is a commonpractice in many data centers.# The VM's file system is described as state that moves with the VM(S.6), but VM mobility solutions often move the VM but stitch it backto its (unmoved) storage. Conversely, the storage can also moveindependent of the VM.[Linda] It depends. When a VM move to a different zone, thestorage/file can becomes inaccessible.#. The draft omits some of the security, transport and managementaspects of VM mobility. See detailed comments.
[Linda] Can you provide some text?
#. The draft reads as if different sections have been written bydifferent authors and no-one has edited the whole to give it acoherent structure, or to ensure consistency (both technical andeditorial) between the parts. See detailed comments.
[Linda] we can improve.
#. The quality of the English grammar does not allow a reviewer toconcentrate on the technical aspects rather than the English. It wouldhave been useful if one of the English-speaking co-authors hadimproved the English before submission for review. See detailed comments.
[Linda] can you help?  Becoming a co-author to improve?
==Detailed Comments==
===#. Normative statements===
In the body of the document, there is just one occurrence of normativetext (actually two "MUST"s, but both state a common requirement - justwritten separately for IPv4 and IPv6). This merely serves to implythat everything else the document says is less important or optional,which was probably not the intention.[Linda] The goal is to indicate any solution in moving the VM “MUST”follow this rule. They make sense, aren’t they?At the start there is a requirements section, which states what a VMMobility protocol "SHOULD" or "SHOULD NOT" do. I think this isintended as a set of goals for the rest of the document. If so, these"SHOULDs" are not intended to apply to implementations, so they oughtnot to be capitalized.
[Linda] okay, will change.
The first requirement, "Data center network SHOULD support virtualmachine mobility in IPv6", is written as a requirement on all DCnetworks, not on implementations. I assume this was intended to readas "Data center network virtual machine mobility protocols SHOULDsupport IPv6". Even then, it doesn't really add anything to say VMmobility should support v6 and it should support v4. A L2 solutionwon't. While undoubtedly, a L3 solution will at least support one of them.[Linda]Agree. Will change it to “Data center that support IPv6 addressshould …”I'm not sure that 'protocol' is the right word anyway; I think 'VMMobility procedure' would be a better phrase, because it includessteps such as suspending the VM, which is more than a protocol.
[Linda] yes. Will change to “Procedure”.
The requirement "Virtual machine mobility protocol MAY support hostroutes to accomplish virtualization", is not followed up at all in therest of the draft.
Even if this requirement stays, the last 3 words should be deleted.
[Linda] will change to “Host Route can be used to support the VirtualMachine Mobility Procedure.”By the end of the draft, the solution falls far short of the mostrelevant "Requirements" anyway, so one assumes the title of thesection ought to have been "Goals". Specifically, even in the simplercase of L2 VM mobility, S.4.1 says that triangular routing andtunnelling persist "until a neighbour cache entry times out". A cachetimeout is about 10 orders of magnitude longer than the requirement toonly persist "while handling packets in flight", which would be a fewmilliseconds at most (the time for packets to clear the network thatwere already launched into flight when the old VM stopped).Whatever, it would be preferable for the draft to give rationale forthese requirements, rather than just assert them. This would help toshed light on the merits of the different trade offs that solutionschoose.
[Linda] Agree, will add.
===#. Mobility vs. Redundancy===
Redundancy and mobility have a lot of similarities, but they havedifferent goals. With mobility, it is necessary to know the exactinstant when one set of state is identical to the other so it can handover. With redundancy, the aim is to keep two (or more) sets of stateevolving through the same sequence of changes, but there is no need toknow the point at which one is the same as the other was at a certainpoint.[Linda] Agree with what you said. There is only one usage of“redundancy” in the entire document, used under the context of “Hotstandby option”, indicating the “redundancy” of “the VMs in bothprimary and secondary domains have identical information and canprovide services simultaneously as in load-share mode of operation”being expensive.
The draft slips from mobility to resilience in the following places:
* S.2. Terminology: Warm VM Mobility is defined without any ending, asif it is permanent replication. * S.7. "Handling of Hot, Warm and ColdVirtual Machine Mobility" is actually all about redundancy, anddoesn't address mobility explicitly.[Linda] Will add the definition “Hot Migration”, “cold migration”, and“warm migration”.
===#. Terminology===
Packets run from the source at A to the destination at B via NVE1,then via NVE2. Please don't call NVE1 and NVE2 the source NVE and thedestination NVE.In future, no-one will thank you for the apparent contradictions whenthey continually stumble over phrases like this one in S.4.1: "...sendtheir packets to the source NVE".The term "packets in flight" is used incorrectly to refer to all thepackets sent to the old NVE after the VM has moved, even if they werelaunched into flight long after the old VM stopped receiving packets.
[Linda] thank for the comments. Will change.
BTW, I think s/before/after/ in: "that have old ARP or neighbor cacheentry before VM or task migration".I think: s/IP-based VM mobility/L3 VM mobility/ throughout, because"based"sounds (to me) like the mobility control protocol is over (i.e. basedon) IP.
===#. Applicability===
In section 4.2 it says that the protocol mostly used as the IP basedtask migration protocol is ILA. This implies that all hostscorresponding with the mobile VMs are either part of the samecontrolled environment, or they are proxied via nodes that are part ofthe same controlled environment (I only have passing knowledge of ILA,but I understand that it depends on ILA routers on the path). If I amcorrect, this aspect of scope needs to be made clear from the start.Also under the heading of applicabiliy, the sentence "Since migrationsshould be relatively rare events" appears very late in the document(S.4.2.1). The assumed level of churn ought to be stated nearer the start.
[Linda] yes, under the same controlled environment.
===#. L3 Mobility===
L2 VM mobility is independent of the application, because resolutionof L2 mappings is delegated to the stack. In contrast, L3 VM mobilityis only feasible under certain conditions, because an applicationneeds an IP address to open a socket (resolution of DNS names is notdelegated to the stack, and apps can use IP addresses directly anyway).
Examples of the 'certain conditions':
a) /All/ applications used in the whole DC load balancing schemecontain IP address migration logic for /all/ their connections; b) VMsrunning solely applications that support IP address migration registerthis fact with the NVA, and it only select such VMs for mobility. c)An abstraction is layered over /all/ the IP addresses exposed toapplications (at both ends) so that the IP addresses that applicationsuse are solely identifiers (e.g. ILA, LISP, HIP), not also locators.The introduction says the draft is about VM mobility in a multi-tenantDC, so the DC admin will not know the range of applications beingused. This excludes condition (a) above. When the draft says "...ifall applications running are known to handle this gracefully...", itdoesn't quantify just how restrictive this condition is, and it givesno explanation of how this knowledge might be 'known' or whichfunction within the system 'knows' it.
S.4.2.1 contains what seems like plenty of arm-waving.
* "TCP connections could be automatically closed in the network stackduring a migration event."
        o There is no TCP connection state in the network stack.
o Even if the network starts to drop every packet, the TCPconnection state persists in the end-points for a duration of the orderof 30-90 minutes (OS-dependent) before TCP deems the connection isbroken. o Other transport protocols have similar designs (including theapp-layer
        of protocols over UDP).
* "More involved approach to connection migration":
o pausing the connection [does this refer to an actual featureof any L4 protocol?] o packaging connection state and sending totarget [does this assume logic written into the application, or is thisassuming the
        stack handles this and the app is restricted to using some form of
        separate identifier/locator addresses?] o instantiating connection
        state in the peer stack [ditto?].
There's some arm-waving in S.7 too:
  "Cold Virtual Machine mobility is facilitated by the VM initially
   sending an ARP or Neighbor Discovery message at the destination NVE
   but the source NVE not receiving any packets inflight."
[How is it arranged for the source NVE not to receive any packetsin flight?]
And in S.7:
  "In hot
   standby option, regarding TCP connections, one option is to start
   with and maintain TCP connections to two different VMs at the same
   time."
[This sounds like resilience logic has been written into theapplication, which would be a special case but not something VM mobilityinfrastructure
   could depend on.]
[Linda] will add.
===#. Gaps===
#. Security Considerations: repeats issues in other drafts that arenot specific to mobility, but it does not mention any security issuesspecifically due to VM mobility. It says that address spoofing mayarise in a DC (sort-of implying it is worse than in non-DCenvironments, but not saying why). The handshake at the start of aconnection (e.g. TCP, SCTP, QUIC) checks for source address spoofing.So L3 VM mobility would be more vulnerable to source address spoofingin cases where the mobile VM was the connection initiator and therewas not a new handshake after the move. However, this draft does notcontain any detailed mobility protocols, so it is not possible toidentify any specific security flaws.#. Transport Issues: Effect of delay on the transport: Cold mobilityintroduces significant delay, and other forms less, but still somedelay. It should be pointed out that some applications (e.g.real-time) will therefore not be useful if subjected to VM mobility.Similarly, even a short period of delay will drive most congestioncontrols to severely reduce throughput. These points might beself-evident, but perhaps they should be stated explicitly.BTW, in the L3 VM mobility case, the draft often refers to TCPconnections, but the address bindings of any transport protocols wouldhave to be migrated due to VM mobility (e.g. SCTP; sequences ofdatagrams over UDP; streams over UDP such as with RTP, QUIC).#. Management Issues: perhaps the draft ought to recommend statisticsgathering (e.g. time taken, amount of duplicate data) to aid a DC'sfuture decisions on the cost-benefit of moving a VM. The OPSDIR reviewsays a BCP does not /have/ to describe management issues, but thisdocument seems to describe a whole system procedure, not just aprotocol, which then surely includes the management plane.
[Linda] can you become a co-author and add those in?
===#. Incoherent Structure===
S.4.1. happens to talk about VMs moving, while S.4.2. happens to talkabout tasks moving, but this is not the distinguishing aspect of thesetwo sections (anyway, S.2. says "the draft uses task and VMinterchangeably"): * "4.1 VM Migration" is about "L2 VM Mobility" sothis ought to be the section heading, *"4.2 Task Migration" is about "L3 VM Mobility" so this ought to be thesection heading. It would also help not to switch from VM to taskacross these sections
- it's just a distraction.
S.4.1 needs better signposting of where each sub-case ends(Subsections might be useful to solve this): * IPv4 * end-user client* 2 paras starting "All NVEs communicating with this virtualmachine..." [Not clear that the end-user case has ended and we havereturned to the general IPv4 case?] * IPv6 [Strictly, it still hasn'tsaid whether the end-user client case has ended.] [Also, it doesn'texplain why there is no need for an end-user client case under IPv6?]Sections 5 & 6 seem to be about either L2 or L3 mobility, whereasSections 7 &
8 seem to be restricted to L2.
The draft vacillates over what to do with packets arriving at the oldNVE in the L3 case (see also L3 mobility above): * S4.2 first sayspackets are dropped, possibly with an ICMP error message;
  o then later it says they are silently dropped;
o then in the very next sentence it says either silently drop themor forward
  them to the new location
* S.5 says they should not be lost, but instead delivered to thedestination hypervisor
  o then it describes how they are tunnelled (which is not the same as
  "forwarding").
The order in which all the stages of mobilty are given is jumbled upacross sections that also appear in arbitrary order: * S.5 prepares,establishes uses then stops a tunnel, but it doesn't say where theother stages fit between these steps o When tunneling packets, it talks about the *migrating* VMnot the *migrated* VM, which implies tunnelling has started before thenew VM is running. Does this imply there is a huge buffer? o It says"Stop Tunneling Packets - When source NVE stops receiving packetsdestined to..." but it is never clear when a source has stopped sendingpackets to a destination, unless it explicitly closes the connection(e.g. with a FIN in the case of TCP). Often there are long gaps betweenpackets, because many flows are 'thin' (meaning the applicationfrequently has nothing to send). These gaps can last for milliseconds, hoursor even
        days without any implication that the connection has ended.
* Then S.6. describes moving state, but doesn't say that this is notafter the previous tunnelling steps (or where it fits within thosesteps). * Then S.7 describes hot, warm and cold mobility, but doesn'tlay out the tunnelling or steps to move state in each case. * Then S.8says it's about VM life-cycle, but just gives the very first 3 stepsfor allocation of resources to a VM, then abruptly ends, without evenstarting the VM, let alone getting to move it.S.5 exhibits another inconsistency by talking about the hypervisor,not the NVE.
==#. Nits==
Nits with the English are too numerous to mention them all. Below arepointers to general problems as well as some individual instances.
S.4
  "Layer 2 and Layer 3 protocols are described next.  In the following
   sections, we examine more advanced features."
        s/following/subsequent/
S.4.1
Expand WSC, MSC and NVA on first use.
s/the VM moves in the same link/the VM moves in the same subnet/
"i.e. end-user clients ask for the same MAC address upon migration.[...] to ensure that the same IPv4 address is assigned to the VM." Ithink s/IPv4/MAC/ was intended?
"  All NVEs communicating with this virtual machine uses the old ARP
   entry.  If any VM in those NVEs need to talk to the new VM in the
   destination NVE, it uses the old ARP entry."
Repetition: these 2 sentences say the same. (The mistake is alsorepeated when these 2 sentences are repeated for IPv6).
S.4.2.1
s/Push the new mapping to hosts./Push the new mapping to communicatinghosts./
S.5.
The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and "tunnelingpackets"only differ in the words "IPv4"/"IPv6". So in each case a single paracould be given for IP (irrespective of whether v4 or v6).
Thank you very much.
Linda Dunbar


_______________________________________________
Tsv-art mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/tsv-art


--
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/

_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

Re: [nvo3] [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04

Reply via email to