Hi Greg, Please see inline prefixed by [ag3].
Thanks, Anoop On Fri, Nov 16, 2018 at 5:29 PM Greg Mirsky <[email protected]> wrote: > Hi Anoop, > thank you for the discussion. Please find my responses tagged GIM3>>. > Also, attached diff and the updated working version of the draft. Hope > we're converging. > > Regards, > Greg > > On Wed, Nov 14, 2018 at 11:00 PM Anoop Ghanwani <[email protected]> > wrote: > >> Hi Greg, >> >> Please see inline prefixed with [ag2]. >> >> Thanks, >> Anoop >> >> On Wed, Nov 14, 2018 at 9:45 AM Greg Mirsky <[email protected]> >> wrote: >> >>> Hi Anoop, >>> thank you for the expedient response. I am glad that some of my >>> responses have addressed your concerns. Please find followup notes in-line >>> tagged GIM2>>. I've attached the diff to highlight the updates applied in >>> the working version. Let me know if these are acceptable changes. >>> >>> Regards, >>> Greg >>> >>> On Tue, Nov 13, 2018 at 12:30 PM Anoop Ghanwani <[email protected]> >>> wrote: >>> >>>> Hi Greg, >>>> >>>> Please see inline prefixed with [ag]. >>>> >>>> Thanks, >>>> Anoop >>>> >>>> On Tue, Nov 13, 2018 at 11:34 AM Greg Mirsky <[email protected]> >>>> wrote: >>>> >>>>> Hi Anoop, >>>>> many thanks for the thorough review and detailed comments. Please find >>>>> my answers, this time for real, in-line tagged GIM>>. >>>>> >>>>> Regards, >>>>> Greg >>>>> >>>>> On Thu, Nov 8, 2018 at 1:58 AM Anoop Ghanwani <[email protected]> >>>>> wrote: >>>>> >>>>>> >>>>>> Here are my comments. >>>>>> >>>>>> Thanks, >>>>>> Anoop >>>>>> >>>>>> == >>>>>> >>>>>> Philosophical >>>>>> >>>>>> Since VXLAN is not an IETF standard, should we be defining a standard >>>>>> for running BFD on it? Should we define BFD over Geneve instead which is >>>>>> the official WG selection? Is that going to be a separate document? >>>>>> GIM>> IS-IS is not on the Standard track either but that had not >>>>>> prevented IETF from developing tens of standard track RFCs using RFC 1142 >>>>>> as the normative reference until RFC 7142 re-classified it as >>>>>> historical. A >>>>>> similar path was followed with IS-IS-TE by publishing RFC 3784 until it >>>>>> was >>>>>> obsoleted by RFC 5305 four years later. I understand that Down Reference, >>>>>> i.e., using informational RFC as the normative reference, is not an >>>>>> unusual >>>>>> situation. >>>>>> >>>>> >>>> [ag] OK. I'm not an expert on this part so unless someone else that is >>>> an expert (chairs, AD?) can comment on it, I'll just let it go. >>>> >>>> >>>>> >>>>> >>>>>> >>>>>> Technical >>>>>> >>>>>> Section 1: >>>>>> >>>>>> This part needs to be rewritten: >>>>>> >>> >>>>>> The individual racks may be part of a different Layer 3 network, or >>>>>> they could be in a single Layer 2 network. The VXLAN segments/overlays >>>>>> are >>>>>> overlaid on top of Layer 3 network. A VM can communicate with another VM >>>>>> only if they are on the same VXLAN segment. >>>>>> >>> >>>>>> It's hard to parse and, given IRB, >>>>>> >>>>> GIM>> Would the following text be acceptable: >>>>> OLD TEXT: >>>>> VXLAN is typically deployed in data centers interconnecting >>>>> virtualized hosts, which may be spread across multiple racks. The >>>>> individual racks may be part of a different Layer 3 network, or they >>>>> could be in a single Layer 2 network. The VXLAN segments/overlays >>>>> are overlaid on top of Layer 3 network. >>>>> NEW TEXT: >>>>> VXLAN is typically deployed in data centers interconnecting >>>>> virtualized >>>>> hosts of a tenant. VXLAN addresses requirements of the Layer 2 and >>>>> Layer 3 data center network infrastructure in the presence of VMs in >>>>> a multi-tenant environment, discussed in section 3 [RFC7348], by >>>>> providing Layer 2 overlay scheme on a Layer 3 network. >>>>> >>>> >>>> [ag] This is a lot better. >>>> >>>> >>>>> >>>>> A VM can communicate with another VM only if they are on the same >>>>> VXLAN segment. >>>>>> >>>>>> the last sentence above is wrong. >>>>>> >>>>> GIM>> Section 4 in RFC 7348 states: >>>>> Only VMs within the same VXLAN segment can communicate with each other. >>>>> >>>> >>>> [ag] VMs on different segments can communicate using routing/IRB, so >>>> even RFC 7348 is wrong. Perhaps the text should be modified so say -- "In >>>> the absence of a router in the overlay, a VM can communicate...". >>>> >>>> >>>>> >>>>> Section 3: >>>>>> >>> >>>>>> Most deployments will have VMs with only L2 capabilities that >>>>>> may not support L3. >>>>>> >>> >>>>>> Are you suggesting most deployments have VMs with no IP >>>>>> addresses/configuration? >>>>>> >>>>> GIM>> Would re-word as follows: >>>>> OLD TEXT: >>>>> Most deployments will have VMs with only L2 capabilities that >>>>> may not support L3. >>>>> NEW TEXT: >>>>> Deployments may have VMs with only L2 capabilities that do not support >>>>> L3. >>>>> >>>> >>>> [ag] I still don't understand this. What does it mean for a VM to not >>>> support L3? No IP address, no default GW, something else? >>>> >>> GIM2>> VM communicates with its VTEP which, in turn, originates VXLAN >>> tunnel. VM is not required to have IP address as it is VTEP's IP address >>> that VM's MAC is associated with. As for gateway, RFC 7348 discusses VXLAN >>> gateway as the device that forwards traffice between VXLAN and non-VXLAN >>> domains. Considering all that, would the following change be acceptable: >>> OLD TEXT: >>> Most deployments will have VMs with only L2 capabilities that >>> may not support L3. >>> NEW TEXT: >>> Most deployments will have VMs with only L2 capabilities and not have >>> an IP address assigned. >>> >> >> [ag2] Do you have a reference for this (i.e. that most deployments have >> VMs without an IP address)? Normally I would think VMs would have an IP >> address. It's just that they are segregated into segments and, without an >> intervening router, they are restricted to communicate only within their >> subnet. >> > GIM3>> Would the following text be acceptable: > > Deployments might have VMs with only L2 capabilities and not have an IP > address assigned or, > in other cases, VMs are assigned IP address but are restricted to > communicate only within their subnet. > > [ag3] Yes, this is better. >>>> >>>>> >>>>>> >>> >>>>>> Having a hierarchical OAM model helps localize faults though it >>>>>> requires additional consideration. >>>>>> >>> >>>>>> What are the additional considerations? >>>>>> >>>>> GIM>> For example, coordination of BFD intervals across the OAM >>>>> layers. >>>>> >>>> >>>> [ag] Can we mention them in the draft? >>>> >>>> >>>>> >>>>>> Would be useful to add a reference to RFC 8293 in case the reader >>>>>> would like to know more about service nodes. >>>>>> >>>>> GIM>> I have to admit that I don't find how RFC 8293 A Framework for >>>>> Multicast in Network Virtualization over Layer 3 is related to this >>>>> document. Please help with additional reference to the text of the >>>>> document. >>>>> >>>> >>>> [ag] The RFC discusses the use of service nodes which is mentioned >>>> here. >>>> >>>> >>>>> >>>>>> Section 4 >>>>>> >>> >>>>>> Separate BFD sessions can be established between the VTEPs (IP1 and >>>>>> IP2) for monitoring each of the VXLAN tunnels (VNI 100 and 200). >>>>>> >>> >>>>>> IMO, the document should mention that this could lead to scaling >>>>>> issues given that VTEPs can support well in excess of 4K VNIs. >>>>>> Additionally, we should mention that with IRB, a given VNI may not even >>>>>> exist on the destination VTEP. Finally, what is the benefit of doing >>>>>> this? There may be certain corner cases where it's useful (vs a single >>>>>> BFD >>>>>> session between the VTEPs for all VNIs) but it would be good to explain >>>>>> what those are. >>>>>> >>>>> GIM>> Will add text in the Security Considerations section that VTEPs >>>>> should have limit on number of BFD sessions. >>>>> >>>> >>>> [ag] I was hoping for two things: >>>> - A mention about the scalability issue right where per-VNI BFD is >>>> discussed. (Not sure why that is a security issue/consideration.) >>>> >>> GIM2>> I've added the following sentense in both places: >>> The implementation SHOULD have a reasonable upper bound on the number of >>> BFD sessions that can be created between the same pair of VTEPs. >>> >> >> [ag2] What is the criteria for determining what is reasonable? >> > GIM>> I usually understand that as requirement to make it controllable, > have configurable limit. Thus it will be up to an network operator to set > the limit. > >> >> >>> - What is the benefit of running BFD per VNI between a pair of VTEPs? >>>> >>> GIM2>> An alternative would be to run CFM between VMs, if there's the >>> need to monitor liveliness of the particular VM. Again, this is optional. >>> >> >> [ag2] I'm not sure how running per-VNI BFD between the VTEPs allows one >> to monitor the liveliness of VMs. >> > [ag3] I think you missed responding to this. I'm not sure of the value of running BFD per VNI between VTEPs. What am I getting that is not covered by running a single BFD session with VNI 0 between the VTEPs? > >> >>> >>>> >>>>> >>>>>> Sections 5.1 and 6.1 >>>>>> >>>>>> In 5.1 we have >>>>>> >>> >>>>>> The inner MAC frame carrying the BFD payload has the >>>>>> following format: >>>>>> ... Source IP: IP address of the originating VTEP. Destination IP: IP >>>>>> address of the terminating VTEP. >>>>>> >>> >>>>>> >>>>>> In 6.1 we have >>>>>> >>> >>>>>> >>>>>> Since multiple BFD sessions may be running between two >>>>>> VTEPs, there needs to be a mechanism for demultiplexing received BF >>>>>> >>>>>> packets to the proper session. The procedure for demultiplexing >>>>>> packets with Your Discriminator equal to 0 is different from[RFC5880 >>>>>> <https://tools.ietf.org/html/rfc5880>]. >>>>>> >>>>>> *For such packets, the BFD session MUST be identified* >>>>>> >>>>>> *using the inner headers, i.e., the source IP and the destination IP >>>>>> present in the IP header carried by the payload of the VXLAN* >>>>>> >>>>>> *encapsulated packet.* >>>>>> >>>>>> >>>>>> >>> >>>>>> How does this work if the source IP and dest IP are the same as >>>>>> specified in 5.1? >>>>>> >>>>> GIM>> You're right, Destination and source IP addresses likely are the >>>>> same in this case. Will add that the source UDP port number, along with >>>>> the >>>>> pair of IP addresses, MUST be used to demux received BFD control packets. >>>>> Would you agree that will be sufficient? >>>>> >>>> >>>> [ag] Yes, I think that should work. >>>> >>>>> >>>>>> Editorial >>>>>> >>>>> >>>> [ag] Agree with all comments on this section. >>>> >>>>> >>>>>> - Terminology section should be renamed to acronyms. >>>>>> >>>>> GIM>> Accepted >>>>> >>>>>> - Document would benefit from a thorough editorial scrub, but maybe >>>>>> that will happen once it gets to the RFC editor. >>>>>> >>>>> GIM>> Will certainly have helpful comments from ADs and RFC editor. >>>>> >>>>>> >>>>>> Section 1 >>>>>> >>> >>>>>> "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348 >>>>>> <https://tools.ietf.org/html/rfc7348>]. provides an encapsulation >>>>>> scheme that allows virtual machines (VMs) to communicate in a data center >>>>>> network. >>>>>> >>> >>>>>> This is not accurate. VXLAN allows you to implement an overlay to >>>>>> decouple the address space of the attached hosts from that of the >>>>>> network. >>>>>> >>>>> GIM>> Thank you for the suggested text. Will change as follows: >>>>> OLD TEXT: >>>>> "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides >>>>> an encapsulation scheme that allows virtual machines (VMs) to >>>>> communicate in a data center network. >>>>> NEW TEXT: >>>>> "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides >>>>> an encapsulation scheme that allows building an overlay network by >>>>> decoupling the address space of the attached virtual hosts from that >>>>> of the network. >>>>> >>>>>> >>>>>> Section 7 >>>>>> >>>>>> VTEP's -> VTEPs >>>>>> >>>>> GIM>> Yes, thank you. >>>>> >>>>
