Hi Greg, Please see inline prefixed with [ag2].
Thanks, Anoop On Wed, Nov 14, 2018 at 9:45 AM Greg Mirsky <[email protected]> wrote: > Hi Anoop, > thank you for the expedient response. I am glad that some of my responses > have addressed your concerns. Please find followup notes in-line tagged > GIM2>>. I've attached the diff to highlight the updates applied in the > working version. Let me know if these are acceptable changes. > > Regards, > Greg > > On Tue, Nov 13, 2018 at 12:30 PM Anoop Ghanwani <[email protected]> > wrote: > >> Hi Greg, >> >> Please see inline prefixed with [ag]. >> >> Thanks, >> Anoop >> >> On Tue, Nov 13, 2018 at 11:34 AM Greg Mirsky <[email protected]> >> wrote: >> >>> Hi Anoop, >>> many thanks for the thorough review and detailed comments. Please find >>> my answers, this time for real, in-line tagged GIM>>. >>> >>> Regards, >>> Greg >>> >>> On Thu, Nov 8, 2018 at 1:58 AM Anoop Ghanwani <[email protected]> >>> wrote: >>> >>>> >>>> Here are my comments. >>>> >>>> Thanks, >>>> Anoop >>>> >>>> == >>>> >>>> Philosophical >>>> >>>> Since VXLAN is not an IETF standard, should we be defining a standard >>>> for running BFD on it? Should we define BFD over Geneve instead which is >>>> the official WG selection? Is that going to be a separate document? >>>> GIM>> IS-IS is not on the Standard track either but that had not >>>> prevented IETF from developing tens of standard track RFCs using RFC 1142 >>>> as the normative reference until RFC 7142 re-classified it as historical. A >>>> similar path was followed with IS-IS-TE by publishing RFC 3784 until it was >>>> obsoleted by RFC 5305 four years later. I understand that Down Reference, >>>> i.e., using informational RFC as the normative reference, is not an unusual >>>> situation. >>>> >>> >> [ag] OK. I'm not an expert on this part so unless someone else that is >> an expert (chairs, AD?) can comment on it, I'll just let it go. >> >> >>> >>> >>>> >>>> Technical >>>> >>>> Section 1: >>>> >>>> This part needs to be rewritten: >>>> >>> >>>> The individual racks may be part of a different Layer 3 network, or >>>> they could be in a single Layer 2 network. The VXLAN segments/overlays are >>>> overlaid on top of Layer 3 network. A VM can communicate with another VM >>>> only if they are on the same VXLAN segment. >>>> >>> >>>> It's hard to parse and, given IRB, >>>> >>> GIM>> Would the following text be acceptable: >>> OLD TEXT: >>> VXLAN is typically deployed in data centers interconnecting >>> virtualized hosts, which may be spread across multiple racks. The >>> individual racks may be part of a different Layer 3 network, or they >>> could be in a single Layer 2 network. The VXLAN segments/overlays >>> are overlaid on top of Layer 3 network. >>> NEW TEXT: >>> VXLAN is typically deployed in data centers interconnecting virtualized >>> hosts of a tenant. VXLAN addresses requirements of the Layer 2 and >>> Layer 3 data center network infrastructure in the presence of VMs in >>> a multi-tenant environment, discussed in section 3 [RFC7348], by >>> providing Layer 2 overlay scheme on a Layer 3 network. >>> >> >> [ag] This is a lot better. >> >> >>> >>> A VM can communicate with another VM only if they are on the same >>> VXLAN segment. >>>> >>>> the last sentence above is wrong. >>>> >>> GIM>> Section 4 in RFC 7348 states: >>> Only VMs within the same VXLAN segment can communicate with each other. >>> >> >> [ag] VMs on different segments can communicate using routing/IRB, so even >> RFC 7348 is wrong. Perhaps the text should be modified so say -- "In the >> absence of a router in the overlay, a VM can communicate...". >> >> >>> >>> Section 3: >>>> >>> >>>> Most deployments will have VMs with only L2 capabilities that >>>> may not support L3. >>>> >>> >>>> Are you suggesting most deployments have VMs with no IP >>>> addresses/configuration? >>>> >>> GIM>> Would re-word as follows: >>> OLD TEXT: >>> Most deployments will have VMs with only L2 capabilities that >>> may not support L3. >>> NEW TEXT: >>> Deployments may have VMs with only L2 capabilities that do not support >>> L3. >>> >> >> [ag] I still don't understand this. What does it mean for a VM to not >> support L3? No IP address, no default GW, something else? >> > GIM2>> VM communicates with its VTEP which, in turn, originates VXLAN > tunnel. VM is not required to have IP address as it is VTEP's IP address > that VM's MAC is associated with. As for gateway, RFC 7348 discusses VXLAN > gateway as the device that forwards traffice between VXLAN and non-VXLAN > domains. Considering all that, would the following change be acceptable: > OLD TEXT: > Most deployments will have VMs with only L2 capabilities that > may not support L3. > NEW TEXT: > Most deployments will have VMs with only L2 capabilities and not have an > IP address assigned. > [ag2] Do you have a reference for this (i.e. that most deployments have VMs without an IP address)? Normally I would think VMs would have an IP address. It's just that they are segregated into segments and, without an intervening router, they are restricted to communicate only within their subnet. > >> >>> >>>> >>> >>>> Having a hierarchical OAM model helps localize faults though it >>>> requires additional consideration. >>>> >>> >>>> What are the additional considerations? >>>> >>> GIM>> For example, coordination of BFD intervals across the OAM layers. >>> >>> >> >> [ag] Can we mention them in the draft? >> >> >>> >>>> Would be useful to add a reference to RFC 8293 in case the reader would >>>> like to know more about service nodes. >>>> >>> GIM>> I have to admit that I don't find how RFC 8293 A Framework for >>> Multicast in Network Virtualization over Layer 3 is related to this >>> document. Please help with additional reference to the text of the >>> document. >>> >> >> [ag] The RFC discusses the use of service nodes which is mentioned here. >> >> >>> >>>> Section 4 >>>> >>> >>>> Separate BFD sessions can be established between the VTEPs (IP1 and >>>> IP2) for monitoring each of the VXLAN tunnels (VNI 100 and 200). >>>> >>> >>>> IMO, the document should mention that this could lead to scaling issues >>>> given that VTEPs can support well in excess of 4K VNIs. Additionally, we >>>> should mention that with IRB, a given VNI may not even exist on the >>>> destination VTEP. Finally, what is the benefit of doing this? There may >>>> be certain corner cases where it's useful (vs a single BFD session between >>>> the VTEPs for all VNIs) but it would be good to explain what those are. >>>> >>> GIM>> Will add text in the Security Considerations section that VTEPs >>> should have limit on number of BFD sessions. >>> >> >> [ag] I was hoping for two things: >> - A mention about the scalability issue right where per-VNI BFD is >> discussed. (Not sure why that is a security issue/consideration.) >> > GIM2>> I've added the following sentense in both places: > The implementation SHOULD have a reasonable upper bound on the number of > BFD sessions that can be created between the same pair of VTEPs. > [ag2] What is the criteria for determining what is reasonable? > - What is the benefit of running BFD per VNI between a pair of VTEPs? >> > GIM2>> An alternative would be to run CFM between VMs, if there's the need > to monitor liveliness of the particular VM. Again, this is optional. > [ag2] I'm not sure how running per-VNI BFD between the VTEPs allows one to monitor the liveliness of VMs. > >> >>> >>>> Sections 5.1 and 6.1 >>>> >>>> In 5.1 we have >>>> >>> >>>> The inner MAC frame carrying the BFD payload has the >>>> following format: >>>> ... Source IP: IP address of the originating VTEP. Destination IP: IP >>>> address of the terminating VTEP. >>>> >>> >>>> >>>> In 6.1 we have >>>> >>> >>>> >>>> Since multiple BFD sessions may be running between two >>>> VTEPs, there needs to be a mechanism for demultiplexing received BF >>>> >>>> packets to the proper session. The procedure for demultiplexing >>>> packets with Your Discriminator equal to 0 is different from[RFC5880 >>>> <https://tools.ietf.org/html/rfc5880>]. >>>> >>>> *For such packets, the BFD session MUST be identified* >>>> >>>> *using the inner headers, i.e., the source IP and the destination IP >>>> present in the IP header carried by the payload of the VXLAN* >>>> >>>> *encapsulated packet.* >>>> >>>> >>>> >>> >>>> How does this work if the source IP and dest IP are the same as >>>> specified in 5.1? >>>> >>> GIM>> You're right, Destination and source IP addresses likely are the >>> same in this case. Will add that the source UDP port number, along with the >>> pair of IP addresses, MUST be used to demux received BFD control packets. >>> Would you agree that will be sufficient? >>> >> >> [ag] Yes, I think that should work. >> >>> >>>> Editorial >>>> >>> >> [ag] Agree with all comments on this section. >> >>> >>>> - Terminology section should be renamed to acronyms. >>>> >>> GIM>> Accepted >>> >>>> - Document would benefit from a thorough editorial scrub, but maybe >>>> that will happen once it gets to the RFC editor. >>>> >>> GIM>> Will certainly have helpful comments from ADs and RFC editor. >>> >>>> >>>> Section 1 >>>> >>> >>>> "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348 >>>> <https://tools.ietf.org/html/rfc7348>]. provides an encapsulation >>>> scheme that allows virtual machines (VMs) to communicate in a data center >>>> network. >>>> >>> >>>> This is not accurate. VXLAN allows you to implement an overlay to >>>> decouple the address space of the attached hosts from that of the network. >>>> >>> GIM>> Thank you for the suggested text. Will change as follows: >>> OLD TEXT: >>> "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides >>> an encapsulation scheme that allows virtual machines (VMs) to >>> communicate in a data center network. >>> NEW TEXT: >>> "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides >>> an encapsulation scheme that allows building an overlay network by >>> decoupling the address space of the attached virtual hosts from that >>> of the network. >>> >>>> >>>> Section 7 >>>> >>>> VTEP's -> VTEPs >>>> >>> GIM>> Yes, thank you. >>> >>
