Re: WGLC comments on draft-ietf-bfd-vxlan

Greg Mirsky Thu, 22 Nov 2018 12:31:03 -0800

Hi Anoop,
apologies if my explanation was not clear. Non-zero VNIs are recommended to
be used by a VTEP that received BFD control packet with zero Your
Discriminator value. BFD control packets with non-zero Your Discriminator
value will be demultiplexed using only that value. As for the special role
of VNI 0 the section 7 of the draft states the following:
   BFD session MAY be established for the reserved VNI 0.  One way to
   aggregate BFD sessions between VTEP's is to establish a BFD session
   with VNI 0.  A VTEP MAY also use VNI 0 to establish a BFD session
   with a service node.
Would you suggest changing the normative language in this text?


Regards,
Greg

PS. Happy Thanksgiving to All!

On Wed, Nov 21, 2018 at 11:00 PM Anoop Ghanwani <[email protected]>
wrote:

> Hi Greg,
>
> See below prefixed with [ag4].
>
> Thanks,
> Anoop
>
> On Wed, Nov 21, 2018 at 4:36 PM Greg Mirsky <[email protected]> wrote:
>
>> Hi Anoop,
>> apologies for the miss. Is it the last outstanding? Let's bring it to the
>> front then.
>>
>> - What is the benefit of running BFD per VNI between a pair of VTEPs?
>>>>>>
>>>>> GIM2>> An alternative would be to run CFM between VMs, if there's the
>>>>> need to monitor liveliness of the particular VM. Again, this is optional.
>>>>>
>>>>
>>>> [ag2] I'm not sure how running per-VNI BFD between the VTEPs allows one
>>>> to monitor the liveliness of VMs.
>>>>
>>>
>> [ag3] I think you missed responding to this.  I'm not sure of the value
>> of running BFD per VNI between VTEPs.  What am I getting that is not
>> covered by running a single BFD session with VNI 0 between the VTEPs?
>>
>> GIM3>> I've misspoken. Non-zero VNI is recommended to be used to
>> demultiplex BFD sessions between the same VTEPs. In section 6.1:
>>    The procedure for demultiplexing
>>    packets with Your Discriminator equal to 0 is different from
>>    [RFC5880].  For such packets, the BFD session MUST be identified
>>    using the inner headers, i.e., the source IP and the destination IP
>>    present in the IP header carried by the payload of the VXLAN
>>    encapsulated packet.  The VNI of the packet SHOULD be used to derive
>>    interface-related information for demultiplexing the packet.
>>
>> Hope that clarifies the use of non-zero VNI in VXLAN encapsulation of a
>> BFD control packet.
>>
>
> [ag4] This tells me how the VNI is used for BFD packets being
> sent/received.  What is the use case/benefit of doing that?  I am creating
> a special interface with VNI 0 just for BFD.  Why do I now need to run BFD
> on any/all of the other VNIs?  As a developer, if I read this spec, should
> I be building this capability or not?  Basically what I'm getting at is I
> think the draft should recommend using VNI 0.  If there is a convincing use
> case for running BFD over other VNIs serviced by that VTEP, then that needs
> to be explained.  But as I mentioned before, this leads to scaling issues.
> So given the scaling issues, it would be good if an implementation only
> needed to worry about sending BFD messages on VNI 0.
>
>
>>
>> Regards,
>> Greg
>>
>> On Tue, Nov 20, 2018 at 12:14 PM Anoop Ghanwani <[email protected]>
>> wrote:
>>
>>> Hi Greg,
>>>
>>> Please see inline prefixed by [ag3].
>>>
>>> Thanks,
>>> Anoop
>>>
>>> On Fri, Nov 16, 2018 at 5:29 PM Greg Mirsky <[email protected]>
>>> wrote:
>>>
>>>> Hi Anoop,
>>>> thank you for the discussion. Please find my responses tagged GIM3>>.
>>>> Also, attached diff and the updated working version of the draft. Hope
>>>> we're converging.
>>>>
>>>> Regards,
>>>> Greg
>>>>
>>>> On Wed, Nov 14, 2018 at 11:00 PM Anoop Ghanwani <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Greg,
>>>>>
>>>>> Please see inline prefixed with [ag2].
>>>>>
>>>>> Thanks,
>>>>> Anoop
>>>>>
>>>>> On Wed, Nov 14, 2018 at 9:45 AM Greg Mirsky <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi Anoop,
>>>>>> thank you for the expedient response. I am glad that some of my
>>>>>> responses have addressed your concerns. Please find followup notes 
>>>>>> in-line
>>>>>> tagged GIM2>>. I've attached the diff to highlight the updates applied in
>>>>>> the working version. Let me know if these are acceptable changes.
>>>>>>
>>>>>> Regards,
>>>>>> Greg
>>>>>>
>>>>>> On Tue, Nov 13, 2018 at 12:30 PM Anoop Ghanwani <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi Greg,
>>>>>>>
>>>>>>> Please see inline prefixed with [ag].
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Anoop
>>>>>>>
>>>>>>> On Tue, Nov 13, 2018 at 11:34 AM Greg Mirsky <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Anoop,
>>>>>>>> many thanks for the thorough review and detailed comments. Please
>>>>>>>> find my answers, this time for real, in-line tagged GIM>>.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Greg
>>>>>>>>
>>>>>>>> On Thu, Nov 8, 2018 at 1:58 AM Anoop Ghanwani <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Here are my comments.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Anoop
>>>>>>>>>
>>>>>>>>> ==
>>>>>>>>>
>>>>>>>>> Philosophical
>>>>>>>>>
>>>>>>>>> Since VXLAN is not an IETF standard, should we be defining a
>>>>>>>>> standard for running BFD on it?  Should we define BFD over Geneve 
>>>>>>>>> instead
>>>>>>>>> which is the official WG selection?  Is that going to be a separate
>>>>>>>>> document?
>>>>>>>>> GIM>> IS-IS is not on the Standard track either but that had not
>>>>>>>>> prevented IETF from developing tens of standard track RFCs using RFC 
>>>>>>>>> 1142
>>>>>>>>> as the normative reference until RFC 7142 re-classified it as 
>>>>>>>>> historical. A
>>>>>>>>> similar path was followed with IS-IS-TE by publishing RFC 3784 until 
>>>>>>>>> it was
>>>>>>>>> obsoleted by RFC 5305 four years later. I understand that Down 
>>>>>>>>> Reference,
>>>>>>>>> i.e., using informational RFC as the normative reference, is not an 
>>>>>>>>> unusual
>>>>>>>>> situation.
>>>>>>>>>
>>>>>>>>
>>>>>>> [ag] OK.  I'm not an expert on this part so unless someone else that
>>>>>>> is an expert (chairs, AD?) can comment on it, I'll just let it go.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Technical
>>>>>>>>>
>>>>>>>>> Section 1:
>>>>>>>>>
>>>>>>>>> This part needs to be rewritten:
>>>>>>>>> >>>
>>>>>>>>> The individual racks may be part of a different Layer 3 network,
>>>>>>>>> or they could be in a single Layer 2 network. The VXLAN 
>>>>>>>>> segments/overlays
>>>>>>>>> are overlaid on top of Layer 3 network. A VM can communicate with 
>>>>>>>>> another
>>>>>>>>> VM only if they are on the same VXLAN segment.
>>>>>>>>> >>>
>>>>>>>>> It's hard to parse and, given IRB,
>>>>>>>>>
>>>>>>>> GIM>> Would the following text be acceptable:
>>>>>>>> OLD TEXT:
>>>>>>>>    VXLAN is typically deployed in data centers interconnecting
>>>>>>>>    virtualized hosts, which may be spread across multiple racks.
>>>>>>>> The
>>>>>>>>    individual racks may be part of a different Layer 3 network, or
>>>>>>>> they
>>>>>>>>    could be in a single Layer 2 network.  The VXLAN
>>>>>>>> segments/overlays
>>>>>>>>    are overlaid on top of Layer 3 network.
>>>>>>>> NEW TEXT:
>>>>>>>> VXLAN is typically deployed in data centers interconnecting
>>>>>>>> virtualized
>>>>>>>> hosts of a tenant. VXLAN addresses requirements of the Layer 2 and
>>>>>>>> Layer 3 data center network infrastructure in the presence of VMs
>>>>>>>> in
>>>>>>>> a multi-tenant environment, discussed in section 3 [RFC7348], by
>>>>>>>>  providing Layer 2 overlay scheme on a Layer 3 network.
>>>>>>>>
>>>>>>>
>>>>>>> [ag] This is a lot better.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>  A VM can communicate with another VM only if they are on the same
>>>>>>>> VXLAN segment.
>>>>>>>>>
>>>>>>>>> the last sentence above is wrong.
>>>>>>>>>
>>>>>>>> GIM>> Section 4 in RFC 7348 states:
>>>>>>>> Only VMs within the same VXLAN segment can communicate with each
>>>>>>>> other.
>>>>>>>>
>>>>>>>
>>>>>>> [ag] VMs on different segments can communicate using routing/IRB, so
>>>>>>> even RFC 7348 is wrong.  Perhaps the text should be modified so say -- 
>>>>>>> "In
>>>>>>> the absence of a router in the overlay, a VM can communicate...".
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Section 3:
>>>>>>>>> >>>
>>>>>>>>>  Most deployments will have VMs with only L2 capabilities that
>>>>>>>>> may not support L3.
>>>>>>>>> >>>
>>>>>>>>> Are you suggesting most deployments have VMs with no IP
>>>>>>>>> addresses/configuration?
>>>>>>>>>
>>>>>>>> GIM>> Would re-word as follows:
>>>>>>>> OLD TEXT:
>>>>>>>>  Most deployments will have VMs with only L2 capabilities that
>>>>>>>>  may not support L3.
>>>>>>>> NEW TEXT:
>>>>>>>> Deployments may have VMs with only L2 capabilities that do not
>>>>>>>> support L3.
>>>>>>>>
>>>>>>>
>>>>>>> [ag] I still don't understand this.  What does it mean for a VM to
>>>>>>> not support L3?  No IP address, no default GW, something else?
>>>>>>>
>>>>>> GIM2>> VM communicates with its VTEP which, in turn, originates VXLAN
>>>>>> tunnel. VM is not required to have IP address as it is VTEP's IP address
>>>>>> that VM's MAC is associated with. As for gateway, RFC 7348 discusses 
>>>>>> VXLAN
>>>>>> gateway as the device that forwards traffice between VXLAN and non-VXLAN
>>>>>> domains. Considering all that, would the following change be acceptable:
>>>>>> OLD TEXT:
>>>>>>  Most deployments will have VMs with only L2 capabilities that
>>>>>>  may not support L3.
>>>>>> NEW TEXT:
>>>>>>  Most deployments will have VMs with only L2 capabilities and not
>>>>>> have an IP address assigned.
>>>>>>
>>>>>
>>>>> [ag2] Do you have a reference for this (i.e. that most deployments
>>>>> have VMs without an IP address)?  Normally I would think VMs would have an
>>>>> IP address.  It's just that they are segregated into segments and, without
>>>>> an intervening router, they are restricted to communicate only within 
>>>>> their
>>>>> subnet.
>>>>>
>>>> GIM3>> Would the following text be acceptable:
>>>>
>>>> Deployments might have VMs with only L2 capabilities and not have an IP
>>>> address assigned or,
>>>> in other cases, VMs are assigned IP address but are restricted to
>>>> communicate only within their subnet.
>>>>
>>>>
>>> [ag3] Yes, this is better.
>>>
>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> >>>
>>>>>>>>> Having a hierarchical OAM model helps localize faults though it
>>>>>>>>> requires additional consideration.
>>>>>>>>> >>>
>>>>>>>>> What are the additional considerations?
>>>>>>>>>
>>>>>>>> GIM>> For example, coordination of BFD intervals across the OAM
>>>>>>>> layers.
>>>>>>>>
>>>>>>>
>>>>>>> [ag] Can we mention them in the draft?
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> Would be useful to add a reference to RFC 8293 in case the reader
>>>>>>>>> would like to know more about service nodes.
>>>>>>>>>
>>>>>>>> GIM>> I have to admit that I don't find how RFC 8293  A Framework
>>>>>>>> for Multicast in Network Virtualization over Layer 3 is related to this
>>>>>>>> document. Please help with additional reference to the text of the
>>>>>>>> document.
>>>>>>>>
>>>>>>>
>>>>>>> [ag] The RFC discusses the use of service nodes which is mentioned
>>>>>>> here.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> Section 4
>>>>>>>>> >>>
>>>>>>>>> Separate BFD sessions can be established between the VTEPs (IP1
>>>>>>>>> and IP2) for monitoring each of the VXLAN tunnels (VNI 100 and 200).
>>>>>>>>> >>>
>>>>>>>>> IMO, the document should mention that this could lead to scaling
>>>>>>>>> issues given that VTEPs can support well in excess of 4K VNIs.
>>>>>>>>> Additionally, we should mention that with IRB, a given VNI may not 
>>>>>>>>> even
>>>>>>>>> exist on the destination VTEP.  Finally, what is the benefit of doing
>>>>>>>>> this?  There may be certain corner cases where it's useful (vs a 
>>>>>>>>> single BFD
>>>>>>>>> session between the VTEPs for all VNIs) but it would be good to 
>>>>>>>>> explain
>>>>>>>>> what those are.
>>>>>>>>>
>>>>>>>> GIM>> Will add text in the Security Considerations section that
>>>>>>>> VTEPs should have limit on number of BFD sessions.
>>>>>>>>
>>>>>>>
>>>>>>> [ag] I was hoping for two things:
>>>>>>> - A mention about the scalability issue right where per-VNI BFD is
>>>>>>> discussed.  (Not sure why that is a security issue/consideration.)
>>>>>>>
>>>>>> GIM2>> I've added the following sentense in both places:
>>>>>> The implementation SHOULD have a reasonable upper bound on the number
>>>>>> of BFD sessions that can be created between the same pair of VTEPs.
>>>>>>
>>>>>
>>>>> [ag2] What is the criteria for determining what is reasonable?
>>>>>
>>>> GIM>> I usually understand that as requirement to make it controllable,
>>>> have configurable limit. Thus it will be up to an network operator to set
>>>> the limit.
>>>>
>>>>>
>>>>>
>>>>>> - What is the benefit of running BFD per VNI between a pair of VTEPs?
>>>>>>>
>>>>>> GIM2>> An alternative would be to run CFM between VMs, if there's the
>>>>>> need to monitor liveliness of the particular VM. Again, this is optional.
>>>>>>
>>>>>
>>>>> [ag2] I'm not sure how running per-VNI BFD between the VTEPs allows
>>>>> one to monitor the liveliness of VMs.
>>>>>
>>>>
>>> [ag3] I think you missed responding to this.  I'm not sure of the value
>>> of running BFD per VNI between VTEPs.  What am I getting that is not
>>> covered by running a single BFD session with VNI 0 between the VTEPs?
>>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> Sections 5.1 and 6.1
>>>>>>>>>
>>>>>>>>> In 5.1 we have
>>>>>>>>> >>>
>>>>>>>>> The inner MAC frame carrying the BFD payload has the
>>>>>>>>> following format:
>>>>>>>>> ... Source IP: IP address of the originating VTEP. Destination IP:
>>>>>>>>> IP address of the terminating VTEP.
>>>>>>>>> >>>
>>>>>>>>>
>>>>>>>>> In 6.1 we have
>>>>>>>>> >>>
>>>>>>>>>
>>>>>>>>> Since multiple BFD sessions may be running between two
>>>>>>>>> VTEPs, there needs to be a mechanism for demultiplexing received BF
>>>>>>>>>
>>>>>>>>> packets to the proper session.  The procedure for demultiplexing
>>>>>>>>> packets with Your Discriminator equal to 0 is different from[RFC5880 
>>>>>>>>> <https://tools.ietf.org/html/rfc5880>].
>>>>>>>>>
>>>>>>>>> *For such packets, the BFD session MUST be identified*
>>>>>>>>>
>>>>>>>>> *using the inner headers, i.e., the source IP and the destination IP
>>>>>>>>> present in the IP header carried by the payload of the VXLAN*
>>>>>>>>>
>>>>>>>>> *encapsulated packet.*
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> >>>
>>>>>>>>> How does this work if the source IP and dest IP are the same as
>>>>>>>>> specified in 5.1?
>>>>>>>>>
>>>>>>>> GIM>> You're right, Destination and source IP addresses likely are
>>>>>>>> the same in this case. Will add that the source UDP port number, along 
>>>>>>>> with
>>>>>>>> the pair of IP addresses, MUST be used to demux received BFD control
>>>>>>>> packets. Would you agree that will be sufficient?
>>>>>>>>
>>>>>>>
>>>>>>> [ag] Yes, I think that should work.
>>>>>>>
>>>>>>>>
>>>>>>>>> Editorial
>>>>>>>>>
>>>>>>>>
>>>>>>> [ag] Agree with all comments on this section.
>>>>>>>
>>>>>>>>
>>>>>>>>> - Terminology section should be renamed to acronyms.
>>>>>>>>>
>>>>>>>> GIM>> Accepted
>>>>>>>>
>>>>>>>>> - Document would benefit from a thorough editorial scrub, but
>>>>>>>>> maybe that will happen once it gets to the RFC editor.
>>>>>>>>>
>>>>>>>> GIM>> Will certainly have helpful comments from ADs and RFC editor.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Section 1
>>>>>>>>> >>>
>>>>>>>>> "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
>>>>>>>>> <https://tools.ietf.org/html/rfc7348>]. provides an encapsulation
>>>>>>>>> scheme that allows virtual machines (VMs) to communicate in a data 
>>>>>>>>> center
>>>>>>>>> network.
>>>>>>>>> >>>
>>>>>>>>> This is not accurate.  VXLAN allows you to implement an overlay to
>>>>>>>>> decouple the address space of the attached hosts from that of the 
>>>>>>>>> network.
>>>>>>>>>
>>>>>>>> GIM>> Thank you for the suggested text. Will change as follows:
>>>>>>>> OLD TEXT:
>>>>>>>>    "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348].
>>>>>>>> provides
>>>>>>>>    an encapsulation scheme that allows virtual machines (VMs) to
>>>>>>>>    communicate in a data center network.
>>>>>>>> NEW TEXT:
>>>>>>>>  "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348].
>>>>>>>> provides
>>>>>>>>    an encapsulation scheme that allows building an overlay network
>>>>>>>> by
>>>>>>>>   decoupling the address space of the attached virtual hosts from
>>>>>>>> that of the network.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Section 7
>>>>>>>>>
>>>>>>>>> VTEP's -> VTEPs
>>>>>>>>>
>>>>>>>> GIM>> Yes, thank you.
>>>>>>>>
>>>>>>>

Re: WGLC comments on draft-ietf-bfd-vxlan

Reply via email to