Hi Dinesh,
many thanks for your detailed updates on how some implementations process
VXLAN header and the inner Ethernet frame. These are very helpful in
achieving the workable solution for the problem at hand.
You've noted that a path between VTEPs may be monitored in the underlay
network by merely establishing a BFD session. That is true, but by using
BFD with VXLAN encapsulation between the pair of VTEPs we are extending the
OAM domain by including, to some extent, VXLAN forwarding engine. Abstract
in RFC 5880 defines the goal and the domain in which BFD protocol can
detect a fault as:
   This document describes a protocol intended to detect faults in the
   bidirectional path between two forwarding engines, including
   interfaces, data link(s), and to the extent possible the forwarding
   engines themselves, with potentially very low latency.
Thus, BFD in the underlay will exercise a part of IP forwarding engine
while BFD with VXLAN encapsulation, ran between the same pair of VTEPs,
extends the OAM domain. At the same time, defining BFD between tenant
systems in outside the goal of this specification. But VXLAN BFD session
between VTEPs may be useful in monitoring e2e path between tenants, as
described in the update to -07:
   At the same time, a service layer BFD session may be used between the
   tenants of VTEPs IP1 and IP2 to provide end-to-end fault management.
   In such case, for VTEPs BFD control packets of that session are
   indistinguishable from data packets.  If end-to-end defect detection
   is realized as the set of concatenated OAM domains, e.g., VM1-1 - IP1
   -- IP2 - VM2-1, then the BFD session over VXLAN between VTEPs SHOULD
   follow the procedures described in Section 6.8.17 [RFC5880].
I've attached the current working version of the draft.

Regards,
Greg


On Fri, Aug 2, 2019 at 5:43 PM Dinesh Dutt <[email protected]> wrote:

> What I mean is "How do you infer that it excludes the case I'm talking
> about?".
>
> Dinesh
>
> On Fri, Aug 2, 2019 at 5:41 PM Dinesh Dutt <[email protected]> wrote:
>
>> The abstract reads this: "
>>
>> This document describes the use of the Bidirectional Forwarding
>>    Detection (BFD) protocol in point-to-point Virtual eXtensible Local
>>    Area Network (VXLAN) tunnels forming up an overlay network."
>>
>> How do you infer what you said?
>>
>> Dinesh
>>
>>
>> On Fri, Aug 2, 2019 at 5:38 PM Joel M. Halpern <[email protected]>
>> wrote:
>>
>>> I am going by what the draft says its purpose is.  If you (Dinesh) want
>>> the draft to fulfill a different purpose, then either ask the chairs to
>>> take this draft back to the WG, or write a separate draft.
>>> As currently written, the behavior Greg proposed meets the needs, and
>>> does so in a way that is consistent with VxLAN.
>>>
>>> Yours,
>>> Joel
>>>
>>> On 8/2/2019 8:30 PM, Dinesh Dutt wrote:
>>> > What is the stated purpose of this BFD session? The VTEP reachability
>>> is
>>> > determined by the underlay, I don't need VXLAN-encaped packet for
>>> that.
>>> > Do we agree?
>>> >
>>> > If I want to test the VXLAN encap/decap functionality alone, picking
>>> any
>>> > single VNI maybe fine. But is this all any network operator wants?
>>> Why?
>>> > In what situations has this been a problem? I suspect operators also
>>> > want to verify path continuity over a specific VNI. If you say this is
>>> > not defined by the document, I disagree because the current version
>>> > talks about controlling the number of BFD sessions between the VTEPs
>>> > (see section 3). More importantly, this is a real problem that
>>> operators
>>> > like to verify.
>>> >
>>> > Dinesh
>>> >
>>> > On Fri, Aug 2, 2019 at 5:08 PM Joel M. Halpern <[email protected]
>>> > <mailto:[email protected]>> wrote:
>>> >
>>> >     What is special about the management VNI is precisely that it is
>>> NOT a
>>> >     tenant VNI.  The VxLAN administration does know how it allocates
>>> VNI to
>>> >     tenants, and which VNI it has allocated.  In contrast, it does not
>>> know
>>> >     which IP addresses or MAC adddresses teh tenant is using or may
>>> plan
>>> >     to use.
>>> >
>>> >     Yours,
>>> >     Joel
>>> >
>>> >     On 8/2/2019 6:41 PM, Dinesh Dutt wrote:
>>> >      > The assumption of an IP address within any VNI is suspect that
>>> way.
>>> >      > What's special about a single VNI, the management VNI? The VTEP
>>> IP
>>> >      > address does not belong in reality in any VNI.
>>> >      >
>>> >      > Dinesh
>>> >      >
>>> >      > On Fri, Aug 2, 2019 at 3:17 PM Joel M. Halpern
>>> >     <[email protected] <mailto:[email protected]>
>>> >      > <mailto:[email protected] <mailto:[email protected]>>>
>>> wrote:
>>> >      >
>>> >      >     Your response seems to miss two points:
>>> >      >
>>> >      >     First, the problem you describe is not what the document
>>> says
>>> >     it is
>>> >      >     solving.  To the degree it discusses it at all, the document
>>> >     says "
>>> >      >       In
>>> >      >     most cases, a single BFD session is sufficient for the given
>>> >     VTEP to
>>> >      >     monitor the reachability of a remote VTEP, regardless of the
>>> >     number of
>>> >      >     VNIs in common. "
>>> >      >
>>> >      >     Second, you assume the existence of an IP address for a VTEP
>>> >     within a
>>> >      >     VNI.  As with the MAC address, the VTEP does not have an IP
>>> >     address
>>> >      >     within the VNI.  Some implementations may have created such
>>> a
>>> >     thing,
>>> >      >     but
>>> >      >     the general construct, as defined to date, does not support
>>> such.
>>> >      >
>>> >      >     In short, you are requiring a behavior that violates the
>>> >     architectural
>>> >      >     structure of overlay / underlay separation, and common
>>> >     usage.  And you
>>> >      >     are doing so to support a use case that the working group
>>> has not
>>> >      >     indicated in the document as important.
>>> >      >
>>> >      >     Yours,
>>> >      >     Joel
>>> >      >
>>> >      >     On 8/2/2019 5:01 PM, Dinesh Dutt wrote:
>>> >      >      > Joel,
>>> >      >      >
>>> >      >      > You understood correctly.
>>> >      >      >
>>> >      >      > The VNIs may not share fate due to misconfiguration. And
>>> I
>>> >     strongly
>>> >      >      > suspect someone will want to use BFD for that because its
>>> >     about
>>> >      >     checking
>>> >      >      > path continuity as stated by the draft. As long as
>>> there's a
>>> >      >     valid IP
>>> >      >      > (because it's BFD) owned by the VTEP in that VNI, you can
>>> >     use BFD in
>>> >      >      > that VNI. Thats all that you need to dictate.  That IP
>>> address
>>> >      >     has a MAC
>>> >      >      > address and you can use that on the inner frame. That is
>>> >     all normal
>>> >      >      > VXLAN processing. The outer IP is always that of the
>>> VTEP.
>>> >      >      >
>>> >      >      > Dinesh
>>> >      >      >
>>> >      >      > On Fri, Aug 2, 2019 at 11:03 AM Joel M. Halpern
>>> >      >     <[email protected] <mailto:[email protected]>
>>> >     <mailto:[email protected] <mailto:[email protected]>>
>>> >      >      > <mailto:[email protected] <mailto:[email protected]>
>>> >     <mailto:[email protected] <mailto:[email protected]>>>> wrote:
>>> >      >      >
>>> >      >      >     If I am reading your various emails correctly Dinesh
>>> >     (and I
>>> >      >     may have
>>> >      >      >     missed something) you are trying to use the MAC
>>> address
>>> >      >     because you
>>> >      >      >     want
>>> >      >      >     to be able to send these BFD packets over arbitrary
>>> VNI to
>>> >      >     monitor the
>>> >      >      >     VNI.  That is not a requirement identified in the
>>> >     document.
>>> >      >     It is not
>>> >      >      >     even a problem I understand, since all the VNI
>>> between an
>>> >      >     ingress and
>>> >      >      >     egress VTEP share fate.
>>> >      >      >
>>> >      >      >     Yours,
>>> >      >      >     Joel
>>> >      >      >
>>> >      >      >     On 8/2/2019 1:44 PM, Dinesh Dutt wrote:
>>> >      >      >      > Thanks for verifying this. On Linux and hardware
>>> >     routers
>>> >      >     that I'm
>>> >      >      >     aware
>>> >      >      >      > of (Cisco circa 2012 and Cumulus), the physical
>>> MAC
>>> >     address is
>>> >      >      >     reused
>>> >      >      >      > across the VNIs on the VTEP. Did you check on a
>>> non-VMW
>>> >      >     device?
>>> >      >      >     This is
>>> >      >      >      > more for my own curiosity.
>>> >      >      >      >
>>> >      >      >      > To address the general case, can we not define a
>>> >      >     well-known (or
>>> >      >      >     reserve
>>> >      >      >      > one) unicast MAC address for use with VTEP? If
>>> the MAC
>>> >      >     address is
>>> >      >      >      > configurable in BFD command, this can be moot.
>>> >      >      >      >
>>> >      >      >      > Dinesh
>>> >      >      >      >
>>> >      >      >      > On Fri, Aug 2, 2019 at 10:27 AM Santosh P K
>>> >      >      >      > <[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>>
>>> >      >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>>>
>>> >      >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>>
>>> >      >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>>>>> wrote:
>>> >      >      >      >
>>> >      >      >      >     I have cross checked point raised about MAC
>>> address
>>> >      >     usage. It is
>>> >      >      >      >     possible that tenant could be using physical
>>> MAC
>>> >      >     address and
>>> >      >      >     when a
>>> >      >      >      >     packet comes with valid VNI with a MAC address
>>> >     that is
>>> >      >     being
>>> >      >      >     used by
>>> >      >      >      >     tenant then packet will be sent to that
>>> tenant.
>>> >     This rules
>>> >      >      >     out the
>>> >      >      >      >     fact that we could use physical MAC address as
>>> >     inner
>>> >      >     MAC to
>>> >      >      >     ensure
>>> >      >      >      >     packets get terminated at VTEP itself.
>>> >      >      >      >
>>> >      >      >      >     Thanks
>>> >      >      >      >     Santosh P K
>>> >      >      >      >
>>> >      >      >      >     On Wed, Jul 31, 2019 at 11:00 AM Santosh P K
>>> >      >      >      >     <[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>>
>>> >      >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>>>
>>> >      >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>>
>>> >      >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>>>>>
>>> >      >      >      >     wrote:
>>> >      >      >      >
>>> >      >      >      >         Joel,
>>> >      >      >      >             Thanks for your inputs. I checked
>>> >      >     implementation within
>>> >      >      >      >         Vmware. Perhaps I should have been more
>>> clear
>>> >      >     about MAC
>>> >      >      >     address
>>> >      >      >      >         space while checking internally. I will
>>> cross
>>> >      >     check again for
>>> >      >      >      >         the same and get back on this list.
>>> >      >      >      >
>>> >      >      >      >         Thanks
>>> >      >      >      >         Santosh P K
>>> >      >      >      >
>>> >      >      >      >         On Wed, Jul 31, 2019 at 10:54 AM Joel M.
>>> >     Halpern
>>> >      >      >      >         <[email protected]
>>> >     <mailto:[email protected]> <mailto:[email protected]
>>> >     <mailto:[email protected]>>
>>> >      >     <mailto:[email protected] <mailto:[email protected]>
>>> >     <mailto:[email protected] <mailto:[email protected]>>>
>>> >      >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]> <mailto:[email protected]
>>> >     <mailto:[email protected]>>
>>> >      >     <mailto:[email protected] <mailto:[email protected]>
>>> >     <mailto:[email protected] <mailto:[email protected]>>>>>
>>> wrote:
>>> >      >      >      >
>>> >      >      >      >             Sorry to ask a stupid question.  Whose
>>> >      >     implementation?
>>> >      >      >      >
>>> >      >      >      >             The reason I ask is that as far as I
>>> >     can tell,
>>> >      >     since the
>>> >      >      >      >             tenant does not
>>> >      >      >      >             have any control access to the VTEP,
>>> >     there is no
>>> >      >      >     reason for
>>> >      >      >      >             the VTEP to
>>> >      >      >      >             have a MAC address in the tenant
>>> >     space.  Yes, the
>>> >      >      >     device has
>>> >      >      >      >             a physical
>>> >      >      >      >             MAC address.  But the tenant could
>>> well be
>>> >      >     using that MAC
>>> >      >      >      >             address.  Yes,
>>> >      >      >      >             they would be violating the Ethernet
>>> spec.
>>> >      >     But the whole
>>> >      >      >      >             point of
>>> >      >      >      >             segregation is not to care about such
>>> >     issues.
>>> >      >      >      >
>>> >      >      >      >             On the other hand, if you tell me that
>>> >     the VMWare
>>> >      >      >      >             implementation has an
>>> >      >      >      >             Ethernet address that is part of the
>>> tenant
>>> >      >     space, well,
>>> >      >      >      >             they made up
>>> >      >      >      >             this particular game.
>>> >      >      >      >
>>> >      >      >      >             Yours,
>>> >      >      >      >             Joel
>>> >      >      >      >
>>> >      >      >      >             On 7/31/2019 1:44 PM, Santosh P K
>>> wrote:
>>> >      >      >      >              > I have checked with implementation
>>> >     in data
>>> >      >     path.
>>> >      >      >     When we
>>> >      >      >      >             receive a
>>> >      >      >      >              > packet with valid VNI then lookup
>>> >     for MAC will
>>> >      >      >     happen and
>>> >      >      >      >             it is VTEP own
>>> >      >      >      >              > MAC then it will be trapped to
>>> control
>>> >      >     plane for
>>> >      >      >      >             processing. I think we
>>> >      >      >      >              > can have following options
>>> >      >      >      >              > 1. Optional managment VNI
>>> >      >      >      >              > 2. Mandatory inner MAC set to VTEP
>>> mac
>>> >      >      >      >              > 3. Inner IP TTL set to 1 to avoid
>>> >      >     forwarding of packet
>>> >      >      >      >             via inner IP
>>> >      >      >      >              > address.
>>> >      >      >      >              >
>>> >      >      >      >              >
>>> >      >      >      >              > Thoughts?
>>> >      >      >      >              >
>>> >      >      >      >              > Thansk
>>> >      >      >      >              > Santosh P K
>>> >      >      >      >              >
>>> >      >      >      >              > On Wed, Jul 31, 2019 at 9:20 AM
>>> Greg
>>> >     Mirsky
>>> >      >      >      >             <[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected] <mailto:[email protected]
>>> >>
>>> >     <mailto:[email protected] <mailto:[email protected]>
>>> >      >     <mailto:[email protected] <mailto:[email protected]
>>> >>>
>>> >      >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]> <mailto:[email protected]
>>> >     <mailto:[email protected]>>
>>> >      >     <mailto:[email protected] <mailto:[email protected]
>>> >
>>> >     <mailto:[email protected] <mailto:[email protected]>>>>
>>> >      >      >      >              > <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected] <mailto:[email protected]
>>> >>
>>> >      >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]> <mailto:[email protected]
>>> >     <mailto:[email protected]>>>
>>> >      >      >      >             <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected] <mailto:[email protected]
>>> >>
>>> >      >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>>>>>> wrote:
>>> >      >      >      >              >
>>> >      >      >      >              >     Hi Dinesh,
>>> >      >      >      >              >     thank you for your
>>> consideration
>>> >     of the
>>> >      >      >     proposal and
>>> >      >      >      >             questions. What
>>> >      >      >      >              >     would you see as the scope of
>>> >     testing the
>>> >      >      >      >             connectivity for the
>>> >      >      >      >              >     specific VNI? If it is
>>> >      >     tenant-to-tenant, then
>>> >      >      >     VTEPs
>>> >      >      >      >             will treat these
>>> >      >      >      >              >     packets as regular user
>>> frames. More
>>> >      >     likely, these
>>> >      >      >      >             could be Layer 2
>>> >      >      >      >              >     OAM, e.g. CCM frames. The
>>> reason
>>> >     to use
>>> >      >     127/8 for
>>> >      >      >      >             IPv4, and
>>> >      >      >      >              >     0:0:0:0:0:FFFF:7F00:0/104 for
>>> >     IPv6 is
>>> >      >     to safeguard
>>> >      >      >      >             from leaking
>>> >      >      >      >              >     Ethernet frames with BFD
>>> Control
>>> >     packet
>>> >      >     to a
>>> >      >      >     tenant.
>>> >      >      >      >              >     You've suggested using a MAC
>>> >     address to
>>> >      >     trap the
>>> >      >      >      >             control packet at
>>> >      >      >      >              >     VTEP. What that address could
>>> be? We
>>> >      >     had proposed
>>> >      >      >      >             using the
>>> >      >      >      >              >     dedicated MAC and VTEP's MAC
>>> and
>>> >     both
>>> >      >     raised
>>> >      >      >     concerns
>>> >      >      >      >             among VXLAN
>>> >      >      >      >              >     experts. The idea of using
>>> >     Management
>>> >      >     VNI may
>>> >      >      >     be more
>>> >      >      >      >             acceptable
>>> >      >      >      >              >     based on its similarity to the
>>> >     practice
>>> >      >     of using
>>> >      >      >      >             Management VLAN.
>>> >      >      >      >              >
>>> >      >      >      >              >     Regards,
>>> >      >      >      >              >     Greg
>>> >      >      >      >              >
>>> >      >      >      >              >     On Wed, Jul 31, 2019 at 12:03
>>> PM
>>> >     Dinesh
>>> >      >     Dutt
>>> >      >      >      >             <[email protected]
>>> >     <mailto:[email protected]> <mailto:[email protected]
>>> >     <mailto:[email protected]>>
>>> >      >     <mailto:[email protected] <mailto:[email protected]>
>>> >     <mailto:[email protected] <mailto:[email protected]>>>
>>> >      >      >     <mailto:[email protected] <mailto:[email protected]>
>>> >     <mailto:[email protected] <mailto:[email protected]>>
>>> >      >     <mailto:[email protected] <mailto:[email protected]>
>>> >     <mailto:[email protected] <mailto:[email protected]>>>>
>>> >      >      >      >              >     <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected] <mailto:[email protected]>>
>>> >      >      >     <mailto:[email protected] <mailto:[email protected]>
>>> >     <mailto:[email protected] <mailto:[email protected]>>>
>>> >      >     <mailto:[email protected] <mailto:[email protected]>
>>> >     <mailto:[email protected] <mailto:[email protected]>>
>>> >      >      >     <mailto:[email protected] <mailto:[email protected]>
>>> >     <mailto:[email protected] <mailto:[email protected]>>>>>>
>>> >      >      >      >             wrote:
>>> >      >      >      >              >
>>> >      >      >      >              >         Hi Greg,
>>> >      >      >      >              >
>>> >      >      >      >              >         As long as the inner MAC
>>> >     address is
>>> >      >     such
>>> >      >      >     that the
>>> >      >      >      >             packet is
>>> >      >      >      >              >         trapped to the CPU, it
>>> should be
>>> >      >     fine for
>>> >      >      >     use as
>>> >      >      >      >             an inner MAC is
>>> >      >      >      >              >         it not? Stating that is
>>> >     better than
>>> >      >     trying to
>>> >      >      >      >             force a management
>>> >      >      >      >              >         VNI. What if someone wants
>>> >     to test
>>> >      >      >     connectivity
>>> >      >      >      >             on a specific
>>> >      >      >      >              >         VNI? I would not pick a
>>> >     loopback IP
>>> >      >      >     address for
>>> >      >      >      >             this since that
>>> >      >      >      >              >         address range is host/node
>>> local
>>> >      >     only. Is
>>> >      >      >     there a
>>> >      >      >      >             reason you're
>>> >      >      >      >              >         not using the VTEP IP as
>>> the
>>> >     inner IP
>>> >      >      >     address ?
>>> >      >      >      >              >
>>> >      >      >      >              >         Dinesh
>>> >      >      >      >              >
>>> >      >      >      >              >         On Wed, Jul 31, 2019 at
>>> 5:48 AM
>>> >      >     Greg Mirsky
>>> >      >      >      >              >         <[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected] <mailto:[email protected]
>>> >>
>>> >      >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]> <mailto:[email protected]
>>> >     <mailto:[email protected]>>>
>>> >      >      >      >             <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected] <mailto:[email protected]
>>> >>
>>> >      >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>>>> <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected] <mailto:[email protected]
>>> >>
>>> >      >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]> <mailto:[email protected]
>>> >     <mailto:[email protected]>>>
>>> >      >      >      >             <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected] <mailto:[email protected]
>>> >>
>>> >      >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>
>>> >      >     <mailto:[email protected]
>>> >     <mailto:[email protected]>>>>>> wrote:
>>> >      >      >      >              >
>>> >      >      >      >              >             Dear All,
>>> >      >      >      >              >             thank you for your
>>> comments,
>>> >      >      >     suggestions on
>>> >      >      >      >             this issue,
>>> >      >      >      >              >             probably the most
>>> >     challenging
>>> >      >     for this
>>> >      >      >      >             specification. In the
>>> >      >      >      >              >             course of our
>>> discussions,
>>> >      >     we've agreed to
>>> >      >      >      >             abandon the
>>> >      >      >      >              >             request to allocate the
>>> >      >     dedicated MAC
>>> >      >      >     address
>>> >      >      >      >             to be used as
>>> >      >      >      >              >             the destination MAC
>>> >     address in
>>> >      >     the inner
>>> >      >      >      >             Ethernet frame.
>>> >      >      >      >              >             Also, earlier using VNI
>>> >     0 was
>>> >      >     changed from
>>> >      >      >      >             mandatory to one
>>> >      >      >      >              >             of the options an
>>> >      >     implementation may
>>> >      >      >     offer to
>>> >      >      >      >             an operator.
>>> >      >      >      >              >             The most recent
>>> >     discussion was
>>> >      >     whether
>>> >      >      >     VTEP's
>>> >      >      >      >             MAC address
>>> >      >      >      >              >             might be used as the
>>> >      >     destination MAC
>>> >      >      >     address
>>> >      >      >      >             in the inner
>>> >      >      >      >              >             Ethernet frame. As I
>>> >     recall it, the
>>> >      >      >     comments
>>> >      >      >      >             from VXLAN
>>> >      >      >      >              >             experts equally split
>>> >     with one
>>> >      >     for it
>>> >      >      >     and one
>>> >      >      >      >             against. Hence
>>> >      >      >      >              >             I would like to propose
>>> >     a new
>>> >      >     text to
>>> >      >      >     resolve
>>> >      >      >      >             the issue. The
>>> >      >      >      >              >             idea is to let an
>>> >     operator select
>>> >      >      >     Management
>>> >      >      >      >             VNI and use
>>> >      >      >      >              >             that VNI in VXLAN
>>> >     encapsulation
>>> >      >     of BFD
>>> >      >      >      >             Control packets:
>>> >      >      >      >              >             NEW TEXT:
>>> >      >      >      >              >
>>> >      >      >      >              >                 An operator MUST
>>> >     select a VNI
>>> >      >      >     number to
>>> >      >      >      >             be used as
>>> >      >      >      >              >                 Management VNI.
>>> VXLAN
>>> >      >     packet for
>>> >      >      >      >             Management VNI MUST NOT
>>> >      >      >      >              >                 be sent to a
>>> tenant. VNI
>>> >      >     number 1 is
>>> >      >      >      >             RECOMMENDED as the
>>> >      >      >      >              >                 default for
>>> >     Management VNI.
>>> >      >      >      >              >
>>> >      >      >      >              >             With that new text,
>>> what
>>> >     can be the
>>> >      >      >     value of
>>> >      >      >      >             the destination
>>> >      >      >      >              >             MAC in the inner
>>> Ethernet? I
>>> >      >     tend to
>>> >      >      >     believe
>>> >      >      >      >             that it can be
>>> >      >      >      >              >             anything and ignored
>>> by the
>>> >      >     reciever VTEP.
>>> >      >      >      >             Also, if the
>>> >      >      >      >              >             trapping is based on
>>> VNI
>>> >      >     number, the
>>> >      >      >      >             destination IP address
>>> >      >      >      >              >             of the inner IP packet
>>> >     can from
>>> >      >     the range
>>> >      >      >      >             127/8 for IPv4,
>>> >      >      >      >              >             and for IPv6 from the
>>> range
>>> >      >      >      >             0:0:0:0:0:FFFF:7F00:0/104. And
>>> >      >      >      >              >             lastly, the TTL to be
>>> >     set to 1 (no
>>> >      >      >     change here).
>>> >      >      >      >              >
>>> >      >      >      >              >             Much appreciate your
>>> >     comments,
>>> >      >      >     questions, and
>>> >      >      >      >             suggestions.
>>> >      >      >      >              >
>>> >      >      >      >              >             Best regards,
>>> >      >      >      >              >             Greg
>>> >      >      >      >              >
>>> >      >      >      >
>>> >      >      >
>>> >      >
>>> >
>>>
>>



BFD                                                   S. Pallagatti, Ed.
Internet-Draft                                                    VMware
Intended status: Standards Track                             S. Paragiri
Expires: January 8, 2020                          Individual Contributor
                                                             V. Govindan
                                                            M. Mudigonda
                                                                   Cisco
                                                               G. Mirsky
                                                               ZTE Corp.
                                                            July 7, 2019


                             BFD for VXLAN
                        draft-ietf-bfd-vxlan-08

Abstract

   This document describes the use of the Bidirectional Forwarding
   Detection (BFD) protocol in point-to-point Virtual eXtensible Local
   Area Network (VXLAN) tunnels forming up an overlay network.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on January 8, 2020.

Copyright Notice

   Copyright (c) 2019 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect



Pallagatti, et al.       Expires January 8, 2020                [Page 1]

Internet-Draft                BFD for VXLAN                    July 2019


   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Conventions used in this document . . . . . . . . . . . . . .   3
     2.1.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   3
     2.2.  Requirements Language . . . . . . . . . . . . . . . . . .   3
   3.  Deployment  . . . . . . . . . . . . . . . . . . . . . . . . .   4
   4.  BFD Packet Transmission over VXLAN Tunnel . . . . . . . . . .   5
   5.  Reception of BFD Packet from VXLAN Tunnel . . . . . . . . . .   7
     5.1.  Demultiplexing of the BFD Packet  . . . . . . . . . . . .   7
   6.  Use of the Specific VNI . . . . . . . . . . . . . . . . . . .   8
   7.  Echo BFD  . . . . . . . . . . . . . . . . . . . . . . . . . .   8
   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
   9.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
   10. Contributors  . . . . . . . . . . . . . . . . . . . . . . . .   8
   11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .   9
   12. References  . . . . . . . . . . . . . . . . . . . . . . . . .   9
     12.1.  Normative References . . . . . . . . . . . . . . . . . .   9
     12.2.  Informational References . . . . . . . . . . . . . . . .   9
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  10

1.  Introduction

   "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348] provides an
   encapsulation scheme that allows building an overlay network by
   decoupling the address space of the attached virtual hosts from that
   of the network.

   One use of VXLAN is in data centers interconnecting virtual machines
   (VMs) of a tenant.  VXLAN addresses requirements of the Layer 2 and
   Layer 3 data center network infrastructure in the presence of VMs in
   a multi-tenant environment by providing a Layer 2 overlay scheme on a
   Layer 3 network [RFC7348].  Another use is as an encapsulation for
   Ethernet VPN [RFC8365].

   This document is written assuming the use of VXLAN for virtualized
   hosts and refers to VMs and VXLAN Tunnel End Points (VTEPs) in
   hypervisors.  However, the concepts are equally applicable to non-
   virtualized hosts attached to VTEPs in switches.

   In the absence of a router in the overlay, a VM can communicate with
   another VM only if they are on the same VXLAN segment.  VMs are
   unaware of VXLAN tunnels as a VXLAN tunnel is terminated on a VTEP.



Pallagatti, et al.       Expires January 8, 2020                [Page 2]

Internet-Draft                BFD for VXLAN                    July 2019


   VTEPs are responsible for encapsulating and decapsulating frames
   exchanged among VMs.

   Ability to monitor path continuity, i.e., perform proactive
   continuity check (CC) for point-to-point (p2p) VXLAN tunnels, is
   important.  The asynchronous mode of BFD, as defined in [RFC5880], is
   used to monitor a p2p VXLAN tunnel.

   In the case where a Multicast Service Node (MSN) (as described in
   Section 3.3 of [RFC8293]) resides behind an Network Virtualization
   Endpoint (NVE), the mechanisms described in this document apply and
   can, therefore, be used to test the connectivity from the source NVE
   to the MSN.

   This document describes the use of Bidirectional Forwarding Detection
   (BFD) protocol to enable monitoring continuity of the path between
   VXLAN VTEPs, performing as Network Virtualization Endpoints, and/or
   availability of a replicator multicast service node.

2.  Conventions used in this document

2.1.  Terminology

   BFD Bidirectional Forwarding Detection

   CC Continuity Check

   p2p Point-to-point

   MSN Multicast Service Node

   NVE Network Virtualization Endpoint

   VFI Virtual Forwarding Instance

   VM Virtual Machine

   VNI VXLAN Network Identifier (or VXLAN Segment ID)

   VTEP VXLAN Tunnel End Point

   VXLAN Virtual eXtensible Local Area Network

2.2.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP



Pallagatti, et al.       Expires January 8, 2020                [Page 3]

Internet-Draft                BFD for VXLAN                    July 2019


   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

3.  Deployment

   Figure 1 illustrates the scenario with two servers, each of them
   hosting two VMs.  The servers host VTEPs that terminate two VXLAN
   tunnels with VXLAN Network Identifier (VNI) number 100 and 200
   respectively.  Separate BFD sessions can be established between the
   VTEPs (IP1 and IP2) for monitoring each of the VXLAN tunnels (VNI 100
   and 200).  An implementation that supports this specification MUST be
   able to control the number of BFD sessions that can be created
   between the same pair of VTEPs.  BFD packets intended for a
   Hypervisor VTEP MUST NOT be forwarded to a VM as a VM may drop BFD
   packets leading to a false negative.  This method is applicable
   whether the VTEP is a virtual or physical device.


      +------------+-------------+
      |        Server 1          |
      | +----+----+  +----+----+ |
      | |VM1-1    |  |VM1-2    | |
      | |VNI 100  |  |VNI 200  | |
      | |         |  |         | |
      | +---------+  +---------+ |
      | Hypervisor VTEP (IP1)    |
      +--------------------------+
                            |
                            |   +-------------+
                            |   |   Layer 3   |
                            +---|   Network   |
                                +-------------+
                                    |
                                    +-----------+
                                                |
                                         +------------+-------------+
                                         |    Hypervisor VTEP (IP2) |
                                         | +----+----+  +----+----+ |
                                         | |VM2-1    |  |VM2-2    | |
                                         | |VNI 100  |  |VNI 200  | |
                                         | |         |  |         | |
                                         | +---------+  +---------+ |
                                         |      Server 2            |
                                         +--------------------------+


                     Figure 1: Reference VXLAN Domain




Pallagatti, et al.       Expires January 8, 2020                [Page 4]

Internet-Draft                BFD for VXLAN                    July 2019


   At the same time, a service layer BFD session may be used between the
   tenants of VTEPs IP1 and IP2 to provide end-to-end fault management.
   In such case, for VTEPs BFD control packets of that session are
   indistinguishable from data packets.  If end-to-end defect detection
   is realized as the set of concatenated OAM domains, e.g., VM1-1 - IP1
   -- IP2 - VM2-1, then the BFD session over VXLAN between VTEPs SHOULD
   follow the procedures described in Section 6.8.17 [RFC5880].

4.  BFD Packet Transmission over VXLAN Tunnel

   BFD packet MUST be encapsulated and sent to a remote VTEP as
   explained in this section.  Implementations SHOULD ensure that the
   BFD packets follow the same lookup path as VXLAN data packets within
   the sender system.

   BFD packets are encapsulated in VXLAN as described below.  The VXLAN
   packet format is defined in Section 5 of [RFC7348].  The Outer IP/UDP
   and VXLAN headers MUST be encoded by the sender as defined in
   [RFC7348].
































Pallagatti, et al.       Expires January 8, 2020                [Page 5]

Internet-Draft                BFD for VXLAN                    July 2019


     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                                                               |
    ~                      Outer Ethernet Header                    ~
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                                                               |
    ~                        Outer IPvX Header                      ~
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                                                               |
    ~                        Outer UDP Header                       ~
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                                                               |
    ~                           VXLAN Header                        ~
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                                                               |
    ~                    Inner Ethernet Header                      ~
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                                                               |
    ~                        Inner IPvX Header                      ~
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                                                               |
    ~                         Inner UDP Header                      ~
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                                                               |
    ~                       BFD Control Message                     ~
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            FCS                                |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

           Figure 2: VXLAN Encapsulation of BFD Control Message

   The BFD packet MUST be carried inside the inner MAC frame of the
   VXLAN packet.  The inner MAC frame carrying the BFD payload has the
   following format:

      Ethernet Header:

         Destination MAC: This MUST be the MAC address of the
         destination VTEP.  The MAC address MAY be configured or it MAY



Pallagatti, et al.       Expires January 8, 2020                [Page 6]

Internet-Draft                BFD for VXLAN                    July 2019


         be learned via a control plane protocol.  The details of how
         the MAC address of the destination VTEP is obtained are outside
         the scope of this document.

         Source MAC: MAC address of the originating VTEP

      IP header:

         Source IP: IP address of the originating VTEP.

         Destination IP: IP address of the terminating VTEP.

         TTL: MUST be set to 1 to ensure that the BFD packet is not
         routed within the L3 underlay network.

      The fields of the UDP header and the BFD control packet are
      encoded as specified in [RFC5881].

5.  Reception of BFD Packet from VXLAN Tunnel

   Once a packet is received, VTEP MUST validate the packet.  If the
   Destination MAC of the inner Ethernet frame matches the MAC address
   of the VTEP the packet MUST be processed further.  If the Destination
   MAC of the inner Ethernet frame doesn't match any of VTEP's MAC
   addresses, then the processing of the received VXLAN packet MUST
   follow the procedures described in Section 4.1 [RFC7348].

   The UDP destination port and the TTL of the inner IP packet MUST be
   validated to determine if the received packet can be processed by
   BFD.  BFD packet with inner MAC set to VTEP's MAC address MUST NOT be
   forwarded to VMs.

5.1.  Demultiplexing of the BFD Packet

   Demultiplexing of IP BFD packet has been defined in Section 3 of
   [RFC5881].  Since multiple BFD sessions may be running between two
   VTEPs, there needs to be a mechanism for demultiplexing received BFD
   packets to the proper session.  The procedure for demultiplexing
   packets with Your Discriminator equal to 0 is different from
   [RFC5880].  For such packets, the BFD session MUST be identified
   using the following three-tuples of fields of the inner header: the
   source IP, the destination IP, and the source UDP port number present
   in the IP header carried by the by the payload of the packet in VXLAN
   encapsulation.  If BFD packet is received with non-zero Your
   Discriminator, then BFD session MUST be demultiplexed only with Your
   Discriminator as the key.





Pallagatti, et al.       Expires January 8, 2020                [Page 7]

Internet-Draft                BFD for VXLAN                    July 2019


6.  Use of the Specific VNI

   In most cases, a single BFD session is sufficient for the given VTEP
   to monitor the reachability of a remote VTEP, regardless of the
   number of VNIs in common.  When the single BFD session is used to
   monitor the reachability of the remote VTEP, an implementation SHOULD
   choose any of the VNIs but MAY choose VNI = 0.

7.  Echo BFD

   Support for echo BFD is outside the scope of this document.

8.  IANA Considerations

   This specification has no IANA action requested.  This section may be
   deleted before the publication.

9.  Security Considerations

   The document requires setting the inner IP TTL to 1, which could be
   used as a DDoS attack vector.  Thus the implementation MUST have
   throttling in place to control the rate of BFD control packets sent
   to the control plane.  On the other hand, over aggressive throttling
   of BFD control packets may become the cause of the inability to form
   and maintain BFD session at scale.  Hence, throttling of BFD control
   packets SHOULD be adjusted to permit BFD to work according to its
   procedures.

   If the implementation supports establishing multiple BFD sessions
   between the same pair of VTEPs, there SHOULD be a mechanism to
   control the maximum number of such sessions that can be active at the
   same time.

   Other than inner IP TTL set to 1 and limit the number of BFD sessions
   between the same pair of VTEPs, this specification does not raise any
   additional security issues beyond those of the specifications
   referred to in the list of normative references.

10.  Contributors


   Reshad Rahman
   [email protected]
   Cisco







Pallagatti, et al.       Expires January 8, 2020                [Page 8]

Internet-Draft                BFD for VXLAN                    July 2019


11.  Acknowledgments

   Authors would like to thank Jeff Haas of Juniper Networks for his
   reviews and feedback on this material.

   Authors would also like to thank Nobo Akiya, Marc Binderberger,
   Shahram Davari, Donald E.  Eastlake 3rd, and Anoop Ghanwani for the
   extensive reviews and the most detailed and helpful comments.

12.  References

12.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC5880]  Katz, D. and D. Ward, "Bidirectional Forwarding Detection
              (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010,
              <https://www.rfc-editor.org/info/rfc5880>.

   [RFC5881]  Katz, D. and D. Ward, "Bidirectional Forwarding Detection
              (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881,
              DOI 10.17487/RFC5881, June 2010,
              <https://www.rfc-editor.org/info/rfc5881>.

   [RFC7348]  Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger,
              L., Sridhar, T., Bursell, M., and C. Wright, "Virtual
              eXtensible Local Area Network (VXLAN): A Framework for
              Overlaying Virtualized Layer 2 Networks over Layer 3
              Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014,
              <https://www.rfc-editor.org/info/rfc7348>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

12.2.  Informational References

   [RFC8293]  Ghanwani, A., Dunbar, L., McBride, M., Bannai, V., and R.
              Krishnan, "A Framework for Multicast in Network
              Virtualization over Layer 3", RFC 8293,
              DOI 10.17487/RFC8293, January 2018,
              <https://www.rfc-editor.org/info/rfc8293>.






Pallagatti, et al.       Expires January 8, 2020                [Page 9]

Internet-Draft                BFD for VXLAN                    July 2019


   [RFC8365]  Sajassi, A., Ed., Drake, J., Ed., Bitar, N., Shekhar, R.,
              Uttaro, J., and W. Henderickx, "A Network Virtualization
              Overlay Solution Using Ethernet VPN (EVPN)", RFC 8365,
              DOI 10.17487/RFC8365, March 2018,
              <https://www.rfc-editor.org/info/rfc8365>.

Authors' Addresses

   Santosh Pallagatti (editor)
   VMware

   Email: [email protected]


   Sudarsan Paragiri
   Individual Contributor

   Email: [email protected]


   Vengada Prasad Govindan
   Cisco

   Email: [email protected]


   Mallik Mudigonda
   Cisco

   Email: [email protected]


   Greg Mirsky
   ZTE Corp.

   Email: [email protected]















Pallagatti, et al.       Expires January 8, 2020               [Page 10]

Reply via email to