Hi Dinesh,
many thanks for your detailed updates on how some implementations process
VXLAN header and the inner Ethernet frame. These are very helpful in
achieving the workable solution for the problem at hand.
You've noted that a path between VTEPs may be monitored in the underlay
network by merely establishing a BFD session. That is true, but by using
BFD with VXLAN encapsulation between the pair of VTEPs we are extending the
OAM domain by including, to some extent, VXLAN forwarding engine. Abstract
in RFC 5880 defines the goal and the domain in which BFD protocol can
detect a fault as:
This document describes a protocol intended to detect faults in the
bidirectional path between two forwarding engines, including
interfaces, data link(s), and to the extent possible the forwarding
engines themselves, with potentially very low latency.
Thus, BFD in the underlay will exercise a part of IP forwarding engine
while BFD with VXLAN encapsulation, ran between the same pair of VTEPs,
extends the OAM domain. At the same time, defining BFD between tenant
systems in outside the goal of this specification. But VXLAN BFD session
between VTEPs may be useful in monitoring e2e path between tenants, as
described in the update to -07:
At the same time, a service layer BFD session may be used between the
tenants of VTEPs IP1 and IP2 to provide end-to-end fault management.
In such case, for VTEPs BFD control packets of that session are
indistinguishable from data packets. If end-to-end defect detection
is realized as the set of concatenated OAM domains, e.g., VM1-1 - IP1
-- IP2 - VM2-1, then the BFD session over VXLAN between VTEPs SHOULD
follow the procedures described in Section 6.8.17 [RFC5880].
I've attached the current working version of the draft.
Regards,
Greg
On Fri, Aug 2, 2019 at 5:43 PM Dinesh Dutt <[email protected]> wrote:
> What I mean is "How do you infer that it excludes the case I'm talking
> about?".
>
> Dinesh
>
> On Fri, Aug 2, 2019 at 5:41 PM Dinesh Dutt <[email protected]> wrote:
>
>> The abstract reads this: "
>>
>> This document describes the use of the Bidirectional Forwarding
>> Detection (BFD) protocol in point-to-point Virtual eXtensible Local
>> Area Network (VXLAN) tunnels forming up an overlay network."
>>
>> How do you infer what you said?
>>
>> Dinesh
>>
>>
>> On Fri, Aug 2, 2019 at 5:38 PM Joel M. Halpern <[email protected]>
>> wrote:
>>
>>> I am going by what the draft says its purpose is. If you (Dinesh) want
>>> the draft to fulfill a different purpose, then either ask the chairs to
>>> take this draft back to the WG, or write a separate draft.
>>> As currently written, the behavior Greg proposed meets the needs, and
>>> does so in a way that is consistent with VxLAN.
>>>
>>> Yours,
>>> Joel
>>>
>>> On 8/2/2019 8:30 PM, Dinesh Dutt wrote:
>>> > What is the stated purpose of this BFD session? The VTEP reachability
>>> is
>>> > determined by the underlay, I don't need VXLAN-encaped packet for
>>> that.
>>> > Do we agree?
>>> >
>>> > If I want to test the VXLAN encap/decap functionality alone, picking
>>> any
>>> > single VNI maybe fine. But is this all any network operator wants?
>>> Why?
>>> > In what situations has this been a problem? I suspect operators also
>>> > want to verify path continuity over a specific VNI. If you say this is
>>> > not defined by the document, I disagree because the current version
>>> > talks about controlling the number of BFD sessions between the VTEPs
>>> > (see section 3). More importantly, this is a real problem that
>>> operators
>>> > like to verify.
>>> >
>>> > Dinesh
>>> >
>>> > On Fri, Aug 2, 2019 at 5:08 PM Joel M. Halpern <[email protected]
>>> > <mailto:[email protected]>> wrote:
>>> >
>>> > What is special about the management VNI is precisely that it is
>>> NOT a
>>> > tenant VNI. The VxLAN administration does know how it allocates
>>> VNI to
>>> > tenants, and which VNI it has allocated. In contrast, it does not
>>> know
>>> > which IP addresses or MAC adddresses teh tenant is using or may
>>> plan
>>> > to use.
>>> >
>>> > Yours,
>>> > Joel
>>> >
>>> > On 8/2/2019 6:41 PM, Dinesh Dutt wrote:
>>> > > The assumption of an IP address within any VNI is suspect that
>>> way.
>>> > > What's special about a single VNI, the management VNI? The VTEP
>>> IP
>>> > > address does not belong in reality in any VNI.
>>> > >
>>> > > Dinesh
>>> > >
>>> > > On Fri, Aug 2, 2019 at 3:17 PM Joel M. Halpern
>>> > <[email protected] <mailto:[email protected]>
>>> > > <mailto:[email protected] <mailto:[email protected]>>>
>>> wrote:
>>> > >
>>> > > Your response seems to miss two points:
>>> > >
>>> > > First, the problem you describe is not what the document
>>> says
>>> > it is
>>> > > solving. To the degree it discusses it at all, the document
>>> > says "
>>> > > In
>>> > > most cases, a single BFD session is sufficient for the given
>>> > VTEP to
>>> > > monitor the reachability of a remote VTEP, regardless of the
>>> > number of
>>> > > VNIs in common. "
>>> > >
>>> > > Second, you assume the existence of an IP address for a VTEP
>>> > within a
>>> > > VNI. As with the MAC address, the VTEP does not have an IP
>>> > address
>>> > > within the VNI. Some implementations may have created such
>>> a
>>> > thing,
>>> > > but
>>> > > the general construct, as defined to date, does not support
>>> such.
>>> > >
>>> > > In short, you are requiring a behavior that violates the
>>> > architectural
>>> > > structure of overlay / underlay separation, and common
>>> > usage. And you
>>> > > are doing so to support a use case that the working group
>>> has not
>>> > > indicated in the document as important.
>>> > >
>>> > > Yours,
>>> > > Joel
>>> > >
>>> > > On 8/2/2019 5:01 PM, Dinesh Dutt wrote:
>>> > > > Joel,
>>> > > >
>>> > > > You understood correctly.
>>> > > >
>>> > > > The VNIs may not share fate due to misconfiguration. And
>>> I
>>> > strongly
>>> > > > suspect someone will want to use BFD for that because its
>>> > about
>>> > > checking
>>> > > > path continuity as stated by the draft. As long as
>>> there's a
>>> > > valid IP
>>> > > > (because it's BFD) owned by the VTEP in that VNI, you can
>>> > use BFD in
>>> > > > that VNI. Thats all that you need to dictate. That IP
>>> address
>>> > > has a MAC
>>> > > > address and you can use that on the inner frame. That is
>>> > all normal
>>> > > > VXLAN processing. The outer IP is always that of the
>>> VTEP.
>>> > > >
>>> > > > Dinesh
>>> > > >
>>> > > > On Fri, Aug 2, 2019 at 11:03 AM Joel M. Halpern
>>> > > <[email protected] <mailto:[email protected]>
>>> > <mailto:[email protected] <mailto:[email protected]>>
>>> > > > <mailto:[email protected] <mailto:[email protected]>
>>> > <mailto:[email protected] <mailto:[email protected]>>>> wrote:
>>> > > >
>>> > > > If I am reading your various emails correctly Dinesh
>>> > (and I
>>> > > may have
>>> > > > missed something) you are trying to use the MAC
>>> address
>>> > > because you
>>> > > > want
>>> > > > to be able to send these BFD packets over arbitrary
>>> VNI to
>>> > > monitor the
>>> > > > VNI. That is not a requirement identified in the
>>> > document.
>>> > > It is not
>>> > > > even a problem I understand, since all the VNI
>>> between an
>>> > > ingress and
>>> > > > egress VTEP share fate.
>>> > > >
>>> > > > Yours,
>>> > > > Joel
>>> > > >
>>> > > > On 8/2/2019 1:44 PM, Dinesh Dutt wrote:
>>> > > > > Thanks for verifying this. On Linux and hardware
>>> > routers
>>> > > that I'm
>>> > > > aware
>>> > > > > of (Cisco circa 2012 and Cumulus), the physical
>>> MAC
>>> > address is
>>> > > > reused
>>> > > > > across the VNIs on the VTEP. Did you check on a
>>> non-VMW
>>> > > device?
>>> > > > This is
>>> > > > > more for my own curiosity.
>>> > > > >
>>> > > > > To address the general case, can we not define a
>>> > > well-known (or
>>> > > > reserve
>>> > > > > one) unicast MAC address for use with VTEP? If
>>> the MAC
>>> > > address is
>>> > > > > configurable in BFD command, this can be moot.
>>> > > > >
>>> > > > > Dinesh
>>> > > > >
>>> > > > > On Fri, Aug 2, 2019 at 10:27 AM Santosh P K
>>> > > > > <[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected]
>>> > <mailto:[email protected]>>
>>> > > > <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected]
>>> > <mailto:[email protected]>>>
>>> > > > <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected]
>>> > <mailto:[email protected]>>
>>> > > > <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected]
>>> > <mailto:[email protected]>>>>> wrote:
>>> > > > >
>>> > > > > I have cross checked point raised about MAC
>>> address
>>> > > usage. It is
>>> > > > > possible that tenant could be using physical
>>> MAC
>>> > > address and
>>> > > > when a
>>> > > > > packet comes with valid VNI with a MAC address
>>> > that is
>>> > > being
>>> > > > used by
>>> > > > > tenant then packet will be sent to that
>>> tenant.
>>> > This rules
>>> > > > out the
>>> > > > > fact that we could use physical MAC address as
>>> > inner
>>> > > MAC to
>>> > > > ensure
>>> > > > > packets get terminated at VTEP itself.
>>> > > > >
>>> > > > > Thanks
>>> > > > > Santosh P K
>>> > > > >
>>> > > > > On Wed, Jul 31, 2019 at 11:00 AM Santosh P K
>>> > > > > <[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected]
>>> > <mailto:[email protected]>>
>>> > > > <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected]
>>> > <mailto:[email protected]>>>
>>> > > > <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected]
>>> > <mailto:[email protected]>>
>>> > > > <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected]
>>> > <mailto:[email protected]>>>>>
>>> > > > > wrote:
>>> > > > >
>>> > > > > Joel,
>>> > > > > Thanks for your inputs. I checked
>>> > > implementation within
>>> > > > > Vmware. Perhaps I should have been more
>>> clear
>>> > > about MAC
>>> > > > address
>>> > > > > space while checking internally. I will
>>> cross
>>> > > check again for
>>> > > > > the same and get back on this list.
>>> > > > >
>>> > > > > Thanks
>>> > > > > Santosh P K
>>> > > > >
>>> > > > > On Wed, Jul 31, 2019 at 10:54 AM Joel M.
>>> > Halpern
>>> > > > > <[email protected]
>>> > <mailto:[email protected]> <mailto:[email protected]
>>> > <mailto:[email protected]>>
>>> > > <mailto:[email protected] <mailto:[email protected]>
>>> > <mailto:[email protected] <mailto:[email protected]>>>
>>> > > > <mailto:[email protected]
>>> > <mailto:[email protected]> <mailto:[email protected]
>>> > <mailto:[email protected]>>
>>> > > <mailto:[email protected] <mailto:[email protected]>
>>> > <mailto:[email protected] <mailto:[email protected]>>>>>
>>> wrote:
>>> > > > >
>>> > > > > Sorry to ask a stupid question. Whose
>>> > > implementation?
>>> > > > >
>>> > > > > The reason I ask is that as far as I
>>> > can tell,
>>> > > since the
>>> > > > > tenant does not
>>> > > > > have any control access to the VTEP,
>>> > there is no
>>> > > > reason for
>>> > > > > the VTEP to
>>> > > > > have a MAC address in the tenant
>>> > space. Yes, the
>>> > > > device has
>>> > > > > a physical
>>> > > > > MAC address. But the tenant could
>>> well be
>>> > > using that MAC
>>> > > > > address. Yes,
>>> > > > > they would be violating the Ethernet
>>> spec.
>>> > > But the whole
>>> > > > > point of
>>> > > > > segregation is not to care about such
>>> > issues.
>>> > > > >
>>> > > > > On the other hand, if you tell me that
>>> > the VMWare
>>> > > > > implementation has an
>>> > > > > Ethernet address that is part of the
>>> tenant
>>> > > space, well,
>>> > > > > they made up
>>> > > > > this particular game.
>>> > > > >
>>> > > > > Yours,
>>> > > > > Joel
>>> > > > >
>>> > > > > On 7/31/2019 1:44 PM, Santosh P K
>>> wrote:
>>> > > > > > I have checked with implementation
>>> > in data
>>> > > path.
>>> > > > When we
>>> > > > > receive a
>>> > > > > > packet with valid VNI then lookup
>>> > for MAC will
>>> > > > happen and
>>> > > > > it is VTEP own
>>> > > > > > MAC then it will be trapped to
>>> control
>>> > > plane for
>>> > > > > processing. I think we
>>> > > > > > can have following options
>>> > > > > > 1. Optional managment VNI
>>> > > > > > 2. Mandatory inner MAC set to VTEP
>>> mac
>>> > > > > > 3. Inner IP TTL set to 1 to avoid
>>> > > forwarding of packet
>>> > > > > via inner IP
>>> > > > > > address.
>>> > > > > >
>>> > > > > >
>>> > > > > > Thoughts?
>>> > > > > >
>>> > > > > > Thansk
>>> > > > > > Santosh P K
>>> > > > > >
>>> > > > > > On Wed, Jul 31, 2019 at 9:20 AM
>>> Greg
>>> > Mirsky
>>> > > > > <[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected] <mailto:[email protected]
>>> >>
>>> > <mailto:[email protected] <mailto:[email protected]>
>>> > > <mailto:[email protected] <mailto:[email protected]
>>> >>>
>>> > > > <mailto:[email protected]
>>> > <mailto:[email protected]> <mailto:[email protected]
>>> > <mailto:[email protected]>>
>>> > > <mailto:[email protected] <mailto:[email protected]
>>> >
>>> > <mailto:[email protected] <mailto:[email protected]>>>>
>>> > > > > > <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected] <mailto:[email protected]
>>> >>
>>> > > > <mailto:[email protected]
>>> > <mailto:[email protected]> <mailto:[email protected]
>>> > <mailto:[email protected]>>>
>>> > > > > <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected] <mailto:[email protected]
>>> >>
>>> > > > <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected]
>>> > <mailto:[email protected]>>>>>> wrote:
>>> > > > > >
>>> > > > > > Hi Dinesh,
>>> > > > > > thank you for your
>>> consideration
>>> > of the
>>> > > > proposal and
>>> > > > > questions. What
>>> > > > > > would you see as the scope of
>>> > testing the
>>> > > > > connectivity for the
>>> > > > > > specific VNI? If it is
>>> > > tenant-to-tenant, then
>>> > > > VTEPs
>>> > > > > will treat these
>>> > > > > > packets as regular user
>>> frames. More
>>> > > likely, these
>>> > > > > could be Layer 2
>>> > > > > > OAM, e.g. CCM frames. The
>>> reason
>>> > to use
>>> > > 127/8 for
>>> > > > > IPv4, and
>>> > > > > > 0:0:0:0:0:FFFF:7F00:0/104 for
>>> > IPv6 is
>>> > > to safeguard
>>> > > > > from leaking
>>> > > > > > Ethernet frames with BFD
>>> Control
>>> > packet
>>> > > to a
>>> > > > tenant.
>>> > > > > > You've suggested using a MAC
>>> > address to
>>> > > trap the
>>> > > > > control packet at
>>> > > > > > VTEP. What that address could
>>> be? We
>>> > > had proposed
>>> > > > > using the
>>> > > > > > dedicated MAC and VTEP's MAC
>>> and
>>> > both
>>> > > raised
>>> > > > concerns
>>> > > > > among VXLAN
>>> > > > > > experts. The idea of using
>>> > Management
>>> > > VNI may
>>> > > > be more
>>> > > > > acceptable
>>> > > > > > based on its similarity to the
>>> > practice
>>> > > of using
>>> > > > > Management VLAN.
>>> > > > > >
>>> > > > > > Regards,
>>> > > > > > Greg
>>> > > > > >
>>> > > > > > On Wed, Jul 31, 2019 at 12:03
>>> PM
>>> > Dinesh
>>> > > Dutt
>>> > > > > <[email protected]
>>> > <mailto:[email protected]> <mailto:[email protected]
>>> > <mailto:[email protected]>>
>>> > > <mailto:[email protected] <mailto:[email protected]>
>>> > <mailto:[email protected] <mailto:[email protected]>>>
>>> > > > <mailto:[email protected] <mailto:[email protected]>
>>> > <mailto:[email protected] <mailto:[email protected]>>
>>> > > <mailto:[email protected] <mailto:[email protected]>
>>> > <mailto:[email protected] <mailto:[email protected]>>>>
>>> > > > > > <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected] <mailto:[email protected]>>
>>> > > > <mailto:[email protected] <mailto:[email protected]>
>>> > <mailto:[email protected] <mailto:[email protected]>>>
>>> > > <mailto:[email protected] <mailto:[email protected]>
>>> > <mailto:[email protected] <mailto:[email protected]>>
>>> > > > <mailto:[email protected] <mailto:[email protected]>
>>> > <mailto:[email protected] <mailto:[email protected]>>>>>>
>>> > > > > wrote:
>>> > > > > >
>>> > > > > > Hi Greg,
>>> > > > > >
>>> > > > > > As long as the inner MAC
>>> > address is
>>> > > such
>>> > > > that the
>>> > > > > packet is
>>> > > > > > trapped to the CPU, it
>>> should be
>>> > > fine for
>>> > > > use as
>>> > > > > an inner MAC is
>>> > > > > > it not? Stating that is
>>> > better than
>>> > > trying to
>>> > > > > force a management
>>> > > > > > VNI. What if someone wants
>>> > to test
>>> > > > connectivity
>>> > > > > on a specific
>>> > > > > > VNI? I would not pick a
>>> > loopback IP
>>> > > > address for
>>> > > > > this since that
>>> > > > > > address range is host/node
>>> local
>>> > > only. Is
>>> > > > there a
>>> > > > > reason you're
>>> > > > > > not using the VTEP IP as
>>> the
>>> > inner IP
>>> > > > address ?
>>> > > > > >
>>> > > > > > Dinesh
>>> > > > > >
>>> > > > > > On Wed, Jul 31, 2019 at
>>> 5:48 AM
>>> > > Greg Mirsky
>>> > > > > > <[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected] <mailto:[email protected]
>>> >>
>>> > > > <mailto:[email protected]
>>> > <mailto:[email protected]> <mailto:[email protected]
>>> > <mailto:[email protected]>>>
>>> > > > > <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected] <mailto:[email protected]
>>> >>
>>> > > > <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected]
>>> > <mailto:[email protected]>>>> <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected] <mailto:[email protected]
>>> >>
>>> > > > <mailto:[email protected]
>>> > <mailto:[email protected]> <mailto:[email protected]
>>> > <mailto:[email protected]>>>
>>> > > > > <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected] <mailto:[email protected]
>>> >>
>>> > > > <mailto:[email protected]
>>> > <mailto:[email protected]>
>>> > > <mailto:[email protected]
>>> > <mailto:[email protected]>>>>>> wrote:
>>> > > > > >
>>> > > > > > Dear All,
>>> > > > > > thank you for your
>>> comments,
>>> > > > suggestions on
>>> > > > > this issue,
>>> > > > > > probably the most
>>> > challenging
>>> > > for this
>>> > > > > specification. In the
>>> > > > > > course of our
>>> discussions,
>>> > > we've agreed to
>>> > > > > abandon the
>>> > > > > > request to allocate the
>>> > > dedicated MAC
>>> > > > address
>>> > > > > to be used as
>>> > > > > > the destination MAC
>>> > address in
>>> > > the inner
>>> > > > > Ethernet frame.
>>> > > > > > Also, earlier using VNI
>>> > 0 was
>>> > > changed from
>>> > > > > mandatory to one
>>> > > > > > of the options an
>>> > > implementation may
>>> > > > offer to
>>> > > > > an operator.
>>> > > > > > The most recent
>>> > discussion was
>>> > > whether
>>> > > > VTEP's
>>> > > > > MAC address
>>> > > > > > might be used as the
>>> > > destination MAC
>>> > > > address
>>> > > > > in the inner
>>> > > > > > Ethernet frame. As I
>>> > recall it, the
>>> > > > comments
>>> > > > > from VXLAN
>>> > > > > > experts equally split
>>> > with one
>>> > > for it
>>> > > > and one
>>> > > > > against. Hence
>>> > > > > > I would like to propose
>>> > a new
>>> > > text to
>>> > > > resolve
>>> > > > > the issue. The
>>> > > > > > idea is to let an
>>> > operator select
>>> > > > Management
>>> > > > > VNI and use
>>> > > > > > that VNI in VXLAN
>>> > encapsulation
>>> > > of BFD
>>> > > > > Control packets:
>>> > > > > > NEW TEXT:
>>> > > > > >
>>> > > > > > An operator MUST
>>> > select a VNI
>>> > > > number to
>>> > > > > be used as
>>> > > > > > Management VNI.
>>> VXLAN
>>> > > packet for
>>> > > > > Management VNI MUST NOT
>>> > > > > > be sent to a
>>> tenant. VNI
>>> > > number 1 is
>>> > > > > RECOMMENDED as the
>>> > > > > > default for
>>> > Management VNI.
>>> > > > > >
>>> > > > > > With that new text,
>>> what
>>> > can be the
>>> > > > value of
>>> > > > > the destination
>>> > > > > > MAC in the inner
>>> Ethernet? I
>>> > > tend to
>>> > > > believe
>>> > > > > that it can be
>>> > > > > > anything and ignored
>>> by the
>>> > > reciever VTEP.
>>> > > > > Also, if the
>>> > > > > > trapping is based on
>>> VNI
>>> > > number, the
>>> > > > > destination IP address
>>> > > > > > of the inner IP packet
>>> > can from
>>> > > the range
>>> > > > > 127/8 for IPv4,
>>> > > > > > and for IPv6 from the
>>> range
>>> > > > > 0:0:0:0:0:FFFF:7F00:0/104. And
>>> > > > > > lastly, the TTL to be
>>> > set to 1 (no
>>> > > > change here).
>>> > > > > >
>>> > > > > > Much appreciate your
>>> > comments,
>>> > > > questions, and
>>> > > > > suggestions.
>>> > > > > >
>>> > > > > > Best regards,
>>> > > > > > Greg
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>
BFD S. Pallagatti, Ed.
Internet-Draft VMware
Intended status: Standards Track S. Paragiri
Expires: January 8, 2020 Individual Contributor
V. Govindan
M. Mudigonda
Cisco
G. Mirsky
ZTE Corp.
July 7, 2019
BFD for VXLAN
draft-ietf-bfd-vxlan-08
Abstract
This document describes the use of the Bidirectional Forwarding
Detection (BFD) protocol in point-to-point Virtual eXtensible Local
Area Network (VXLAN) tunnels forming up an overlay network.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 8, 2020.
Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
Pallagatti, et al. Expires January 8, 2020 [Page 1]
Internet-Draft BFD for VXLAN July 2019
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Conventions used in this document . . . . . . . . . . . . . . 3
2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3
2.2. Requirements Language . . . . . . . . . . . . . . . . . . 3
3. Deployment . . . . . . . . . . . . . . . . . . . . . . . . . 4
4. BFD Packet Transmission over VXLAN Tunnel . . . . . . . . . . 5
5. Reception of BFD Packet from VXLAN Tunnel . . . . . . . . . . 7
5.1. Demultiplexing of the BFD Packet . . . . . . . . . . . . 7
6. Use of the Specific VNI . . . . . . . . . . . . . . . . . . . 8
7. Echo BFD . . . . . . . . . . . . . . . . . . . . . . . . . . 8
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
9. Security Considerations . . . . . . . . . . . . . . . . . . . 8
10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 8
11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9
12. References . . . . . . . . . . . . . . . . . . . . . . . . . 9
12.1. Normative References . . . . . . . . . . . . . . . . . . 9
12.2. Informational References . . . . . . . . . . . . . . . . 9
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10
1. Introduction
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348] provides an
encapsulation scheme that allows building an overlay network by
decoupling the address space of the attached virtual hosts from that
of the network.
One use of VXLAN is in data centers interconnecting virtual machines
(VMs) of a tenant. VXLAN addresses requirements of the Layer 2 and
Layer 3 data center network infrastructure in the presence of VMs in
a multi-tenant environment by providing a Layer 2 overlay scheme on a
Layer 3 network [RFC7348]. Another use is as an encapsulation for
Ethernet VPN [RFC8365].
This document is written assuming the use of VXLAN for virtualized
hosts and refers to VMs and VXLAN Tunnel End Points (VTEPs) in
hypervisors. However, the concepts are equally applicable to non-
virtualized hosts attached to VTEPs in switches.
In the absence of a router in the overlay, a VM can communicate with
another VM only if they are on the same VXLAN segment. VMs are
unaware of VXLAN tunnels as a VXLAN tunnel is terminated on a VTEP.
Pallagatti, et al. Expires January 8, 2020 [Page 2]
Internet-Draft BFD for VXLAN July 2019
VTEPs are responsible for encapsulating and decapsulating frames
exchanged among VMs.
Ability to monitor path continuity, i.e., perform proactive
continuity check (CC) for point-to-point (p2p) VXLAN tunnels, is
important. The asynchronous mode of BFD, as defined in [RFC5880], is
used to monitor a p2p VXLAN tunnel.
In the case where a Multicast Service Node (MSN) (as described in
Section 3.3 of [RFC8293]) resides behind an Network Virtualization
Endpoint (NVE), the mechanisms described in this document apply and
can, therefore, be used to test the connectivity from the source NVE
to the MSN.
This document describes the use of Bidirectional Forwarding Detection
(BFD) protocol to enable monitoring continuity of the path between
VXLAN VTEPs, performing as Network Virtualization Endpoints, and/or
availability of a replicator multicast service node.
2. Conventions used in this document
2.1. Terminology
BFD Bidirectional Forwarding Detection
CC Continuity Check
p2p Point-to-point
MSN Multicast Service Node
NVE Network Virtualization Endpoint
VFI Virtual Forwarding Instance
VM Virtual Machine
VNI VXLAN Network Identifier (or VXLAN Segment ID)
VTEP VXLAN Tunnel End Point
VXLAN Virtual eXtensible Local Area Network
2.2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
Pallagatti, et al. Expires January 8, 2020 [Page 3]
Internet-Draft BFD for VXLAN July 2019
14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
3. Deployment
Figure 1 illustrates the scenario with two servers, each of them
hosting two VMs. The servers host VTEPs that terminate two VXLAN
tunnels with VXLAN Network Identifier (VNI) number 100 and 200
respectively. Separate BFD sessions can be established between the
VTEPs (IP1 and IP2) for monitoring each of the VXLAN tunnels (VNI 100
and 200). An implementation that supports this specification MUST be
able to control the number of BFD sessions that can be created
between the same pair of VTEPs. BFD packets intended for a
Hypervisor VTEP MUST NOT be forwarded to a VM as a VM may drop BFD
packets leading to a false negative. This method is applicable
whether the VTEP is a virtual or physical device.
+------------+-------------+
| Server 1 |
| +----+----+ +----+----+ |
| |VM1-1 | |VM1-2 | |
| |VNI 100 | |VNI 200 | |
| | | | | |
| +---------+ +---------+ |
| Hypervisor VTEP (IP1) |
+--------------------------+
|
| +-------------+
| | Layer 3 |
+---| Network |
+-------------+
|
+-----------+
|
+------------+-------------+
| Hypervisor VTEP (IP2) |
| +----+----+ +----+----+ |
| |VM2-1 | |VM2-2 | |
| |VNI 100 | |VNI 200 | |
| | | | | |
| +---------+ +---------+ |
| Server 2 |
+--------------------------+
Figure 1: Reference VXLAN Domain
Pallagatti, et al. Expires January 8, 2020 [Page 4]
Internet-Draft BFD for VXLAN July 2019
At the same time, a service layer BFD session may be used between the
tenants of VTEPs IP1 and IP2 to provide end-to-end fault management.
In such case, for VTEPs BFD control packets of that session are
indistinguishable from data packets. If end-to-end defect detection
is realized as the set of concatenated OAM domains, e.g., VM1-1 - IP1
-- IP2 - VM2-1, then the BFD session over VXLAN between VTEPs SHOULD
follow the procedures described in Section 6.8.17 [RFC5880].
4. BFD Packet Transmission over VXLAN Tunnel
BFD packet MUST be encapsulated and sent to a remote VTEP as
explained in this section. Implementations SHOULD ensure that the
BFD packets follow the same lookup path as VXLAN data packets within
the sender system.
BFD packets are encapsulated in VXLAN as described below. The VXLAN
packet format is defined in Section 5 of [RFC7348]. The Outer IP/UDP
and VXLAN headers MUST be encoded by the sender as defined in
[RFC7348].
Pallagatti, et al. Expires January 8, 2020 [Page 5]
Internet-Draft BFD for VXLAN July 2019
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
~ Outer Ethernet Header ~
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
~ Outer IPvX Header ~
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
~ Outer UDP Header ~
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
~ VXLAN Header ~
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
~ Inner Ethernet Header ~
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
~ Inner IPvX Header ~
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
~ Inner UDP Header ~
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
~ BFD Control Message ~
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| FCS |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2: VXLAN Encapsulation of BFD Control Message
The BFD packet MUST be carried inside the inner MAC frame of the
VXLAN packet. The inner MAC frame carrying the BFD payload has the
following format:
Ethernet Header:
Destination MAC: This MUST be the MAC address of the
destination VTEP. The MAC address MAY be configured or it MAY
Pallagatti, et al. Expires January 8, 2020 [Page 6]
Internet-Draft BFD for VXLAN July 2019
be learned via a control plane protocol. The details of how
the MAC address of the destination VTEP is obtained are outside
the scope of this document.
Source MAC: MAC address of the originating VTEP
IP header:
Source IP: IP address of the originating VTEP.
Destination IP: IP address of the terminating VTEP.
TTL: MUST be set to 1 to ensure that the BFD packet is not
routed within the L3 underlay network.
The fields of the UDP header and the BFD control packet are
encoded as specified in [RFC5881].
5. Reception of BFD Packet from VXLAN Tunnel
Once a packet is received, VTEP MUST validate the packet. If the
Destination MAC of the inner Ethernet frame matches the MAC address
of the VTEP the packet MUST be processed further. If the Destination
MAC of the inner Ethernet frame doesn't match any of VTEP's MAC
addresses, then the processing of the received VXLAN packet MUST
follow the procedures described in Section 4.1 [RFC7348].
The UDP destination port and the TTL of the inner IP packet MUST be
validated to determine if the received packet can be processed by
BFD. BFD packet with inner MAC set to VTEP's MAC address MUST NOT be
forwarded to VMs.
5.1. Demultiplexing of the BFD Packet
Demultiplexing of IP BFD packet has been defined in Section 3 of
[RFC5881]. Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BFD
packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from
[RFC5880]. For such packets, the BFD session MUST be identified
using the following three-tuples of fields of the inner header: the
source IP, the destination IP, and the source UDP port number present
in the IP header carried by the by the payload of the packet in VXLAN
encapsulation. If BFD packet is received with non-zero Your
Discriminator, then BFD session MUST be demultiplexed only with Your
Discriminator as the key.
Pallagatti, et al. Expires January 8, 2020 [Page 7]
Internet-Draft BFD for VXLAN July 2019
6. Use of the Specific VNI
In most cases, a single BFD session is sufficient for the given VTEP
to monitor the reachability of a remote VTEP, regardless of the
number of VNIs in common. When the single BFD session is used to
monitor the reachability of the remote VTEP, an implementation SHOULD
choose any of the VNIs but MAY choose VNI = 0.
7. Echo BFD
Support for echo BFD is outside the scope of this document.
8. IANA Considerations
This specification has no IANA action requested. This section may be
deleted before the publication.
9. Security Considerations
The document requires setting the inner IP TTL to 1, which could be
used as a DDoS attack vector. Thus the implementation MUST have
throttling in place to control the rate of BFD control packets sent
to the control plane. On the other hand, over aggressive throttling
of BFD control packets may become the cause of the inability to form
and maintain BFD session at scale. Hence, throttling of BFD control
packets SHOULD be adjusted to permit BFD to work according to its
procedures.
If the implementation supports establishing multiple BFD sessions
between the same pair of VTEPs, there SHOULD be a mechanism to
control the maximum number of such sessions that can be active at the
same time.
Other than inner IP TTL set to 1 and limit the number of BFD sessions
between the same pair of VTEPs, this specification does not raise any
additional security issues beyond those of the specifications
referred to in the list of normative references.
10. Contributors
Reshad Rahman
[email protected]
Cisco
Pallagatti, et al. Expires January 8, 2020 [Page 8]
Internet-Draft BFD for VXLAN July 2019
11. Acknowledgments
Authors would like to thank Jeff Haas of Juniper Networks for his
reviews and feedback on this material.
Authors would also like to thank Nobo Akiya, Marc Binderberger,
Shahram Davari, Donald E. Eastlake 3rd, and Anoop Ghanwani for the
extensive reviews and the most detailed and helpful comments.
12. References
12.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection
(BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010,
<https://www.rfc-editor.org/info/rfc5880>.
[RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection
(BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881,
DOI 10.17487/RFC5881, June 2010,
<https://www.rfc-editor.org/info/rfc5881>.
[RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger,
L., Sridhar, T., Bursell, M., and C. Wright, "Virtual
eXtensible Local Area Network (VXLAN): A Framework for
Overlaying Virtualized Layer 2 Networks over Layer 3
Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014,
<https://www.rfc-editor.org/info/rfc7348>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
12.2. Informational References
[RFC8293] Ghanwani, A., Dunbar, L., McBride, M., Bannai, V., and R.
Krishnan, "A Framework for Multicast in Network
Virtualization over Layer 3", RFC 8293,
DOI 10.17487/RFC8293, January 2018,
<https://www.rfc-editor.org/info/rfc8293>.
Pallagatti, et al. Expires January 8, 2020 [Page 9]
Internet-Draft BFD for VXLAN July 2019
[RFC8365] Sajassi, A., Ed., Drake, J., Ed., Bitar, N., Shekhar, R.,
Uttaro, J., and W. Henderickx, "A Network Virtualization
Overlay Solution Using Ethernet VPN (EVPN)", RFC 8365,
DOI 10.17487/RFC8365, March 2018,
<https://www.rfc-editor.org/info/rfc8365>.
Authors' Addresses
Santosh Pallagatti (editor)
VMware
Email: [email protected]
Sudarsan Paragiri
Individual Contributor
Email: [email protected]
Vengada Prasad Govindan
Cisco
Email: [email protected]
Mallik Mudigonda
Cisco
Email: [email protected]
Greg Mirsky
ZTE Corp.
Email: [email protected]
Pallagatti, et al. Expires January 8, 2020 [Page 10]