Re: [nvo3] Support for multi-homed NVEs

Somesh Gupta Fri, 07 Sep 2012 13:18:33 -0700

I know there are a whole bunch of MPLS gurus on this
mailing list. Can we decouple the discussion from
MPLS to make it easier for those barely MPLS-literate?


> -----Original Message-----
> From: Lucy yong [mailto:[email protected]]
> Sent: Friday, September 07, 2012 1:15 PM
> To: Somesh Gupta; Yakov Rekhter; Ivan Pepelnjak
> Cc: 'Balus, Florin Stelian (Florin)'; [email protected]
> Subject: RE: [nvo3] Support for multi-homed NVEs
>
> If this is the requirement, it looks like that the server is equivalent
> to CE site in MPLS/VPN, NVE is on the server as if vrf on CE (vrf-
> lite). ToR is the PE. In this case, you do not have to run routing
> protocol on server, and it is looks like multi-homing in E-VPN.
>
> In this configuration, the server is decouple from DC physical network.
> The server only connects to a VPN configured on DC physical network.
>
> Lucy
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of
> Somesh Gupta
> Sent: Friday, September 07, 2012 1:50 PM
> To: Yakov Rekhter; Ivan Pepelnjak
> Cc: 'Balus, Florin Stelian (Florin)'; [email protected]
> Subject: Re: [nvo3] Support for multi-homed NVEs
>
> A typical multi-homed server does not run any routing protocol.
> If the problem can be solved without inviting routing protocols
> into the hypervisor based NVE, then that is the approach that
> will get adopted.
>
> > -----Original Message-----
> > From: Yakov Rekhter [mailto:[email protected]]
> > Sent: Friday, September 07, 2012 11:00 AM
> > To: Ivan Pepelnjak
> > Cc: 'Yakov Rekhter'; 'Balus, Florin Stelian (Florin)'; [email protected];
> > Somesh Gupta
> > Subject: Re: [nvo3] Support for multi-homed NVEs
> >
> > Ivan,
> >
> > > Yakov,
> > >
> > > The scenario that worries me is a hypervisor-based NVE with more
> > > than one transport (underlay) IP address, with each IP address tied
> > > to a NIC (or multiple bonded NICs), and potentially being in a
> > > different IP subnet (because it's connected to a different ToR
> > > switch). I'm further assuming that the hypervisor is not running a
> > > routing protocol and thus behaves like an IP host locally
> multihomed
> > > to multiple logical networks.
> > >
> > > Assuming we use something like MPLS/VPN for NVO3, the BGP sessions
> > > from NVE to its BGP neighbors (example: a set of route reflectors)
> > > would run from one of the underlay IP addresses. If another
> underlay
> > > IP address becomes unreachable, we cannot use BGP session drop to
> > > detect that.
> >
> > In a scenario where an NVE has two transport addresses, the BGP
> > session from the NVE to its BGP neighbors will advertise two routes
> > for each VM (each with its own RD, and its own Next Hop).  This is
> > pretty much the same as how multi-homing is done today for 2547
> > VPNs.
> >
> > If an underlay IP address, used as a Next Hop is no longer available,
> > the NVE will withdraw the routes that carry this Next Hop. As a
> > side comment, E-VPN has special machinery to speed up this process.
> >
> > Yakov.
> >
> > > Furthermore, we won't have /32 routes to BGP next hops any more
> > > (that would not scale to the several million endpoints from the
> > > NVO3 charter). The first L3 switch to which the server is connected
> > > (a ToR switch or a spine switch) will originate an IP prefix ==>
> > > we cannot use NH tracking.
> > >
> > > Finally, the server might not experience a NIC link loss if it
> loses
> > > connect ion to first-hop IP gateway. Example: blade enclosure
> switch
> > > might be a L2-only switch ==> we cannot expect the hypervisor-based
> > > NVE to revoke prefixes or change BGP next hop if its transport
> > > address becomes unreachable.
> > >
> > > Using BFD from the NVE to detect first-hop gateway reachability and
> > using th
> > at information to set BGP next hops in BGP prefix origination process
> > (as prop
> > osed by Diego Garcia Del Rio) sounds like the simplest option.
> > >
> > > Does this make sense?
> > > Ivan
> > >
> > > > -----Original Message-----
> > > > From: Yakov Rekhter [mailto:[email protected]]
> > > > Sent: Friday, September 07, 2012 12:50 AM
> > > > To: Ivan Pepelnjak
> > > > Cc: 'Balus, Florin Stelian (Florin)'; [email protected]; 'Somesh
> > Gupta';
> > > > [email protected]
> > > > Subject: Re: [nvo3] Support for multi-homed NVEs
> > > >
> > > > Ivan,
> > > >
> > > > > Florin,
> > > > >
> > > > > All the VPNs you've mentioned share the data-plane principles
> > with
> > > > > LISP (that's why I mentioned them in the same sentence): IP
> > packet is
> > > > > encapsulated into an envelope with a known destination endpoint
> > > > > (PE-router, ETR), and sent on its way. The big question is "is
> > the
> > > > > destination endpoint reachable?"
> > > > >
> > > > > In most MPLS-based implementations we assume the endpoint is
> > reachable
> > > > > if we're getting VPN routing/signalling updates from it and if
> > it's
> > > > > reachable through the IP routing table. Some implementations
> > might be
> > > > > more cautious and use GRE encapsulation if there's no LSP to
> the
> > > > > endpoint, others might use MPLS OAM.  Usually these techniques
> > are
> > > > > glossed over and left as implementation details.
> > > >
> > > > In 2547 VPNs (or E-VPNs) it is the PE that originates routes to
> the
> > VPN
> > > > sites connected to that PE. So, if the PE goes down, the routes
> get
> > > > withdrawn. This is "vanilla" BGP, and as such has be supported by
> > any non-
> > > > broken BGP implementation. In addition, there are several
> > enhancements to
> > > > speed up connectivity restoration after egress PE failure, such
> as
> > Next-
> > > > Hop tracking, egress PE protection, etc...
> > > > that have been implemented by vendors.
> > > >
> > > > > On the other hand, LISP defines several mechanisms that can be
> > used to
> > > > > check the ETR liveliness.
> > > >
> > > > Yes, except that all these LISP mechanisms are broken in one way
> or
> > > > another.
> > > >
> > > > > Coming back to nvo3, we'll have serious problems with NVE in a
> > > > > hypervisor using more than one underlay IP address, more so if
> > its
> > > > > control-plane session uses only one of them. We'll never know
> > whether
> > > > > the other IP addresses are reachable (the problem becomes worse
> > if you
> > > > > have a DC transport infrastructure that 's a mixture of L2 and
> > L3).
> > > >
> > > > > In situation where NVE has more than one IP address used by
> nvo3,
> > we
> > > > > need (in my opinion) something that checks the liveliness of
> the
> > > > > remote NVE IP address ... and it's not an implementation
> detail,
> > it's
> > > > > a mandatory requirement.
> > > >
> > > > "check the liveness" is a bit underspecified, as it does not say
> > what
> > > > should be the upper bound on the detection time.
> > > > While discussing the upper bound, we should probably keep in mind
> > that the
> > > > goal here is to minimize connectivity disruption after a failure.
> > Handling
> > > > connectivity diruption could be done via either global or local
> > repair
> > > > techniques (both of these have been employed in 2547 VPNs to deal
> > with PEs
> > > > failures).
> > > >
> > > > Yakov.
> > > > >
> > > > > Hope this makes more sense
> > > > > Ivan
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Balus, Florin Stelian (Florin)
> > [mailto:florin.balus@alcatel-
> > > > > > lucent.com]
> > > > > > Sent: Tuesday, September 04, 2012 11:46 PM
> > > > > > To: Ivan Pepelnjak
> > > > > > Cc: Somesh Gupta; [email protected]
> > > > > > Subject: RE: [nvo3] Support for multi-homed NVEs
> > > > > >
> > > > > > Ivan,
> > > > > > See in-line...
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Ivan Pepelnjak [mailto:[email protected]]
> > > > > > > Sent: Tuesday, September 04, 2012 12:03 PM
> > > > > > > To: Balus, Florin Stelian (Florin)
> > > > > > > Cc: Somesh Gupta; [email protected]
> > > > > > > Subject: Re: [nvo3] Support for multi-homed NVEs
> > > > > > >
> > > > > > > A) Not per prefix (at least not without serious amount of
> > > > > > > end-to-end BGP multipathing+BGP AddPath or multiple RDs)
> > > > > > >
> > > > > > > B) There's no liveliness check in *VPN (apart from LISP)
> > > > > > [FB>] How are the VPN specifications (IP VPN, VPLS, VPWS or
> > incoming
> > > > > > EVPN) connected to LISP? Also can you be more specific on the
> > > > > > absence of liveliness check in *VPN. Are you talking about a
> > certain
> > > > > > deployment/implementation environment?
> > > > > > Thanks,
> > > > > > Florin
> > > > > >
> > > > > > >
> > > > > > > In MPLS/VPN we rely on the host route to PE loopback and/or
> > LSP to
> > > > > > > the
> > > > > > > /32 prefix to indicate next hop validity. That won't work
> for
> > > > > > > hypervisor-based NVEs
> > > > > > >
> > > > > > > On 9/4/12 8:00 PM, Balus, Florin Stelian (Florin) wrote:
> > > > > > > > AFAIK the text in NVO3 framework and requirements drafts
> > allows
> > > > > > > multiple IPs per NVE. We have multiple IPs per PE in
> current
> > VPN
> > > > > > > implementations. This is an implementation matter though in
> > my
> > > > opinion.
> > > > > > > >
> > > > > > > >> -----Original Message-----
> > > > > > > >> From: Somesh Gupta [mailto:[email protected]]
> > > > > > > >> Sent: Tuesday, September 04, 2012 10:51 AM
> > > > > > > >> To: Ivan Pepelnjak; Balus, Florin Stelian (Florin)
> > > > > > > >> Cc: [email protected]
> > > > > > > >> Subject: RE: [nvo3] Support for multi-homed NVEs
> > > > > > > >>
> > > > > > > >> Florin,
> > > > > > > >>
> > > > > > > >> Regarding the multi-homing, my assumption is that the
> NVE
> > in
> > > > > > > >> the hypervisor would not (want to) run a routing
> protocol.
> > So
> > > > > > > >> as Ivan points out, the standard would need to
> accommodate
> > > > > > > >> multiple IP addresses per NVE.
> > > > > > > >>
> > > > > > > >> Somesh
> > > > > > > >>
> > > > > > > >>> -----Original Message-----
> > > > > > > >>> From: Ivan Pepelnjak [mailto:[email protected]]
> > > > > > > >>> Sent: Tuesday, September 04, 2012 9:58 AM
> > > > > > > >>> To: Balus, Florin Stelian (Florin)
> > > > > > > >>> Cc: Somesh Gupta; [email protected]
> > > > > > > >>> Subject: Re: [nvo3] Support for multi-homed NVEs
> > > > > > > >>>
> > > > > > > >>> Ah, that other can of worms ;) Mine was simpler.
> > > > > > > >>>
> > > > > > > >>> On the underlay side, we might decide that NVEs have a
> > single
> > > > > > > >>> IP address or multiple IP addresses (like some NVGRE
> load
> > > > > > > >>> balancing proposals). If we decide NVEs have a single
> IP
> > > > > > > >>> address (potential per virtual network segment), then
> the
> > rest
> > > > > > > >>> is implementation details
> > > > > > > >> (and
> > > > > > > >>> we're back to MLAG/SMLT land for true redundancy).
> > > > > > > >>> Alternatively we might implement the option of having
> > multiple
> > > > > > > >>> IP addresses per NVE, and the NVEs might use the
> > > > > > > >>> IP-address-per-link option (thus no need for L2 or MLAG
> > at all).
> > > > > > > >>>
> > > > > > > >>> On the overlay side, the real problem (as you stated)
> is
> > the
> > > > > > > >>> multi-homing of NVO3-to-legacy gateways. I don't see
> any
> > other
> > > > > > > >>> need for overlay NVE multihoming.
> > > > > > > >>>
> > > > > > > >>> BTW, Nicira has nicely solved the NVO3 gateway
> > multihoming -
> > > > > > > >>> the
> > > > > > > >> whole
> > > > > > > >>> NVO3 network works exactly like VMware's vSwitch: split
> > > > > > > >>> horizon bridging (thus no forwarding loops through
> NVO3),
> > with
> > > > > > > >>> every VM MAC address being dynamically assigned to one
> of
> > the
> > > > > > > >>> gateways, which also solves the return path issues
> > (dynamic
> > > > > > > >>> MAC learning in legacy network takes care of that).
> Maybe
> > we
> > > > > > > >>> should just use the wheel
> > > > > > > that
> > > > > > > >>> has already been invented?
> > > > > > > >>>
> > > > > > > >>> Kind regards,
> > > > > > > >>> Ivan
> > > > > > > >>>
> > > > > > > >>> On 9/4/12 6:45 PM, Balus, Florin Stelian (Florin)
> wrote:
> > > > > > > >>>> I understand the discussion below is about the NVE
> > > > > > > >>>> multi-homing
> > > > > > > >>> towards the IP core, on the tunnel side.
> > > > > > > >>>> We did not focus in the framework draft on the core
> > > > > > > >>>> redundancy as
> > > > > > > >> in
> > > > > > > >>> our opinion there was no need to standardize anything
> > here.
> > > > > > > >>> There are no differences from what is available today
> in
> > > > > > > >>> regular IP
> > > > > > > networks:
> > > > > > > >> if
> > > > > > > >>> NVEs are multi-homed directly to the next IP router,
> > regular
> > > > > > > routing
> > > > > > > >>> will take care of it. If there is Ethernet switching in
> > > > > > > >>> between NVE and the next IP hop, L2 resiliency
> mechanisms
> > need
> > > > > > > >>> to be
> > > > > > employed.
> > > > > > > >>>  From what I read below it looks more of an
> > implementation
> > > > > > > >>> discussion than a standardization requirement. Am I
> > right?
> > > > > > > >>>> By Multi-homed NVEs one can also understand a set of
> > NVEs
> > > > > > > >>>> multi-homed
> > > > > > > >>> on the access side to other devices. That is a
> discussion
> > we
> > > > > > > >>> need
> > > > > > > to
> > > > > > > >>> have in my opinion. An use case example: NVO3 network -
> > NVE
> > > > > > > >>> GWs
> > > > > > > >> multi-
> > > > > > > >>> homed to external non-NVO3 networks. Handoff can be
> > VLANs,
> > > > > > > >>> VPLS
> > > > > > > PWs,
> > > > > > > >>> or BGP EVPN labels...
> > > > > > > >>>> I think the latter is worth discussing although there
> > are
> > > > > > > >>>> some
> > > > > > > >>> mechanisms and some standardization initiatives in
> place
> > > > already.
> > > > > > > >>>>
> > > > > > > >>>>> -----Original Message-----
> > > > > > > >>>>> From: [email protected] [mailto:nvo3-
> > [email protected]]
> > > > > > > >>>>> On Behalf
> > > > > > > >>> Of
> > > > > > > >>>>> Ivan Pepelnjak
> > > > > > > >>>>> Sent: Friday, August 31, 2012 12:23 AM
> > > > > > > >>>>> To: 'Somesh Gupta'; [email protected]
> > > > > > > >>>>> Subject: Re: [nvo3] Support for multi-homed NVEs
> > > > > > > >>>>>
> > > > > > > >>>>> This is definitely an interesting can of worms ;)
> > > > > > > >>>>>
> > > > > > > >>>>> While I don't think we should go down the path of IP-
> > A/IP-B
> > > > > > > >>>>> networks similar to some other DC technology, we will
> > face
> > > > > > > >>>>> the reality of
> > > > > > > >>> some
> > > > > > > >>>>> NVE elements (hypervisor soft switches) not being
> > underlay
> > > > > > > >>>>> IP
> > > > > > > >>> routers.
> > > > > > > >>>>> We could either:
> > > > > > > >>>>>
> > > > > > > >>>>> (A) ignore the issue and expect the network designer
> to
> > > > > > > >>>>> solve it
> > > > > > > >>> using
> > > > > > > >>>>> any one of the existing NIC teaming/MLAG kludges
> while
> > > > > > > >>>>> retaining
> > > > > > > a
> > > > > > > >>>>> single encapsulation IP address per NVE;
> > > > > > > >>>>>
> > > > > > > >>>>> (B) provide support for multiple encapsulation
> > addresses per
> > > > > > > >>>>> NVE
> > > > > > > >> so
> > > > > > > >>> a
> > > > > > > >>>>> multi-homed NVE could have one IP address per
> physical
> > > > > > > >>>>> interface and send and receive nvo3-encapsulated
> frames
> > > > > > > >>>>> using more than one
> > > > > > > >>> address.
> > > > > > > >>>>> Option (A) is the easy way out similar to existing
> > MPLS/VPN
> > > > > > > >>>>> behavior and would fit well with existing DC
> > deployments. It
> > > > > > > would
> > > > > > > >>>>> also
> > > > > > > >>> retain
> > > > > > > >>>>> all the server-to-ToR multihoming complexity.
> > > > > > > >>>>>
> > > > > > > >>>>> Option (B) would reduce the complexity of the
> underlay
> > DC
> > > > > > > >>>>> network (which would become a simple L3 network with
> > > > > > > >>>>> single-homed IP addresses), but we'd have to deal
> with
> > a
> > > > > > > >>>>> bunch of additional
> > > > > > > >>> problems
> > > > > > > >>>>> (peer IP address liveliness check).
> > > > > > > >>>>>
> > > > > > > >>>>> Just speculating ...
> > > > > > > >>>>> Ivan
> > > > > > > >>>>>
> > > > > > > >>>>>> -----Original Message-----
> > > > > > > >>>>>> From: [email protected] [mailto:nvo3-
> > [email protected]]
> > > > > > > >>>>>> On
> > > > > > > >>> Behalf
> > > > > > > >>>>>> Of Somesh Gupta
> > > > > > > >>>>>> Sent: Friday, August 31, 2012 6:58 AM
> > > > > > > >>>>>> To: [email protected]
> > > > > > > >>>>>> Subject: [nvo3] Support for multi-homed NVEs
> > > > > > > >>>>>>
> > > > > > > >>>>>> I did not see any mention of multi-homed NVEs in
> > > > > > > >>>>>> draft-lasserre-
> > > > > > > >>> nvo3-
> > > > > > > >>>>>> framework-03.txt. NVEs are connected together by an
> L3
> > > > > > > >>>>>> network
> > > > > > > >>>>>> -
> > > > > > > >>> does
> > > > > > > >>>>>> that mean only one?
> > > > > > > >>>>>> Can it be multi-homes to two L3 networks?
> > > > > > > >>>>>>
> > > > > > > >>>>>> Somesh
> > > > > > > >>>>>> _______________________________________________
> > > > > > > >>>>>> nvo3 mailing list
> > > > > > > >>>>>> [email protected]
> > > > > > > >>>>>> https://www.ietf.org/mailman/listinfo/nvo3
> > > > > > > >>>>> _______________________________________________
> > > > > > > >>>>> nvo3 mailing list
> > > > > > > >>>>> [email protected]
> > > > > > > >>>>> https://www.ietf.org/mailman/listinfo/nvo3
> > > > >
> > > > > _______________________________________________
> > > > > nvo3 mailing list
> > > > > [email protected]
> > > > > https://www.ietf.org/mailman/listinfo/nvo3
> > >
> > >
> _______________________________________________
> nvo3 mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/nvo3
_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

Re: [nvo3] Support for multi-homed NVEs

Reply via email to