Re: [nvo3] Support for multi-homed NVEs

thomas.morin Fri, 07 Sep 2012 08:35:24 -0700

Hi Ivan,

Ivan Pepelnjak :
> Furthermore, we won't have /32 routes to BGP next hops any more (that would 
> not scale to the several million endpoints from the NVO3 charter). The first 
> L3 switch to which the server is connected (a ToR switch or a spine switch) 
> will originate an IP prefix ==> we cannot use NH tracking.


The scale at which the underlay routing would need to scale for NH 
tracking to be usable, is the number of NVEs.
AFAIK, the target scale for the number of NVE is not millions, but 
rather multiples of 10ks, up to 100ks.
So, I would tend to agree with you that this kind of range would rule 
out IGP-based NH tracking, but the IGP is not the only tool we have to 
scale routing, is it ?

-Thomas


>> -----Original Message-----
>> From: Yakov Rekhter [mailto:[email protected]]
>> Sent: Friday, September 07, 2012 12:50 AM
>> To: Ivan Pepelnjak
>> Cc: 'Balus, Florin Stelian (Florin)'; [email protected]; 'Somesh Gupta';
>> [email protected]
>> Subject: Re: [nvo3] Support for multi-homed NVEs
>>
>> Ivan,
>>
>>> Florin,
>>>
>>> All the VPNs you've mentioned share the data-plane principles with
>>> LISP (that's why I mentioned them in the same sentence): IP packet is
>>> encapsulated into an envelope with a known destination endpoint
>>> (PE-router, ETR), and sent on its way. The big question is "is the
>>> destination endpoint reachable?"
>>>
>>> In most MPLS-based implementations we assume the endpoint is reachable
>>> if we're getting VPN routing/signalling updates from it and if it's
>>> reachable through the IP routing table. Some implementations might be
>>> more cautious and use GRE encapsulation if there's no LSP to the
>>> endpoint, others might use MPLS OAM.  Usually these techniques are
>>> glossed over and left as implementation details.
>> In 2547 VPNs (or E-VPNs) it is the PE that originates routes to the VPN
>> sites connected to that PE. So, if the PE goes down, the routes get
>> withdrawn. This is "vanilla" BGP, and as such has be supported by any non-
>> broken BGP implementation. In addition, there are several enhancements to
>> speed up connectivity restoration after egress PE failure, such as Next-
>> Hop tracking, egress PE protection, etc...
>> that have been implemented by vendors.
>>
>>> On the other hand, LISP defines several mechanisms that can be used to
>>> check the ETR liveliness.
>> Yes, except that all these LISP mechanisms are broken in one way or
>> another.
>>
>>> Coming back to nvo3, we'll have serious problems with NVE in a
>>> hypervisor using more than one underlay IP address, more so if its
>>> control-plane session uses only one of them. We'll never know whether
>>> the other IP addresses are reachable (the problem becomes worse if you
>>> have a DC transport infrastructure that 's a mixture of L2 and L3).
>>> In situation where NVE has more than one IP address used by nvo3, we
>>> need (in my opinion) something that checks the liveliness of the
>>> remote NVE IP address ... and it's not an implementation detail, it's
>>> a mandatory requirement.
>> "check the liveness" is a bit underspecified, as it does not say what
>> should be the upper bound on the detection time.
>> While discussing the upper bound, we should probably keep in mind that the
>> goal here is to minimize connectivity disruption after a failure. Handling
>> connectivity diruption could be done via either global or local repair
>> techniques (both of these have been employed in 2547 VPNs to deal with PEs
>> failures).
>>
>> Yakov.
>>> Hope this makes more sense
>>> Ivan
>>>
>>>> -----Original Message-----
>>>> From: Balus, Florin Stelian (Florin) [mailto:florin.balus@alcatel-
>>>> lucent.com]
>>>> Sent: Tuesday, September 04, 2012 11:46 PM
>>>> To: Ivan Pepelnjak
>>>> Cc: Somesh Gupta; [email protected]
>>>> Subject: RE: [nvo3] Support for multi-homed NVEs
>>>>
>>>> Ivan,
>>>> See in-line...
>>>>
>>>>> -----Original Message-----
>>>>> From: Ivan Pepelnjak [mailto:[email protected]]
>>>>> Sent: Tuesday, September 04, 2012 12:03 PM
>>>>> To: Balus, Florin Stelian (Florin)
>>>>> Cc: Somesh Gupta; [email protected]
>>>>> Subject: Re: [nvo3] Support for multi-homed NVEs
>>>>>
>>>>> A) Not per prefix (at least not without serious amount of
>>>>> end-to-end BGP multipathing+BGP AddPath or multiple RDs)
>>>>>
>>>>> B) There's no liveliness check in *VPN (apart from LISP)
>>>> [FB>] How are the VPN specifications (IP VPN, VPLS, VPWS or incoming
>>>> EVPN) connected to LISP? Also can you be more specific on the
>>>> absence of liveliness check in *VPN. Are you talking about a certain
>>>> deployment/implementation environment?
>>>> Thanks,
>>>> Florin
>>>>
>>>>> In MPLS/VPN we rely on the host route to PE loopback and/or LSP to
>>>>> the
>>>>> /32 prefix to indicate next hop validity. That won't work for
>>>>> hypervisor-based NVEs
>>>>>
>>>>> On 9/4/12 8:00 PM, Balus, Florin Stelian (Florin) wrote:
>>>>>> AFAIK the text in NVO3 framework and requirements drafts allows
>>>>> multiple IPs per NVE. We have multiple IPs per PE in current VPN
>>>>> implementations. This is an implementation matter though in my
>> opinion.
>>>>>>> -----Original Message-----
>>>>>>> From: Somesh Gupta [mailto:[email protected]]
>>>>>>> Sent: Tuesday, September 04, 2012 10:51 AM
>>>>>>> To: Ivan Pepelnjak; Balus, Florin Stelian (Florin)
>>>>>>> Cc: [email protected]
>>>>>>> Subject: RE: [nvo3] Support for multi-homed NVEs
>>>>>>>
>>>>>>> Florin,
>>>>>>>
>>>>>>> Regarding the multi-homing, my assumption is that the NVE in
>>>>>>> the hypervisor would not (want to) run a routing protocol. So
>>>>>>> as Ivan points out, the standard would need to accommodate
>>>>>>> multiple IP addresses per NVE.
>>>>>>>
>>>>>>> Somesh
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Ivan Pepelnjak [mailto:[email protected]]
>>>>>>>> Sent: Tuesday, September 04, 2012 9:58 AM
>>>>>>>> To: Balus, Florin Stelian (Florin)
>>>>>>>> Cc: Somesh Gupta; [email protected]
>>>>>>>> Subject: Re: [nvo3] Support for multi-homed NVEs
>>>>>>>>
>>>>>>>> Ah, that other can of worms ;) Mine was simpler.
>>>>>>>>
>>>>>>>> On the underlay side, we might decide that NVEs have a single
>>>>>>>> IP address or multiple IP addresses (like some NVGRE load
>>>>>>>> balancing proposals). If we decide NVEs have a single IP
>>>>>>>> address (potential per virtual network segment), then the rest
>>>>>>>> is implementation details
>>>>>>> (and
>>>>>>>> we're back to MLAG/SMLT land for true redundancy).
>>>>>>>> Alternatively we might implement the option of having multiple
>>>>>>>> IP addresses per NVE, and the NVEs might use the
>>>>>>>> IP-address-per-link option (thus no need for L2 or MLAG at all).
>>>>>>>>
>>>>>>>> On the overlay side, the real problem (as you stated) is the
>>>>>>>> multi-homing of NVO3-to-legacy gateways. I don't see any other
>>>>>>>> need for overlay NVE multihoming.
>>>>>>>>
>>>>>>>> BTW, Nicira has nicely solved the NVO3 gateway multihoming -
>>>>>>>> the
>>>>>>> whole
>>>>>>>> NVO3 network works exactly like VMware's vSwitch: split
>>>>>>>> horizon bridging (thus no forwarding loops through NVO3), with
>>>>>>>> every VM MAC address being dynamically assigned to one of the
>>>>>>>> gateways, which also solves the return path issues (dynamic
>>>>>>>> MAC learning in legacy network takes care of that). Maybe we
>>>>>>>> should just use the wheel
>>>>> that
>>>>>>>> has already been invented?
>>>>>>>>
>>>>>>>> Kind regards,
>>>>>>>> Ivan
>>>>>>>>
>>>>>>>> On 9/4/12 6:45 PM, Balus, Florin Stelian (Florin) wrote:
>>>>>>>>> I understand the discussion below is about the NVE
>>>>>>>>> multi-homing
>>>>>>>> towards the IP core, on the tunnel side.
>>>>>>>>> We did not focus in the framework draft on the core
>>>>>>>>> redundancy as
>>>>>>> in
>>>>>>>> our opinion there was no need to standardize anything here.
>>>>>>>> There are no differences from what is available today in
>>>>>>>> regular IP
>>>>> networks:
>>>>>>> if
>>>>>>>> NVEs are multi-homed directly to the next IP router, regular
>>>>> routing
>>>>>>>> will take care of it. If there is Ethernet switching in
>>>>>>>> between NVE and the next IP hop, L2 resiliency mechanisms need
>>>>>>>> to be
>>>> employed.
>>>>>>>>   From what I read below it looks more of an implementation
>>>>>>>> discussion than a standardization requirement. Am I right?
>>>>>>>>> By Multi-homed NVEs one can also understand a set of NVEs
>>>>>>>>> multi-homed
>>>>>>>> on the access side to other devices. That is a discussion we
>>>>>>>> need
>>>>> to
>>>>>>>> have in my opinion. An use case example: NVO3 network - NVE
>>>>>>>> GWs
>>>>>>> multi-
>>>>>>>> homed to external non-NVO3 networks. Handoff can be VLANs,
>>>>>>>> VPLS
>>>>> PWs,
>>>>>>>> or BGP EVPN labels...
>>>>>>>>> I think the latter is worth discussing although there are
>>>>>>>>> some
>>>>>>>> mechanisms and some standardization initiatives in place
>> already.
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: [email protected] [mailto:[email protected]]
>>>>>>>>>> On Behalf
>>>>>>>> Of
>>>>>>>>>> Ivan Pepelnjak
>>>>>>>>>> Sent: Friday, August 31, 2012 12:23 AM
>>>>>>>>>> To: 'Somesh Gupta'; [email protected]
>>>>>>>>>> Subject: Re: [nvo3] Support for multi-homed NVEs
>>>>>>>>>>
>>>>>>>>>> This is definitely an interesting can of worms ;)
>>>>>>>>>>
>>>>>>>>>> While I don't think we should go down the path of IP-A/IP-B
>>>>>>>>>> networks similar to some other DC technology, we will face
>>>>>>>>>> the reality of
>>>>>>>> some
>>>>>>>>>> NVE elements (hypervisor soft switches) not being underlay
>>>>>>>>>> IP
>>>>>>>> routers.
>>>>>>>>>> We could either:
>>>>>>>>>>
>>>>>>>>>> (A) ignore the issue and expect the network designer to
>>>>>>>>>> solve it
>>>>>>>> using
>>>>>>>>>> any one of the existing NIC teaming/MLAG kludges while
>>>>>>>>>> retaining
>>>>> a
>>>>>>>>>> single encapsulation IP address per NVE;
>>>>>>>>>>
>>>>>>>>>> (B) provide support for multiple encapsulation addresses per
>>>>>>>>>> NVE
>>>>>>> so
>>>>>>>> a
>>>>>>>>>> multi-homed NVE could have one IP address per physical
>>>>>>>>>> interface and send and receive nvo3-encapsulated frames
>>>>>>>>>> using more than one
>>>>>>>> address.
>>>>>>>>>> Option (A) is the easy way out similar to existing MPLS/VPN
>>>>>>>>>> behavior and would fit well with existing DC deployments. It
>>>>> would
>>>>>>>>>> also
>>>>>>>> retain
>>>>>>>>>> all the server-to-ToR multihoming complexity.
>>>>>>>>>>
>>>>>>>>>> Option (B) would reduce the complexity of the underlay DC
>>>>>>>>>> network (which would become a simple L3 network with
>>>>>>>>>> single-homed IP addresses), but we'd have to deal with a
>>>>>>>>>> bunch of additional
>>>>>>>> problems
>>>>>>>>>> (peer IP address liveliness check).
>>>>>>>>>>
>>>>>>>>>> Just speculating ...
>>>>>>>>>> Ivan
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: [email protected] [mailto:[email protected]]
>>>>>>>>>>> On
>>>>>>>> Behalf
>>>>>>>>>>> Of Somesh Gupta
>>>>>>>>>>> Sent: Friday, August 31, 2012 6:58 AM
>>>>>>>>>>> To: [email protected]
>>>>>>>>>>> Subject: [nvo3] Support for multi-homed NVEs
>>>>>>>>>>>
>>>>>>>>>>> I did not see any mention of multi-homed NVEs in
>>>>>>>>>>> draft-lasserre-
>>>>>>>> nvo3-
>>>>>>>>>>> framework-03.txt. NVEs are connected together by an L3
>>>>>>>>>>> network
>>>>>>>>>>> -
>>>>>>>> does
>>>>>>>>>>> that mean only one?
>>>>>>>>>>> Can it be multi-homes to two L3 networks?
>>>>>>>>>>>
>>>>>>>>>>> Somesh
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> nvo3 mailing list
>>>>>>>>>>> [email protected]
>>>>>>>>>>> https://www.ietf.org/mailman/listinfo/nvo3
>>>>>>>>>> _______________________________________________
>>>>>>>>>> nvo3 mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>> https://www.ietf.org/mailman/listinfo/nvo3
>>> _______________________________________________
>>> nvo3 mailing list
>>> [email protected]
>>> https://www.ietf.org/mailman/listinfo/nvo3
> _______________________________________________
> nvo3 mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/nvo3

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
France Telecom - Orange decline toute responsabilite si ce message a ete 
altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, France Telecom - Orange is not liable for messages 
that have been modified, changed or falsified.
Thank you.

_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

Re: [nvo3] Support for multi-homed NVEs

Reply via email to