Re: [openstack-dev] [Neutron] [RFC] Floating IP idea solicitation and collaboration

Thomas Morin Tue, 16 Dec 2014 05:41:46 -0800

Hi Keshava,

2014-12-15 11:52, A, Keshava :

        I have been thinking of "Starting MPLS right from CN" for L2VPN/EVPN 
scenario also.


        Below are my queries w.r.t supporting MPLS from OVS :
                1. MPLS will be used even for VM-VM traffic across CNs 
generated by OVS  ?

If E-VPN is used only to interconnect outside of a Neutron domain, thenMPLS does not have to be used for traffic between VMs.

If E-VPN is used inside one DC for VM-VM traffic, then MPLS is *one* ofthe possible encapsulation only: E-VPN specs have been defined to useVXLAN (handy because there is native kernel support), MPLS/GRE orMPLS/UDP are other possibilities.

                2. MPLS will be originated right from OVS and will be mapped at 
Gateway (it may be NN/Hardware router ) to SP network ?
                        So MPLS will carry 2 Labels ? (one for hop-by-hop, and 
other one for end to identify network ?)

On "will carry 2 Labels ?" : this would be one possibility, but not theone we target.We would actually favor MPLS/GRE (GRE used instead of what you call theMPLS "hop-by-hop" label) inside the DC -- this requires only one label.At the DC edge gateway, depending on the interconnection techniques toconnect the WAN, different options can be used (RFC4364 section 10):Option A with back-to-back VRFs (no MPLS label, but typically VLANs), oroption B (with one MPLS label), a mix of A/B is also possible andsometimes called option D (one label) ; option C also exists, but isnot a good fit here.

Inside one DC, if vswitches see each other across an Ethernet segment,we can also use MPLS with just one label (the VPN label) without a GREencap.

In a way, you can say that in Option B, the label are "mapped" at theDC/WAN gateway(s), but this is really just MPLS label swaping, not to bemisunderstood as mapping a DC label space to a WAN label space (seebelow, the label space is local to each device).

                3. MPLS will go over even the "network physical infrastructure" 
 also ?

The use of MPLS/GRE means we are doing an overlay, just like yourtypical VXLAN-based solution, and the network physical infrastructuredoes not need to be MPLS-aware (it just needs to be able to carry IPtraffic)

                4. How the Labels will be mapped a/c virtual and physical world 
?


(I don't get the question, I'm not sure what you mean by "mapping labels")

                5. Who manages the label space  ? Virtual world or physical 
world or both ? (OpenStack +  ODL ?)

In MPLS*, the label space is local to each device : a label is"downstream-assigned", i.e. allocated by the receiving device for aspecific purpose (e.g. forwarding in a VRF). It is then (typically)avertized in a routing protocol; the sender device will use this labelto send traffic to the receiving device for this specific purpose. As aresult a sender device may then use label 42 to forward traffic in thecontext of VPN X to a receiving device A, and the same label 42 toforward traffic in the context of another VPN Y to another receivingdevice B, and locally use label 42 to receive traffic for VPN Z. Thereis no global label space to manage.

So, while you can design a solution where the label space is managed ina centralized fashion, this is not required.

You could design an SDN controller solution where the controller wouldmanage one label space common to all nodes, or all the label spaces ofall forwarding devices, but I think its hard to derive any interestingproperty from such a design choice.

In our BaGPipe distributed design (and this is also true in OpenContrailfor instance) the label space is managed locally on each compute node(or network node if the BGP speaker is on a network node). Moreprecisely in VPN implementation.

If you take a step back, the only naming space that has to be "managed"in BGP VPNs is the Route Target space. This is only in the controlplane. It is a very large space (48 bits), and it is structured (each AShas its own 32 bit space, and there are private AS numbers). The mappingto the dataplane to MPLS labels is per-device and purely local.

(*: MPLS also allows "upstream-assigned" labels, it is more recent andonly used in specific cases where downstream assigned does not work well)

                6. The labels are nested (i.e. Like L3 VPN end to end MPLS 
connectivity ) will be established ?

In solutions where MPLS/GRE is used the label stack typically has onlyone label (the VPN label).

                7. Or it will be label stitching between Virtual-Physical 
network ?
        How the end-to-end path will be setup ?

Let me know your opinion for the same.


How the end-to-end path is setup may depend on interconnection choice.
With an inter-AS option B or A+B, you would have the following:
- ingress DC overlay: one MPLS-over-GRE hop from vswitch to DC edge
- ingress DC edge to WAN: one MPLS label (VPN label advertised by eBGP)

- inside the WAN: (typically) two labels (e.g. LDP label to reach remoteedge, and VPN label advertised via iBGP)

- WAN to  edgress DC edge: one MPLS label (VPN label advertised by eBGP)
- egress DC overlay: one MPLS-over-GRE hop from DC edge to vswitch

Not sure how the above answers your questions; please keep asking if itdoes not ! ;)


-Thomas

-----Original Message-----
From: Mathieu Rohon [mailto:[email protected]]
Sent: Monday, December 15, 2014 3:46 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Neutron] [RFC] Floating IP idea solicitation and 
collaboration

Hi Ryan,

We have been working on similar Use cases to announce /32 with the Bagpipe 
BGPSpeaker that supports EVPN.
Please have a look at use case B in [1][2].
Note also that the L2population Mechanism driver for ML2, that is compatible 
with OVS, Linuxbridge and ryu ofagent, is inspired by EVPN, and I'm sure it 
could help in your use case

[1]http://fr.slideshare.net/ThomasMorin1/neutron-and-bgp-vpns-with-bagpipe
[2]https://www.youtube.com/watch?v=q5z0aPrUZYc&sns
[3]https://blueprints.launchpad.net/neutron/+spec/l2-population

Mathieu

On Thu, Dec 4, 2014 at 12:02 AM, Ryan Clevenger <[email protected]> 
wrote:

Hi,

At Rackspace, we have a need to create a higher level networking
service primarily for the purpose of creating a Floating IP solution
in our environment. The current solutions for Floating IPs, being tied
to plugin implementations, does not meet our needs at scale for the following 
reasons:

1. Limited endpoint H/A mainly targeting failover only and not
multi-active endpoints, 2. Lack of noisy neighbor and DDOS mitigation,
3. IP fragmentation (with cells, public connectivity is terminated
inside each cell leading to fragmentation and IP stranding when cell
CPU/Memory use doesn't line up with allocated IP blocks. Abstracting
public connectivity away from nova installations allows for much more
efficient use of those precious IPv4 blocks).
4. Diversity in transit (multiple encapsulation and transit types on a
per floating ip basis).

We realize that network infrastructures are often unique and such a
solution would likely diverge from provider to provider. However, we
would love to collaborate with the community to see if such a project
could be built that would meet the needs of providers at scale. We
believe that, at its core, this solution would boil down to
terminating north<->south traffic temporarily at a massively
horizontally scalable centralized core and then encapsulating traffic
east<->west to a specific host based on the association setup via the current 
L3 router's extension's 'floatingips'
resource.

Our current idea, involves using Open vSwitch for header rewriting and
tunnel encapsulation combined with a set of Ryu applications for management:

https://i.imgur.com/bivSdcC.png

The Ryu application uses Ryu's BGP support to announce up to the
Public Routing layer individual floating ips (/32's or /128's) which
are then summarized and announced to the rest of the datacenter. If a
particular floating ip is experiencing unusually large traffic (DDOS,
slashdot effect, etc.), the Ryu application could change the
announcements up to the Public layer to shift that traffic to
dedicated hosts setup for that purpose. It also announces a single /32
"Tunnel Endpoint" ip downstream to the TunnelNet Routing system which
provides transit to and from the cells and their hypervisors. Since
traffic from either direction can then end up on any of the FLIP
hosts, a simple flow table to modify the MAC and IP in either the SRC
or DST fields (depending on traffic direction) allows the system to be
completely stateless. We have proven this out (with static routing and
flows) to work reliably in a small lab setup.

On the hypervisor side, we currently plumb networks into separate OVS
bridges. Another Ryu application would control the bridge that handles
overlay networking to selectively divert traffic destined for the
default gateway up to the FLIP NAT systems, taking into account any
configured logical routing and local L2 traffic to pass out into the
existing overlay fabric undisturbed.

Adding in support for L2VPN EVPN
(https://tools.ietf.org/html/draft-ietf-l2vpn-evpn-11) and L2VPN EVPN
Overlay (https://tools.ietf.org/html/draft-sd-l2vpn-evpn-overlay-03)
to the Ryu BGP speaker will allow the hypervisor side Ryu application
to advertise up to the FLIP system reachability information to take
into account VM failover, live-migrate, and supported encapsulation
types. We believe that decoupling the tunnel endpoint discovery from
the control plane
(Nova/Neutron) will provide for a more robust solution as well as
allow for use outside of openstack if desired.



_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Neutron] [RFC] Floating IP idea solicitation and collaboration

Reply via email to