Hi all,

Aldrin wrote about standardized MESSAGING,
http://www.ietf.org/mail-archive/web/nvo3/current/msg01580.html.

I agree totally with Aldrin but there is more than MESSAGING packets
to the whole picture. If we want to automate things between the server
operational domain and the network operational domain more than
MESSAGING packets are needed, i.e. not only a location mapping
database but also some resource database(s).

I'm biased by SIP and going to use MESSAGING methods from SIP together
with a reference architecture and use cases in this post to explain my
vision of a NVO3 architecture. A SIP proxy is usually stateful, i.e.
it can keep track of sessions and you have a centralized place from
where you can collect data to generate an overview of your system. SIP
leverages distributed databases, you don't need to have all data in
one single database and the SIP UA uses the pull model the fetch
*desired* location and service related information from the databases.
BGP is sort of one single database; it is distributed to all peers and
the peer decides locally what to do with the data - it is the push
model, IMHO.

It is also easier to use SIP RFCs to describe MESSAGING methods in
this post, there is a lot of things to consider in MESSAGING methods
and it doesn't mean that the syntax should be like in SIP but you can
get an overview of the MESSAGING methods in the SIP RFCs and there are
years of experience behind those RFCs. So it is worthwhile to at least
to read the abstract/introduction of the mentioned RFCs in this post.
For example, Sunny wrote ,
http://www.ietf.org/mail-archive/web/nvo3/current/msg01596.html , "The
issue here is how to push endpoint updates to the NVE when endpoints
move" - this is the REFER method (if the endpoint in this context is
the NVE) or INFO method (if the endpoint is the VM).
Remember, no need to use the SIP syntax, instead focus on the
MESSAGING methods, i.e. how to setup, move and delete sessions
(tunnels).

Sorry for the long e-mail, I don't have time to write a draft (I need
soon to focus on my customer cases) and for sure there are holes in
the flow of use cases but hopefully you can get an idea what I'm
looking for.

The reference architecture:
- an L2 pod where all 4000 VLANs are enabled in the local network
which is attached via two L3 ToRs to the DC L3 backbone
- L3 routing service are either provide by the ToRs (attached to the
L2 pod) or "behind" the DC L3 backbone
- hypervisors from several vendors are leveraged, with a standardized  NVE stack
- hypervisor based NVEs (hNVE) and asic based NVEs (aNVE) are deployed
in the NVO3 network
- for high bandwidth multicast streams the underlying network supports RFC6513
- two types of NVO proxies (just as an example and to deal with the
organizational silos)
 * tenant NVO (tNVE) proxy, keeping track of tenant related
information (in the server operational domain).
 * network NVO (nNVE) proxy, keeping track of network resources e.g.
NVO3 gateways, NG-MVPN mcast groups etc. (in the network operational
domain).

Use cases for a tenant:

1. Provisioning of tenant information
The server administrator configures the following tenant information
at the tNVO proxy
- Customer ID
- IP subnet to be used for the servers (to keep it simple in this
study, only one subnet, but of course there will be several subnets if
the web-app-db tiers are used)
- IP default gateway for that subnet
- option to support high bandwidth multicast (yes/no)

2. Provisioning of network resources
The network administrator configures the following network information
at the nNVO proxy
- location of the NVO gateways (for WAN connectivity and legacy
non-virtualized hosts, i.e. ToR switches attached to the L2 pods) and
their locally assigned VLANs
- list of locally assigned VLANs at each L2 pods
- location of NVO gateways that are capable to produce NG-MVPN
services, their locally assigned VLANs and assigned multicast groups
- Customer ID mapping to VNI, MPLS VRF instances (name, route
distinguisher, route target) etc.

3. Provisioning of a NVE
The server administrator configures a new hNVE at the hypervisor, the
admin must tell which tNVE proxy shall be used and there might be some
authentication solution that must be obeyed. The hNVE leverages a
MESSAGING solution to register itself to the assigned tNVE proxy.
Ditto for aNVE, the network admin chooses the appropriate nNVE proxy.
Note that a standardized MESSAGING method (register method) is applied
by the NVEs and not by e.g. the centralized console of a hypervisor
solution.
In SIP the REGISTER method is leveraged, see RFC 3261, section 10
http://tools.ietf.org/html/rfc3261#page-56

4. Provisioning of a VM
The server admin assigns a VM to the correct CID, the hNVE notice that
a new server is deployed and leverages a MESSAGING method to fetch
connectivity services from the assigned tNVE proxy.
In SIP the INVITE method is applied, see RFC 3261, section 4
http://tools.ietf.org/html/rfc3261#page-10
I’ll switch partly to the SIP lingo – the INVITE is sent to the
assigned tNVE proxy together with MAC address of the VM and the CID.
The tNVE proxy forward this INVITE to the nNVE proxy (if several, then
it forks the INVITE to all proxies), and the tNVE adds the following
data to the INVITE
- IP subnet
- IP default gateway
- high bandwidth multicast requested
- supported encapsulation schemes with priority
Next the nNVE proxy will process the INVITE, assigning the following
in the 200 (OK) message
- location of the NVO WAN GW, including supported encapsulation schemes
- location of the NVO ToR GW to reach legacy hosts, including
supported encapsulation schemes
- a VLAN to be used for the VM (one VLAN ID shall be used that is
locally available for all three parties, i.e. the WAN GW, ToR GW and
at the L2 pod where the initiator hNVE is located).
- VNI
- NG-MVPN mcast group
The hNVE enable the VLAN and starts to enable the two tunnels to the
WAN GW and ToR GW with the provided parameters. Ditto at the two
aNVEs, but since these are routers some other mechanism can be used,
for example XMPP, NETCONF etc  to configure all required network
parameters on the routers, including adding the MAC address of the new
VM at the two aNVEs.
If successful, two new unicast tunnels have been established and we
have stateful information about the tunnels at both proxies including
MAC information of the VM – one with focus on the tenant and one with
focus on the network resources.
The two ToRs that are attached to the L2 pod of the hNVE are assigned
the new VLAN for the VM and are added to NG-MVPN multicast group, same
for the WAN GW and the ToR GW.

5. Add/deleting VMs on an hNVE with existing tunnels
The server admin adds/deletes a VM at the hNVE discussed in use case
4. A MESSAGING method is needed to update the MAC tables on all NVEs
belonging to the VNI. No changes to the tunnels.
In SIP the INFO method can be used, see
https://tools.ietf.org/html/rfc6086
Think there are other similar methods that could be used, not sure.

6. The hNVE is moved to another host
If the make before break approach is preferred the existing tunnels
should be gracefully moved from the existing host to another host. A
MESSAGING method is needed for this.
In SIP the REFER method is leveraged, see
http://tools.ietf.org/html/rfc3515
There are other similar methods that could be used.

This is a very rough overview, but hopefully you can see what could be
achieved with a standardized MESSAGING solution, including distributed
location databases and network resources databases.

NVO3 have so much more potential than which/how many encapsulation
schemes should be used etc; here is an opportunity to bridge the gap
between the DC silos and automate provisioning of resources etc.

Best regards,

Patrick
_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

Reply via email to