On Fri, Nov 1, 2013 at 10:16 AM, Pedro Roque Marques <pedro.r.marq...@gmail.com> wrote: > Darren, > > On Oct 31, 2013, at 10:05 AM, Darren Shepherd <darren.s.sheph...@gmail.com> > wrote: > >> Yeah I think it would be great to talk about this at CCC. I'm >> hesitant to further narrow down the definition of the network. For >> example, I think OpenStack's Neutron is fundamentally flawed because >> they defined a network as a L2 segment. > > OpenContrail implements a Neutron plugin. It uses the Neutron API to provide > the concept of a virtual-network. The virtual-network can be a collection of > IP subnets that work as a closed user group; by configuring a network-policy > between virtual-networks the user/admin can define additional connectivity > for the network. The same functionality can be achieved using the AWS VPC > API. We have extended the Neutron API with the concept of network-policy but > have not changed the underlying concept of network; the 1.00 release of the > software provides an IP service to the guest-only (the latest release > provides fallback bridging for non-IP traffic also). While i don't have a > firm opinion on the Neutron API, it does not limit the network to be an L2 > segment. > >> In the world of SDN, I think its even more important to keep the >> definition of the a network loose. SDN has the capability of >> completely changing the way we look at L2 and L3. Currently in >> networking we group things by L3 and L2 concepts as that is how >> routers and switches are laid out today. As SDN matures and you see >> more flow oriented design it won't make sense to group things using L2 >> and L3 concepts (as those become more a physical fabric technology), >> the groups becomes more loose and thus the definition of a network >> should be loose. > > I don't believe there is an accepted definition of SDN. My perspective and > the goal for OpenContrail is to decouple the physical network from the > service provided to the "edge" (the virtual-machines in this case). The goal > is to allow the physical underlay to be designed for throughput and high > inter-connectivity (e.g. CLOS topology); while implementing the functionality > traditionally found in an aggregation switch (the L2/L3 boundary) in the host. > > The logic is that to get the highest server utilization one needs to be able > to schedule a VM (or LXC) anywhere in the cluster; this implies much greater > data throughput requirements. The standard operating procedure used to be to > aim for I/O locality by placing multiple components of an application stack > in the same rack. In the traditional design you can easily find a 20:1 > over-subscription between server ports and the actual throughput of the > network core. > > Once you spread the server load around, the network requirements go up to > design points like 2:1 oversub. This requires a different physical design for > the network and makes it so that there isn't a pair of aggregation switches > nicely positioned above the rack in order to implement policies that control > network-to-network traffic. This is the reason that OpenContrail tries to > implement network-to-network traffic policies in the ingress hypervisor > switch and forward traffic directly without requiring a VirtualRouter > appliance. > > Just to provide one less fluffy definition of what is the problem we are > trying to solve... > >> >> Now that's not to say that a network can't provide L2 and L3 >> information. You should be able to create a network in CloudStack and >> based on the configuration you know that it is a single L2 or L3. It >> is just that the core orchestration system can't make that fundamental >> assumption. I'd be interested in furthering the model and maybe >> adding a concept of a L2 network such that a network guru when >> designing a network, can define multiple l2networks and associate them >> with the generic network that was created. That idea I'm still >> toiling with. > > I'd encourage you to not thing about L2 networks. I've yet to see an > application that is "cloud-ready" that needs anything but IP connectivity. > For IP it doesn't matter what the underlying data layer looks like... > emulating ethernet is a rat-hole. There is no point in doing so.
May be true in the sense that 'cloud-ready' applications are generally just web/application servers that are ephemeral, but I'd just like to point out that many folks aren't using CloudStack to provide cloud servers, they're using it to provide traditional or hybrid infrastructure. Throwing out layer 2 to me seems like throwing away the whole concept of a VPC. Or perhaps you're just saying that it can be emulated by managing ACLs on a per-VM basis, like security groups, and that no applications actually need to be on the same subnet or broadcast domain. I'm not sure that can be assumed, for example DSR-style load balancing requires a real layer 2. > >> >> For example, when configuring DHCP on the systemvm. DHCP is a L2 >> based serviced. > > DHCP is an IP service. Typically provided via a DHCP relay service in the > aggregation switch. For instance in OpenContrail this is provided in the > hypervisor switch (aka vrouter linux kernel module). > >> So to configure DHCP you really need to know for each >> nic, what is the L2 its attached to and what are the VMs associated >> with that L2. Today, since there is no first class concept of a L2 >> network, you have to look at the implied definition of L2. For basic >> networks, the L2 is the Pod, so you need to list all VMs in that Pod. >> For guest/VPC networks, the L2 is the network object, so you need to >> list all VMs associated with the network. It would be nice if when >> the guru designed the network, it also defined the l2networks, and >> then when a VM starts the guru the reserve() method could associate >> the l2network to the nic. So the nic object would have a network_id >> and a l2_network_id. > > With OpenContrail, DHCP is quite simple. The Nic uuid is known by the vrouter > kernel module on the compute-node. When the DHCP request comes from the > tap/vif interface the vrouter answers locally (it known the relationship > between Nic, its properties and virtual-network). Please do not try to bring > L2 into the picture. It would be very unhelpful to do so. > > For most data-centers, the main networking objective is to get rid of L2 and > its limitations. Ethernet is really complex. It has a nice zero config > deployment for very simple networks but at the cost of high complexity if you > are trying to do redundancy, use multiple links, interoperate with other > network devices, scale.... not to mention that all state is data-driven which > makes it really really hard to debug. Ethernet as a layer 1 point to point > link is great; not as a network. > > Pedro.