wido opened a new issue, #12210:
URL: https://github.com/apache/cloudstack/issues/12210

   ### The required feature described as a wish
   
   # Networking without Layer 2
   This proposal is to add a new networking feature to CloudStack where 
Instances are directly assigned /32 (IPv4) and/or /128 (IPv6) addresses without 
a shared Layer 2 domain.
   
   A shared Layer 2 domain in this case would be a VLAN or VXLAN VNI where 
Instances shared the same Broadcast/Multicast domain and where they use a 
shared IP-gateway for their routing.
   
   # Layer 3
   By leveraging various features of the Linux kernel, making this a KVM-only 
feature, we can directly route an IPv4 and/or IPv6 address to a virtual machine 
by using a dynamic routing protocol like BGP, but this could also work with 
OSPF(v3).
   
   By eliminating the need for Layer 2 we can create a routed network where no 
Instance has a "network relationship" with another Instance. Every Instance has 
one or more routes installed in the routing table of the network and can be 
routed to any host at any time.
   
   In the examples below I will use two IP-addresses:
   - 2.57.57.30
   - 2001:678:3a4:100::80
   
   # Hypervisor host as gateway
   On the hypervisor the cloudbr0 bridge will be created and assigned an IPv4 
and IPv6 address:
   
   ```
   auto cloudbr0
   iface cloudbr0 inet static
       address 169.254.0.1/32
       address fe80::1/64
       bridge-ports none
       bridge-stp off
       bridge-fd 0
   ```
   
   All Instances will be connected to this bridge and they will be configured 
to use the following IP-gateways:
   - 169.254.0.1
   - fe80::1
   
   # Inside the Instance
   As there is no Layer 2 available the IP-configuration within the VM has to 
be done using ConfigDrive for cloud-init, a Virtual Router handing out DHCP and 
cloud-init data is not possible in this design. For the VM there is no way of 
detecting the cloud-init source over the network as our current CloudStack 
provider within cloud-init relies on the DHCP server as a source.
   
   After the Instance has used cloud-init to fetch the networking information 
from ConfigDrive the Netplan (Ubuntu Linux) configuration would look like this:
   
   ```
   network:
     ethernets:
       ens18:
         accept-ra: no
         nameservers:
             addresses:
                 - 2620:fe::fe
                 - 2620:fe::9
         addresses:
                 - 2.57.57.30/32
                 - 2001:678:3a4:100::80/128
         routes:
         - to: default
           via: fe80::1
         - to: default
           via: 169.254.0.1
           on-link: true
     version: 2
   ```
   
   In this configuration the network inside the Instance is configured to the 
the addresses configured on **cloudbr0** as the gateway, meaning that the 
hypervisor will act as the gateway and route the IP-traffic.
   
   All hypervisors will use an identical configuration for cloudbr0.
   
   # ARP and NDP neighbor configuration
   CloudStack is aware of the IPv4 and/or IPv6 addresses assigned to an 
Instance as well as the MAC address. On the hypervisor these entries have to be 
installed into the kernel's routing table and neighbor table. In this example 
the commands would be:
   
   ```
   ip -6 route add 2001:678:3a4:100::80/128 dev vmbr1
   ip -6 neigh add 2001:678:3a4:100::80 lladdr 52:02:45:76:d2:35 dev vmbr1 nud 
permanent
   ip -4 route add 2.57.57.30/32 dev vmbr1
   ip -4 neigh add 2.57.57.30 lladdr 52:02:45:76:d2:35 dev vmbr1 nud permanent
   ```
   
   These entries would need to be added upon Instance start on that host and 
removed on Instance stop/migrate. The KVM Agent should handle the orchestration 
of these entries.
   
   # Dynamic Routing
   Configuring these entries in the routing table is not sufficient, these need 
to be advertised to the upstream network. For this the hypervisor host would 
need to use some form of dynamic routing. BGP is the most commonly used, while 
others would like to use OSPF(v3).
   
   In both cases the hypervisor will announce these /32 (IPv4) and /128 (IPv6) 
addresses to the upstream network while receiving a default route (0.0.0.0/0 
and ::/0) from the network to be able to route traffic.
   
   A very simple piece of configuration for FRRouting (BGP or OSPF) could be:
   
   ## BGP
   ```
   router bgp
    redistribute kernel route-map only-cloud
   !
   route-map only-cloud permit 10
    match interface cloudbr0
   ```
   
   ## OSPF
   ```router ospf
    redistribute kernel route-map only-cloud
    network YOUR_NETWORK/XX area 0.0.0.0
   !
   route-map only-cloud permit 10
    match interface cloudbr0
   ```
   
   # IP address pools
   As each Instance is assigned a IPv4 and/or IPv6 address there is no need to 
create a "network" inside CloudStack. The concept would be that CloudStack 
simply has a pool of addresses to choose from and allocates them to an Instance
   
   A pool could be:
   
   - 2.57.57.80
   - 145.31.53.21
   - 90.78.37.15
   - 88.17.11.53
   
   - 2001:db8::100
   - 2001:678:3a4:100::80
   - 2a00:f10:415:27::100
   
   These addresses have no relationship with eachother, but they don't have to 
as each individual address is assigned the a VM
   
   This networking setup also allows for very easy single stack IPv6-only 
Virtual Machines where IPv4 can be added or removed when needed. There is no 
dependency on either of the two protocols.
   
   # Summary
   This networking design completely eliminates the use of Layer 2 
broadcast/multicast domains. Each Instance becomes a full L3 routed part of the 
network where CloudStack's orchestration will make sure the addresses are 
routed towards the host where the Instance is on.
   
   Using this setup it's very easy to create a massively scalable and reliable 
network spanning multiple datacenters as there is no shared L2 or VXLAN overlay.
   
   The most common use-case for this feature will probably be public cloud 
providers which need to assign public IPv4/IPv6 addresses to Instance and want 
to share nothing between the VMs.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to