Op 16-08-2021 om 11:29 schreef Rohit Yadav:
Thanks Hean, Kristaps, Wido for your feedback.

I think we've some quorum and consensus on how we should proceed with IPv6 
support with static routing (phase1). Based on my proof-of-concept and 
discussions, I believe we may target this feature as early as 4.17 and I 
welcome offer by Kristaps and others who may want to be involved in testing the 
feature as and when we'll develop it.

As the next step I'll write a short design doc on cwiki including some of the 
new ideas/suggestions and share it on this thread for another iteration.


I was thinking about this and I have nother idea for phase 2.

As mentioned earlier it can be quite difficult when you have a lot of VRs running to configure OSPFv3 or BGP in the upstream routers to have this all working.

Instead we could think about using ExaBGP. With ExaBGP you would have some additional tooling which picks up the subnets assigned to VRs from the messagebus.

ExaBGP: https://github.com/Thomas-Mangin/exabgp

This tooling then dynamically injects this route into ExaBGP where the destination of thus /48 (example) points to the VR.

Inside the VR there is no need to configure OSPF or BGP. The admin doesn't need to configure static routers either.

Their routers only peer via BGP with ExaBGP which injects the routes.

See this blogpost to get an idea: https://vincent.bernat.ch/en/blog/2013-exabgp-highavailability

"Redundancy with ExaBGP"
"ExaBGP is a convenient tool to plug scripts into BGP. They can then receive and advertise routes. ExaBGP does the hard work of speaking BGP with your routers. The scripts just have to read routes from standard input or advertise them on standard output."

So we would just need to feed ExaBGP all the routes which then advertises them again towards the upstream routers.

ExaBGP could be running anywhere as long as it receives the messages from CloudStack and has a BGP connection with the routers.

Might be worth looking into in the future for phase 2.

And again: Also in shared networks it would be very nice to be able to route a subnet towards a single Instance. Native IPv6 with Docker, VPN services with IPv6 inside a VM, etc, etc.

Wido


Regards.

________________________________
From: Wido den Hollander <w...@widodh.nl>
Sent: Friday, August 13, 2021 15:48
To: dev@cloudstack.apache.org <dev@cloudstack.apache.org>
Subject: Re: IPV6 in Isolated/VPC networks

Hi,

See my inline responses:

Op 11-08-2021 om 14:26 schreef Rohit Yadav:
Hi all,

Thanks for your feedback and ideas, I've gone ahead with discussing them with 
Alex and came up with a PoC/design which can be implemented in the following 
phases:

    *   Phase1: implement ipv6 support in isolated networks and VPC with static 
routing
    *   Phase2: discuss and implement support for dynamic routing (TBD)

For Phase1 here's the high-level proposal:

    *   IPv6 address management:
       *   At the zone level root-admin specifies a /64 public range that will 
be used for VRs, then they can add a /48, or /56 IPv6 range for guest networks 
(to be used by isolated networks and VPC tiers)
       *   On creation of any IPv6 enabled isolated network or VPC tier, from 
the /48 or /56 block a /64 network is allocated/used
       *   We assume SLAAC and autoconfiguration, no DHCPv6 in the zone 
(discuss: is privacy a concern, can privacy extensions rfc4941 of slaac be 
explored?)

Privacy Extensions are only a concern for client devices which roam
between different IPv6 networks.

If you IPv6 address of a client keeps the same suffix (MAC based) and
switches network then only the prefix (/64) will change.

This way a network like Google, Facebook, etc could track your device
moving from network to network if they only look at the last 64-bits of
the IPv6 address.

For servers this is not a problem as you already know in which network
they are.

    *   Network offerings: root-admin can create new network offerings (with 
VPC too) that specifies a network stack option:
       *   ipv4 only (default, for backward compatibility all 
networks/offerings post-upgrade migrate to this option)
       *   ipv4-and-ipv6
       *   ipv6-only (this can be phase 1.b)
       *   A new routing option: static (phase1), dynamic (phase2, with 
multiple sub-options such as ospf/bgp etc...)

This means that the network admin will need to statically route the IPv6
subnet to the VR's outside IPv6 address, for example, on a JunOS router:

set routing-options rib inet6.0 static route 2001:db8:500::/48 next-hop
2001:db8:100::50

I'm assuming that 2001:db8:100::50 is the address of the VR on the
outside (/64) network. In reality this will probably be a longer
address, but this is for just the example.

    *   VR changes:
       *   VR gets its guest and public nics set to inet6 auto
       *   For each /64 allocated to guest network and VPC tiers, radvd is 
configured to do RA

radvd is fine, but looking at phase 2 with dynamic routing you might
already want to look into FRRouting. FRR can also advertise RAs while
not doing any routing.

interface ens4
    no ipv6 nd suppress-ra
    ipv6 nd prefix 2001:db8:500::/64
    ipv6 nd rdnss 2001:db8:400::53 2001:db8:200::53

See: http://docs.frrouting.org/en/latest/ipv6.html

       *   Firewall: a new ipv6 zone/chain is created for ipv6 where ipv6 
firewall rules (ACLs, ingress, egress) are implemented; ACLs between VPC tiers 
are managed/implemented by ipv6 firewall on VR

Please take a look at the existing security_group.py script which
implements RFC4890

https://datatracker.ietf.org/doc/html/rfc4890

ICMPv6 is a vital part of IPv6 and certain packets should always be allowed.

       *   It is assumed that static routes are created on the core/main router 
by the admin or automated using some scripts/tools; for this CloudStack will 
announce events with details of /64 networks and VR's public IPv6 address that 
can be consumed by a rabbitmq/message bus client (for example), or a custom 
cron job or script as part of orchestration. (this wouldn't be necessary for 
dynamic routing bgp with phase2)\\

You would only need to announce the /48 or /56 allocated to the VR,
that's all. You don't need to inform the upstream router about the /64
subnets created within that larger subnet.

    *   Guest Networking: With SLAAC, it's easy for CloudStack to calculate 
allocate and use a /64 and determine the IPv6 address of VR nics and guest VM 
nics
       *   A user create an isolated network/VPC with an offering that is ipv6 
enabled
       *   A user can manage firewall for the IPv6 address/guest nics; there'll 
be no port forward and LB feature though for IPv6
       *   A users can run workloads in the guest VMs that listen on publically 
routable ipv6 addresses
       *   Usage/billing etc continue to work, no change needed

Network layout:

[core/ISP router] -> [VR] -> [guest netwokr or VPC tier on a VLAN] -> [guest 
VMs/nics]
*core/ISP router needs static routes to be added (manually or automated), 
assumes a /48 or /56 configured for the zone

Thoughts, feedback?

Looks doable!

Side-note: It would be very cool if you could use parts of this
implementation to also route /48, /56, or /60 subnets to individual VMs
in Shared networks.

Why? This allows for running Docker containers with native IPv6 inside
the VM or running a (Open)VPN server inside a VM which then assigns
public IPv6 addresses to clients connected.

Instead of routing the subnet to a VR we route the subnet to a single
instance in a shared network.

If we could then also move these subnets between Instances easily one
can quickly migrate to a different instance while keeping the same IPv6
subnet.

Wido


Proof-of-concept commentary: here's what I did to test the idea:

    *   Created an isolated network and deployed a VM in my home lab
The VR running on KVM has following nics
eth0 - guest network
eth1 - link local
eth2 - public network

    *   I setup a custom openwrt router on a RPi4 to serve as a toy-core router 
where I create a wan6 IPv6 tunnel using tunnel broker and I got a /48 
allocated. My configuration looks like:
/48 - 2001:470:ed36::/48 (allocated by tunnel broker)
/64 - 2001:470:36:3e2::/64 (default allocated by)

I create a LAN ipv6 (public network for CloudStack VR): at subnet/prefix 0:
LAN IPv6 address: 2001:470:ed36:0::1/64
Address mode: SLAAC+stateless DHCP (no dhcpv6)
    *
    *
In the isolated VR, I enabled ipv6 as:
net.ipv6.conf.all.disable_ipv6 = 0
net.ipv6.conf.all.forwarding = 1
net.ipv6.conf.all.accept_ra = 1
net.ipv6.conf.all.accept_redirects = 1
net.ipv6.conf.all.autoconf = 1

Set up a IPv6 nameserver/dns in /etc/resolve.conf
And configured the nics:
echo iface eth0 inet6 auto >> /etc/network/interfaces
echo iface eth2 inet6 auto >> /etc/network/interfaces
/etc/init.d/networking restart
Next, restart ACS isolated network without cleanup to have it reconfigure IPv4 
nics, firewall, NAT etc

    *
Next, I created a /64 network for the isolated guest network on eth0 of VR 
using radvd:

# cat /etc/radvd.conf
interface eth0
{
      AdvSendAdvert on;
      MinRtrAdvInterval 5;
      MaxRtrAdvInterval 15;
      prefix 2001:470:ed36:1::/64
      {
          AdvOnLink on;
          AdvAutonomous on;
      };
};
systemctl restart radvd
All guest VMs nics and VR's eth0 gets IPv6 address (SLAAC) in this ...:1::/64 
network
    *   Finally I added a static route in toy core-router for the new /64 IPv6 
range in the isolated network
2001:470:ed36:1::/64 via <public IPv6 address of the VR on eth2> dev <local lan 
nic>
    *
... and I enabled firewall rules to allow any traffic to pass for the new /64 
network

And voila all done! I create a domain AAAA record that points to my guest VM 
IPv6 address a test webserver on
http://ipv6-isolated-ntwk-demo.yadav.cloud/

(Note: I'll get rid of the tunnel and request a new /48 block after a few days, 
sharing this solely for testing purposes)


Regards.

________________________________
From: Wido den Hollander <w...@widodh.nl>
Sent: Tuesday, July 20, 2021 12:46
To: dev@cloudstack.apache.org <dev@cloudstack.apache.org>
Subject: Re: IPV6 in Isolated/VPC networks



Op 19-07-2021 om 20:38 schreef Kristaps Cudars:
Hi Wido,

I assume that flouting ip will not work grate with ingress/egress acl on VR.

   From regular ACS user perspective:
I have Instance with dualstack its running web app on 443.
I want to swap instances for whatever reason.
In case of IPv4 change d-nat rule.
In case of IPv6 if flouting IP was not created upfront he will need to change 
dns entry that usually has 24h ttl. Inconvenience degradation in experience.


Yes, but, keep in mind that the IP you are using can also be terminated
on the VR where HAProxy proxies request to the backend VM (could even be
v4!)

I'm not against DHCPv6, but I have seen many issues with implementing
it. Therefor I always stick to SLAAC.

   From ACS admin perspective:
I don’t want to have these tickets in helpdesk.
You needed to create another flouting IP that it would be seamless- will not 
work as answer.


I understand that as well.

Wido


On 2021/07/19 09:05:54, Wido den Hollander <w...@widodh.nl> wrote:


Op 16-07-2021 om 21:46 schreef Kristaps Cudars:
Hi Wido,

Your proposal is to sacrifice ability to reassign IPv6 to instance, have 
internal domain prefix, and list/db in ACS what IPv6 has been assigned to what 
instance and go with RA and SLAAC. For route signaling to switch use BGP/OSPFv3 
or manual pre-creation.


You can still list the IPs which have been assigned. You'll know exactly
what IPv6 address a VM has because of the prefix + MAC. Privacy
Extensions need to be disabled in the VM.

This already works in CloudStack in Shared Networks in this way.

Using secondary IPs you can always have 'floating' IPv6 addressess.

Wido

Option with RA and managed flag that DHCPv6 is in use to support preset 
information and ability to create route information from ACS is not an option 
as DHCPv6 its failing?


On 2021/07/16 15:17:42, Wido den Hollander <w...@widodh.nl> wrote:


Op 16-07-2021 om 16:42 schreef Hean Seng:
Hi Wido,

In current setup,  each Cloudstack have own VR, so in this new  IPv6 subnet
allocation , each VR (which have Frr) will need to have peering with ISP
router (and either BGP or Static Route) , and there is 1000 Acocunts,  it
will 1000 BGP session with ISP router ,  Am I right for this ? or I
understand wrong .


Yes, that is correct. A /56 would also be sufficient or a /60 which is
enough to allocate a few /64 subnets.

1000 BGP connections isn't really a problem for a proper router at the
ISP. OSPF(v3) would be better, but as I said that's poorly supported.

The ISP could also install 1000 static routes, but that means that the
ISP's router needs to have those configured.

http://docs.frrouting.org/en/latest/ospf6d.html
(While looking up this URL I see that Frr recently put in a lot of work
in OSPFv3, seems better now)

I understand IPv6 is different then IPv4, and in IPv6 it suppose each
devices have own IP. It just how to realize in easy way.













On Fri, Jul 16, 2021 at 8:17 PM Wido den Hollander <w...@widodh.nl> wrote:



Op 16-07-2021 om 05:54 schreef Hean Seng:
Hi Wido,

My initial thought is not like this,  it is the /48 at ISP router, and
/64
subnet assign to AdvanceZoneVR,   AdvanceZoneVR responsible is
distribule IPv6 ip (from the assigned /64 sunet) to VM,  and not routing
the traffic,   in the VM that get the IPv6 IP will default route to ISP
router as gw.   It can may be a bridge over via Advancezone-VR.


How would you bridge this? That sounds like NAT?

IPv6 is meant to be routed. Not to be translated or bridged in any way.

The way a made the drawing is exactly how IPv6 should work in a VPC
environment.

Traffic flows through the VR where it can do firewalling of the traffic.

However, If do as the way described in the drawing, then i suppose will
be
another kind of virtual router going to introduce , to get hold the /48
in
this virtual router right ?


It can be the same VR. But keep in mind that IPv6 != IPv4.

The VR will get Frr as a new daemon which can talk BGP with the upper
network to route traffic.

After this,  The Advance Zone, NAT's  VR will peer with this new IPv6 VR
for getting the IPv6 /64 prefix ?


IPv4 will be behind NAT, but IPv6 will not be behind NAT.

If do in this way, then I guess  you just only need Static route, with
peering ip both end  as one /48 can have a lot of /64 on it.  And
hardware
budgeting for new IPv6-VR will become very important, as all traffic will
need to pass over it .


Routing or NAT is the same for the VR. You don't need a very beefy VR
for this.

It will be like

ISP Router  ------ >  (new IPV6-VR ) ---- > AdvanceZone-VR ----> VM

Relationship of (new IPv6 VR) and AdvanceZone-VR , may be considering on
OSPF instead of  BGP , otherwise few thousand of AdvanceZone-VR wil have
few thousand of BGP session. on new-IPv6-VR

Also, I suppose we cannot do ISP router. -->. Advancezone VR direct,   ,
otherwise ISP router will be full of /64 prefix route either on BGP( Many
BGP Session) , or  Many Static route .   If few thousand account, ti will
be few thousand of BGP session with ISP router or few thousand static
route
which  is not possible .






On Thu, Jul 15, 2021 at 10:47 PM Wido den Hollander <w...@widodh.nl>
wrote:

But you still need routing. See the attached PNG (and draw.io XML).

You need to route the /48 subnet TO the VR which can then route it to
the Virtual Networks behind the VR.

There is no other way then routing with either BGP or a Static route.

Wido

Op 15-07-2021 om 12:39 schreef Hean Seng:
Or explain like this :

1) Cloudstack generate list of /64 subnet from /48 that Network admin
assigned to Cloudstack
2) Cloudsack allocated the subnet (that generated from step1) to
Virtual
Router, one Virtual Router have one subniet /64
3) Virtual Router allocate single IPv6 (within the range of /64
allocated to VR)  to VM






On Thu, Jul 15, 2021 at 6:25 PM Hean Seng <heans...@gmail.com
<mailto:heans...@gmail.com>> wrote:

          Hi Wido,

          I think the /48 is at physical router as gateway , and subnet of
/64
          at VR of Cloudstack.   Cloudstack only keep which /48 prefix and
          vlan information of this /48 to be later split the  /64. to VR.

          And the instances is getting singe IPv6 of /64  IP.   The VR is
          getting /64.  The default gateway shall goes to /48 of physical
          router ip .   In this case ,does not need any BGP router .


          Similar concept as IPv4 :

          /48 subnet of IPv6 is equivalent to current /24 subnet of IPv4
that
          created in Network.
          and /64  of IPv6 is equivalent to single IP of IPv4 assign to VM.




          On Thu, Jul 15, 2021 at 5:31 PM Wido den Hollander <
w...@widodh.nl
          <mailto:w...@widodh.nl>> wrote:



              Op 14-07-2021 om 16:44 schreef Hean Seng:
               > Hi
               >
               > I replied in another thread, i think do not need implement
              BGP or OSPF,
               > that would be complicated .
               >
               > We only need assign  IPv6 's /64 prefix to Virtual Router
              (VR) in NAT
               > zone, and the VR responsible to deliver single IPv6 to VM
via
              DHCP6.
               >
               > In VR, you need to have Default IPv6 route to  Physical
              Router's /48. IP
               > as IPv6 Gateway.  Thens should be done .
               >
               > Example :
               > Physical Router Interface
               >   IPv6 IP : 2000:aaaa::1/48
               >
               > Cloudstack  virtual router : 2000:aaaa:200:201::1/64 with
              default ipv6
               > route to router ip 2000:aaaa::1
               > and Clodustack Virtual router dhcp allocate IP to VM , and
              VM will have
               > default route to VR. IPv6 2000:aaaa:200:201::1
               >
               > So in cloudstack need to allow  user to enter ,  IPv6
              gwateway , and
               > the  /48 Ipv6 prefix , then it will self allocate the /64
ip
              to the VR ,
               > and maintain make sure not ovelap allocation
               >
               >

              But NAT is truly not the solution with IPv6. IPv6 is supposed
to
be
              routable. In addition you should avoid DHCPv6 as much as
              possible as
              that's not really the intended use-case for address allocation
              with IPv6.

              In order to route an /48 IPv6 subnet to the VR you have a few
              possibilities:

              - Static route from the upperlying routers which are outside
of
              CloudStack
              - BGP
              - OSPFv3 (broken in most cases!)
              - DHCPv6 Prefix Delegation

              BGP and/or Static routes are still the best bet here.

              So what you do is that you tell CloudStack that you will route
              2001:db8::/48 to the VR, the VR can then use that to split it
up
              into
              multiple /64 subnets going towards the instances:

              - 2001:db8::/64
              - 2001:db8:1::/64
              - 2001:db8:2::/64
              ...
              - 2001:db8:f::/64

              And go on.

              In case of BGP you indeed have to tell the VR a few things:

              - It's own AS number
              - The peer's address(es)

              With FRR you can simply say:

              neighbor 2001:db8:4fa::179 remote-as external

              The /48 you need to have at the VR anyway in case of either a
              static
              route or BGP.

              We just need to add a NullRoute on the VR for that /48 so that
              traffic
              will not be routed to the upper gateway in case of the VR
can't
              find a
              route.

              Wido

               >
               >
               >
               >
               >
               > On Wed, Jul 14, 2021 at 8:55 PM Alex Mattioli
               > <alex.matti...@shapeblue.com
              <mailto:alex.matti...@shapeblue.com>
              <mailto:alex.matti...@shapeblue.com
              <mailto:alex.matti...@shapeblue.com>>> wrote:
               >
               >     Hi Wido,
               >     That's pretty much in line with our thoughts, thanks
for
              the input.
               >     I believe we agree on the following points then:
               >
               >     - FRR with BGP (no OSPF)
               >     - Route /48 (or/56) down to the VR
               >     - /64 per network
               >     - SLACC for IP addressing
               >
               >     I believe the next big question is then "on which level
              of ACS do we
               >     manage AS numbers?".  I see two options:
               >     1) Private AS number on a per-zone basis
               >     2) Root Admin assigned AS number on a domain/account
basis
               >     3) End-user driven AS number on a per network basis
(for
              bring your
               >     own AS and IP scenario)
               >
               >     Thoughts?
               >
               >     Cheers
               >     Alex
               >
               >
               >
               >
               >     -----Original Message-----
               >     From: Wido den Hollander <w...@widodh.nl
              <mailto:w...@widodh.nl> <mailto:w...@widodh.nl
              <mailto:w...@widodh.nl>>>
               >     Sent: 13 July 2021 15:08
               >     To: dev@cloudstack.apache.org
              <mailto:dev@cloudstack.apache.org>
              <mailto:dev@cloudstack.apache.org
              <mailto:dev@cloudstack.apache.org>>;
               >     Alex Mattioli <alex.matti...@shapeblue.com
              <mailto:alex.matti...@shapeblue.com>
               >     <mailto:alex.matti...@shapeblue.com
              <mailto:alex.matti...@shapeblue.com>>>
               >     Cc: Wei Zhou <wei.z...@shapeblue.com
              <mailto:wei.z...@shapeblue.com>
               >     <mailto:wei.z...@shapeblue.com
              <mailto:wei.z...@shapeblue.com>>>; Rohit Yadav
               >     <rohit.ya...@shapeblue.com
              <mailto:rohit.ya...@shapeblue.com>
              <mailto:rohit.ya...@shapeblue.com
              <mailto:rohit.ya...@shapeblue.com>>>;
               >     Gabriel Beims Bräscher <gabr...@pcextreme.nl
              <mailto:gabr...@pcextreme.nl>
               >     <mailto:gabr...@pcextreme.nl <mailto:
gabr...@pcextreme.nl

               >     Subject: Re: IPV6 in Isolated/VPC networks
               >
               >
               >
               >     On 7/7/21 1:16 PM, Alex Mattioli wrote:
               >      > Hi all,
               >      > @Wei Zhou<mailto:wei.z...@shapeblue.com
              <mailto:wei.z...@shapeblue.com>
               >     <mailto:wei.z...@shapeblue.com
              <mailto:wei.z...@shapeblue.com>>> @Rohit
               >     Yadav<mailto:rohit.ya...@shapeblue.com
              <mailto:rohit.ya...@shapeblue.com>
               >     <mailto:rohit.ya...@shapeblue.com
              <mailto:rohit.ya...@shapeblue.com>>> and myself are
              investigating how
               >     to enable IPV6 support on Isolated and VPC networks and
              would like
               >     your input on it.
               >      > At the moment we are looking at implementing FRR
with
              BGP (and
               >     possibly OSPF) on the ACS VR.
               >      >
               >      > We are looking for requirements, recommendations,
              ideas, rants,
               >     etc...etc...
               >      >
               >
               >     Ok! Here we go.
               >
               >     I think that you mean that the VR will actually route
the
              IPv6
               >     traffic and for that you need to have a way of getting
a
              subnet
               >     routed to the VR.
               >
               >     BGP is probably you best bet here. Although OSPFv3
              technically
               >     supports this it is very badly implemented in Frr for
              example.
               >
               >     Now FRR is a very good router and one of the fancy
              features it
               >     supports is BGP Unnumered. This allows for auto
              configuration of BGP
               >     over a L2 network when both sides are sending Router
              Advertisements.
               >     This is very easy for flexible BGP configurations where
              both sides
               >     have dynamic IPs.
               >
               >     What you want to do is that you get a /56, /48 or
              something which is
               >      >/64 bits routed to the VR.
               >
               >     Now you can sub-segment this into separate /64 subnets.
              You don't
               >     want to go smaller then a /64 is that prevents you from
              using SLAAC
               >     for IPv6 address configuration. This is how it works
for
              Shared
               >     Networks now in Basic and Advanced Zones.
               >
               >     FRR can now also send out the Router Advertisements on
              the downlinks
               >     sending out:
               >
               >     - DNS servers
               >     - DNS domain
               >     - Prefix (/64) to be used
               >
               >     There is no need for DHCPv6. You can calculate the IPv6
              address the
               >     VM will obtain by using the MAC and the prefix.
               >
               >     So in short:
               >
               >     - Using BGP you routed a /48 to the VR
               >     - Now you split this into /64 subnets towards the
              isolated networks
               >
               >     Wido
               >
               >      > Alex Mattioli
               >      >
               >      >
               >      >
               >      >
               >
               >
               >
               > --
               > Regards,
               > Hean Seng



          --
          Regards,
          Hean Seng



--
Regards,
Hean Seng










Reply via email to