On 7/21/21 5:08 PM, Vladimir Dombrovski wrote:
> Hello Andrija,
> 
> Thank you for the swift response. You did correctly understand our setup, 
> which is aimed towards disaster recovery (DRP), rather than business 
> continuity (BCP). 
> 
> From a strictly technical point of view, we know that running a unique 
> cluster is the only way to migrate compute resources between our datacenters 
> when using KVM. Our Ceph storage is indeed running as a stretch cluster, and 
> is provisioned with a CRUSH map that allows for the loss of a datacenter.
> 

Don't do a stretch Ceph cluster. The performance will be horrible.

Instead use RBD mirroring for this purpose.

You already have the MySQL DB replicated, so then it's just 'a matter'
of getting a mgmt server running there and have the hosts connected.

You will need to do some fancy routing to get your IP-space working in
the other DC as well, but then it's possible.

> We have currently tested the 4.15 release of Cloudstack, which doesn't yet 
> integrate the changes you've described in (1). However, due to the fact that 
> the latency between our datacenters is quite low (~2ms), we could allow less 
> isolation between entities.
> 
> What we are however looking into, is to ensure that in our configuration, any 
> and all resources are pointed towards the correct local endpoint. To be more 
> specific, these endpoints include:
> 
> - The address of the manager, which we could technically move to a VIP, as 
> our hosts are plugged on the same subnet. From there, as we have 
> master-master MySQL replication setup, the agents on the DR site should 
> reconnect with the backup manager without much issue (we are still testing 
> this hypothesis).
> - The address of our primary storage, with which we're struggling. 
> Technically we have a Ceph endpoint on each site, and we would like to keep 
> it that way. This implies that we add multiple addresses for the same 
> storage. This is possible in libvird as you can define multiple <host> tags 
> in the <source> declaration XML. We haven't found a way to do the same via 
> Cloudstack
> - The address of our secondary storage, which is in our case NFS and is 
> simply plugged into Ceph using a different pool.
> 
> All this being said, I believe that we would eventually brute-force our way 
> into a working setup, that would look similar to the solution (2) you've 
> described. We are still looking for ways to do this more elegantly so we 
> would be glad to hear any more ideas.
> 
> As a side note, are the DB hacks you have mentioned simple IP replacements? 
> Or are there deeper modifications to be made?
> 
> Vladimir
> 
> On 2021/07/21 13:48:29, Andrija Panic <andrija.pa...@gmail.com> wrote: 
>> Migration between zones is NOT possible in any shape or form, so this is a
>> route you should, IMO, abandon (you can always export VMs in this way or
>> another, but this is not feasible in production)
>>
>> I understand you have 2 DCs and you want VMs to, eventually, become alive
>> in 2nd DC, if the plane crashes on 1st DC? (well, your data is there,
>> unless CEPH is stretched/distributed across 2 DCs and could survive the
>> whole DC1 going down)
>>
>> If you are insisting on that HA level - then you could do it in 2 ways,
>> that cross my mind right now.
>> (CEPH as distributed storage, zone wide, some nodes DC1, some DC2 - make
>> sure your CEPH setup survives whole DC going down (this requires that CRUSH
>> maps correctly configured etc)
>>
>>
>> (1)   DC1 = Pod1 (1/2/3) and DC2 = Pod2 (or Pod 4/5/6 etc) - i.e. multiple
>> Pods per DC - they all will be using zone-wide Ceph storage - your VMs are
>> on your storage, that is the crucial part to not lose data.
>> -- you can't really migrate VMs between Pods, only within cluster (and in
>> some cases between clusters in the same Pod, staring from 4.16)
>> -- this is OK if you have not-low-enough-latency between DC1 and DC2 (but
>> then CEPH will also suffer from that  higher latency)
>>
>> (2) A very untipical, not recommends, but technically possible setup -
>> DC1+DC2 = one large DC = 1 POD = 1 cluster (or more clusters if needed) -
>> still using CEPH as before
>> -- requires ultra-low latency between DC1 and DC2 - and if plane crashes on
>> DC1 (taking this example, as I've been to some Zurich DCs next to the
>> airport...) - you can still start VM on hosts in DC2 in case it was a
>> single cluster. In case you had multiple clusters - then it get's more
>> complicated (minor DB hacks) etc.
>>
>> In both cases you still have to sort out Secondary Storage NFS HA....
>>
>> In general,  you can't achieve what you want that easy nor you should be
>> stretching the possibilities (that I just explained, as I would, probably,
>> never use them in production)
>>
>> I guess I didn't help - but there you go.
>>
>> Andrija
>>
>>
>>
>> On Tue, 20 Jul 2021 at 16:28, Vladimir Dombrovski <
>> vladimir.dombrov...@bso.co> wrote:
>>
>>> Hello,
>>>
>>> We're trying to draw a multisite architecture where any VM could be
>>> relocated to the secondary site whenever the primary site fails
>>> (primary/backup for disaster recovery purposes). We don't require live
>>> migration, and we are okay with shutting down machines in order to relocate
>>> them.
>>>
>>> We are using Cloudstack 4.15 on Ubuntu Focal. In our current setup, each
>>> datacenter has a Cloudstack management node, as well as a few hypervisors
>>> running KVM and a Cloudstack agent. We're using Ceph as our primary
>>> storage, and NFS as our secondary storage on each site.
>>>
>>> To ensure metadata resiliency, we've replicated the MySQL database across
>>> both sites, much like described following this guide:
>>>
>>>
>>> https://docs.cloudstack.apache.org/projects/cloudstack-installation/en/4.11/choosing_deployment_architecture.html#multi-site-deployment
>>>
>>> We tried setting up multiple zones, one for each datacenter, each one
>>> having its own primary storage, but we are faced with the issue where we
>>> are not able to migrate VMs across zones (only Pod/Cluster/Host level is
>>> available via the GUI and the Cloudmonkey CLI).
>>>
>>> Are we using the right level of abstraction for our case? If so, how can
>>> we migrate a VM (compute + storage) from one zone to another? If not, what
>>> is the right level to use that allows us to use two separate primary
>>> storage endpoints and ensures that only the primary site gets used for
>>> compute resource allocation in normal conditions?
>>>
>>> Also, we would like to know whether there is some documentation already
>>> touching on the subject of best practices when performing these "more
>>> advanced" deployments.
>>>
>>> Kind regards,
>>>
>>> Vladimir DOMBROVSKI
>>>
>>
>>
>> -- 
>>
>> Andrija Panić
>>

Reply via email to