tverkade commented on issue #12761:
URL: https://github.com/apache/cloudstack/issues/12761#issuecomment-4025033986

   > [@tverkade](https://github.com/tverkade) , I am right in understanding 
that you are basically asking for defensive code agains an out-of-bounds change 
in cheph, am I? As you say/imply we could also natively support snapshot 
mirroring, which would be a good feature.
   
   @DaanHoogland  yes, even if the scheduling is not managed by Cloudstack (it 
would be the default schedule applied to the pool), if there is a way that 
creating an RBD image in Cloudstack in a specific primary storage target and 
enable the snapshot mirroring feature for that RBD, that would be excellent. 
   
   So, for the example In my initial issue, creating an RBD image in Cloudstack 
that results in the image ID `df1e2a8b-00ff-4854-b2d4-f8a2d685fa1e` with pool 
`cloudstack` would require snapshot mirroring enabled with:
   `image enable cloudstack/df1e2a8b-00ff-4854-b2d4-f8a2d685fa1e snapshot`
   
   That would solve the biggest issue I'm dealing with right now, however that 
also only solves the issue of enabling mirroring. The other issue is that if 
you simply enable the snapshot mirroring, it will default to the pool's 
snapshot schedule. This is a problem as not staggering the schedule (which 
results in all VMs on this storage are taking snapshots and replicating changes 
at exactly the same time), would cause potentially noticeable performance 
issues. The other piece of this is that in order to maintain crash-consistency, 
we also need to make sure that all disks on a particular VM have a snapshot 
taken at the exact same time, or the changes on the remote side will be 
inconsistent with the data on the source. 
   
   I'm currently using a series of custom systemd units that trigger scripts to:
   1) Query the Cloudstack API for all VMs and their disks that are configured 
the desired RBD pools
   2) Query the images in ceph to see which already have a snapshot schedule 
and which do not
   3) If a new disk is created on a VM that already has other disks with a 
snapshot schedule, the scripts will apply a schedule that matches the other 
disk(s).
   4) If this is a new VM, it will generate a random offset between 1 and 55 
minutes and use that for an hourly snapshot schedule
   5) Query the Cloudstack API for any images marked for deletion that reside 
on RBD pools with mirroring, and disable mirroring so the disks can be expunged.
   
   While this works, these scripts run every 5 minutes, and as I had previously 
mentioned it prevents expunging directly in the UI until after mirroring is 
disabled.
   
   I hope this helps explain a bit more of what I'm running into


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to