If the entire gluster volume failed, I'd wipe it, setup a fresh master volume & then copy the VM DR images onto the new volume. To restart each VM after it's been restored, I'd setup a script to connect to the hypervisor's API.
Of course, at the level you're speaking of, it could take a fair amount of time before the last VM is restored. As long as you've followed a naming standard, you could easily script in a restore queue based on service priority. If you need something quicker than that, then you've got little choice but to go down the HA-with-a-big-fat-pipe route. On 23 Mar 2017 18:46, "Gandalf Corvotempesta" < [email protected]> wrote: > The problem is not how to backup, but how to restore. > How do you restore a whole cluster made of thousands of VMs ? > > If you move all VMs to a shared storage like gluster, you should > consider how to recover everything from the gluster failure. > If you had a bounch of VMs on each server with local disks, you had to > recover only VMs affected by a single server failure, > but moving everything to a shared storage means to be prepared for a > disaster, where you *must* restore everything or hundreds of TB. > > 2017-03-23 23:07 GMT+01:00 Gambit15 <[email protected]>: > > Don't snapshot the entire gluster volume, keep a rolling routine for > > snapshotting the individual VMs & rsync those. > > As already mentioned, you need to "itemize" the backups - trying to > manage > > backups for the whole volume as a single unit is just crazy! > > > > Also, for long term backups, maintaining just the core data of each VM is > > far more manageable. > > > > I settled on oVirt for our platform, and do the following... > > > > A cronjob regularly snapshots & clones each VM, whose image is then > rsynced > > to our backup storage; > > The backup server snapshots the VM's image backup volume to maintain > > history/versioning; > > These full images are only maintained for 30 days, for DR purposes; > > A separate routine rsyncs the VM's core data to its own data backup > volume, > > which is snapshotted & maintained for 10 years; > > > > This could be made more efficient by using guestfish to extract the core > > data from backup image, instead of basically rsyncing the data across the > > network twice. > > > > That active storage layer uses Gluster on top of XFS & LVM. The backup > > storage layer uses a mirrored storage unit running ZFS on FreeNAS. > > This of course doesn't allow for HA in the case of the entire cloud > failing. > > For that we'd use geo-rep & a big fat pipe. > > > > D > > > > On 23 March 2017 at 16:29, Gandalf Corvotempesta > > <[email protected]> wrote: > >> > >> Yes but the biggest issue is how to recover > >> You'll need to recover the whole storage not a single snapshot and this > >> can last for days > >> > >> Il 23 mar 2017 9:24 PM, "Alvin Starr" <[email protected]> ha scritto: > >>> > >>> For volume backups you need something like snapshots. > >>> > >>> If you take a snapshot A of a live volume L that snapshot stays at that > >>> moment in time and you can rsync that to another system or use > something > >>> like deltacp.pl to copy it. > >>> > >>> The usual process is to delete the snapshot once its copied and than > >>> repeat the process again when the next backup is required. > >>> > >>> That process does require rsync/deltacp to read the complete volume on > >>> both systems which can take a long time. > >>> > >>> I was kicking around the idea to try and handle snapshot deltas better. > >>> > >>> The idea is that you could take your initial snapshot A then sync that > >>> snapshot to your backup system. > >>> > >>> At a later point you could take another snapshot B. > >>> > >>> Because snapshots contain the copies of the original data at the time > of > >>> the snapshot and unmodified data points to the Live volume it is > possible to > >>> tell what blocks of data have changed since the snapshot was taken. > >>> > >>> Now that you have a second snapshot you can in essence perform a diff > on > >>> the A and B snapshots to get only the blocks that changed up to the > time > >>> that B was taken. > >>> > >>> These blocks could be copied to the backup image and you should have a > >>> clone of the B snapshot. > >>> > >>> You would not have to read the whole volume image but just the changed > >>> blocks dramatically improving the speed of the backup. > >>> > >>> At this point you can delete the A snapshot and promote the B snapshot > to > >>> be the A snapshot for the next backup round. > >>> > >>> > >>> On 03/23/2017 03:53 PM, Gandalf Corvotempesta wrote: > >>> > >>> Are backup consistent? > >>> What happens if the header on shard0 is synced referring to some data > on > >>> shard450 and when rsync parse shard450 this data is changed by > subsequent > >>> writes? > >>> > >>> Header would be backupped of sync respect the rest of the image > >>> > >>> Il 23 mar 2017 8:48 PM, "Joe Julian" <[email protected]> ha > scritto: > >>>> > >>>> The rsync protocol only passes blocks that have actually changed. Raw > >>>> changes fewer bits. You're right, though, that it still has to check > the > >>>> entire file for those changes. > >>>> > >>>> > >>>> On 03/23/17 12:47, Gandalf Corvotempesta wrote: > >>>> > >>>> Raw or qcow doesn't change anything about the backup. > >>>> Georep always have to sync the whole file > >>>> > >>>> Additionally, raw images has much less features than qcow > >>>> > >>>> Il 23 mar 2017 8:40 PM, "Joe Julian" <[email protected]> ha > scritto: > >>>>> > >>>>> I always use raw images. And yes, sharding would also be good. > >>>>> > >>>>> > >>>>> On 03/23/17 12:36, Gandalf Corvotempesta wrote: > >>>>> > >>>>> Georep expose to another problem: > >>>>> When using gluster as storage for VM, the VM file is saved as qcow. > >>>>> Changes are inside the qcow, thus rsync has to sync the whole file > every > >>>>> time > >>>>> > >>>>> A little workaround would be sharding, as rsync has to sync only the > >>>>> changed shards, but I don't think this is a good solution > >>>>> > >>>>> Il 23 mar 2017 8:33 PM, "Joe Julian" <[email protected]> ha > scritto: > >>>>>> > >>>>>> In many cases, a full backup set is just not feasible. Georep to the > >>>>>> same or different DC may be an option if the bandwidth can keep up > with the > >>>>>> change set. If not, maybe breaking the data up into smaller more > manageable > >>>>>> volumes where you only keep a smaller set of critical data and just > back > >>>>>> that up. Perhaps an object store (swift?) might handle fault > tolerance > >>>>>> distribution better for some workloads. > >>>>>> > >>>>>> There's no one right answer. > >>>>>> > >>>>>> > >>>>>> On 03/23/17 12:23, Gandalf Corvotempesta wrote: > >>>>>> > >>>>>> Backing up from inside each VM doesn't solve the problem > >>>>>> If you have to backup 500VMs you just need more than 1 day and what > if > >>>>>> you have to restore the whole gluster storage? > >>>>>> > >>>>>> How many days do you need to restore 1PB? > >>>>>> > >>>>>> Probably the only solution should be a georep in the same > >>>>>> datacenter/rack with a similiar cluster, > >>>>>> ready to became the master storage. > >>>>>> In this case you don't need to restore anything as data are already > >>>>>> there, > >>>>>> only a little bit back in time but this double the TCO > >>>>>> > >>>>>> Il 23 mar 2017 6:39 PM, "Serkan Çoban" <[email protected]> ha > >>>>>> scritto: > >>>>>>> > >>>>>>> Assuming a backup window of 12 hours, you need to send data at > 25GB/s > >>>>>>> to backup solution. > >>>>>>> Using 10G Ethernet on hosts you need at least 25 host to handle > >>>>>>> 25GB/s. > >>>>>>> You can create an EC gluster cluster that can handle this rates, or > >>>>>>> you just backup valuable data from inside VMs using open source > >>>>>>> backup > >>>>>>> tools like borg,attic,restic , etc... > >>>>>>> > >>>>>>> On Thu, Mar 23, 2017 at 7:48 PM, Gandalf Corvotempesta > >>>>>>> <[email protected]> wrote: > >>>>>>> > Let's assume a 1PB storage full of VMs images with each brick > over > >>>>>>> > ZFS, > >>>>>>> > replica 3, sharding enabled > >>>>>>> > > >>>>>>> > How do you backup/restore that amount of data? > >>>>>>> > > >>>>>>> > Backing up daily is impossible, you'll never finish the backup > that > >>>>>>> > the > >>>>>>> > following one is starting (in other words, you need more than 24 > >>>>>>> > hours) > >>>>>>> > > >>>>>>> > Restoring is even worse. You need more than 24 hours with the > whole > >>>>>>> > cluster > >>>>>>> > down > >>>>>>> > > >>>>>>> > You can't rely on ZFS snapshot due to sharding (the snapshot took > >>>>>>> > from one > >>>>>>> > node is useless without all other node related at the same shard) > >>>>>>> > and you > >>>>>>> > still have the same restore speed > >>>>>>> > > >>>>>>> > How do you backup this? > >>>>>>> > > >>>>>>> > Even georep isn't enough, if you have to restore the whole > storage > >>>>>>> > in case > >>>>>>> > of disaster > >>>>>>> > > >>>>>>> > _______________________________________________ > >>>>>>> > Gluster-users mailing list > >>>>>>> > [email protected] > >>>>>>> > http://lists.gluster.org/mailman/listinfo/gluster-users > >>>>>> > >>>>>> > >>>>>> > >>>>>> _______________________________________________ > >>>>>> Gluster-users mailing list > >>>>>> [email protected] > >>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users > >>>>>> > >>>>>> _______________________________________________ Gluster-users > mailing > >>>>>> list [email protected] > >>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users > >>> > >>> _______________________________________________ > >>> Gluster-users mailing list > >>> [email protected] > >>> http://lists.gluster.org/mailman/listinfo/gluster-users > >>> > >>> -- > >>> Alvin Starr || voice: (905)513-7688 > >>> Netvel Inc. || Cell: (416)806-0133 > >>> [email protected] || > >>> > >>> > >>> _______________________________________________ > >>> Gluster-users mailing list > >>> [email protected] > >>> http://lists.gluster.org/mailman/listinfo/gluster-users > >> > >> > >> _______________________________________________ > >> Gluster-users mailing list > >> [email protected] > >> http://lists.gluster.org/mailman/listinfo/gluster-users > > > > >
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
