Yes but the biggest issue is how to recover You'll need to recover the whole storage not a single snapshot and this can last for days
Il 23 mar 2017 9:24 PM, "Alvin Starr" <[email protected]> ha scritto: > For volume backups you need something like snapshots. > > If you take a snapshot A of a live volume L that snapshot stays at that > moment in time and you can rsync that to another system or use something > like deltacp.pl to copy it. > > The usual process is to delete the snapshot once its copied and than > repeat the process again when the next backup is required. > > That process does require rsync/deltacp to read the complete volume on > both systems which can take a long time. > > I was kicking around the idea to try and handle snapshot deltas better. > > The idea is that you could take your initial snapshot A then sync that > snapshot to your backup system. > > At a later point you could take another snapshot B. > > Because snapshots contain the copies of the original data at the time of > the snapshot and unmodified data points to the Live volume it is possible > to tell what blocks of data have changed since the snapshot was taken. > > Now that you have a second snapshot you can in essence perform a diff on > the A and B snapshots to get only the blocks that changed up to the time > that B was taken. > > These blocks could be copied to the backup image and you should have a > clone of the B snapshot. > > You would not have to read the whole volume image but just the changed > blocks dramatically improving the speed of the backup. > > At this point you can delete the A snapshot and promote the B snapshot to > be the A snapshot for the next backup round. > > On 03/23/2017 03:53 PM, Gandalf Corvotempesta wrote: > > Are backup consistent? > What happens if the header on shard0 is synced referring to some data on > shard450 and when rsync parse shard450 this data is changed by subsequent > writes? > > Header would be backupped of sync respect the rest of the image > > Il 23 mar 2017 8:48 PM, "Joe Julian" <[email protected]> ha scritto: > >> The rsync protocol only passes blocks that have actually changed. Raw >> changes fewer bits. You're right, though, that it still has to check the >> entire file for those changes. >> >> On 03/23/17 12:47, Gandalf Corvotempesta wrote: >> >> Raw or qcow doesn't change anything about the backup. >> Georep always have to sync the whole file >> >> Additionally, raw images has much less features than qcow >> >> Il 23 mar 2017 8:40 PM, "Joe Julian" <[email protected]> ha scritto: >> >>> I always use raw images. And yes, sharding would also be good. >>> >>> On 03/23/17 12:36, Gandalf Corvotempesta wrote: >>> >>> Georep expose to another problem: >>> When using gluster as storage for VM, the VM file is saved as qcow. >>> Changes are inside the qcow, thus rsync has to sync the whole file every >>> time >>> >>> A little workaround would be sharding, as rsync has to sync only the >>> changed shards, but I don't think this is a good solution >>> >>> Il 23 mar 2017 8:33 PM, "Joe Julian" <[email protected]> ha scritto: >>> >>>> In many cases, a full backup set is just not feasible. Georep to the >>>> same or different DC may be an option if the bandwidth can keep up with the >>>> change set. If not, maybe breaking the data up into smaller more manageable >>>> volumes where you only keep a smaller set of critical data and just back >>>> that up. Perhaps an object store (swift?) might handle fault tolerance >>>> distribution better for some workloads. >>>> >>>> There's no one right answer. >>>> >>>> On 03/23/17 12:23, Gandalf Corvotempesta wrote: >>>> >>>> Backing up from inside each VM doesn't solve the problem >>>> If you have to backup 500VMs you just need more than 1 day and what if >>>> you have to restore the whole gluster storage? >>>> >>>> How many days do you need to restore 1PB? >>>> >>>> Probably the only solution should be a georep in the same >>>> datacenter/rack with a similiar cluster, >>>> ready to became the master storage. >>>> In this case you don't need to restore anything as data are already >>>> there, >>>> only a little bit back in time but this double the TCO >>>> >>>> Il 23 mar 2017 6:39 PM, "Serkan Çoban" <[email protected]> ha >>>> scritto: >>>> >>>>> Assuming a backup window of 12 hours, you need to send data at 25GB/s >>>>> to backup solution. >>>>> Using 10G Ethernet on hosts you need at least 25 host to handle 25GB/s. >>>>> You can create an EC gluster cluster that can handle this rates, or >>>>> you just backup valuable data from inside VMs using open source backup >>>>> tools like borg,attic,restic , etc... >>>>> >>>>> On Thu, Mar 23, 2017 at 7:48 PM, Gandalf Corvotempesta >>>>> <[email protected]> wrote: >>>>> > Let's assume a 1PB storage full of VMs images with each brick over >>>>> ZFS, >>>>> > replica 3, sharding enabled >>>>> > >>>>> > How do you backup/restore that amount of data? >>>>> > >>>>> > Backing up daily is impossible, you'll never finish the backup that >>>>> the >>>>> > following one is starting (in other words, you need more than 24 >>>>> hours) >>>>> > >>>>> > Restoring is even worse. You need more than 24 hours with the whole >>>>> cluster >>>>> > down >>>>> > >>>>> > You can't rely on ZFS snapshot due to sharding (the snapshot took >>>>> from one >>>>> > node is useless without all other node related at the same shard) >>>>> and you >>>>> > still have the same restore speed >>>>> > >>>>> > How do you backup this? >>>>> > >>>>> > Even georep isn't enough, if you have to restore the whole storage >>>>> in case >>>>> > of disaster >>>>> > >>>>> > _______________________________________________ >>>>> > Gluster-users mailing list >>>>> > [email protected] >>>>> > http://lists.gluster.org/mailman/listinfo/gluster-users >>>>> >>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing >>>> [email protected]http://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>>> _______________________________________________ Gluster-users mailing >>>> list [email protected] http://lists.gluster.org/mailm >>>> an/listinfo/gluster-users >>> >>> _______________________________________________ > Gluster-users mailing > [email protected]http://lists.gluster.org/mailman/listinfo/gluster-users > > -- > Alvin Starr || voice: (905)513-7688 <(905)%20513-7688> > Netvel Inc. || Cell: (416)806-0133 > <(416)%20806-0133>[email protected] || > > > _______________________________________________ > Gluster-users mailing list > [email protected] > http://lists.gluster.org/mailman/listinfo/gluster-users >
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
