The rsync protocol only passes blocks that have actually changed. Raw changes fewer bits. You're right, though, that it still has to check the entire file for those changes.

On 03/23/17 12:47, Gandalf Corvotempesta wrote:
Raw or qcow doesn't change anything about the backup.
Georep always have to sync the whole file

Additionally, raw images has much less features than qcow

Il 23 mar 2017 8:40 PM, "Joe Julian" <[email protected] <mailto:[email protected]>> ha scritto:

    I always use raw images. And yes, sharding would also be good.


    On 03/23/17 12:36, Gandalf Corvotempesta wrote:
    Georep expose to another problem:
    When using gluster as storage for VM, the VM file is saved as
    qcow. Changes are inside the qcow, thus rsync has to sync the
    whole file every time

    A little workaround would be sharding, as rsync has to sync only
    the changed shards, but I don't think this is a good solution

    Il 23 mar 2017 8:33 PM, "Joe Julian" <[email protected]
    <mailto:[email protected]>> ha scritto:

        In many cases, a full backup set is just not feasible. Georep
        to the same or different DC may be an option if the bandwidth
        can keep up with the change set. If not, maybe breaking the
        data up into smaller more manageable volumes where you only
        keep a smaller set of critical data and just back that up.
        Perhaps an object store (swift?) might handle fault tolerance
        distribution better for some workloads.

        There's no one right answer.


        On 03/23/17 12:23, Gandalf Corvotempesta wrote:
        Backing up from inside each VM doesn't solve the problem
        If you have to backup 500VMs you just need more than 1 day
        and what if you have to restore the whole gluster storage?

        How many days do you need to restore 1PB?

        Probably the only solution should be a georep in the same
        datacenter/rack with a similiar cluster,
        ready to became the master storage.
        In this case you don't need to restore anything as data are
        already there,
        only a little bit back in time but this double the TCO

        Il 23 mar 2017 6:39 PM, "Serkan Çoban"
        <[email protected] <mailto:[email protected]>> ha
        scritto:

            Assuming a backup window of 12 hours, you need to send
            data at 25GB/s
            to backup solution.
            Using 10G Ethernet on hosts you need at least 25 host to
            handle 25GB/s.
            You can create an EC gluster cluster that can handle
            this rates, or
            you just backup valuable data from inside VMs using open
            source backup
            tools like borg,attic,restic , etc...

            On Thu, Mar 23, 2017 at 7:48 PM, Gandalf Corvotempesta
            <[email protected]
            <mailto:[email protected]>> wrote:
            > Let's assume a 1PB storage full of VMs images with
            each brick over ZFS,
            > replica 3, sharding enabled
            >
            > How do you backup/restore that amount of data?
            >
            > Backing up daily is impossible, you'll never finish
            the backup that the
            > following one is starting (in other words, you need
            more than 24 hours)
            >
            > Restoring is even worse. You need more than 24 hours
            with the whole cluster
            > down
            >
            > You can't rely on ZFS snapshot due to sharding (the
            snapshot took from one
            > node is useless without all other node related at the
            same shard) and you
            > still have the same restore speed
            >
            > How do you backup this?
            >
            > Even georep isn't enough, if you have to restore the
            whole storage in case
            > of disaster
            >
            > _______________________________________________
            > Gluster-users mailing list
            > [email protected]
            <mailto:[email protected]>
            >
            http://lists.gluster.org/mailman/listinfo/gluster-users
            <http://lists.gluster.org/mailman/listinfo/gluster-users>



        _______________________________________________
        Gluster-users mailing list
        [email protected] <mailto:[email protected]>
        http://lists.gluster.org/mailman/listinfo/gluster-users
        <http://lists.gluster.org/mailman/listinfo/gluster-users>
        _______________________________________________ Gluster-users
        mailing list [email protected]
        <mailto:[email protected]>
        http://lists.gluster.org/mailman/listinfo/gluster-users
<http://lists.gluster.org/mailman/listinfo/gluster-users>
_______________________________________________
Gluster-users mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to