Re: [BackupPC-users] Large files with small changes

Craig Barratt via BackupPC-users Tue, 20 Nov 2018 09:41:18 -0800

Steve,

You are exactly right - BackupPC's storage granularity is whole files.  So,
in the worst case, a single byte change to a file that is a unique will
result in a new file in the pool.  Rsync will only transfer the deltas, but
the full file gets rebuilt on the server.


Before I did the 4.x rewrite, I did some benchmarking on block-level or
more granular deltas, but the typical performance improvement was modest
and the effort to implement it was large.  However, there are two cases
where block-level or byte-level deltas would be very helpful - database
files (as you mentioned) and VM images.

Perhaps you could use $Conf{DumpPreUserCmd} to run a script that generates
byte-level deltas, and exclude the original database files?  You could have
a weekly schedule where you copy the full database file on, eg, Sunday, and
generate deltas every other day of the week.  Then BackupPC will backup the
full file once, and also grab each of the deltas.  That way you'll have a
complete database file once per week, and all the daily (cumulative) deltas.

Craig

On Tue, Nov 20, 2018 at 9:28 AM Steve Richards <b...@boxersoft.com> wrote:

> Thanks. Yes, I had seen that in the docs but I got the impression that the
> deltas referred to there were at the granularity of whole files. For
> example, let's say backup 1 contains files A, B and C. If B is then
> modified then, during the next backup rsync might only *transfer* the
> deltas needed to change B to B1 e.g. "replace line 5 with [new content]". I
> got the impression that those deltas would be used to create B1 though, and
> that both complete files (B and B1) would be stored in the pool as whole
> files. The deltas referred to in the docs would then be how to get from one
> *backup* to another e.g. "Delete file B, insert file B1" (or vice versa,
> depending on whether it's BackupPC V3 or V4).
>
> So that's the way I interpreted it, but I'm very new to this so I may have
> got the wrong end of the stick completely. If anyone could confirm or
> correct my understanding, I'd appreciate it either way.
>
> Thanks for the comments on mysqldump, I'll take a look at those options.
>
> SteveR.
> On 20/11/2018 14:05, Mike Hughes wrote:
>
> Hi Steve,
>
>
>
> It looks like they are stored using reverse deltas. Maybe you’ve already
> seen this from the V4.0 documentation:
>
>    - Backups are stored as "reverse deltas" - the most recent backup is
>    always filled and older backups are reconstituted by merging all the deltas
>    starting with the nearest future filled backup and working backwards.
>
> This is the opposite of V3 where incrementals are stored as "forward
> deltas" to a prior backup (typically the last full backup or prior
> lower-level incremental backup, or the last full in the case of rsync).
>
>    - Since the most recent backup is filled, viewing/restoring that
>    backup (which is the most common backup used) doesn't require merging any
>    deltas from other backups.
>    - The concepts of incr/full backups and unfilled/filled storage are
>    decoupled. The most recent backup is always filled. By default, for the
>    remaining backups, full backups are filled and incremental backups are
>    unfilled, but that is configurable.
>
> Additionally these tips might help apply deltas to the files and reduce
> transfer bandwidth:
>
>
>
> MySQL dump has an option  ‘--order-by-primary’ which sorts before/while
> dumping the database. Useful if you’re trying to limit the amount to be
> rsync’ed. You’ll need to evaluate the usefulness of this based on db design.
>
>
>
> If you’re compressing your database look into the “--rsyncable” option
> available in the package pigz.
>
>
>
> *From:* Steve Richards <b...@boxersoft.com> <b...@boxersoft.com>
> *Sent:* Tuesday, November 20, 2018 04:34
> *To:* backuppc-users@lists.sourceforge.net
> *Subject:* [BackupPC-users] Large files with small changes
>
>
>
> I think some backup programs are able to store just the changes ("deltas")
> in a file when making incrementals. Am I right in thinking that BackupPC
> doesn't do this, and would instead store the whole of each changed file as
> separate entries in the pool?
>
> Reason for asking is that I want to implement a backup strategy for
> databases, which is likely to involve multi-megabyte SQL files that differ
> only slightly from day to day. I'm trying to decide how best to handle them.
>
>
> _______________________________________________
> BackupPC-users mailing listbackuppc-us...@lists.sourceforge.net
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users@lists.sourceforge.net
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>

_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Re: [BackupPC-users] Large files with small changes

Reply via email to