Re: does the incremental rsync algorithm save on storage?

2007-07-17 Thread Matt McCutchen

On 7/17/07, Noah Leaman [EMAIL PROTECTED] wrote:

From what I understand, the incremental rsync algorithm saves on network 
bandwidth, but does rsync then just merge that delta data to end up with the new 
version and full sized file on the destination filesystem?


Correct.


I have these Microsoft Entourage databases files that modified often and can be 
a few gigs in size...  I want to back them up with rsync, but if the whole file 
is backed-up each time it takes a lot of space up (...multiplied by many 
users). I want to just store the initial file then only the deltas for each 
backup after that.

So my concern is not so much for bandwidth conservation, but storage 
conservation.


I recommend using rdiff-backup ( http://www.nongnu.org/rdiff-backup/
).  It produces a destination containing a full copy of the most
recent version of each file and a chain of backward deltas.

If you really want the original file and forward deltas, I recommend
that you use rsync's batch mode.  Pull the files to the receiver with
--only-write-batch, which makes rsync write a batch file containing
forward deltas from the (original) destination files to the current
source files instead of actually updating the destination.  To
reconstruct a non-original version of a file, use --read-batch to
apply the appropriate batch file to a temporary copy of the original
files.  This approach produces independent, cumulative deltas from the
original files; I can't think of an easy way to produce a chain of
forward deltas.

Matt
--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: does the incremental rsync algorithm save on storage?

2007-07-17 Thread Wayne Davison
On Tue, Jul 17, 2007 at 07:27:51PM -0400, Matt McCutchen wrote:
 I can't think of an easy way to produce a chain of forward deltas.

A chain of forward deltas requires an extra copy of the backup data.
So, you'd need a start point, an end point, and the deltas would be
generated while updating the end point.  To expire old deltas, you
would apply the oldest batch file to the start-point data, and then
delete that batch file.

So, the rdiff backup approach you mentioned would take less space since
it only needs an end-point copy of the data to maintain its chain of
backward deltas.

..wayne..
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html