Re: Skip creating files in --backup-dir if content has not changed

2019-12-18 Thread Kevin Korb via rsync
The reason for that is pretty simple...  Rsync isn't reading the
existing file to find out that it is the same.  Doing so would normally
be a waste of rsync's time.  A better question is why are your files
changing timestamps when the data is the same.

On 12/18/19 6:47 PM, Wayne Piekarski via rsync wrote:
> I am using rsync with --backup --backup-dir to keep copies of files
> which have changed as part of an incremental backup system. However, if
> only the timestamp has changed, it creates a copy of the file in
> --backup-dir, and if thousands of large files have their timestamps
> changed, this can waste a lot of disk space on something which hasn't
> really changed.
> 
> Interestingly, if you use --checksum, rsync will not create a file in
> --backup-dir unless the contents are truly different, but it will fix up
> the timestamp on the remote end to match. This is what I want, but I
> just don't want to pay the performance penalty of running --checksum all
> the time.
> 
> Here is an example that shows the problem:
> 
> mkdir ./SRC
> echo hello > ./SRC/a
> echo hello > ./SRC/b
> rsync -av ./SRC/ ./DEST/
> touch ./SRC/*
> ls -al --full-time ./SRC/ ./DEST/
> # Creates copies in BACKUP, even though contents are the same
> rsync -av --backup-dir=`pwd`/BACKUP/ ./SRC/ ./DEST/
> 
> After this run, the BACKUP directory will contain copies of both a and b
> even though neither actually changed. If you add --checksum, then it
> avoids creating a copy, but still syncs the timestamps correctly.
> 
> touch ./SRC/*
> # Does not create any copies in BACKUP since nothing changed
> rsync -av --checksum --backup-dir=`pwd`/BACKUP/ ./SRC/ ./DEST/
> 
> The problem with --checksum is that for hundreds of gigabytes of data,
> it can be very slow to run over every file, especially if the timestamps
> are mostly actually the same. But without it, the delta algorithm in
> rsync has already decided to make a backup copy before it realizes later
> that nothing has changed.
> 
> Is there a flag I can add to rsync that will tell it to only create a
> backup file if something actually changed, saving lots of wasted backup
> space?
> 
> thanks,
> Wayne
> 
> 

-- 
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
Kevin Korb  Phone:(407) 252-6853
Systems Administrator   Internet:
FutureQuest, Inc.   ke...@futurequest.net  (work)
Orlando, Floridak...@sanitarium.net (personal)
Web page:   https://sanitarium.net/
PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,



signature.asc
Description: OpenPGP digital signature
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Skip creating files in --backup-dir if content has not changed

2019-12-18 Thread Wayne Piekarski via rsync
I am using rsync with --backup --backup-dir to keep copies of files 
which have changed as part of an incremental backup system. However, if 
only the timestamp has changed, it creates a copy of the file in 
--backup-dir, and if thousands of large files have their timestamps 
changed, this can waste a lot of disk space on something which hasn't 
really changed.


Interestingly, if you use --checksum, rsync will not create a file in 
--backup-dir unless the contents are truly different, but it will fix up 
the timestamp on the remote end to match. This is what I want, but I 
just don't want to pay the performance penalty of running --checksum all 
the time.


Here is an example that shows the problem:

mkdir ./SRC
echo hello > ./SRC/a
echo hello > ./SRC/b
rsync -av ./SRC/ ./DEST/
touch ./SRC/*
ls -al --full-time ./SRC/ ./DEST/
# Creates copies in BACKUP, even though contents are the same
rsync -av --backup-dir=`pwd`/BACKUP/ ./SRC/ ./DEST/

After this run, the BACKUP directory will contain copies of both a and b 
even though neither actually changed. If you add --checksum, then it 
avoids creating a copy, but still syncs the timestamps correctly.


touch ./SRC/*
# Does not create any copies in BACKUP since nothing changed
rsync -av --checksum --backup-dir=`pwd`/BACKUP/ ./SRC/ ./DEST/

The problem with --checksum is that for hundreds of gigabytes of data, 
it can be very slow to run over every file, especially if the timestamps 
are mostly actually the same. But without it, the delta algorithm in 
rsync has already decided to make a backup copy before it realizes later 
that nothing has changed.


Is there a flag I can add to rsync that will tell it to only create a 
backup file if something actually changed, saving lots of wasted backup 
space?


thanks,
Wayne


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html