Hi John,

> I've set up an initial use of rsync for making a mirror-type backup
> and I'd like some advice to improve it, please.
> 
> 1.  Does rsync verify the copies it makes?

Well, it does have -c, AKA --checksum, though it's actually MD4 or MD5.
This changes how the list of files that need updating is calculated.
Instead of being just size and modification time, the sender reads every
byte to build the MD5 digest.  This is before transmission starts so it
can delay the start of things quite a bit, and cause a lot more sender
I/O.  The receiver will do the same for any file that's the same length.
I tend to use this option in case anything overwrote bytes in a file and
then touched back its original modification time.

rsync(1) points out

    Note that rsync always verifies that each transferred file was
    correctly reconstructed on the receiving side by checking a
    whole-file checksum that is generated as the file is transferred,
    but that automatic after-the-transfer verification has nothing to do
    with this option’s before-the-transfer "Does this file need to be
    updated?" check.

But my reading is this is confirming that the bytes requested to be
written to disk on the receiver match;  it doesn't then ask the OS to
get them from spinning rust, bypassing any cache.

So, if you want to achieve that "scrub" of the copy, you could do your
backup, with or without -c, up to you, then flush the OS's cache on the
receiver, and repeat with -c.  Since the files will mostly be the same
length, the receiver will pull from disk to calculate the digests.

IIRC to have the kernel discard its cache of file's contents, do
    sudo sh -c 'echo 1 >/proc/sys/vm/drop_caches'

More simply, use -c each time and the next backup with check that the
files transferred in the previous one that are still the same length
read from disk OK on both machines.  And it will keep doing this on
every future backup.

BTW, I find -PacivHAX a convenient mnemonic for what rsync options might
be useful.  I think you might want to consider -HX to add to your -a.

> 2.  What non-user directories should I exclude?  My initial command is:
> 
>     sudo rsync -azvv --delete --exclude=/tmp/
>     --exclude=/home/john/Downloads/ --exclude=/home/john/GM2
>     --exclude=/home/shareddocs/Downloads/
>     --exclude=/home/evelyn/Downloads/ --exclude=proc/ --exclude=dev/
>     --exclude=mnt/ --exclude=media/ --exclude=sys/    /
>     '/media/john/Ubuntu Backup/Mirror-backup'

--exclude takes a pattern so you might find
    --exclude='/home/*/Downloads/'
a useful shortcut.  You've excluded various mount points;  -x stops its
crossing into other filesystems, so you could specify that and list the
top of the ones you do want transferred instead, e.g. / and /home.

> 3.  How many v's should I use, e.g -azv?

Depends how noisy you want the output?  I tend to peruse the listing of
my GNU tar incremental backups sorted by the size of the file so I can
spot any monsters getting in that I didn't intend.  I rectify, delete
the latest backup (and revert the incremental database), and re-run.

> 4.  I (user john) couldn't copy the .thunderbird subdirectories of
> user evelyn.  Why not?  I ran with sudo and other items copied OK.
> Temporarily, I've changed their permissions.

Don't know.  Would help to see the permissions of the directory its
complaining about and every parent before they were changed together
with rsync's complaint.

Cheers, Ralph.

-- 
Next meeting:  Bournemouth, Tuesday, 2014-10-07 20:00
Meets, Mailing list, IRC, LinkedIn, ...  http://dorset.lug.org.uk/
New thread on mailing list:  mailto:[email protected]
How to Report Bugs Effectively:  http://goo.gl/4Xue

Reply via email to