Good morning, AJ,

On Thu, 14 May 2009, AJ ONeal wrote:

I've been using sbackup (tar) to do backups for a while, but now that I have a 
1tb drive
I want to change to an rsync-based backup scheme.

Also, I recently had a hard drive go bad and I don't know for how long it's been
corrupting files, but there were about 5mb of bad sectors when I replaced it. 
Using
ddrescue I got the copy on the new drive down to 450k, but I still want to look 
through
some old backups and see what needs recovering.

Now I want to progress through my old backups and create two 'views':

1) a change history of the backups over time
the first full backup is loaded into
/backups/CURRENT.d
all incremental backups replace files in current and are copied
will contain the most current copy of the files
rsync \
    --archive \
    --exclude-from $RSYNC_EXCLUDE \
    --delete-excluded \
    --backup --backup-dir=../${DATE_OF_LAST}.bak \
    ${DATE_OF_NEXT}.d CURRENT.d
I'll have some control such that the --delete option is added on full backups 
but not on
incremental backups
This is very much like leaving them untarred and not touching them at all with 
rsync,
but there are some differences
(the full (CURRENT) will be the most recent and the incrementals will be least 
recent,
also the CURRENT will contain all files changed or added and there will only be 
one ful
backup rather than periodic ones)
But I'd like it to make hard-link copies of the files rather than duplicating 
that I've
untarred.

2) I want to make a copy of the directory structure above, but where
every backup is seen as a full backup, again using hard links

I took a similar, but slightly different approach with rsync-backup (see http://www.stearns.org/rsync-backup/ ). The client always backs up to /backups/{client_name_or_ip}/current/ on the backup server. Each day, the server makes a snapshot using hardlinks from that directory to /backups/{client_name_or_ip}/YYYYMMDD/ . /current/ and each daily snapshot are complete backups. Only the changed files get backed up each day. Backups happen with rsync-over-ssh using ssh keys, and can be either a complete backup of the system or individual directories, as the client chooses. "--delete" is used on each backup, so /current/ really does match the client drive. BTW, using "--delete" on current doesn't affect the older snapshots.


I don't understand the rysnc hard-link options and
I don't think osx's cp has hard-link options at all

It doesn't appear to on the mac to which I have access; it uses the BSD cp command that doesn't support that. You could carry over Linux' cp command and compile on your mac if it's important enough. That tool is part of the coreutils package at http://www.gnu.org/software/coreutils/ . One alternative would be to use cp -av to make a (non-hardlinked) snapshot every night, then re-link the files in the backup tree with freedups ( http://www.stearns.org/freedups/ ), which does work on macs just fine:


mac:~/test wstearns$ ls -Ali
total 24
736193 -rw-r--r--    1 wstearns  staff    0 May 14 11:18 a
736199 -rw-r--r--    1 wstearns  staff    6 May 14 11:25 hello
736197 -rw-r--r--    1 wstearns  staff    6 May 14 11:25 hello1
736198 -rw-r--r--    1 wstearns  staff    6 May 14 11:25 hello2
mac:~/test wstearns$ ~/bin/freedups.pl -a ~/test/
Freedups Version 0.6.14
Options Chosen: ActuallyLink Paranoid Verbosity=1 CacheFile=/Users/wstearns/md5sum-v1.cache MaxFiles=1 MinSize=0 (only consider files 1 bytes and larger)
Starting to load md5 checksum cache from /Users/wstearns/md5sum-v1.cache.
Finished loading checksums from checksum cache.
Starting to scan /Users/wstearns/test/
 6
        linked /Users/wstearns/test/hello1 /Users/wstearns/test/hello
        linked /Users/wstearns/test/hello1 /Users/wstearns/test/hello2
Finished processing inodes, appending new md5sums.
Finished saving md5sums.
1 file specs searched.
3 Unique files scanned.
3 Unique inodes scanned.
1 filenames were discarded because there were already 1 filenames for that inode.
Cached checksums: 0, From disk checksums: 4.
Space saved: 12
Discarded 2 checksums of small files.
mac:~/test wstearns$ ls -Ali
total 24
736193 -rw-r--r--    1 wstearns  staff    0 May 14 11:18 a
736197 -rw-r--r--    3 wstearns  staff    6 May 14 11:25 hello
736197 -rw-r--r--    3 wstearns  staff    6 May 14 11:25 hello1
736197 -rw-r--r--    3 wstearns  staff    6 May 14 11:25 hello2

(All three identical content files now share a single inode, 736197).
        Cheers,
        - Bill



I've seen some examples online of using hard links with rsync and cp -al, but 
the ones
I've seen weren't clear to me.
Could you point me in the right direction or explain how I might complete these 
tasks?


AJ ONeal

P.S. Here's the script that I've got so far, but, of course, it still needs 
tweaking


#!/bin/bash

SB_DIR='ubu-backup-1.d'
UNTAR_DIR='ubu-untar.d'
RSYNC_DIR='twdp.hobby-site.org.d'
RSYNC_CUR='CURRENT.d'
RSYNC_EXCLUDE='exclude-list.txt'

#ls $SB_DIR | sort | while read BAK_DIR
#do
#       DEST_DIR=$UNTAR_DIR/`echo $BAK_DIR | cut -d'.' -f1-3`
#       mkdir -p $DEST_DIR
#       tar xzvf $SB_DIR/$BAK_DIR/files.tgz -C $DEST_DIR
#done

#mkdir -p $RSYNC_DIR
#FIRST_BACKUP=`ls $UNTAR_DIR | sort | head -n 1`
NUM_BACKUPS=$((`ls $UNTAR_DIR | wc -l`-1))
cp -a $UNTAR_DIR/$FIRST_BACKUP $RSYNC_DIR/$RSYNC_CUR

ls $UNTAR_DIR | sort | tail -n $NUM_BACKUPS | while read BAK_DIR
do
        SOURCE_DIR="$UNTAR_DIR/$BAK_DIR/"
        DEST_DIR="$RSYNC_DIR/$RSYNC_CUR"
        rsync \
                --archive --checksum --verbose \
                --exclude-from $RSYNC_EXCLUDE \
                --delete-excluded \
                --backup --backup-dir=../$BAK_DIR.d \
                $SOURCE_DIR $DEST_DIR
done

---------------------------------------------------------------------------
        "Linux is my OS of Free Choice."
        -- Wes Yates <[email protected]>
--------------------------------------------------------------------------
William Stearns ([email protected], tools and papers: www.stearns.org)
Top-notch computer security training at www.sans.org , www.giac.net
--------------------------------------------------------------------------

Reply via email to