Re: rsync script for snapshot backups

2016-06-21 Thread Petros Angelatos
On 19 June 2016 at 10:27, Simon Hobson  wrote:
> Dennis Steinkamp  wrote:
>
>> i tried to create a simple rsync script that should create daily backups 
>> from a ZFS storage and put them into a timestamp folder.
>> After creating the initial full backup, the following backups should only 
>> contain "new data" and the rest will be referenced via hardlinks (-link-dest)
>> ...
>> Well, it works but there is a huge flaw with his approach and i am not able 
>> to solve it on my own unfortunately.
>> As long as the backups are finishing properly, everything is fine but as 
>> soon as one backup job couldn`t be finished for some reason, (like it will 
>> be aborted accidently or a power cut occurs)
>> the whole backup chain is messed up and usually the script creates a new 
>> full backup which fills up my backup storage.
>
> Yes indeed, this is a typical flaw with many systems - you often need to 
> throw away the partial backup.
> One option that comes to mind is this :
> Create the new backup in a directory called (for example) "new" or 
> "in-progress". If, and only if, the backup completes, then rename this to a 
> timestamp. If when you start a new backup, if the in-progress folder exists, 
> then use that and it'll be freshened to the current source state.

I have an extremely similar script for my backups and that's exactly
what I do to deal with backups that are stopped mid-way, either by
power failures or by me. I rsync to a .tmp-$target directory, where
$target is what I'm backing up. I have separate backups for my rootfs
and /home. I also start the whole thing under ionice so that my
computer doesn't get slow from all this I/O. Lastly, before renaming
the .tmp-$target to the final directory I do a `sync -f` because rsync
doesn't seem to call fsync() when copying files and you can have a
failed backup if a power failure happens after the rename().

Here is my script:

#!/bin/bash

set -o errexit
set -o pipefail

target=$1

case "$target" in
home)
source=/home
;;
root)
source=/
;;
esac

PATHTOBACKUP=/root/backup

date=$(date --utc "+%Y-%m-%dT%H:%M:%S")

ionice --class 3 rsync \
--archive \
--verbose \
--one-file-system \
--sparse \
--delete \
--compress \
--log-file=$PATHTOBACKUP/.tmp-$target.log \
--link-dest=$PATHTOBACKUP/$target-current \
$source $PATHTOBACKUP/.tmp-$target

sync -f $PATHTOBACKUP/.tmp-$target

mv $PATHTOBACKUP/.tmp-$target.log $PATHTOBACKUP/$target-$date.log
mv $PATHTOBACKUP/.tmp-$target $PATHTOBACKUP/$target-$date

ln --symbolic --force --no-dereference $target-$date
$PATHTOBACKUP/$target-current

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync script for snapshot backups

2016-06-21 Thread Dennis Steinkamp


Am 20.06.2016 um 22:01 schrieb Larry Irwin (gmail):
The scripts I use analyze the rsync log after it completes and then 
sftp's a summary to the root of the just completed rsync.
If no summary is found or the summary is that it failed, the folder 
rotation for that set is skipped and that folder is re-used on the 
subsequent rsync.
The key here is that the folder rotation script runs separately from 
the rsync script(s).
For each entity I want to rsync, I create a named folder to identify 
it and the rsync'd data is held in sub-folders:

daily.[1-7] and monthly.[1-3]
When I rsync, I rsync into daily.0 using daily.1 as the link-dest.
Then the rotation script checks daily.0/rsync.summary - and if it 
worked, it removes daily.7 and renames the daily folders.
On the first of the month, the rotation script removes monthly.3, 
renames the other 2 and makes a complete hard-link copy of daily.1 to 
monthly.1
It's been running now for about 4 years and, in my environment, the 10 
copies take about 4 times the space of a single copy.

(we do complete copies of linux servers - starting from /)
If there's a good spot to post the scripts, I'd be glad to put them up.


Hi Larry,

that is something i couldn`t do with my current scripting skills but it 
sounds very interesting and i really would like to know how you did it 
if you don`t mind showing me your script of course.

As for my script, this is what i came up with.

#!/bin/sh

# rsync copy scriptv2 for rsync pull from FreeNAS to BackupNAS

# Set Date
B_DATE=$(date +"%d-%m-%Y-%H%M")
EXPIRED=`date +"%d-%m-%Y" -d "14 days ago"`

# Create directory if it doesn`t exist already
if ! [ -d /volume1/Backup_Test/in_progress ] ; then
mkdir -p /volume1/Backup_Test/in_progress
fi

# rsync command
if   [ -f /volume1/rsync/Test/linkdest.txt ] ; then
rsync -aqzh \
--delete --stats --exclude-from=/volume1/rsync/Test/exclude.txt \
--log-file=/volume1/Backup_Test/logs/rsync-$B_DATE.log \
--link-dest=/volume1/Backup_Test/`cat 
/volume1/rsync/Test/linkdest.txt`\

Test@192.168.2.2::Test /volume1/Backup_Test/in_progress
else
rsync -aqzh \
--delete --stats --exclude-from=/volume1/rsync/Test/exclude.txt \
--log-file=/volume1/Backup_Test/logs/rsync-$B_DATE.log \
Test@192.168.2.2::Test /volume1/Backup_Test/in_progress
fi

# Check return value
if [ $? = 24 -o $? = 0 ] ; then
mv /volume1/Backup_Test/in_progress /volume1/Backup_Test/$B_DATE
echo $B_DATE > /volume1/rsync/Test/linkdest.txt
fi

# Delete expired snapshots (2 weeks old)
if [ -d /volume1/Backup_Test/$EXPIRED-* ]
then
rm -Rf /volume1/Backup_Test/$EXPIRED-*
fi

Keep in mind i am not very good at this and if something can be improved 
or you see a big flaw in it, i would be grateful if you let me know.
So far it seems to do the trick. I would like to improve it so that the 
logfile will be mailed to a specific e-mail adress after rsync completed 
successfully.
Unfortunately the logfiles grow very big when i have lots of data to 
back up and i couldn`t figure out how to only send a specific part of 
the logfile or to customize the logfile somehow.




--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync script for snapshot backups

2016-06-20 Thread Larry Irwin (gmail)
The scripts I use analyze the rsync log after it completes and then 
sftp's a summary to the root of the just completed rsync.
If no summary is found or the summary is that it failed, the folder 
rotation for that set is skipped and that folder is re-used on the 
subsequent rsync.
The key here is that the folder rotation script runs separately from the 
rsync script(s).
For each entity I want to rsync, I create a named folder to identify it 
and the rsync'd data is held in sub-folders:

daily.[1-7] and monthly.[1-3]
When I rsync, I rsync into daily.0 using daily.1 as the link-dest.
Then the rotation script checks daily.0/rsync.summary - and if it 
worked, it removes daily.7 and renames the daily folders.
On the first of the month, the rotation script removes monthly.3, 
renames the other 2 and makes a complete hard-link copy of daily.1 to 
monthly.1
It's been running now for about 4 years and, in my environment, the 10 
copies take about 4 times the space of a single copy.

(we do complete copies of linux servers - starting from /)
If there's a good spot to post the scripts, I'd be glad to put them up.

--
Larry Irwin
Cell: 864-525-1322
Email: lrir...@alum.wustl.edu
Skype: larry_irwin
About: http://about.me/larry_irwin

On 06/19/2016 01:27 PM, Simon Hobson wrote:

Dennis Steinkamp  wrote:


i tried to create a simple rsync script that should create daily backups from a 
ZFS storage and put them into a timestamp folder.
After creating the initial full backup, the following backups should only contain 
"new data" and the rest will be referenced via hardlinks (-link-dest)
...
Well, it works but there is a huge flaw with his approach and i am not able to 
solve it on my own unfortunately.
As long as the backups are finishing properly, everything is fine but as soon 
as one backup job couldn`t be finished for some reason, (like it will be 
aborted accidently or a power cut occurs)
the whole backup chain is messed up and usually the script creates a new full 
backup which fills up my backup storage.

Yes indeed, this is a typical flaw with many systems - you often need to throw 
away the partial backup.
One option that comes to mind is this :
Create the new backup in a directory called (for example) "new" or 
"in-progress". If, and only if, the backup completes, then rename this to a timestamp. If 
when you start a new backup, if the in-progress folder exists, then use that and it'll be freshened 
to the current source state.

Also, have you looked at StoreBackup ? http://storebackup.org
I does most of this automagically, keeps a definable history (eg one/day for 14 
days, one/week for x weeks, one/30d for y years), plus it keeps file hashes so 
can detect bit-rot in your backups.





--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync script for snapshot backups

2016-06-19 Thread Joe

Rely on the other answers here as to how to do it right.

I just want to mention a few things in your script.
yes | cp /volume1/rsync/Buero/timenow.txt 
/volume1/rsync/Buero/timeold.txt
Yes is a program which puts out "Y" (or whatever you tell it to) forever 
- not what you want - and cp does not accept input from a pipe unless 
the first argument is "-" or some similar fancier construction. You can 
probably just leave off  the "yes | " and have the statement work 
exactly as it does now.


It looks like your EXPIRED logic will only find a directory which 
*exactly* matches that date.


You might look at using something like a find command to find 
directories older than 14 days.


Some find options which might help:

-ctime 14  specifies finding things modified more than 14 days ago
-type d specifies finding only directories
-maxdepth 1 specifies finding things only one level below the path find 
starts at
-exec ls -l {} \; specifies running a command on every result which is 
returned - in this case, an ls which can't hurt anything. You can 
replace ls with something like rm -rf {} when you're *very* sure the 
command is finding *exactly* what you want it to.


I didn't put the whole command together because until you understand how 
it works, you don't want to try something that might delete a bunch of 
things beyond what you actually want deleted.


Joe

On 06/19/2016 08:22 AM, Dennis Steinkamp wrote:

Hey guys,

i tried to create a simple rsync script that should create daily 
backups from a ZFS storage and put them into a timestamp folder.
After creating the initial full backup, the following backups should 
only contain "new data" and the rest will be referenced via hardlinks 
(-link-dest)


This was at least a simple enough scenario to achieve it with my 
pathetic scripting skills. This is what i came up with:


#!/bin/sh

# rsync copy script for rsync pull from FreeNAS to BackupNAS for Buero 
dataset


# Set variables
EXPIRED=`date +"%d-%m-%Y" -d "14 days ago"`

# Copy previous timefile to timeold.txt if it exists
if [ -f "/volume1/rsync/Buero/timenow.txt" ]
then
yes | cp /volume1/rsync/Buero/timenow.txt 
/volume1/rsync/Buero/timeold.txt

fi
# Create current timefile
echo `date +"%d-%m-%Y-%H%M"` > /volume1/rsync/Buero/timenow.txt
# rsync command
if [ -f "/volume1/rsync/Buero/timeold.txt" ]
then
rsync -aqzh \
--delete --stats --exclude-from=/volume1/rsync/Buero/exclude.txt \
--log-file=/volume1/Backup_Test/logs/rsync-`date 
+"%d-%m-%Y-%H%M"`.log \
--link-dest=/volume1/Backup_Test/`cat 
/volume1/rsync/Buero/timeold.txt` \

Test@192.168.2.2::Test/volume1/Backup_Test/`date +"%d-%m-%Y-%H%M"`
else
rsync -aqzh \
--delete --stats --exclude-from=/volume1/rsync/Buero/exclude.txt \
--log-file=/volume1/Backup_Buero/logs/rsync-`date 
+"%d-%m-%Y-%H%M"`.log \

Test@192.168.2.2::Test/volume1/Backup_Test/`date +"%d-%m-%Y-%H%M"`
fi

# Delete expired snapshots (2 weeks old)
if [ -d /volume1/Backup_Buero/$EXPIRED-* ]
then
rm -Rf /volume1/Backup_Buero/$EXPIRED-*
fi

Well, it works but there is a huge flaw with his approach and i am not 
able to solve it on my own unfortunately.
As long as the backups are finishing properly, everything is fine but 
as soon as one backup job couldn`t be finished for some reason, (like 
it will be aborted accidently or a power cut occurs)
the whole backup chain is messed up and usually the script creates a 
new full backup which fills up my backup storage.


What i would like to achieve is, to improve the script so that a 
backup run that wasn`t finished properly will be resumed, next time 
the script triggers.
Only if that was successful should the next incremental backup be 
created so that the files that didn`t changed from the previous backup 
can be hardlinked properly.


I did a little bit of research and i am not sure if i am on the right 
track here but apparently this can be done with return codes, but i 
honestly don`t know how to do this.
Thank you in advance for your help and sorry if this question may seem 
foolish to most of you people.


Regards

Dennis













--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync script for snapshot backups

2016-06-19 Thread Dennis Steinkamp

Am 19.06.2016 um 19:27 schrieb Simon Hobson:

Dennis Steinkamp  wrote:


i tried to create a simple rsync script that should create daily backups from a 
ZFS storage and put them into a timestamp folder.
After creating the initial full backup, the following backups should only contain 
"new data" and the rest will be referenced via hardlinks (-link-dest)
...
Well, it works but there is a huge flaw with his approach and i am not able to 
solve it on my own unfortunately.
As long as the backups are finishing properly, everything is fine but as soon 
as one backup job couldn`t be finished for some reason, (like it will be 
aborted accidently or a power cut occurs)
the whole backup chain is messed up and usually the script creates a new full 
backup which fills up my backup storage.

Yes indeed, this is a typical flaw with many systems - you often need to throw 
away the partial backup.
One option that comes to mind is this :
Create the new backup in a directory called (for example) "new" or 
"in-progress". If, and only if, the backup completes, then rename this to a timestamp. If 
when you start a new backup, if the in-progress folder exists, then use that and it'll be freshened 
to the current source state.

Also, have you looked at StoreBackup ? http://storebackup.org
I does most of this automagically, keeps a definable history (eg one/day for 14 
days, one/week for x weeks, one/30d for y years), plus it keeps file hashes so 
can detect bit-rot in your backups.



 Thank you for taking the time to answer me.
 Your suggestion is what i also had in mind but i wasn`t sure if this 
would be "best practice"
 To build this idea into my script i probably need to hardcode the 
target directory rsync writes to (e.g new or in-progress) and move the 
directory name to a timestamp only after rsync gave a return code of 0, 
am i correct? (or return code 0 and 24?)


 As for StoreBackup, it really does sound nice but i have to do all of 
this from the console of a 2bay synology nas, so its not that easy to 
use 3rd party software that may has other dependencies, the synology 
system doesn`t meet.





--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync script for snapshot backups

2016-06-19 Thread Simon Hobson
Dennis Steinkamp  wrote:

> i tried to create a simple rsync script that should create daily backups from 
> a ZFS storage and put them into a timestamp folder. 
> After creating the initial full backup, the following backups should only 
> contain "new data" and the rest will be referenced via hardlinks (-link-dest) 
> ...
> Well, it works but there is a huge flaw with his approach and i am not able 
> to solve it on my own unfortunately. 
> As long as the backups are finishing properly, everything is fine but as soon 
> as one backup job couldn`t be finished for some reason, (like it will be 
> aborted accidently or a power cut occurs) 
> the whole backup chain is messed up and usually the script creates a new full 
> backup which fills up my backup storage. 

Yes indeed, this is a typical flaw with many systems - you often need to throw 
away the partial backup.
One option that comes to mind is this :
Create the new backup in a directory called (for example) "new" or 
"in-progress". If, and only if, the backup completes, then rename this to a 
timestamp. If when you start a new backup, if the in-progress folder exists, 
then use that and it'll be freshened to the current source state.

Also, have you looked at StoreBackup ? http://storebackup.org
I does most of this automagically, keeps a definable history (eg one/day for 14 
days, one/week for x weeks, one/30d for y years), plus it keeps file hashes so 
can detect bit-rot in your backups.


-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


rsync script for snapshot backups

2016-06-19 Thread Dennis Steinkamp

Hey guys,

i tried to create a simple rsync script that should create daily backups 
from a ZFS storage and put them into a timestamp folder.
After creating the initial full backup, the following backups should 
only contain "new data" and the rest will be referenced via hardlinks 
(-link-dest)


This was at least a simple enough scenario to achieve it with my 
pathetic scripting skills. This is what i came up with:


#!/bin/sh

# rsync copy script for rsync pull from FreeNAS to BackupNAS for Buero 
dataset


# Set variables
EXPIRED=`date +"%d-%m-%Y" -d "14 days ago"`

# Copy previous timefile to timeold.txt if it exists
if [ -f "/volume1/rsync/Buero/timenow.txt" ]
then
yes | cp /volume1/rsync/Buero/timenow.txt 
/volume1/rsync/Buero/timeold.txt

fi
# Create current timefile
echo `date +"%d-%m-%Y-%H%M"` > /volume1/rsync/Buero/timenow.txt
# rsync command
if [ -f "/volume1/rsync/Buero/timeold.txt" ]
then
rsync -aqzh \
--delete --stats --exclude-from=/volume1/rsync/Buero/exclude.txt \
--log-file=/volume1/Backup_Test/logs/rsync-`date 
+"%d-%m-%Y-%H%M"`.log \
--link-dest=/volume1/Backup_Test/`cat 
/volume1/rsync/Buero/timeold.txt` \

Test@192.168.2.2::Test/volume1/Backup_Test/`date +"%d-%m-%Y-%H%M"`
else
rsync -aqzh \
--delete --stats --exclude-from=/volume1/rsync/Buero/exclude.txt \
--log-file=/volume1/Backup_Buero/logs/rsync-`date 
+"%d-%m-%Y-%H%M"`.log \

Test@192.168.2.2::Test/volume1/Backup_Test/`date +"%d-%m-%Y-%H%M"`
fi

# Delete expired snapshots (2 weeks old)
if [ -d /volume1/Backup_Buero/$EXPIRED-* ]
then
rm -Rf /volume1/Backup_Buero/$EXPIRED-*
fi

Well, it works but there is a huge flaw with his approach and i am not 
able to solve it on my own unfortunately.
As long as the backups are finishing properly, everything is fine but as 
soon as one backup job couldn`t be finished for some reason, (like it 
will be aborted accidently or a power cut occurs)
the whole backup chain is messed up and usually the script creates a new 
full backup which fills up my backup storage.


What i would like to achieve is, to improve the script so that a backup 
run that wasn`t finished properly will be resumed, next time the script 
triggers.
Only if that was successful should the next incremental backup be 
created so that the files that didn`t changed from the previous backup 
can be hardlinked properly.


I did a little bit of research and i am not sure if i am on the right 
track here but apparently this can be done with return codes, but i 
honestly don`t know how to do this.
Thank you in advance for your help and sorry if this question may seem 
foolish to most of you people.


Regards

Dennis








-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html