Re: Backups w/ rsync

2007-09-30 Thread Michal Soltys

Wolfgang Denk wrote:

Dear Bill,

in message [EMAIL PROTECTED] you wrote:


Be aware that rsync is useful for making a *copy* of your files, which 
isn't always the best backup. If the goal is to preserve data and be 
able to recover in time of disaster, it's probably not optimal, while if 
you need frequent access to old or deleted files it's fine.


If you want to do real backups you should use real tools, like bacula
etc.



I wouldn't agree here. All depends on how you organize yuor things, write
scripts, and so on. It isn't any less real solution than amanda or bacula.
It's much more DIY solution though, so not everyone will be inclined to use it.

ps.
Sorry for offtopic. Last in this subject from me.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backups w/ rsync

2007-09-29 Thread Michael Tokarev
Dean S. Messing wrote:
 Michael Tokarev writes:
[]
 : the procedure is something like this:
 : 
 :   cd /backups
 :   rm -rf tmp/
 :   cp -al $yesterday tmp/
 :   rsync -r --delete -t ... /filesystem tmp
 :   mv tmp $today
 : 
 : That is, link the previous backup to temp (which takes no space
 : except directories), rsync current files to there (rsync will
 : break links for changed files), and rename temp to $today.
 
 Very nice.  The breaking of the hardlink is the key.  I wondered about
 this when Michal using rsync yesterday.  I just tested the idea. It
 does indeed work.

Well, others in this thread already presented other, simpler ways,
namely using --link-dest rsync option.  I was just too lazy to read
the man page, but I already knew other tools can do the work ;)

 One question: why do you not use -a instead of -r -t?  It would
 seem that one would want to preserve permissions, and group and user
 ownerships.  Also, is there a reason to _not_ preserve sym-links
 in the backup.  Your script appears to copy the referent.

Note the above -- SOMETHING like this.  I was typing from memory,
it's not an actual script, just to show an idea.  Sure real script
does more than that, including error checking too.

/mjt
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backups w/ rsync

2007-09-28 Thread Michal Soltys

Goswin von Brederlow wrote:


Thanks, should have looked at --link-dest before replying. I wonder
how long rsync had that option. I wrote my own rsync script years
ago. Maybe it predates this.



According to news file, since ~ 2002-9, so quite a bit of time.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backups w/ rsync

2007-09-28 Thread Wolfgang Denk
Dear Bill,

in message [EMAIL PROTECTED] you wrote:

 Be aware that rsync is useful for making a *copy* of your files, which 
 isn't always the best backup. If the goal is to preserve data and be 
 able to recover in time of disaster, it's probably not optimal, while if 
 you need frequent access to old or deleted files it's fine.

If you want to do real backups you should use real tools, like bacula
etc.

 Now you can do an incremental (since last full or incremental) or 
 partial (since last full):
 
 touch bkup_incr_new
 timestamp=$(date +%Y%m%d-%T)
 find /home -cnewer bkup_incr | cpio -o -Hcrc |
gzip -3 /mnt/USBbkup/incr-$timestamp 
mv -f bkup_incr_new bkup_incr
 
 timestamp=$(date +%Y%m%d-%T)
 find /home -cnewer bkup_full  | cpio -o -Hcrc |
gzip -3 /mnt/USBbkup/part-$timestamp

Now have Johnny Loser downloading some stuff, say:

$ wget -N ftp://ftp.kernel.org/pub/linux/kernel/v2.6/linux-2.6.12.tar.gz

Are you aware that this file will never be backed up by your script?

Also, what about permission / owner changes etc.?

A backup tool should never work based on timestamps alone.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH, MD: Wolfgang Denk  Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: [EMAIL PROTECTED]
All he had was nothing, but that was something, and now it  had  been
taken away. - Terry Pratchett, _Sourcery_
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backups w/ rsync

2007-09-28 Thread Michal Soltys

Goswin von Brederlow wrote:


I was thinking Michal Soltys ment it this way. You can probably
replace the cp invocation with an rsync one but that hardly changes
things.

I don't think you can do this in a single rsync call. Please correct
me if I'm wrong.



something along this way:

rsync other options --link-dest /backup/2007-01-01/ \
rsync://[EMAIL PROTECTED]/module /backup/2007-01-02/

It will create backup of .../module in ...-02 hardlinking to ...-01 (if 
possible).


So, no need for cp -l. There's similar example in rsync man. Also - 
multiple --link-dest are supported too.


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backups w/ rsync

2007-09-28 Thread Jon Nelson
Please note: I'm having trouble w/gmail's formatting... so please
forgive this if it looks horrible. :-|

On 9/28/07, Bill Davidsen [EMAIL PROTECTED] wrote:

 Dean S. Messing wrote:
  It has been some time since I read the rsync man page.  I see that
  there is (among the bazillion and one switches) a --link-dest=DIR
  switch which I suppose does what you describe.  I'll have to
  experiment with this and think things through.  Thanks, Michal.
 

 Be aware that rsync is useful for making a *copy* of your files, which
 isn't always the best backup. If the goal is to preserve data and be
 able to recover in time of disaster, it's probably not optimal, while if
 you need frequent access to old or deleted files it's fine.


You are absolutely right when you say it isn't always the best backup. There
IS no 'best' backup.

For example, full and incremental backup methods such as dump and
 restore are usually faster to take and restore than a copy, and allow
 easy incremental backups.


If copy meant full data copy and not hard link where possible, I'd
agree with you. However...

I use a nightly rsync (with --link-dest) to backup more than 40 GiB to a
drbd-backed drive. I'll explain why I use drbd in just a moment.

Technically, I have a 3 disk raid5 (Linux Software Raid) which is the
primary store for the data. Then I have a second drive (non-raid) that is
used as a drbd backing store, which I rsync *to* from filesystems built off
of the raid. I keep *30 days* of nightly backups on the drbd volume. The
average difference between nightly backups is about 45MB, or a bit less than
10%. The total disk usage is (on average) about 10% more than a single
backup. On an AMD x86-64 dual core (3600 de-clocked to run at 1GHz) the
entire process takes between 1 and 2 minutes, from start to finish.

Using hard links means I can snapshot ~175,000 files, about 40GiB, in under
2 minutes - something I'd have a hard time doing with dump+restore. I could
easily make incremental or differential copies, and maybe even in that time
frame, but I'm not sure I much advantage in that. Furthermore, as you state,
dump+restore does *not* include the removal of files which for some
scenarios is a huge deal.

The long and short of it is this: using hard links (via rsync or cp or
whatever) to do snapshot backups can be really, really fast and have
significant advantages but there are, as with all things, some downsides.
Those downsides are fairly easily mitigated, however. In my case, I can lose
1 drive of the raid and I'm OK. If I lose 2, then the other drive (not part
of the raid) has the data I care about. If I lose the entire machine, the
*other* machine (the other end of the drbd, only woken up every other day or
so) has the data. Going back 30 days. And a bare-metal restore is as fast
as your I/O is.  I back my /really/ important stuff up on DLT.

Thanks again to drbd, when the secondary comes up it communicates with the
primary and is able to figure out only which blocks have changed and only
copies those. On a nightly basis that is usually a couple of hundred
megabytes, and at 12MiB/s that doesn't take terribly long to take care of.

-- 
Jon
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backups w/ rsync

2007-09-28 Thread Jon Nelson
On 9/28/07, Bill Davidsen [EMAIL PROTECTED] wrote:
 What I don't understand is how you use hard links... because a hard link
 needs to be in the same filesystem, and because a hard link is just
 another pointer to the inode and doesn't make a physical copy of the
 data to another device or to anywhere, really.

Yes, I know how hard links work. There is (one) physical copy of the
data when it goes from the filesystem on the raid to the filesystem on
the drbd. Subsequent copies of the same file, assuming the file has
not changed, are all hard links on the drbd-backed filesystem. Thus, I
have one *physical* copy of the data and a whole bunch of hard links.
Now, since I'm using drbd I actually have *two* physical copies (for a
total of three if you include the original) because the *other*
machine has a block-for-block copy of the drbd device (or it did, as
of a few days ago).

link-dest basically works like this:

Assuming we are going to copy (using that word loosely here) file
A from /source to /dest/backup.tmp/, and we've told rsync that
/dest/backup.1/A might exist:


If /dest/backup.1/A does not exist: make a physical copy from
/source/A to /dest/backup.tmp/A.
If it does exist, and the two files are considered identical, simply
hardlink /dest/backup.tmp/A to /dest/backup.1/A.
When all files are copied, move every /dest/backup.N (N is a number)
to /dest/backup.N+1
If /dest/backup.31 exists, delete it.
Move /dest/backup.tmp to /dest/backup.1 (which was just renamed /dest/backup.2)

I can do all of this, for 175K files (40G), in under 2 minutes on
modest hardware.
I end up with:
1+1 physical copies of the data (local drbd copy and remote drbd copy)

There is more but if I may suggest: if you want more details contact
me off-line, I'm pretty sure the linux-raid folks couldn't care less
about rsync and drbd.
-- 
Jon
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backups w/ rsync

2007-09-28 Thread Bill Davidsen

Dean S. Messing wrote:

It has been some time since I read the rsync man page.  I see that
there is (among the bazillion and one switches) a --link-dest=DIR
switch which I suppose does what you describe.  I'll have to
experiment with this and think things through.  Thanks, Michal.
  


Be aware that rsync is useful for making a *copy* of your files, which 
isn't always the best backup. If the goal is to preserve data and be 
able to recover in time of disaster, it's probably not optimal, while if 
you need frequent access to old or deleted files it's fine.


For example, full and incremental backup methods such as dump and 
restore are usually faster to take and restore than a copy, and allow 
easy incremental backups.


Consider:

   touch bkup_full_new
   timestamp=$(date +%Y%m%d-%T)
   find /home -depth | cpio -o -Hcrc |
  gzip -3 /mnt/USBbkup/full-$timestamp 
  mv -f bkup_full_new bkup_full  touch bkup_incr

Now you can do an incremental (since last full or incremental) or 
partial (since last full):


   touch bkup_incr_new
   timestamp=$(date +%Y%m%d-%T)
   find /home -cnewer bkup_incr | cpio -o -Hcrc |
  gzip -3 /mnt/USBbkup/incr-$timestamp 
  mv -f bkup_incr_new bkup_incr

   timestamp=$(date +%Y%m%d-%T)
   find /home -cnewer bkup_full  | cpio -o -Hcrc |
  gzip -3 /mnt/USBbkup/part-$timestamp

The advantage of the incr is that files are smaller, the advantage of 
partial is that you only restore full+part (two total), and the 
advantage of rsync is that deleted files will really be deleted (that's 
why I say it a copy, not a backup).


Hope this is useful.

--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backups w/ rsync

2007-09-28 Thread Goswin von Brederlow
Michael Tokarev [EMAIL PROTECTED] writes:

 Dean S. Messing wrote:
 Michal Soltys writes:
 []
 :  Rsync is fantastic tool for incremental backups. Everything that didn't 
 :  change can be hardlinked to previous entry. And time of performing the 
 :  backup is pretty much neglible. Essentially - you have equivalent of 
 :  full backups at almost minimal time and space cost possible.
 
 It has been some time since I read the rsync man page.  I see that
 there is (among the bazillion and one switches) a --link-dest=DIR
 switch which I suppose does what you describe.  I'll have to
 experiment with this and think things through.  Thanks, Michal.

 I haven't actually read the rsync manpage to this detail, but I
 do use rsync for backups this way, but a bit differently - yet
 more understandable without referring to manpages... ;)

 the procedure is something like this:

   cd /backups
   rm -rf tmp/
   cp -al $yesterday tmp/
   rsync -r --delete -t ... /filesystem tmp
   mv tmp $today

 That is, link the previous backup to temp (which takes no space
 except directories), rsync current files to there (rsync will
 break links for changed files), and rename temp to $today.

I was thinking Michal Soltys ment it this way. You can probably
replace the cp invocation with an rsync one but that hardly changes
things.

I don't think you can do this in a single rsync call. Please correct
me if I'm wrong.

MfG
Goswin

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backups w/ rsync

2007-09-28 Thread Dean S. Messing

Michael Tokarev writes:
: Dean S. Messing wrote:
:  Michal Soltys writes:
: []
:  :  Rsync is fantastic tool for incremental backups. Everything that didn't 
:  :  change can be hardlinked to previous entry. And time of performing the 
:  :  backup is pretty much neglible. Essentially - you have equivalent of 
:  :  full backups at almost minimal time and space cost possible.
:  
:  It has been some time since I read the rsync man page.  I see that
:  there is (among the bazillion and one switches) a --link-dest=DIR
:  switch which I suppose does what you describe.  I'll have to
:  experiment with this and think things through.  Thanks, Michal.
: 
: I haven't actually read the rsync manpage to this detail, but I
: do use rsync for backups this way, but a bit differently - yet
: more understandable without referring to manpages... ;)
: 
: the procedure is something like this:
: 
:   cd /backups
:   rm -rf tmp/
:   cp -al $yesterday tmp/
:   rsync -r --delete -t ... /filesystem tmp
:   mv tmp $today
: 
: That is, link the previous backup to temp (which takes no space
: except directories), rsync current files to there (rsync will
: break links for changed files), and rename temp to $today.

Very nice.  The breaking of the hardlink is the key.  I wondered about
this when Michal using rsync yesterday.  I just tested the idea. It
does indeed work.

One question: why do you not use -a instead of -r -t?  It would
seem that one would want to preserve permissions, and group and user
ownerships.  Also, is there a reason to _not_ preserve sym-links
in the backup.  Your script appears to copy the referent.

Dean

P.S.  I think this thread has wandered from the topic of linux-raid.
  I'm happy to cease and desist if this Off Topic discussion
  offends.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Backups w/ rsync (was: Help: very slow software RAID 5.)

2007-09-27 Thread Dean S. Messing

Michal Soltys writes:
:  Dean S. Messing wrote:
:   
:   I don't see how one would do incrementals.  My backup system uses
:   currently does a monthly full backup,   a weekly level  3  (which
:   saves everything that has changed since the last level 3 a week ago) and
:   daily level 5's (which save everything that changed today).
:   
:  
:  Rsync is fantastic tool for incremental backups. Everything that didn't 
:  change can be hardlinked to previous entry. And time of performing the 
:  backup is pretty much neglible. Essentially - you have equivalent of 
:  full backups at almost minimal time and space cost possible.

It has been some time since I read the rsync man page.  I see that
there is (among the bazillion and one switches) a --link-dest=DIR
switch which I suppose does what you describe.  I'll have to
experiment with this and think things through.  Thanks, Michal.

Dean

P.S. I changed the Subject: to reflect the new subject. Not sure if
that starts a new thread or not.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html