Re: [BackupPC-users] Backing up large directories times out with signal=ALRM or PIPE

2007-02-26 Thread Carl Wilhelm Soderstrom
On 02/24 11:09 , Les Mikesell wrote:
   That means there is no meaningful way of deleting an older
  backup, as the parent files may be lost, rendering future links useless?
 
 On unix filesystems, the contents are not removed until the last link is 
 deleted and no process has the file open.  

looked at another way, a hardlink is just another name for the file, exactly
like the first one it had.  A hardlink is a directory index entry, in
exactly the same way that the first name for the file was.

-- 
Carl Soderstrom
Systems Administrator
Real-Time Enterprises
www.real-time.com

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] Backing up large directories times out with signal=ALRM or PIPE

2007-02-24 Thread John Pettitt
Jason B wrote:
  close to 
 the same way as an incremental, except it's more useful, so to say?

 Incidentally, unrelated, but something that's been bugging me for a while: 
 subsequent full backups hardlink to older ones that have the true copy of the 
 file, correct? That means there is no meaningful way of deleting an older 
 backup, as the parent files may be lost, rendering future links useless?

   
Not quite - if it were symlinks that would be true but BackupPC uses 
hard links - with a hard link the underlying inode (which describes the 
file data) persists as long as there is at least one link to it.  When 
old backups get purged  what really happens is they gut unlinked (doing 
rm -rf on a numbered backup in the directory for an individual pc has 
the same effect).   If all the numbered backups that reference a file 
get removed then only a single link (from the pool tree) will remain.   
The nightly cleanup code looks for files with one link and removes 
them.   So you can safely delete older backups knowing that only files 
that are unique to that backup will disappear (and also knowing that you 
won't get the disk space backup until after the nightly cleanup).

John

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] Backing up large directories times out with signal=ALRM or PIPE

2007-02-24 Thread Les Mikesell
Jason B wrote:

 3.) Rsync(d) full backups go to more trouble to determine what has changed,
 meaning they're more expensive in terms of CPU time and disk I/O, but
 they'll catch changes incrementals may have missed. That means they're
 vital every now and then, supposing you want a meaningful backup of your
 data.
 
 In that case, though, what advantage is there to running incrementals vs 
 fulls? 
 The server load? To me, a full backup implies a complete re-transfer of all 
 files, but you are saying a rsync(d) full backup, in effect, functions close 
 to 
 the same way as an incremental, except it's more useful, so to say?

There are two differences.  One is that it does a full block checksum 
comparision of the files at both ends which may take a lot longer even 
though it doesn't use much more bandwidth than the incrementals  that 
skip files where the timestamp and length match.  The other is that it 
completely populates the backup directory which can then be used as the 
basis for subsequent incrementals.

 Incidentally, unrelated, but something that's been bugging me for a while: 
 subsequent full backups hardlink to older ones that have the true copy of the 
 file, correct?

Not exactly.  All of the copies end up linked to a common file in the 
cpool directory.  The fact that they are linked to each other is 
incidental - all identical files will be linked, not just ones that 
match previous backups of the same file.  The filename  in the cpool 
directory is a hash value used as a quick way to find the matches.

  That means there is no meaningful way of deleting an older
 backup, as the parent files may be lost, rendering future links useless?

On unix filesystems, the contents are not removed until the last link is 
deleted and no process has the file open.  Thus you can remove any 
individual backup without affecting any of the others.  You don't get 
the space back until the nightly cleanup job runs, removing the links in 
the cpool directory where the link count is 1 (meaning no backups still 
use it).

-- 
   Les Mikesell
[EMAIL PROTECTED]


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] Backing up large directories times out with signal=ALRM or PIPE

2007-02-24 Thread Ambrose Li
On 24/02/07, Jason B [EMAIL PROTECTED] wrote:
 Incidentally, unrelated, but something that's been bugging me for a while:
 subsequent full backups hardlink to older ones that have the true copy of the
 file, correct? That means there is no meaningful way of deleting an older
 backup, as the parent files may be lost, rendering future links useless?

Not exactly. BackupPC always keeps the files hardlinked to a file in the cpool
directory, so the parent files should never get lost.

-- 
cheers,
-ambrose

Don't trust everything you read in Wikipedia.

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] Backing up large directories times out with signal=ALRM or PIPE

2007-02-21 Thread Nils Breunese (Lemonbit)

Les Mikesell wrote:


Apologies for the relatively long email, but I figure it's better to
give too much information than not enough. I've run into a bit of
difficulty backing up a large directory tree that has me not being
able to do a successful backup in over a month now. I'm attempting to
back up about 70GB over the Internet with a 1 MB/sec connection (the
time it takes doesn't really bother me, just want to do a full backup
and then run incrementals all  the time). However, the transfer  
always

times out with signal=ALRM.


ALRM should mean the server's $Conf{ClientTimeout} expired.  You may
need to make it much longer. The time is supposed to mean  
inactivity but

some circumstances make it the total time for a transfer to complete.

signal=PIPE means the connection broke or the client side quit  
unexpectedly.


Although the ALRM and PIPE signals are probably technically correct  
it might be clearer to use different terms/explanations in the  
interface. I have the feeling not everyone understands these signals.


Nils Breunese.




PGP.sig
Description: Dit deel van het bericht is digitaal ondertekend
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] Backing up large directories times out with signal=ALRM or PIPE

2007-02-21 Thread Les Mikesell
Nils Breunese (Lemonbit) wrote:

 Apologies for the relatively long email, but I figure it's better to
 give too much information than not enough. I've run into a bit of
 difficulty backing up a large directory tree that has me not being
 able to do a successful backup in over a month now. I'm attempting to
 back up about 70GB over the Internet with a 1 MB/sec connection (the
 time it takes doesn't really bother me, just want to do a full backup
 and then run incrementals all  the time). However, the transfer always
 times out with signal=ALRM.

 ALRM should mean the server's $Conf{ClientTimeout} expired.  You may
 need to make it much longer. The time is supposed to mean inactivity but
 some circumstances make it the total time for a transfer to complete.

 signal=PIPE means the connection broke or the client side quit 
 unexpectedly.
 
 Although the ALRM and PIPE signals are probably technically correct it 
 might be clearer to use different terms/explanations in the interface. I 
 have the feeling not everyone understands these signals.

man signal
will show all the possibilities.  SIGPIPE isn't very clear because it 
really just means a child process terminated while the parent is still 
trying to communicate with it, but in this case the child is the ssh, 
rsync, or smbclient that is doing the transfer from the remote and the 
likely reasons are either a network problem or that the remote side 
terminated.

-- 
   Les Mikesell
[EMAIL PROTECTED]


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] Backing up large directories times out with signal=ALRM or PIPE

2007-02-21 Thread Nils Breunese (Lemonbit)

Les Mikesell wrote:

Although the ALRM and PIPE signals are probably technically  
correct it might be clearer to use different terms/explanations in  
the interface. I have the feeling not everyone understands these  
signals.


man signal
will show all the possibilities.  SIGPIPE isn't very clear because  
it really just means a child process terminated while the parent is  
still trying to communicate with it, but in this case the child is  
the ssh, rsync, or smbclient that is doing the transfer from the  
remote and the likely reasons are either a network problem or that  
the remote side terminated.


I know I can take a look at the man pages, but I still think it would  
better for the web interface to display something a bit clearer than  
just the signal name.


Nils Breunese.




PGP.sig
Description: Dit deel van het bericht is digitaal ondertekend
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] Backing up large directories times out with signal=ALRM or PIPE

2007-02-21 Thread Les Mikesell
Nils Breunese (Lemonbit) wrote:
 Les Mikesell wrote:
 
 Although the ALRM and PIPE signals are probably technically correct 
 it might be clearer to use different terms/explanations in the 
 interface. I have the feeling not everyone understands these signals.

 man signal
 will show all the possibilities.  SIGPIPE isn't very clear because it 
 really just means a child process terminated while the parent is still 
 trying to communicate with it, but in this case the child is the ssh, 
 rsync, or smbclient that is doing the transfer from the remote and the 
 likely reasons are either a network problem or that the remote side 
 terminated.
 
 I know I can take a look at the man pages, but I still think it would 
 better for the web interface to display something a bit clearer than 
 just the signal name.
 

I haven't looked at the code, but it probably just picks up the error or 
exit status and its description as returned by the operating system. 
Things like 'no space on device' or 'permission denied' are a little 
more understandable but there are a lot of possibilities for failure.

-- 
   Les Mikesell
[EMAIL PROTECTED]

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] Backing up large directories times out with signal=ALRM or PIPE

2007-02-21 Thread Holger Parplies
Hi,

Jason B wrote on 20.02.2007 at 20:28:59 [Re[2]: [BackupPC-users] Backing up 
large directories times out with signal=ALRM or PIPE]:
  [...] $Conf{ClientTimeout} will need to be at least 72000 [...]
 
 I see. I must've been misunderstanding the meaning of that setting -
 my original impression was that it be the time that it would wait, at
 most, if nothing is happening before it times out - I assumed that if
 files are being transferred, that is sufficient activity for it to
 keep re-setting that timer. [...]

that is the way it would ideally be supposed to work. Unfortunately that's
not really easy to implement, as the instance (i.e. process) *watching* the
transfer is not the one *doing* the transfer. Apparently, the tar and smb
transfer methods are a bit better than rsync(d) in that the alarm time is
reset whenever (informational) output from the tar command is received. This
is not really an advantage, because you're dependent on the transfer time of
the largest file instead of the total backup. File sizes probably vary more
than total backup sizes.

  You don't really want to do that, for various reasons.
 
 Would you suggest, in that case, to lower the frequency of
 incrementals, and raise the frequency of full backups? I was going on
 the idea of doing an incremental once every 2 days or so, and a full
 backup once a month (because of the size of the data and the
 persistent timeouts).

Well, you *wrote* you wanted no full backups at all. Whether one month is a
good interval for full backups or not really depends on your data, the
changes, your bandwidth, and your requirements. If you require an exact
backup that is at most a week old (meaning no missed changes are acceptable),
then you'll need a weekly full. If the same files change every day, your
incrementals won't grow as much as if different files change every day. If
the time a backup takes is unimportant, as long as it finishes within 24
hours, you can probably get away with longer intervals between full backups.
If bandwidth is more expensive than server load, you'll need shorter
intervals. You'll have to work out for yourself, which interval best fits
your needs. I was just saying: no fulls and only incrementals won't work.

You can always configure monthly (automatic) full backups and then start one
by hand after a week. See how long that takes. Start the next one after
further two weeks. See how much interval you can get away with. Or watch how
long your incrementals are taking. BackupPC provides you with a lot of
flexibility.

Concerning the incremental backups: if you need (or want) a backup every two
days, then you should do one every two days. If that turns out to be too
expensive in terms of network bandwidth, you'll have to change something.
Doing *each backup as a full backup* (using rsync(d)!) will probably minimize
network utilisation at the expense of (much!) server load. Again: there's no
one fits all answer.

  Jason Hughes explained how to incrementally transfer such a structure using
  $Conf{BackupFilesExclude}. The important thing is that you need successful
  backups to avoid re-transferring data, even if these backups at first
  comprise only part of your target data. [...]
 
 What I currently have is a rsyncd share for about 10 - 12 different
 subdirectories (I drilled down a level with the expectation that
 splitting into separate shares might help with the timeouts; I have
 not considered the possibility of backing up separately, though).
 By that token, I would imagine that I just comment out the shares I
 don't need at present, and re-activate them once the backups are done,
 right? And once I've gone through the entire tree, just enable them
 all and hope for the best?

I'm not sure I understand you correctly.

The important thing seems to be: define your share as you ultimately want it
to be. Exclude parts of it at first (with $Conf{BackupFilesExclude}) to get
a successful backup. Altering $Conf{BackupFilesExclude} will not change your
backup definition, i.e. it will appear as if the share started off with a few
files and quickly grew from backup to backup. You can start a full backup by
hand every hour (after changing $Conf{BackupFilesExclude}) to get your share
populated, no need to wait for your monthly full :). Each successful full
backup (with less excluded files) will bring you nearer to your goal.
Each future full backup will be similar to the last of these steps: most of
the files are already on the backup server, only a few percent need to be
transfered.


If, in contrast, you start up with several distinct shares, you'll either
have to keep it that way forever, or re-transfer files, or do some other
magic to move them around the backups and hope everything goes well. It's
certainly possible, but it's not easy. Using $Conf{BackupFilesExclude} is,
and you can't do much wrong, as long as you finally end up excluding nothing
you don't want to exclude.

Regards,
Holger


Re: [BackupPC-users] Backing up large directories times out with signal=ALRM or PIPE

2007-02-20 Thread Jason Hughes
Jason B wrote:
 However, the transfer always times out with signal=ALRM.
   
[...]
 Somewhat unrelated, but of all these attempts, it hasn't ever kept a
 partial - so it transfers the files, fails, and removes them. I have
 one partial from 3 weeks ago that was miraculously kept, so it keeps
 coming back to it.

 Would anybody have any ideas on what I can do? I've set
 $Conf{ClientTimeout} = 7200; in the config.pl... enabled
 --checksum-seed... disabled compression to rsync... no other ideas.
 Running BackupPC-3.0.0 final. I'm guessing the connection gets broken
 at some point (using rsycnd), but is there any way to make BackupPC
 attempt to reconnect and just continue from where it left off?
   

Not exactly.  It's a gripe that has come up before.  The way BackupPC 
works is by completing a job.  Anything incomplete is essentially thrown 
away the next time it runs.  You might try bumping up your ClientTimeout 
to a higher number, but chances are, you're actually seeing the pipe 
break because the connection is cut or TCP errors occur that prevent 
routing or who knows what.  If you think about it, larger transfers are 
much more susceptible to this because there may be a small chance that a 
connection is cut at any time, the longer the connection the more likely 
it breaks...  any unrecoverable transfer will tend toward impossible to 
complete as the tranfer time increases.  :-(

 On a final note: interestingly, backups from the SAME physical host
 using a different hostname (to back up another, much smaller,
 virtualhost directory) work perfectly every day, never failed. So I'm
 guessing it's just having a problem with the size / # of files. What
 can I do?


   

I have a machine that has a lot of video (120gb) across a wifi WDS link 
(half 802.11g speed, at best).  I could never get an initial backup to 
succeed, because it could take 30-50 hours.  What I did was set up 
excludes on tons of directories, so the first backup was very short.  I 
kicked it off manually and waited until it completed.  Then I removed 
one excluded directory and kicked off another.  BackupPC skips files 
that have been entered into the pool due to a completed backup, so it is 
kind of like biting off smaller pieces of a single larger backup.  
Repeat until all your files have made it into the pool.   At that point, 
your total backups will be very short and only include deltas.

Other people have had success with moving your server physically to the 
LAN of the client and doing the backup over a fast, stable connection, 
to populate the pool with files initially.  That may not be an option 
for you.

Good luck,
JH

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] Backing up large directories times out with signal=ALRM or PIPE

2007-02-20 Thread Holger Parplies
Hi,

Jason B wrote on 20.02.2007 at 21:28:43 [[BackupPC-users] Backing up large 
directories times out with signal=ALRM or PIPE]:
 I've run into a bit of
 difficulty backing up a large directory tree that has me not being
 able to do a successful backup in over a month now. I'm attempting to
 back up about 70GB over the Internet with a 1 MB/sec connection (the

if you really mean 8 MBit/s your backup will need about 20 hours to
complete, meaning $Conf{ClientTimeout} will need to be at least 72000 (if
you meant 128KB/s, it's obviously 8 times as much). Setting it to this value
or more is no problem. It just means, if a backup happens to get somehow
stuck, BackupPC will need that long to recover, possibly blocking other
backups for the time due to $Conf{MaxBackups}. That may or may not be a
problem for you in the long run, so you'll probably want to adjust it once
you've got a feeling for how long your backups take in the worst case.

 time it takes doesn't really bother me, just want to do a full backup
 and then run incrementals all  the time).

You don't really want to do that, for various reasons.

1.) An incremental is based on the last full backup (or incremental of lower
level, to be exact). That means, everything changed since the last full
backup will be transfered on each incremental - more data from day to
day.
2.) In contrast to this, an rsync(d) full backup will also only transfer
files changed since the last full backup (i.e. ideally not more than an
incremental), but it will give you a new reference point, meaning future
incrementals transfer less data.
3.) Rsync(d) full backups go to more trouble to determine what has changed,
meaning they're more expensive in terms of CPU time and disk I/O, but
they'll catch changes incrementals may have missed. That means they're
vital every now and then, supposing you want a meaningful backup of your
data.

 The tree is approximately like this:
 
 - top level 1
 - articles
   - dir 1
 - subdirs 1 through 9
   - dir 2
 - subdirs 1 through 9
   etc until dir 9 (same subdir structure)
 - images
   - dir 1
 - subdirs 1 through 9
   - dir 2
 - subdirs 1 through 9
   etc until dir 9 (same subdir structure)
 - top level 4
 
 There are (on average) 5,000 files per directory (about 230,000 files
 in total).

Jason Hughes explained how to incrementally transfer such a structure using
$Conf{BackupFilesExclude}. The important thing is that you need successful
backups to avoid re-transferring data, even if these backups at first
comprise only part of your target data. It might be enough to split the
process into two parts by first excluding half of your toplevel directories
and then removing the excludes for the second run. You might even be able
to transfer everything at once by simply adjusting your $Conf{ClientTimeout}.
If in doubt, set the value way too high rather than slightly too low. You
can always adjust it after your first successful backup.

Regards,
Holger

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/