[BackupPC-users] RsyncP: BackupPC_dump consumes 100% CPU and progress stalls

2009-12-13 Thread Tim Connors
backuppc has spontaneously a few days ago started getting stuck on a
partition on a machine that it has been dealing with fine for 120 days
now.

The logfile gives:

2009-12-02 08:46:04 incr backup 122 complete, 38 files, 46425552 bytes, 0 
xferErrs (0 bad files, 0 bad shares, 0 other)
2009-12-03 08:24:08 incr backup started back to 2009-11-29 08:18:01 (backup 
#119) for directory /
2009-12-03 08:27:03 incr backup started back to 2009-11-29 08:18:01 (backup 
#119) for directory /var
2009-12-03 08:28:12 incr backup started back to 2009-11-29 08:18:01 (backup 
#119) for directory /ssb_local
2009-12-03 08:31:26 incr backup started back to 2009-11-29 08:18:01 (backup 
#119) for directory /opt
2009-12-03 08:31:47 incr backup started back to 2009-11-29 08:18:01 (backup 
#119) for directory /scratch
2009-12-03 08:31:57 incr backup started back to 2009-11-29 08:18:01 (backup 
#119) for directory /6dfdata
2009-12-03 08:32:08 incr backup started back to 2009-11-29 08:18:01 (backup 
#119) for directory /6dfgs
2009-12-03 08:33:30 incr backup started back to 2009-11-29 08:18:01 (backup 
#119) for directory /6dfdata2
2009-12-03 08:44:02 incr backup started back to 2009-11-29 08:18:01 (backup 
#119) for directory /usr/local
2009-12-03 08:44:58 incr backup 123 complete, 46 files, 129608707 bytes, 0 
xferErrs (0 bad files, 0 bad shares, 0 other)
2009-12-04 08:18:49 incr backup started back to 2009-12-03 08:24:08 (backup 
#123) for directory /
2009-12-04 08:23:45 incr backup started back to 2009-12-03 08:24:08 (backup 
#123) for directory /var
2009-12-04 08:25:01 incr backup started back to 2009-12-03 08:24:08 (backup 
#123) for directory /ssb_local
2009-12-04 08:28:09 incr backup started back to 2009-12-03 08:24:08 (backup 
#123) for directory /opt
2009-12-04 08:28:32 incr backup started back to 2009-12-03 08:24:08 (backup 
#123) for directory /scratch
2009-12-04 08:28:41 incr backup started back to 2009-12-03 08:24:08 (backup 
#123) for directory /6dfdata
2009-12-04 08:28:50 incr backup started back to 2009-12-03 08:24:08 (backup 
#123) for directory /6dfgs
2009-12-04 08:29:21 incr backup started back to 2009-12-03 08:24:08 (backup 
#123) for directory /6dfdata2
2009-12-04 08:40:29 incr backup started back to 2009-12-03 08:24:08 (backup 
#123) for directory /usr/local
2009-12-04 08:41:25 incr backup 124 complete, 37 files, 47108022 bytes, 0 
xferErrs (0 bad files, 0 bad shares, 0 other)
2009-12-05 08:18:02 incr backup started back to 2009-12-04 08:18:49 (backup 
#124) for directory /
2009-12-05 08:21:32 incr backup started back to 2009-12-04 08:18:49 (backup 
#124) for directory /var
2009-12-05 08:23:00 incr backup started back to 2009-12-04 08:18:49 (backup 
#124) for directory /ssb_local
2009-12-05 08:26:45 incr backup started back to 2009-12-04 08:18:49 (backup 
#124) for directory /opt
2009-12-06 04:27:03 Aborting backup up after signal ALRM
2009-12-06 04:27:05 Got fatal error during xfer (aborted by signal=ALRM)
2009-12-06 10:03:04 incr backup started back to 2009-12-04 08:18:49 (backup 
#124) for directory /
2009-12-06 10:06:13 incr backup started back to 2009-12-04 08:18:49 (backup 
#124) for directory /var
2009-12-06 10:07:57 incr backup started back to 2009-12-04 08:18:49 (backup 
#124) for directory /ssb_local
2009-12-06 10:11:40 incr backup started back to 2009-12-04 08:18:49 (backup 
#124) for directory /opt
2009-12-07 06:12:02 Aborting backup up after signal ALRM
2009-12-07 06:12:04 Got fatal error during xfer (aborted by signal=ALRM)
2009-12-07 08:21:50 incr backup started back to 2009-12-04 08:18:49 (backup 
#124) for directory /

It's a slowaris machine running rsync 2.6.8, and the backuppc server is
debian lenny, running backuppc 3.1.0.

A typical gdb trace on the runaway backuppc_dump process looks like this:

sudo gdb /usr/share/backuppc/bin/BackupPC_dump 20538
bt
#0  0x7f779bdd24c0 in perl_gthr_key_...@plt () from 
/usr/lib/perl5/auto/File/RsyncP/FileList/FileList.so
#1  0x7f779bdd67f2 in XS_File__RsyncP__FileList_get () from 
/usr/lib/perl5/auto/File/RsyncP/FileList/FileList.so
#2  0x7f779e526eb0 in Perl_pp_entersub () from /usr/lib/libperl.so.5.10
#3  0x7f779e525392 in Perl_runops_standard () from /usr/lib/libperl.so.5.10
#4  0x7f779e5205df in perl_run () from /usr/lib/libperl.so.5.10
#5  0x00400d0c in main ()

(always in RsyncP, almost obviously).

It has no files open according to /proc/pid/fd, other than logfiles etc.

It's cpu bound, making no syscalls, so attaching an strace process to it
yields nothing (and I wasn't enlightened when I tried to attach it from
the start).

If I remove the /opt partition one day, then add it the next day, it locks
up again that next day while trying to do /opt.

If I delete 123, 124 (and the newer copies without /opt) and modify the
backups file to make 122 the last one, and thus force it to compare
against backup 119 rather than 123, it still locks up on /opt
(incrlevels is 1, 2, 3, 4, 5, 2, 3, 4, 5, 3, 4, 5, 4, 5)

-- 
TimC
How 

Re: [BackupPC-users] rsync 3.0 incremental support?

2009-12-13 Thread Jeffrey J. Kosowsky
Robin Lee Powell wrote at about 15:07:04 -0800 on Saturday, December 12, 2009:
  
  It seems to me that rsync's memory bloat issues, which have been
  discussed here many times, would be basically fixed by making
  File::RsyncP and backuppc itself support rsync 3.0's incremental
  file transfer stuff.  Is anyone working on that?
  

I believe it does already at least to some extent automatically since
memory issues go down when you upgrade to series 3.0+ rsync.
I believe that is separate from the protocol 30 issue which is not
addressed by the current version of File::RsyncP

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Cannot start backuppc

2009-12-13 Thread Matthias Meyer
Robert J. Phillips wrote:

 My raid drive failed that stores all the data.  I have fixed this
 problem (rebuilt the raid and had to re-install the xfs file system).
 All the data is lost that was on the array.
 
  
 
 I am running the Beta 3.2.0 version of backuppc and I ran sudo perl
 configure.pl to let it rebuild the data folders.  When I try to start
 backuppc I get an error that it cannot create a test hardlink between a
 file in /mnt/backup/pc and /mnt/backup/cpool.
 
  
 
 How do I fix it??
It is a problem with your filesystem. BackupPC must make hardlinks between
this two directories.
You are sure this two directories on the same drive/volume?
Try to make a hardlink between this two directories.

br
Matthias
-- 
Don't Panic


--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Cannot start backuppc

2009-12-13 Thread Alan McKay
 How do I fix it??

In addition to the other response, start by showing us the output of

df /mnt/backup/cpool /mnt/backup/pc



-- 
“Don't eat anything you've ever seen advertised on TV”
 - Michael Pollan, author of In Defense of Food

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Got fatal error during xfer

2009-12-13 Thread Robin Lee Powell
On Thu, Dec 03, 2009 at 08:13:47PM +, Tyler J. Wagner wrote:
 Are you sure this isn't a ClientTimeout problem?  Try increasing
 it and see if the backup runs for longer.

Just as a general comment (I've been reviewing all the SIGPIPE mails
and people keep saying that), no.  SIGPIPE means the remote rsync
stopped talking for some reason.  SIGALRM is what's generated by the
timeout possing.

-Robin

-- 
They say:  The first AIs will be built by the military as weapons.
And I'm  thinking:  Does it even occur to you to try for something
other  than  the default  outcome?  See http://shrunklink.com/cdiz
http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] rsync 3.0 incremental support?

2009-12-13 Thread Robin Lee Powell
On Sun, Dec 13, 2009 at 03:46:50PM -0500, Jeffrey J. Kosowsky wrote:
 Robin Lee Powell wrote at about 15:07:04 -0800 on Saturday,
 December 12, 2009:
   
   It seems to me that rsync's memory bloat issues, which have
   been discussed here many times, would be basically fixed by
   making File::RsyncP and backuppc itself support rsync 3.0's
   incremental file transfer stuff.  Is anyone working on that?
   
 
 I believe it does already at least to some extent automatically
 since memory issues go down when you upgrade to series 3.0+ rsync.
 I believe that is separate from the protocol 30 issue which is not
 addressed by the current version of File::RsyncP

Good to know it's better, but it can't be doing the full incremental
version:

  Beginning  with  rsync  3.0.0,  the recursive algorithm used is
  now an incremental scan that uses much less memory than before and
  begins the transfer after the scanning of the first few
  directories  have  been  completed.   This incremental  scan  only
  affects our recursion algorithm, and does not change a
  non-recursive transfer.  It is also only possible when both ends
  of the transfer are at least version 3.0.0.

(says man rsync)

-Robin

-- 
They say:  The first AIs will be built by the military as weapons.
And I'm  thinking:  Does it even occur to you to try for something
other  than  the default  outcome?  See http://shrunklink.com/cdiz
http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


[BackupPC-users] An idea to fix both SIGPIPE and memory issues with rsync

2009-12-13 Thread Robin Lee Powell

I've only looked at the code briefly, but I believe this *should* be
possible.  I don't know if I'll be implementing it, at least not
right away, but it shouldn't actually be that hard, so I wanted to
throw it out so someone else could run with it if ey wants.

It's an idea I had about rsync resumption:

Keep an array of all the things you haven't backed up yet, starting
with the inital arguments; let's say we're transferring /a and
/b from the remote machine.

Start by putting a/ and b/ in the array.  Then get the directory
listing for a/, and replace a/ in the array with a/d, a/e, ...
for all files and directories in there.  When each file is
transferred, it gets removed.  Directories are replaced with their
contents.

If the transfer breaks, you can resume with that list of
things-what-still-need-transferring/recursing-through without having
to walk the parts of the tree you've already walked.

This should solve the SIGPIPE problem.  In fact, it could even deal
with incrementals from things like laptops: if you have settings for
NumRetries and RetryDelay, you could, say, retry every 60 seconds
for a week if you wanted.

On top of that, you could use the same retry system to
*significantly* limit the memory usage: stop rsyncing every N files
(where N is a config value).  If you only do, say, 1000 files at a
time, the memory usage will be very low indeed.

As I said, none of this should be *especially* hard to implement;
it's just changes to lib/BackupPC/Xfer/RsyncFileIO.pm and
lib/BackupPC/Xfer/Rsync.pm , and doing this would make BackupPC be
as robust as a lot of commercial backup packages, on top of all its
current benefits.

(Note the -d option to rsync; you'd want to pass that to the
remote rsync to get the directory listings, I think.)

-Robin


-- 
They say:  The first AIs will be built by the military as weapons.
And I'm  thinking:  Does it even occur to you to try for something
other  than  the default  outcome?  See http://shrunklink.com/cdiz
http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] An idea to fix both SIGPIPE and memory issues with rsync

2009-12-13 Thread Jeffrey J. Kosowsky
Robin Lee Powell wrote at about 20:18:55 -0800 on Sunday, December 13, 2009:
  
  I've only looked at the code briefly, but I believe this *should* be
  possible.  I don't know if I'll be implementing it, at least not
  right away, but it shouldn't actually be that hard, so I wanted to
  throw it out so someone else could run with it if ey wants.
  
  It's an idea I had about rsync resumption:
  
  Keep an array of all the things you haven't backed up yet, starting
  with the inital arguments; let's say we're transferring /a and
  /b from the remote machine.
  
  Start by putting a/ and b/ in the array.  Then get the directory
  listing for a/, and replace a/ in the array with a/d, a/e, ...
  for all files and directories in there.  When each file is
  transferred, it gets removed.  Directories are replaced with their
  contents.
  
  If the transfer breaks, you can resume with that list of
  things-what-still-need-transferring/recursing-through without having
  to walk the parts of the tree you've already walked.
  
  This should solve the SIGPIPE problem.  In fact, it could even deal
  with incrementals from things like laptops: if you have settings for
  NumRetries and RetryDelay, you could, say, retry every 60 seconds
  for a week if you wanted.
  
  On top of that, you could use the same retry system to
  *significantly* limit the memory usage: stop rsyncing every N files
  (where N is a config value).  If you only do, say, 1000 files at a
  time, the memory usage will be very low indeed.
  

Unfortunately, I don't think it is that simple. If it were, then rsync
would have been written that way back in version .001. I mean there is
a reason that rsync memory usage increases as the number of files
increases (even in 3.0) and it is not due to memory holes or ignorant
programmers. After all, your proposed fix is not exactly obscure.

At least one reason is the need to keep track of inodes so that hard
links can be copied properly. In fact, I believe that without the -H
flag, rsync memory usage scales much better. Obviously if you break up
backups into smaller chunks or allow resumes without keeping track of
past inodes then you have no way of tracking hard links across the
filesystem. Maybe you don't care but if so, you could probably do just
about as well by dropping the --hard-links argument from RsyncArgs.

I don't believe there is any easy way to get something for free here...

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] An idea to fix both SIGPIPE and memory issues with rsync

2009-12-13 Thread Shawn Perry
You can always run come sort of disk de-duplicater after you copy without -H

On Sun, Dec 13, 2009 at 9:56 PM, Jeffrey J. Kosowsky
backu...@kosowsky.org wrote:
 Robin Lee Powell wrote at about 20:18:55 -0800 on Sunday, December 13, 2009:
  
   I've only looked at the code briefly, but I believe this *should* be
   possible.  I don't know if I'll be implementing it, at least not
   right away, but it shouldn't actually be that hard, so I wanted to
   throw it out so someone else could run with it if ey wants.
  
   It's an idea I had about rsync resumption:
  
   Keep an array of all the things you haven't backed up yet, starting
   with the inital arguments; let's say we're transferring /a and
   /b from the remote machine.
  
   Start by putting a/ and b/ in the array.  Then get the directory
   listing for a/, and replace a/ in the array with a/d, a/e, ...
   for all files and directories in there.  When each file is
   transferred, it gets removed.  Directories are replaced with their
   contents.
  
   If the transfer breaks, you can resume with that list of
   things-what-still-need-transferring/recursing-through without having
   to walk the parts of the tree you've already walked.
  
   This should solve the SIGPIPE problem.  In fact, it could even deal
   with incrementals from things like laptops: if you have settings for
   NumRetries and RetryDelay, you could, say, retry every 60 seconds
   for a week if you wanted.
  
   On top of that, you could use the same retry system to
   *significantly* limit the memory usage: stop rsyncing every N files
   (where N is a config value).  If you only do, say, 1000 files at a
   time, the memory usage will be very low indeed.
  

 Unfortunately, I don't think it is that simple. If it were, then rsync
 would have been written that way back in version .001. I mean there is
 a reason that rsync memory usage increases as the number of files
 increases (even in 3.0) and it is not due to memory holes or ignorant
 programmers. After all, your proposed fix is not exactly obscure.

 At least one reason is the need to keep track of inodes so that hard
 links can be copied properly. In fact, I believe that without the -H
 flag, rsync memory usage scales much better. Obviously if you break up
 backups into smaller chunks or allow resumes without keeping track of
 past inodes then you have no way of tracking hard links across the
 filesystem. Maybe you don't care but if so, you could probably do just
 about as well by dropping the --hard-links argument from RsyncArgs.

 I don't believe there is any easy way to get something for free here...

 --
 Return on Information:
 Google Enterprise Search pays you back
 Get the facts.
 http://p.sf.net/sfu/google-dev2dev
 ___
 BackupPC-users mailing list
 BackupPC-users@lists.sourceforge.net
 List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
 Wiki:    http://backuppc.wiki.sourceforge.net
 Project: http://backuppc.sourceforge.net/


--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/