Hi,

On Thu, 2005-11-17 at 13:27 -0500, Ken Long wrote:
> I have been trying to get BackupPC implemented on my network and have
> had reasonable success other than with one server.
> 
> Unfortunately, this is my primary file server!  
> 
> Both the BackupPC machine and the file server are running Debian Linux
> (Sarge) and I am using rsync for the backups.  Filesystems on both sides
> are xfs.
> 
> What happens is that when I try to back up the entire machine, the
> backup runs a while and then hangs.  It will eventually fail when it
> hits the timeout, but looking at the time stamps on the log files, I can
> see what time it quit doing any thing.  There does not seem to be a
> rhyme or reason to how long it runs (anywhere from 2-4 hours) or where
> it is at in the directory tree when it fails. When I look at the
> contents of the latest XferLOG.z, here is what I see at the end of it:
> 
>   pool     670  1090/100        8704
> everyone/V_Reagan/StoreRoom/Mombalan.doc
>   pool     670  1090/100       11264
> everyone/V_Reagan/StoreRoom/Momlocat.doc
>   pool     670  1090/100       13312 everyone/V_Reagan/StoreRoom/Momm
> 
> 
> Notice that it chops the name of the file in the last line and then just
> quits updating anything.

That could be buffering, but I'm not familiar enough with how backuppc
handles logging. The problem can be several files/directories later than
this one.

Several things come to mind:
- Does the filename that gets chopped contain "weird" characters?
- filesize can be an issue, not here. Maybe a few files after this one
comes a very large one?
- Can backuppc read the file?
- Does the backup complete correctly if you temporarily remove that
file?

> rsync is still running on the client, but not producing anything.
> 
> I tried running a strace on rsync and here is what I saw:
> 
> )  = 0 (Timeout)
> select(1, [0], [], NULL, {60, 0}

It's just waiting for data, nothing is happening. You'll probably see
lots and lots of those lines.
What are the last few lines before the Timeouts appear? You can use the
-o flag for strace to redirect the output to a file. Something like 
strace -o output.txt BackupPC_dump -v -f <hostname>

> (I've never used strace before, so don't really know what that means,
> but I saw where someone suggest to run it for someone else.)

Strace is an insanely useful tool to debug all kinds of trouble.

> The partition that it hangs on is my /datadrive partition and here is
> the stats on it:
> 
> /dev/sda3            581506104 336361120 245144984  58% /datadrive
> 
> So, I am using about 336GB out of 581GB available.  
> Underneath /datadrive, I have the following directories that I am
> backing up:
> 
> home
> public
> shared
> www
> 
> If I select /datadrive to backup in my client.pl file, it will hang.
> If I select /datadrive and exclude a little, small directory from
> underneath shared (holds about 3GB), it will succeed.  (note that the
> file in XferLog.z above that it stopped on is NOT in the directory that
> I exclude when I do this)

That's probably due to buffering. But it looks like somewhere in that
small directory is a file or a directory backuppc doesn't like. That
could be a large file (I've personally noticed some trouble with files
larger than about 500 MB), a file with weird characters, or a filesystem
irregularity (fsck it).

> Thinking size issues, I have, of course, increased the timeout, but that
> hasn't helped.  (besides that, it hangs long before it would ever time
> out)
> I also tried splitting things up and defining backups for
> '/datadrive/home', '/datadrive/www', '/datadrive/public', and
> '/datadrive/shared'.  Thinking that would split things up and make the
> file list smaller for the files being copied.  It still hangs
> on /datadrive/shared.
> 
> Does anyone have any ideas what might be happening here?

Your server is probably fine. This smells like something on the client
or in the communication between the client and the server.

> Any help is greatly appreciated!!!
> 
> -Ken

Hth,

-- 
Guus Houtzager                           Email: [EMAIL PROTECTED]
PGP fingerprint = 5E E6 96 35 F0 64 34 14  CC 03 2B 36 71 FB 4B 5D
Early to rise, early to bed, makes a man healthy, wealthy and dead.
        --Rincewind, The Light Fantastic



-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc.  Get Certified Today
Register for a JBoss Training Course.  Free Certification Exam
for All Training Attendees Through End of 2005. For more info visit:
http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click
_______________________________________________
BackupPC-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/

Reply via email to