Re: [BackupPC-users] Handling machines too large to back themselves up

2021-04-09 Thread Bowie Bailey via BackupPC-users
I don't know where the online documentation is offhand, although if you are 
configuring BackupPC from the web interface, most of the setting names are clickable 
links into the documentation.  You can click on "DumpPreUserCmd" to see 
a general 
description of the setting along with some variables you can use as arguments to your 
command.


I am attaching the script that I use on one of my servers where I have a couple of 
separate backups.  I don't remember all of the reasoning behind using a 
lock 
directory instead of a file, but the mkdir command is atomic, which is the important 
thing.  If it is called as "DumpPreUserCmd", then it attempts to create 
a lock 
directory.  If it is called with any other cmdType (such as "DumpPostUserCmd"), it 
removes the lock directory.  The lock directory is named for the host (which is set 
by ClientNameAlias, if it exists), so only one backup per host will run.


I call it like this from the BackupPC settings:
    DumpPreUserCmd:    /home/backuppc/lockfile.sh $cmdType $host
    DumpPostUserCmd:  /home/backuppc/lockfile.sh $cmdType $host

If you are editing the files directly, it will look like this:
    $Conf{DumpPostUserCmd} = '/home/backuppc/lockfile.sh $cmdType $host';
    $Conf{DumpPreUserCmd} = '/home/backuppc/lockfile.sh $cmdType 
$host';



On 4/9/2021 3:01 AM, Dave Sherohman wrote:


Do you know offhand of any online documentation on using pre/postcmd in 
this way?  
Using that and ClientNameAlias could be a solution, although it always makes me 
uneasy to use any backup target other than the root of a filesystem, due to the 
possibility of things falling through the cracks if someone creates a new directory 
outside of any existing targets and forgets to add it to the list.



On 4/8/21 10:33 PM, Bowie Bailey via BackupPC-users wrote:
You can use the DumpPreUserCmd and DumpPostUserCmd settings to manage lockfiles 
and make sure backups from the aliased hosts cannot run at the same time.


You can separate them in the scheduler by manually starting them at different 
times, or by disabling the automatic backups and using cron to start the backups 
at particular times.



On 4/8/2021 9:47 AM, Mike Hughes wrote:

Hi Dave,

You can always break a backup job into multiple backup 'hosts' by using 
the ClientNameAlias setting. I create hosts based on the share or folder for each 
job, then use the ClientNameAlias to point them to the same host.


-
*From:* Dave Sherohman 
*Sent:* Thursday, April 8, 2021 8:22 AM
*To:* General list for user discussion, questions and support 


*Subject:* [BackupPC-users] Handling machines too large to back themselves up

I have a server which I'm not able to back up because, apparently, it's just 
too big.

If you remember me asking about synology's weird rsync a couple weeks 
ago, it's 
that machine again.  We finally solved the rsync issues by ditching 
the synology 
rync entirely and installing one built from standard rsync source code and using 
that instead.  Using that, we were able to get one "full" backup, but it missed a 
bunch of files because we forgot to use sudo when we did it.  (The synology rsync 
is set up to run suid root and is hardcoded to not allow root to run it, so we 
had to take sudo out for that, then forgot to add it back in when we switched to 
standard rsync.)


Since then, every attempted backup has failed, either full or incremental, 
because the synology is running out of memory:


This is the rsync child about to exec /usr/libexec/backuppc-rsync/rsync_bpc
Xfer PIDs are now 1228998,1229014
xferPids 1228998,1229014
ERROR: out of memory in receive_sums [sender]
rsync error: error allocating core memory buffers (code 22) at util2.c(118) 
[sender=3.2.0dev]
Done: 0 errors, 0 filesExist, 0 sizeExist, 0 sizeExistComp, 0 filesTotal, 0 
sizeTotal, 0 filesNew, 0 sizeNew, 0 sizeNewComp, 32863617 inode
rsync_bpc: [generator] write error: Broken pipe (32)

The poor little NAS has only 6G of RAM vs. 9.4 TB of files (configured as two 
sharenames, /volume1 (8.5T) and /volume2 (885G) and doesn't seem up to the task 
of updating that much at once via rsync.


Adding insult to injury, even a failed attempt to back it up causes the bpc 
server to take 45 minutes to copy the directory structure from the previous 
backup before it even attempts to connect, and then 12-14 hours doing 
reference 
counts after it finishes backing up nothing.  Which makes trial-and-error 
painfully slow, since we can only try one thing, at most, each day.


In our last attempt, I tried flipping the order of the RsyncShareNames to do 
/volume2 first, thinking it might successfully back up the smaller share 
successfully before running out of memory trying to process the larger one. It 
did not run out of memory... but it did sit there for a full 24 hours 
with one 
CPU (out of four) running pegged at 99% handling 

Re: [BackupPC-users] Handling machines too large to back themselves up

2021-04-09 Thread Dave Sherohman

On 4/8/21 8:46 PM, Les Mikesell wrote:

On Thu, Apr 8, 2021 at 8:25 AM Dave Sherohman  wrote:

rsync error: error allocating core memory buffers (code 22) at util2.c(118) 
[sender=3.2.0dev]
This is more about the number of files than the size of the drive.  Do
you happen to know if there are directories containing millions of
tiny files that could feasibly be archived as a tar or zip file
instead of stored separately?


I don't know details of the filesystem contents on this machine, but our 
earlier not-quite-full (some files missed because we didn't use sudo) 
backup contained 24,058,239 files according to the bpc host status page, 
so that is a possibility.


The synology's admin replied faster than I expected and says there's a 
directory where scanned files are dropped which "contains a lot of 
files", so I'm looking into whether that can be archived (or skipped - 
most of the scans we deal with tend to be temporary files that are 
discarded after we do OCR on them).



Also, rsync versions newer than 3.x are supposed to handle it better.
  Is your server side extremely old?


The rsync on the synology NAS is version 3.0.9.  The bpc server is a 
brand-new Debian 11 install, with rsync 3.2.3.


The samba FAQ link mentions that the memory optimization "only works 
provided that both sides are 3.0.0 or newer and certain options that 
rsync currently can't handle in this mode are not being used." Any idea 
what those "certain options" might be?  The client-side rsync commands 
look pretty basic, so it's probably not using any of them, but it's a 
possibility.




___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Handling machines too large to back themselves up

2021-04-09 Thread Dave Sherohman
Do you know offhand of any online documentation on using pre/postcmd in 
this way?  Using that and ClientNameAlias could be a solution, although 
it always makes me uneasy to use any backup target other than the root 
of a filesystem, due to the possibility of things falling through the 
cracks if someone creates a new directory outside of any existing 
targets and forgets to add it to the list.



On 4/8/21 10:33 PM, Bowie Bailey via BackupPC-users wrote:
You can use the DumpPreUserCmd and DumpPostUserCmd settings to manage 
lockfiles and make sure backups from the aliased hosts cannot run at 
the same time.


You can separate them in the scheduler by manually starting them at 
different times, or by disabling the automatic backups and using cron 
to start the backups at particular times.



On 4/8/2021 9:47 AM, Mike Hughes wrote:

Hi Dave,

You can always break a backup job into multiple backup 'hosts' by 
using the ClientNameAlias setting. I create hosts based on the share 
or folder for each job, then use the ClientNameAlias to point them to 
the same host.



*From:* Dave Sherohman 
*Sent:* Thursday, April 8, 2021 8:22 AM
*To:* General list for user discussion, questions and support 

*Subject:* [BackupPC-users] Handling machines too large to back 
themselves up


I have a server which I'm not able to back up because, apparently, 
it's just too big.


If you remember me asking about synology's weird rsync a couple weeks 
ago, it's that machine again.  We finally solved the rsync issues by 
ditching the synology rync entirely and installing one built from 
standard rsync source code and using that instead.  Using that, we 
were able to get one "full" backup, but it missed a bunch of files 
because we forgot to use sudo when we did it.  (The synology rsync is 
set up to run suid root and is hardcoded to not allow root to run it, 
so we had to take sudo out for that, then forgot to add it back in 
when we switched to standard rsync.)


Since then, every attempted backup has failed, either full or 
incremental, because the synology is running out of memory:


This is the rsync child about to exec /usr/libexec/backuppc-rsync/rsync_bpc
Xfer PIDs are now 1228998,1229014
xferPids 1228998,1229014
ERROR: out of memory in receive_sums [sender]
rsync error: error allocating core memory buffers (code 22) at util2.c(118) 
[sender=3.2.0dev]
Done: 0 errors, 0 filesExist, 0 sizeExist, 0 sizeExistComp, 0 filesTotal, 0 
sizeTotal, 0 filesNew, 0 sizeNew, 0 sizeNewComp, 32863617 inode
rsync_bpc: [generator] write error: Broken pipe (32)

The poor little NAS has only 6G of RAM vs. 9.4 TB of files 
(configured as two sharenames, /volume1 (8.5T) and /volume2 (885G) 
and doesn't seem up to the task of updating that much at once via rsync.


Adding insult to injury, even a failed attempt to back it up causes 
the bpc server to take 45 minutes to copy the directory structure 
from the previous backup before it even attempts to connect, and then 
12-14 hours doing reference counts after it finishes backing up 
nothing.  Which makes trial-and-error painfully slow, since we can 
only try one thing, at most, each day.


In our last attempt, I tried flipping the order of the 
RsyncShareNames to do /volume2 first, thinking it might successfully 
back up the smaller share successfully before running out of memory 
trying to process the larger one.  It did not run out of memory... 
but it did sit there for a full 24 hours with one CPU (out of four) 
running pegged at 99% handling the rsync process before we finally 
put it out of its misery.  The bpc xferlog recorded that the 
connection was closed unexpectedly (which is fair, since we killed 
the other end) after 3182 bytes were received, so the client clearly 
hadn't started sending data yet.  And now, after that attempt, the 
bpc server still lists the status as "refCnt #2" another 24 hours 
after the client-side rsync was killed.


So, aside from adding RAM, is there anything else we can do to try to 
work around this?  Would it be possible to break this one backup down 
into smaller chunks that are still recognized as a single host (so 
they run in sequence and don't get scheduled concurrently), but don't 
require the client to diff large amounts of data in one go, and maybe 
also speed up the reference counting a bit?


An "optimization" (or at least an option) to completely skip the 
reference count updates after a backup fails with zero files received 
(and, therefore, no new/changed references to worry about) might also 
not be a bad idea.




___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project:https://backuppc.github.io/backuppc/




___
BackupPC-use

Re: [BackupPC-users] Handling machines too large to back themselves up

2021-04-08 Thread Bowie Bailey via BackupPC-users
You can use the DumpPreUserCmd and DumpPostUserCmd settings to manage lockfiles and 
make sure backups from the aliased hosts cannot run at the same time.


You can separate them in the scheduler by manually starting them at different times, 
or by disabling the automatic backups and using cron to start the backups 
at 
particular times.



On 4/8/2021 9:47 AM, Mike Hughes wrote:

Hi Dave,

You can always break a backup job into multiple backup 'hosts' by using 


the ClientNameAlias setting. I create hosts based on the share or folder for each 
job, then use the ClientNameAlias to point them to the same host.


-
*From:* Dave Sherohman 
*Sent:* Thursday, April 8, 2021 8:22 AM
*To:* General list for user discussion, questions and support 


*Subject:* [BackupPC-users] Handling machines too large to back themselves up

I have a server which I'm not able to back up because, apparently, it's 

just too big.


If you remember me asking about synology's weird rsync a couple weeks ago, it's 
that machine again.  We finally solved the rsync issues by ditching the synology 
rync entirely and installing one built from standard rsync source code and using 
that instead.  Using that, we were able to get one "full" backup, but 
it missed a 
bunch of files because we forgot to use sudo when we did it.  (The synology rsync 
is set up to run suid root and is hardcoded to not allow root to run it, so we had 
to take sudo out for that, then forgot to add it back in when we switched to 
standard rsync.)


Since then, every attempted backup has failed, either full or incremental, because 
the synology is running out of memory:


This is the rsync child about to exec /usr/libexec/backuppc-rsync/rsync_bpc
Xfer PIDs are now 1228998,1229014
xferPids 1228998,1229014
ERROR: out of memory in receive_sums [sender]
rsync error: error allocating core memory buffers (code 22) at util2.c(118) 
[sender=3.2.0dev]
Done: 0 errors, 0 filesExist, 0 sizeExist, 0 sizeExistComp, 0 filesTotal, 0 
sizeTotal, 0 filesNew, 0 sizeNew, 0 sizeNewComp, 32863617 inode
rsync_bpc: [generator] write error: Broken pipe (32)

The poor little NAS has only 6G of RAM vs. 9.4 TB of files (configured as two 
sharenames, /volume1 (8.5T) and /volume2 (885G) and doesn't seem up to the task of 
updating that much at once via rsync.


Adding insult to injury, even a failed attempt to back it up causes the 
bpc server 
to take 45 minutes to copy the directory structure from the previous backup before 
it even attempts to connect, and then 12-14 hours doing reference counts after it 
finishes backing up nothing.  Which makes trial-and-error painfully slow, since we 
can only try one thing, at most, each day.


In our last attempt, I tried flipping the order of the RsyncShareNames to do 
/volume2 first, thinking it might successfully back up the smaller share 
successfully before running out of memory trying to process the larger one.  It did 
not run out of memory... but it did sit there for a full 24 hours with one CPU (out 
of four) running pegged at 99% handling the rsync process before we finally put it 
out of its misery.  The bpc xferlog recorded that the connection was closed 
unexpectedly (which is fair, since we killed the other end) after 3182 bytes were 
received, so the client clearly hadn't started sending data yet.  And 
now, after 
that attempt, the bpc server still lists the status as "refCnt #2" another 24 hours 
after the client-side rsync was killed.


So, aside from adding RAM, is there anything else we can do to try to work around 
this?  Would it be possible to break this one backup down into smaller chunks that 
are still recognized as a single host (so they run in sequence and don't get 
scheduled concurrently), but don't require the client to diff large amounts of data 
in one go, and maybe also speed up the reference counting a bit?


An "optimization" (or at least an option) to completely skip the reference count 
updates after a backup fails with zero files received (and, therefore, no 
new/changed references to worry about) might also not be a bad idea.




___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Handling machines too large to back themselves up

2021-04-08 Thread Les Mikesell
On Thu, Apr 8, 2021 at 8:25 AM Dave Sherohman  wrote:

> rsync error: error allocating core memory buffers (code 22) at util2.c(118) 
> [sender=3.2.0dev]

This is more about the number of files than the size of the drive.  Do
you happen to know if there are directories containing millions of
tiny files that could feasibly be archived as a tar or zip file
instead of stored separately?

Also, rsync versions newer than 3.x are supposed to handle it better.
 Is your server side extremely old?
https://rsync.samba.org/FAQ.html#4

-- 
   Les Mikesell
 lesmikes...@gmail.com


___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Handling machines too large to back themselves up

2021-04-08 Thread Mike Hughes
Hi Dave,

You can always break a backup job into multiple backup 'hosts' by using the 
ClientNameAlias setting. I create hosts based on the share or folder for each 
job, then use the ClientNameAlias to point them to the same host.


From: Dave Sherohman 
Sent: Thursday, April 8, 2021 8:22 AM
To: General list for user discussion, questions and support 

Subject: [BackupPC-users] Handling machines too large to back themselves up


I have a server which I'm not able to back up because, apparently, it's just 
too big.

If you remember me asking about synology's weird rsync a couple weeks ago, it's 
that machine again.  We finally solved the rsync issues by ditching the 
synology rync entirely and installing one built from standard rsync source code 
and using that instead.  Using that, we were able to get one "full" backup, but 
it missed a bunch of files because we forgot to use sudo when we did it.  (The 
synology rsync is set up to run suid root and is hardcoded to not allow root to 
run it, so we had to take sudo out for that, then forgot to add it back in when 
we switched to standard rsync.)

Since then, every attempted backup has failed, either full or incremental, 
because the synology is running out of memory:

This is the rsync child about to exec /usr/libexec/backuppc-rsync/rsync_bpc
Xfer PIDs are now 1228998,1229014
xferPids 1228998,1229014
ERROR: out of memory in receive_sums [sender]
rsync error: error allocating core memory buffers (code 22) at util2.c(118) 
[sender=3.2.0dev]
Done: 0 errors, 0 filesExist, 0 sizeExist, 0 sizeExistComp, 0 filesTotal, 0 
sizeTotal, 0 filesNew, 0 sizeNew, 0 sizeNewComp, 32863617 inode
rsync_bpc: [generator] write error: Broken pipe (32)



The poor little NAS has only 6G of RAM vs. 9.4 TB of files (configured as two 
sharenames, /volume1 (8.5T) and /volume2 (885G) and doesn't seem up to the task 
of updating that much at once via rsync.

Adding insult to injury, even a failed attempt to back it up causes the bpc 
server to take 45 minutes to copy the directory structure from the previous 
backup before it even attempts to connect, and then 12-14 hours doing reference 
counts after it finishes backing up nothing.  Which makes trial-and-error 
painfully slow, since we can only try one thing, at most, each day.

In our last attempt, I tried flipping the order of the RsyncShareNames to do 
/volume2 first, thinking it might successfully back up the smaller share 
successfully before running out of memory trying to process the larger one.  It 
did not run out of memory... but it did sit there for a full 24 hours with one 
CPU (out of four) running pegged at 99% handling the rsync process before we 
finally put it out of its misery.  The bpc xferlog recorded that the connection 
was closed unexpectedly (which is fair, since we killed the other end) after 
3182 bytes were received, so the client clearly hadn't started sending data 
yet.  And now, after that attempt, the bpc server still lists the status as 
"refCnt #2" another 24 hours after the client-side rsync was killed.

So, aside from adding RAM, is there anything else we can do to try to work 
around this?  Would it be possible to break this one backup down into smaller 
chunks that are still recognized as a single host (so they run in sequence and 
don't get scheduled concurrently), but don't require the client to diff large 
amounts of data in one go, and maybe also speed up the reference counting a bit?

An "optimization" (or at least an option) to completely skip the reference 
count updates after a backup fails with zero files received (and, therefore, no 
new/changed references to worry about) might also not be a bad idea.
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


[BackupPC-users] Handling machines too large to back themselves up

2021-04-08 Thread Dave Sherohman
I have a server which I'm not able to back up because, apparently, it's 
just too big.


If you remember me asking about synology's weird rsync a couple weeks 
ago, it's that machine again.  We finally solved the rsync issues by 
ditching the synology rync entirely and installing one built from 
standard rsync source code and using that instead. Using that, we were 
able to get one "full" backup, but it missed a bunch of files because we 
forgot to use sudo when we did it.  (The synology rsync is set up to run 
suid root and is hardcoded to not allow root to run it, so we had to 
take sudo out for that, then forgot to add it back in when we switched 
to standard rsync.)


Since then, every attempted backup has failed, either full or 
incremental, because the synology is running out of memory:


This is the rsync child about to exec /usr/libexec/backuppc-rsync/rsync_bpc
Xfer PIDs are now 1228998,1229014
xferPids 1228998,1229014
ERROR: out of memory in receive_sums [sender]
rsync error: error allocating core memory buffers (code 22) at util2.c(118) 
[sender=3.2.0dev]
Done: 0 errors, 0 filesExist, 0 sizeExist, 0 sizeExistComp, 0 filesTotal, 0 
sizeTotal, 0 filesNew, 0 sizeNew, 0 sizeNewComp, 32863617 inode
rsync_bpc: [generator] write error: Broken pipe (32)

The poor little NAS has only 6G of RAM vs. 9.4 TB of files (configured 
as two sharenames, /volume1 (8.5T) and /volume2 (885G) and doesn't seem 
up to the task of updating that much at once via rsync.


Adding insult to injury, even a failed attempt to back it up causes the 
bpc server to take 45 minutes to copy the directory structure from the 
previous backup before it even attempts to connect, and then 12-14 hours 
doing reference counts after it finishes backing up nothing.  Which 
makes trial-and-error painfully slow, since we can only try one thing, 
at most, each day.


In our last attempt, I tried flipping the order of the RsyncShareNames 
to do /volume2 first, thinking it might successfully back up the smaller 
share successfully before running out of memory trying to process the 
larger one.  It did not run out of memory... but it did sit there for a 
full 24 hours with one CPU (out of four) running pegged at 99% handling 
the rsync process before we finally put it out of its misery.  The bpc 
xferlog recorded that the connection was closed unexpectedly (which is 
fair, since we killed the other end) after 3182 bytes were received, so 
the client clearly hadn't started sending data yet. And now, after that 
attempt, the bpc server still lists the status as "refCnt #2" another 24 
hours after the client-side rsync was killed.


So, aside from adding RAM, is there anything else we can do to try to 
work around this?  Would it be possible to break this one backup down 
into smaller chunks that are still recognized as a single host (so they 
run in sequence and don't get scheduled concurrently), but don't require 
the client to diff large amounts of data in one go, and maybe also speed 
up the reference counting a bit?


An "optimization" (or at least an option) to completely skip the 
reference count updates after a backup fails with zero files received 
(and, therefore, no new/changed references to worry about) might also 
not be a bad idea.


___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/