On 2/12/20 10:35, G.W. Haywood via BackupPC-users wrote:
Hi there,
On Tue, 1 Dec 2020, backuppc-users-requ...@lists.sourceforge.net wrote:
How big can backuppc reasonably scale?
Remember you can scale vertically or horizontally. Either get a bigger
machine for your backups, or get more small machines. If you had 3 (or
more) small machines, you can set 2 to backup each target, this gives
you some additional redundancy of your backups infrastructure, as long
as your backup windows can support this, or backups don't add enough
load to interfere with your daily operations.
I guess at some point using too small machines would be more painful to
manage, but there are a lot of options for scaling. Most people (vague
observations) I think just scale vertically and add enough RAM or IO
performance to handle the load.
... daily backup volume is running around 750 GB per day, with two
database servers providing the majority of that volume (400 GB/day
from one and 150 GB/day from the other).
That's the part which bothers me. I'm not sure that BackupPC's ways
of checking for changed files marry well with database files. In a
typical relational database server you'll have some *big* files which
are modified by more or less random accesses. They will *always* be
changed from the last backup. The backup of virtual machines is not
dissimilar at the level of the partition image. You need to stop the
machine to get a consistent backup, or use something like a snapshot.
I just want to second this, my preference is to snapshot the VM (a pre
backup script from backuppc) and then backup the content of the VM (the
actual target I use is the SAN server rather than the VM itself). For
the DB, you should exclude the actual DB files, and have a script
(either called separately or from BPC pre backup) which will export/dump
the DB to another consistent file. If possible, this file should be
uncompressed (allows rsync to better see the unchanged data), and with
the same filename/path each day (again so rsync/BPC will see this as a
file with some small amount of changes instead of a massive new file).
If you do that, you might see your daily "changes" reduce compared to
before.
... I have no idea what to expect the backup server to need in the
way of processing power.
Modest. I've backed up dozens of Windows workstations and five or six
servers with just a 1.4GHz Celeron which was kicking around after it
was retired from the sales office. The biggest CPU hog is likely to
be data compression, which you can tune. Walking directory trees can
cause rsync to use quite a lot of memory. You might want to look at
something like Icinga/Nagios to keep an eye on things.
FYI, I backup 57 hosts, my current BPC pool size if 7TB, 23M files. Some
of my backup clients are external on the Internet, some are windows,
most are linux.
My BPC server has 8G RAM and a quad core CPU:
Intel(R) Core(TM) i3-4150 CPU @ 3.50GHz
As others have said, you are most likely to be IO bound after the first
couple of backups. You are probably advised to grab a spare machine,
setup BPC, run a couple of backups against a couple of smaller targets,
once you have it working (if all goes smoothly, under 2 hours), target a
larger server, you will soon start to see how it performs in your
environment, and where the relevant bottlenecks are.
PS, All you need to think about is the CPU requirement to compress 750GB
per backup cycle (you only need to compress the changed files), and the
disk IO to write the 750GB (plus a lot of disk IO to do all the
comparisons, which is probably the main load, which is why you also want
a lot of RAM to cache the directory trees).
Regards,
Adam
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/