On 2/12/20 10:35, G.W. Haywood via BackupPC-users wrote:
Hi there,

On Tue, 1 Dec 2020, backuppc-users-requ...@lists.sourceforge.net wrote:

How big can backuppc reasonably scale?

Remember you can scale vertically or horizontally. Either get a bigger machine for your backups, or get more small machines. If you had 3 (or more) small machines, you can set 2 to backup each target, this gives you some additional redundancy of your backups infrastructure, as long as your backup windows can support this, or backups don't add enough load to interfere with your daily operations.

I guess at some point using too small machines would be more painful to manage, but there are a lot of options for scaling. Most people (vague observations) I think just scale vertically and add enough RAM or IO performance to handle the load.


... daily backup volume is running around 750 GB per day, with two
database servers providing the majority of that volume (400 GB/day
from one and 150 GB/day from the other).

That's the part which bothers me.  I'm not sure that BackupPC's ways
of checking for changed files marry well with database files.  In a
typical relational database server you'll have some *big* files which
are modified by more or less random accesses.  They will *always* be
changed from the last backup.  The backup of virtual machines is not
dissimilar at the level of the partition image.  You need to stop the
machine to get a consistent backup, or use something like a snapshot.

I just want to second this, my preference is to snapshot the VM (a pre backup script from backuppc) and then backup the content of the VM (the actual target I use is the SAN server rather than the VM itself). For the DB, you should exclude the actual DB files, and have a script (either called separately or from BPC pre backup) which will export/dump the DB to another consistent file. If possible, this file should be uncompressed (allows rsync to better see the unchanged data), and with the same filename/path each day (again so rsync/BPC will see this as a file with some small amount of changes instead of a massive new file).

If you do that, you might see your daily "changes" reduce compared to before.

... I have no idea what to expect the backup server to need in the
way of processing power.

Modest.  I've backed up dozens of Windows workstations and five or six
servers with just a 1.4GHz Celeron which was kicking around after it
was retired from the sales office.  The biggest CPU hog is likely to
be data compression, which you can tune.  Walking directory trees can
cause rsync to use quite a lot of memory.  You might want to look at
something like Icinga/Nagios to keep an eye on things.

FYI, I backup 57 hosts, my current BPC pool size if 7TB, 23M files. Some of my backup clients are external on the Internet, some are windows, most are linux.

My BPC server has 8G RAM and a quad core CPU:
Intel(R) Core(TM) i3-4150 CPU @ 3.50GHz

As others have said, you are most likely to be IO bound after the first couple of backups. You are probably advised to grab a spare machine, setup BPC, run a couple of backups against a couple of smaller targets, once you have it working (if all goes smoothly, under 2 hours), target a larger server, you will soon start to see how it performs in your environment, and where the relevant bottlenecks are.

PS, All you need to think about is the CPU requirement to compress 750GB per backup cycle (you only need to compress the changed files), and the disk IO to write the 750GB (plus a lot of disk IO to do all the comparisons, which is probably the main load, which is why you also want a lot of RAM to cache the directory trees).

Regards,
Adam



_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/

Reply via email to