Re: [BackupPC-users] Backuppc in large environments
On 2/12/20 10:35, G.W. Haywood via BackupPC-users wrote: Hi there, On Tue, 1 Dec 2020, backuppc-users-requ...@lists.sourceforge.net wrote: How big can backuppc reasonably scale? Remember you can scale vertically or horizontally. Either get a bigger machine for your backups, or get more small machines. If you had 3 (or more) small machines, you can set 2 to backup each target, this gives you some additional redundancy of your backups infrastructure, as long as your backup windows can support this, or backups don't add enough load to interfere with your daily operations. I guess at some point using too small machines would be more painful to manage, but there are a lot of options for scaling. Most people (vague observations) I think just scale vertically and add enough RAM or IO performance to handle the load. ... daily backup volume is running around 750 GB per day, with two database servers providing the majority of that volume (400 GB/day from one and 150 GB/day from the other). That's the part which bothers me. I'm not sure that BackupPC's ways of checking for changed files marry well with database files. In a typical relational database server you'll have some *big* files which are modified by more or less random accesses. They will *always* be changed from the last backup. The backup of virtual machines is not dissimilar at the level of the partition image. You need to stop the machine to get a consistent backup, or use something like a snapshot. I just want to second this, my preference is to snapshot the VM (a pre backup script from backuppc) and then backup the content of the VM (the actual target I use is the SAN server rather than the VM itself). For the DB, you should exclude the actual DB files, and have a script (either called separately or from BPC pre backup) which will export/dump the DB to another consistent file. If possible, this file should be uncompressed (allows rsync to better see the unchanged data), and with the same filename/path each day (again so rsync/BPC will see this as a file with some small amount of changes instead of a massive new file). If you do that, you might see your daily "changes" reduce compared to before. ... I have no idea what to expect the backup server to need in the way of processing power. Modest. I've backed up dozens of Windows workstations and five or six servers with just a 1.4GHz Celeron which was kicking around after it was retired from the sales office. The biggest CPU hog is likely to be data compression, which you can tune. Walking directory trees can cause rsync to use quite a lot of memory. You might want to look at something like Icinga/Nagios to keep an eye on things. FYI, I backup 57 hosts, my current BPC pool size if 7TB, 23M files. Some of my backup clients are external on the Internet, some are windows, most are linux. My BPC server has 8G RAM and a quad core CPU: Intel(R) Core(TM) i3-4150 CPU @ 3.50GHz As others have said, you are most likely to be IO bound after the first couple of backups. You are probably advised to grab a spare machine, setup BPC, run a couple of backups against a couple of smaller targets, once you have it working (if all goes smoothly, under 2 hours), target a larger server, you will soon start to see how it performs in your environment, and where the relevant bottlenecks are. PS, All you need to think about is the CPU requirement to compress 750GB per backup cycle (you only need to compress the changed files), and the disk IO to write the 750GB (plus a lot of disk IO to do all the comparisons, which is probably the main load, which is why you also want a lot of RAM to cache the directory trees). Regards, Adam ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:https://github.com/backuppc/backuppc/wiki Project: https://backuppc.github.io/backuppc/
Re: [BackupPC-users] Backuppc in large environments
Hi there, On Tue, 1 Dec 2020, backuppc-users-requ...@lists.sourceforge.net wrote: How big can backuppc reasonably scale? You can scale it yourself as has already been suggested, but I don't think you'd have any problems with a single backup server and the data volumes you've described if you were sensible about the configuration, which is very flexible. However... ... daily backup volume is running around 750 GB per day, with two database servers providing the majority of that volume (400 GB/day from one and 150 GB/day from the other). That's the part which bothers me. I'm not sure that BackupPC's ways of checking for changed files marry well with database files. In a typical relational database server you'll have some *big* files which are modified by more or less random accesses. They will *always* be changed from the last backup. The backup of virtual machines is not dissimilar at the level of the partition image. You need to stop the machine to get a consistent backup, or use something like a snapshot. Normally I do some sort of separate database dump for database files, and run that system separately from run-of-the-mill Linux/Windows box server/workstation backups. After all, I usually just want a single good backup of any database. Having several copies, aged at one day, one week, two weeks, a month etc. would usually be of no use to me. ... I have no idea what to expect the backup server to need in the way of processing power. Modest. I've backed up dozens of Windows workstations and five or six servers with just a 1.4GHz Celeron which was kicking around after it was retired from the sales office. The biggest CPU hog is likely to be data compression, which you can tune. Walking directory trees can cause rsync to use quite a lot of memory. You might want to look at something like Icinga/Nagios to keep an eye on things. -- 73, Ged. ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:https://github.com/backuppc/backuppc/wiki Project: https://backuppc.github.io/backuppc/
Re: [BackupPC-users] Backuppc in large environments
- Le 1 Déc 20, à 16:33, Dave Sherohman dave.sheroh...@ub.lu.se a écrit : > > Is this something that backuppc could reliably handle? > > If so, what kind of CPU resources would it require? I've already got a > decent handle on the network requirements from observing the current TSM > backups and can calculate likely disk storage needs, but I have no idea > what to expect the backup server to need in the way of processing power. > While not as big as you, I manage a reasonably big BackupPC server, on a single box. It's backing up 193 hosts in total, the pool is ~15TB, ~27 million files. The hosts are a mix of a lot of different stuff (mostly VM, but also a few appliances, and physical servers), with various backup frequency and history config. Most are backed up daily, but some are weekly. It usually represent between 200 and 600GB of new data per day. I'm running this on a single box with those spec : * CPU Intel Xeon D-1541 @ 2.10GHz * 32GB of RAM * 2x120GB SSD for the OS (CentOS 7) * 4x12TB SATA in a ZFS pool (~ RAID10) I'm using the lz4 compression provided by ZFS, so turned the BackupPC one off. While I do see some slowliness from time to time, it's working well. Long story short: don't bother with CPU. Except for the very first backups where it can be a bottleneck, disk I/O is what will limit general speed. Spend more in fast disks or SSD. If using ZFS, NVMe as a slog can help (or as special metadata vdev, although I haven't tested it yet). And as much RAM as you can. with what you have left, choose a decent CPU, but don't spend too much on it. Cheers, Daniel -- [ https://www.firewall-services.com/ ] Daniel Berteaud FIREWALL-SERVICES SAS, La sécurité des réseaux Société de Services en Logiciels Libres Tél : +33.5 56 64 15 32 Matrix: @dani:fws.fr [ https://www.firewall-services.com/ | https://www.firewall-services.com ] ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:https://github.com/backuppc/backuppc/wiki Project: https://backuppc.github.io/backuppc/
Re: [BackupPC-users] Backuppc in large environments
On Tue, Dec 1, 2020 at 9:50 AM Dave Sherohman wrote: > > Is this something that backuppc could reliably handle? > > If so, what kind of CPU resources would it require? I've already got a > decent handle on the network requirements from observing the current TSM > backups and can calculate likely disk storage needs, but I have no idea > what to expect the backup server to need in the way of processing power. > The price is right for the software and you'll use a lot less disk space than other methods (with the big plus of instant access to single files or directories), so consider that you could divide the backups into two or more groups handled by different servers if you run into trouble with just one. It will help if the server has a lot of ram, and if you back up the virtual machines like individual hosts instead of their image files. Likewise you'll probably need to interact with the database servers with commands backuppc can send to get dumps that won't change during the copy. -- Les Mikesell lesmikes...@gmail.com ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:https://github.com/backuppc/backuppc/wiki Project: https://backuppc.github.io/backuppc/
Re: [BackupPC-users] Backuppc in large environments
So long story short, a lot of it will depend on how fast your data changes/grows, but it doesn't necessarily require a high end computer. You really just need something beefy enough as to not be the bottleneck. If you can make the client I/O the bottleneck, then you're good. Depending on your budget (or what you have lying around) a decent AMD budget Ryzen system would work quite nicely. If you're familiar with Debian then I'm sure it's well documented how to install and setup. I maintain the Fedora EPEL version and run it on CentOS 8 quite nicely. Thanks, Richard ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:https://github.com/backuppc/backuppc/wiki Project: https://backuppc.github.io/backuppc/
Re: [BackupPC-users] Backuppc in large environments
My network is rather smaller but still bigger than most home systems. Please keep that in mind. The backup server is a very elderly "Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz" with 8G RAM. /var/lib/backuppc is a ZFS raidz array of three 4TB disks, giving a useful space of 3.6T, of which 1.1T is now used. The CGI interface reports: There are 9 hosts that have been backed up, for a total of: * 109 full backups of total size 15511.81GB (prior to pooling and compression), * 65 incr backups of total size 235.11GB (prior to pooling and compression). but I like to keep an archive as well as a backup so storing 15.5TB of files in 1.1TB of space may be misleading because there are so many files de-duplicated. The server is on a single 1 gigabit NIC. It runs up to four backups simultaneously and a full backup of a 0.4TB machine takes araound 12 hours; this appears to be disk IO bound at each end as incremental backups of other machines proceed at a decent rate. TL:DR: A 10 year old box very easily copes with my load. YMMV. In particular, you may wish to have more than one ethernet NIC and perhaps more RAM. Paul On 01/12/2020 15:33, Dave Sherohman wrote: > Hey, all! > > I've been looking at setting up amanda as a backup solution for a > fairly large environment at work and have just stumbled across > backuppc. While I love the design and scheduling methods of amanda, > I'm also a big fan of incremental-only reverse-delta backup methods > such as that used by backuppc, so now I'm wondering... > > How big can backuppc reasonably scale? > > The environment I'm dealing with includes around 75 various servers > (about 2/3 virtual, 1/3 physical), mostly running Debian, with a few > machines running other linux distros and maybe a dozen Windows > machines. Total data size that we want to maintain backups for is > around 70 TB. Our current backup system is using Tivoli Storage > Manager, a commercial product that uses an incremental-only strategy > similar to backuppc's, and the daily backup volume is running around > 750 GB per day, with two database servers providing the majority of > that volume (400 GB/day from one and 150 GB/day from the other). > > Is this something that backuppc could reliably handle? > > If so, what kind of CPU resources would it require? I've already got > a decent handle on the network requirements from observing the current > TSM backups and can calculate likely disk storage needs, but I have no > idea what to expect the backup server to need in the way of processing > power. > > > > ___ > BackupPC-users mailing list > BackupPC-users@lists.sourceforge.net > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: https://github.com/backuppc/backuppc/wiki > Project: https://backuppc.github.io/backuppc/ OpenPGP_0xBA5077290CFFDDA6.asc Description: application/pgp-keys OpenPGP_signature Description: OpenPGP digital signature ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:https://github.com/backuppc/backuppc/wiki Project: https://backuppc.github.io/backuppc/
Re: [BackupPC-users] Backuppc in large environments
Not a direct response to your question but I run my to backup computers at my home, so quite a bit smaller scale, however, the 4th gen i5 SFF PC I bought off Ebay w/ 1TB hard drive dedicated to BackupPC and M.2 SSD for CentOS 8 works quite well for me, so a REAL computer should do fine. I did max out the memory with 8GB. [image: image.png] Thanks, Richard ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:https://github.com/backuppc/backuppc/wiki Project: https://backuppc.github.io/backuppc/
[BackupPC-users] Backuppc in large environments
Hey, all! I've been looking at setting up amanda as a backup solution for a fairly large environment at work and have just stumbled across backuppc. While I love the design and scheduling methods of amanda, I'm also a big fan of incremental-only reverse-delta backup methods such as that used by backuppc, so now I'm wondering... How big can backuppc reasonably scale? The environment I'm dealing with includes around 75 various servers (about 2/3 virtual, 1/3 physical), mostly running Debian, with a few machines running other linux distros and maybe a dozen Windows machines. Total data size that we want to maintain backups for is around 70 TB. Our current backup system is using Tivoli Storage Manager, a commercial product that uses an incremental-only strategy similar to backuppc's, and the daily backup volume is running around 750 GB per day, with two database servers providing the majority of that volume (400 GB/day from one and 150 GB/day from the other). Is this something that backuppc could reliably handle? If so, what kind of CPU resources would it require? I've already got a decent handle on the network requirements from observing the current TSM backups and can calculate likely disk storage needs, but I have no idea what to expect the backup server to need in the way of processing power. ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:https://github.com/backuppc/backuppc/wiki Project: https://backuppc.github.io/backuppc/