Re: [BackupPC-users] Backuppc in large environments

2020-12-01 Thread Adam Goryachev via BackupPC-users


On 2/12/20 10:35, G.W. Haywood via BackupPC-users wrote:

Hi there,

On Tue, 1 Dec 2020, backuppc-users-requ...@lists.sourceforge.net wrote:


How big can backuppc reasonably scale?


Remember you can scale vertically or horizontally. Either get a bigger 
machine for your backups, or get more small machines. If you had 3 (or 
more) small machines, you can set 2 to backup each target, this gives 
you some additional redundancy of your backups infrastructure, as long 
as your backup windows can support this, or backups don't add enough 
load to interfere with your daily operations.


I guess at some point using too small machines would be more painful to 
manage, but there are a lot of options for scaling. Most people (vague 
observations) I think just scale vertically and add enough RAM or IO 
performance to handle the load.




... daily backup volume is running around 750 GB per day, with two
database servers providing the majority of that volume (400 GB/day
from one and 150 GB/day from the other).


That's the part which bothers me.  I'm not sure that BackupPC's ways
of checking for changed files marry well with database files.  In a
typical relational database server you'll have some *big* files which
are modified by more or less random accesses.  They will *always* be
changed from the last backup.  The backup of virtual machines is not
dissimilar at the level of the partition image.  You need to stop the
machine to get a consistent backup, or use something like a snapshot.

I just want to second this, my preference is to snapshot the VM (a pre 
backup script from backuppc) and then backup the content of the VM (the 
actual target I use is the SAN server rather than the VM itself). For 
the DB, you should exclude the actual DB files, and have a script 
(either called separately or from BPC pre backup) which will export/dump 
the DB to another consistent file. If possible, this file should be 
uncompressed (allows rsync to better see the unchanged data), and with 
the same filename/path each day (again so rsync/BPC will see this as a 
file with some small amount of changes instead of a massive new file).


If you do that, you might see your daily "changes" reduce compared to 
before.



... I have no idea what to expect the backup server to need in the
way of processing power.


Modest.  I've backed up dozens of Windows workstations and five or six
servers with just a 1.4GHz Celeron which was kicking around after it
was retired from the sales office.  The biggest CPU hog is likely to
be data compression, which you can tune.  Walking directory trees can
cause rsync to use quite a lot of memory.  You might want to look at
something like Icinga/Nagios to keep an eye on things.

FYI, I backup 57 hosts, my current BPC pool size if 7TB, 23M files. Some 
of my backup clients are external on the Internet, some are windows, 
most are linux.


My BPC server has 8G RAM and a quad core CPU:
Intel(R) Core(TM) i3-4150 CPU @ 3.50GHz

As others have said, you are most likely to be IO bound after the first 
couple of backups. You are probably advised to grab a spare machine, 
setup BPC, run a couple of backups against a couple of smaller targets, 
once you have it working (if all goes smoothly, under 2 hours), target a 
larger server, you will soon start to see how it performs in your 
environment, and where the relevant bottlenecks are.


PS, All you need to think about is the CPU requirement to compress 750GB 
per backup cycle (you only need to compress the changed files), and the 
disk IO to write the 750GB (plus a lot of disk IO to do all the 
comparisons, which is probably the main load, which is why you also want 
a lot of RAM to cache the directory trees).


Regards,
Adam



___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Backuppc in large environments

2020-12-01 Thread G.W. Haywood via BackupPC-users

Hi there,

On Tue, 1 Dec 2020, backuppc-users-requ...@lists.sourceforge.net wrote:


How big can backuppc reasonably scale?


You can scale it yourself as has already been suggested, but I don't
think you'd have any problems with a single backup server and the data
volumes you've described if you were sensible about the configuration,
which is very flexible.  However...


... daily backup volume is running around 750 GB per day, with two
database servers providing the majority of that volume (400 GB/day
from one and 150 GB/day from the other).


That's the part which bothers me.  I'm not sure that BackupPC's ways
of checking for changed files marry well with database files.  In a
typical relational database server you'll have some *big* files which
are modified by more or less random accesses.  They will *always* be
changed from the last backup.  The backup of virtual machines is not
dissimilar at the level of the partition image.  You need to stop the
machine to get a consistent backup, or use something like a snapshot.

Normally I do some sort of separate database dump for database files,
and run that system separately from run-of-the-mill Linux/Windows box
server/workstation backups.  After all, I usually just want a single
good backup of any database.  Having several copies, aged at one day,
one week, two weeks, a month etc. would usually be of no use to me.


... I have no idea what to expect the backup server to need in the
way of processing power.


Modest.  I've backed up dozens of Windows workstations and five or six
servers with just a 1.4GHz Celeron which was kicking around after it
was retired from the sales office.  The biggest CPU hog is likely to
be data compression, which you can tune.  Walking directory trees can
cause rsync to use quite a lot of memory.  You might want to look at
something like Icinga/Nagios to keep an eye on things.

--

73,
Ged.


___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Backuppc in large environments

2020-12-01 Thread Daniel Berteaud
- Le 1 Déc 20, à 16:33, Dave Sherohman dave.sheroh...@ub.lu.se a écrit :

> 
> Is this something that backuppc could reliably handle?
> 
> If so, what kind of CPU resources would it require?  I've already got a
> decent handle on the network requirements from observing the current TSM
> backups and can calculate likely disk storage needs, but I have no idea
> what to expect the backup server to need in the way of processing power.
> 

While not as big as you, I manage a reasonably big BackupPC server, on a single 
box. It's backing up 193 hosts in total, the pool is ~15TB, ~27 million files. 
The hosts are a mix of a lot of different stuff (mostly VM, but also a few 
appliances, and physical servers), with various backup frequency and history 
config. Most are backed up daily, but some are weekly. It usually represent 
between 200 and 600GB of new data per day.

I'm running this on a single box with those spec :
  * CPU Intel Xeon D-1541 @ 2.10GHz
  * 32GB of RAM
  * 2x120GB SSD for the OS (CentOS 7)
  * 4x12TB SATA in a ZFS pool (~ RAID10)

I'm using the lz4 compression provided by ZFS, so turned the BackupPC one off.

While I do see some slowliness from time to time, it's working well. Long story 
short: don't bother with CPU. Except for the very first backups where it can be 
a bottleneck, disk I/O is what will limit general speed. Spend more in fast 
disks or SSD. If using ZFS, NVMe as a slog can help (or as special metadata 
vdev, although I haven't tested it yet). And as much RAM as you can. with what 
you have left, choose a decent CPU, but don't spend too much on it.

Cheers,
Daniel

-- 
[ https://www.firewall-services.com/ ]  
Daniel Berteaud 
FIREWALL-SERVICES SAS, La sécurité des réseaux 
Société de Services en Logiciels Libres 
Tél : +33.5 56 64 15 32 
Matrix: @dani:fws.fr 
[ https://www.firewall-services.com/ | https://www.firewall-services.com ]



___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Backuppc in large environments

2020-12-01 Thread Les Mikesell
On Tue, Dec 1, 2020 at 9:50 AM Dave Sherohman  wrote:
>
> Is this something that backuppc could reliably handle?
>
> If so, what kind of CPU resources would it require?  I've already got a
> decent handle on the network requirements from observing the current TSM
> backups and can calculate likely disk storage needs, but I have no idea
> what to expect the backup server to need in the way of processing power.
>

The price is right for the software and you'll use a lot less disk
space than other methods (with the big plus of instant access to
single files or directories), so consider that you could divide the
backups into two or more groups handled by different servers if you
run into trouble with just one.   It will help if the server has a lot
of ram, and if you back up the virtual machines like individual hosts
instead of their image files.  Likewise you'll probably need to
interact with the database servers with commands backuppc can send to
get dumps that won't change during the copy.

  -- Les Mikesell
lesmikes...@gmail.com


___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Backuppc in large environments

2020-12-01 Thread Richard Shaw
So long story short, a lot of it will depend on how fast your data
changes/grows, but it doesn't necessarily require a high end computer. You
really just need something beefy enough as to not be the bottleneck. If you
can make the client I/O the bottleneck, then you're good. Depending on your
budget (or what you have lying around) a decent AMD budget Ryzen system
would work quite nicely.

If you're familiar with Debian then I'm sure it's well documented how to
install and setup. I maintain the Fedora EPEL version and run it on CentOS
8 quite nicely.

Thanks,
Richard
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Backuppc in large environments

2020-12-01 Thread Paul Leyland
My network is rather smaller but still bigger than most home systems.
Please keep that in mind.

The backup server is a very elderly "Intel(R) Core(TM)2 CPU 
6600  @ 2.40GHz" with 8G RAM.  /var/lib/backuppc is a ZFS raidz array of
three 4TB disks, giving a useful space of 3.6T, of which 1.1T is now
used. The CGI interface reports:

There are 9 hosts that have been backed up, for a total of:

  * 109 full backups of total size 15511.81GB (prior to pooling and
compression),
  * 65 incr backups of total size 235.11GB (prior to pooling and
compression).

but I like to keep an archive as well as a backup so storing 15.5TB of
files in 1.1TB of space may be misleading because there are so many
files de-duplicated.

The server is on a single 1 gigabit NIC. It runs up to four backups
simultaneously and a full backup of a 0.4TB machine takes araound 12
hours; this appears to be disk IO bound at each end as incremental
backups of other machines proceed at a decent rate.

TL:DR: A 10 year old box very easily copes with my load.  YMMV.  In
particular, you may wish to have more than one ethernet NIC and perhaps
more RAM.

Paul

On 01/12/2020 15:33, Dave Sherohman wrote:
> Hey, all!
>
> I've been looking at setting up amanda as a backup solution for a
> fairly large environment at work and have just stumbled across
> backuppc.  While I love the design and scheduling methods of amanda,
> I'm also a big fan of incremental-only reverse-delta backup methods
> such as that used by backuppc, so now I'm wondering...
>
> How big can backuppc reasonably scale?
>
> The environment I'm dealing with includes around 75 various servers
> (about 2/3 virtual, 1/3 physical), mostly running Debian, with a few
> machines running other linux distros and maybe a dozen Windows
> machines.  Total data size that we want to maintain backups for is
> around 70 TB.  Our current backup system is using Tivoli Storage
> Manager, a commercial product that uses an incremental-only strategy
> similar to backuppc's, and the daily backup volume is running around
> 750 GB per day, with two database servers providing the majority of
> that volume (400 GB/day from one and 150 GB/day from the other).
>
> Is this something that backuppc could reliably handle?
>
> If so, what kind of CPU resources would it require?  I've already got
> a decent handle on the network requirements from observing the current
> TSM backups and can calculate likely disk storage needs, but I have no
> idea what to expect the backup server to need in the way of processing
> power.
>
>
>
> ___
> BackupPC-users mailing list
> BackupPC-users@lists.sourceforge.net
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    https://github.com/backuppc/backuppc/wiki
> Project: https://backuppc.github.io/backuppc/


OpenPGP_0xBA5077290CFFDDA6.asc
Description: application/pgp-keys


OpenPGP_signature
Description: OpenPGP digital signature
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Backuppc in large environments

2020-12-01 Thread Richard Shaw
Not a direct response to your question but I run my to backup computers at
my home, so quite a bit smaller scale, however, the 4th gen i5 SFF PC I
bought off Ebay w/ 1TB hard drive dedicated to BackupPC and M.2 SSD for
CentOS 8 works quite well for me, so a REAL computer should do fine. I did
max out the memory with 8GB.

[image: image.png]

Thanks,
Richard
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


[BackupPC-users] Backuppc in large environments

2020-12-01 Thread Dave Sherohman

Hey, all!

I've been looking at setting up amanda as a backup solution for a fairly 
large environment at work and have just stumbled across backuppc.  While 
I love the design and scheduling methods of amanda, I'm also a big fan 
of incremental-only reverse-delta backup methods such as that used by 
backuppc, so now I'm wondering...


How big can backuppc reasonably scale?

The environment I'm dealing with includes around 75 various servers 
(about 2/3 virtual, 1/3 physical), mostly running Debian, with a few 
machines running other linux distros and maybe a dozen Windows 
machines.  Total data size that we want to maintain backups for is 
around 70 TB.  Our current backup system is using Tivoli Storage 
Manager, a commercial product that uses an incremental-only strategy 
similar to backuppc's, and the daily backup volume is running around 750 
GB per day, with two database servers providing the majority of that 
volume (400 GB/day from one and 150 GB/day from the other).


Is this something that backuppc could reliably handle?

If so, what kind of CPU resources would it require?  I've already got a 
decent handle on the network requirements from observing the current TSM 
backups and can calculate likely disk storage needs, but I have no idea 
what to expect the backup server to need in the way of processing power.




___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/