On Mon, Nov 16, 2020 at 07:25:41AM -0600, Dave Sherohman wrote:
> Hello, again!
> 
> You may recall my earlier question to the list, included below.  I've
> now talked with my other coworkers who work with servers and they've
> agreed to go with amanda for our new backup system.
> 
> Now I'd like to get some hardware recommendations.  I'm mostly unsure
> about what we'll need in terms of capacity, both for processing power
> and for storing the actual backups.  Less interested in specific model
> or part numbers, because it will need to come from one of our approved
> vendors, of course, and most likely by way of a formal tender process -
> but I can say that we almost always end up buying complete Dell
> rackmount systems.
> 
> The basic parameters I'm working with are:
> 
> - Backing up around 75 servers (mostly Debian, with a handful of other
>   linux distros and a handful of windows machines).
> 
> - Total amount of data to back up is currently in the 40 TB range.
> 
> - Everything is connected by fast (10- or 100-gigabit) networks.
> 
> - Backup will be to disk/vtapes.
> 
> - I've been asked to have backups available for the previous 6 months.
> 
> - I'm assuming that the best way to handle backup of windows clients
>   will be to mount the disk on a linux box and back it up from there,
>   although some of them are virtual machines, so doing a kvm snapshot
>   and backing that up instead would also be an option.
> 
> Given all that, how beefy of a box should I be looking at, and how much
> disk space can I expect to need?

I did reply to the original message, but looking back it was addressed
to you rather than the list.  In case it was overlooked, here were my
regarding space:

"Just some simple numbers.  Assuming a 7 day dumpcycle and daily runs.
 40TB / 7 day plus some promotion is about 7TB of level 0 (full) dumps
 per day.  Add a TB for incrementals means about 8TB of backup data / day.

 8TB / 1GB/sec is about 8000sec network traffic.  3-4 hrs, doable on your
 slower network.

 6 months retension is nominally 200 days X 8TB / day is 1600 TB of
 vtape capacity.  With 5TB disks thats 320 disks.  Compression will
 reduce that some, how much only experience will tell you."

> 
> Also, as a side note, I'm planning on using VDO (Virtual Data Optimizer)
> to provide on-the-fly data compression and deduplication on the backup
> server, which should reduce disk consumption at the cost of CPU
> overhead.  I'm thinking it would make the most sense to use VDO only for
> the filesystem holding the vtapes, and not for the staging area, but
> feel free to correct me on that.

Did not know what VDO was, so I read a Red Hat description.  It seems
to consist of 3 components each I question the value of for amanda backup.
Hopefully someone with VDO experience can share it.

Elimination of zero filled blocks:  Compression is likely to greatly
shrink the storage of a string of zeros.

Only one copy of duplicate blocks:  Were your files being backed up
individually, as I do in a separate backup of my Home directory using
rsync, this could provide a worthwhile savings.  But you will likely
be merging your files into a tarball or a dumpfile.  The original
disk block alignment will be lost and likely not even match in one
day's tarball to the next.

LZ4 compression on the fly:  I don't know the cpu load for the server
compressing 8TB of data daily.  One thing you would have to deal with
is amanda's view of what has been sent to the backup device and what
size the data actually consume on the device.

There are points where amanda calculates how much space is left on
the device based on it configuration-specified size and how much it
has already sent.  Of course there is actually more space available
because the compression occurs after amanda's involvement.  The
difference may cause amanda to make less than optimal decisions.

Amanda administrators who use tape drive compression face the same
problem.  I believe most over specify the size of the storage medium
to allow more complete tape utilization.


As to Windows backup, I hope someone suggests a good solution.  I
currently use the proprietary "Zmanda Windows Client".  Generally
works well but suffers from a lack of development and unexplained
failures to connect.  It is often corrected by restarting the ZWC
services on the Windows system and always corrected by rebooting.

In the distant past I backed up windows systems by mounting the drives
on UNIX host.  Most often used NFS.  Liked that approach except for
one thing.  Windows, at least then, does not like a file to be opened
by multiple processes.  So each backup included several files that
did not backup because the file was already opened by another Windows
process.  And a few system files were never backed up.

Regarding backing up a KVM snapshot, would that mean that to recover
one file you would have to take a new snapshot, restore the entire system
from the backed up snap, copy the file to somewhere else, restore the
new snap, copy the file to final location?

Windows remain a tough nut for amanda.

Hope this is of some value.

Jon
-- 
Jon H. LaBadie                 [email protected]
 11226 South Shore Rd.          (703) 787-0688 (H)
 Reston, VA  20190              (703) 935-6720 (C)

Reply via email to