Re: Speccing a large amanda install (was: Re: How's amanda feeling these days?)

2020-12-01 Thread Dave Sherohman
After some off-list discussions, we've set up a meeting for this
Thursday to talk about the six-month retention request.  The IT group
leader has said he's fine with "daily for a week, weekly for a month,
and monthly for two months", but I'm thinking a 14-day dumpcycle and
keep it for a month would be both sufficient and simpler, given how
amanda does scheduling.

14-day instead of 7-day dumpcycle is mostly just to free up network
overhead for fulls - if we go by the current TSM backup times and assume
no parallel dumps, it's looking like about 8 days to do a full.  In
reality it would take less time than that (because of parallelization)
under normal circumstances, but I don't want to tempt fate in the event
of non-normal circumstances which might slow things down temporarily.

Once retention is defined, the (uncompressed) storage requirements are
straightforward to calculate, so that'll be covered.

I'll probably do some limited testing with VDO to see whether it does me
any good, but I don't expect it to, since tarring everything up (instead
of storing individual files) will greatly reduce the number of identical
disk sectors for it to deduplicate, and VDO doing compression outside of
amanda would complicate dump size estimates.  So I'm assuming no VDO.

That all seem sane?


The remaining question is what kind of CPU horsepower will be needed to
manage everything and compress the resulting volume of data.  (~3.5T/
day, assuming a 14-day dumpcycle, or 6.5T/day at 7 days)  Any thoughts
on what that's likely to require?


On Tue, Nov 24, 2020 at 04:14:41AM -0600, Dave Sherohman wrote:
> On Mon, Nov 23, 2020 at 11:28:37PM +0100, Stefan G. Weichinger wrote:
> > Am 16.11.20 um 14:25 schrieb Dave Sherohman:
> > I am a bit surprised by the fact you haven't yet received any reply on
> > the list so far (maybe per direct/private reply).
> 
> I received one accidentally-off-list reply, as already mentioned.  But,
> aside from that, I interpreted it as just the list acting up - if you
> check the headers on the message you replied to, I sent it on Monday the
> 16th, but it didn't go out to the list until Friday the 20th.  So
> getting on-list replies on the 24th is right in keeping with that
> schedule...
> 
> > Your "project" and the related questions could start a new thread
> > without problems ;-)
> 
> True.  But here's a new subject line, at least.  :)
> 
> > * how dynamic is your data: are the incremental changes big or small ...
> 
> We're currently doing backup via Tivoli Storage Manager.  The daily TSM
> output shows a total of about 700GB per day in "Total number of bytes
> transferred".  Most hosts are only sending some MB or maybe a dozen GB.
> The substantial majority comes from two database servers (400GB and
> 150GB/day).
> 
> I only have access to the output emitted by the TSM client as it runs,
> so I don't know what space is used on the server, but this 700GB/day
> is the raw data size.  ("Objects compressed by: 0%")
> 
> > * what $dumpcycle is targetted?
> 
> Seven days is a nice default, but, given the scale of data here and the
> request for maintaining 6 months of backups, I'm thinking 30 days might
> be more sane.
> 
> Back when I was using amanda 20 years ago, I recall a lot of people
> would run a 7-day tapecycle, then monthly and annual full archival
> backups.  I assume something like that would be possible with vtapes as
> well, so that could be an option for maintaining a seven-day dumpcycle
> without needing an exabyte of storage.
> 
> And, personally, I think the 6 month retention is massive overkill in
> any case.  I've been in this job for just over a decade, and I could
> probably count the number of restores in that time on my fingers, and
> none of them needed data more than a week old.
> 
> > * parallelity: will your new amanda server have multiple NICs etc / plan
> > for a big holding disk (array)
> 
> We tend to default to 4 NICs on new server purchases and have gone
> higher.  But we've only done active/passive bonding so far, which is
> basically just single-NIC throughput.  We tried a higher-capacity mode
> once, but the campus data center and I weren't able to get all the
> pieces to coordinate properly to make it work.  (It was some years ago,
> so I don't recall the details of the problems.)
> 
> Holding disk size is one of the things I'm looking for advice on.  The
> largest DLE is currently a 19T NAS, but the admin responsible for that
> system agrees that it should be split into multiple fliesystems, even
> aside from backup-related reasons.  Assuming it doesn't get split, would
> 20T holding disk be sufficient or does it need to be 2x the largest DLE?
> 
> > * fast network is nice, but this results in a bottleneck called
> > *storage* -> fast RAID arrays, maybe SSDs.
> 
> My boss isn't particularly price-sensitive, but I doubt that he could
> swallow the cost of putting all the vtapes on SSD, so hopefully it won't
> come to that.  SSD for the holding disk should be 

Speccing a large amanda install (was: Re: How's amanda feeling these days?)

2020-11-24 Thread Dave Sherohman
On Mon, Nov 23, 2020 at 10:32:18PM -0500, Jon LaBadie wrote:
> I did reply to the original message, but looking back it was addressed
> to you rather than the list.  In case it was overlooked, here were my
> regarding space:

I saw it, just absent-minded about replying sometimes...  Good that you
repeated it for the list, in any case, I figure.

> Did not know what VDO was, so I read a Red Hat description.  It seems
> to consist of 3 components each I question the value of for amanda backup.
> Hopefully someone with VDO experience can share it.

I first heard about VDO from a friend who (surprise, surprise) works for
Red Hat.  He also runs a hosting company on the side and has been very
happy with what VDO has done for his backup server, but his backup
system is pretty rudimentary (sounds like it's a homebrew rsync-based
solution), so I suppose his results may not be entirely indicative of
what to expect with amanda+VDO.

> Only one copy of duplicate blocks:  Were your files being backed up
> individually, as I do in a separate backup of my Home directory using
> rsync, this could provide a worthwhile savings.  But you will likely
> be merging your files into a tarball or a dumpfile.  The original
> disk block alignment will be lost and likely not even match in one
> day's tarball to the next.

Ah, yes.  I hadn't considered the tarball aspect of it.  I figured it
would be able to make good use of the deduplication if I have amanda
write "uncompressed" and use only VDO's compression, but you've got a
good point about merging the files still messing things up in that
scenario.

> LZ4 compression on the fly:  I don't know the cpu load for the server
> compressing 8TB of data daily.

I assume that the CPU load would be comparable (not identical, of
course, but in the same ballpark) for on-the-fly VDO compression vs.
amanda compressing the data at that level.

> There are points where amanda calculates how much space is left on
> the device based on it configuration-specified size and how much it
> has already sent.  Of course there is actually more space available
> because the compression occurs after amanda's involvement.  The
> difference may cause amanda to make less than optimal decisions.
> 
> Amanda administrators who use tape drive compression face the same
> problem.  I believe most over specify the size of the storage medium
> to allow more complete tape utilization.

Hrm, yes...  Now that you mention it, I do have foggy memories of seeing
this discussed when I was using amanda previously.

But, yeah, if block deduplication isn't going to be a significant
benfit (and it sounds like it probably won't), then it probably would be
better to skip VDO and have amanda handle the compression itself.

> In the distant past I backed up windows systems by mounting the drives
> on UNIX host.  Most often used NFS.

Any particular reason to prefer NFS over SMB?

> Windows, at least then, does not like a file to be opened by multiple
> processes.  So each backup included several files that did not backup
> because the file was already opened by another Windows process.  And a
> few system files were never backed up.

I'm pretty certain this is still the case.  One of the most annoying
misfeatures of Windows, IMO.

> Regarding backing up a KVM snapshot, would that mean that to recover
> one file you would have to take a new snapshot, restore the entire system
> from the backed up snap, copy the file to somewhere else, restore the
> new snap, copy the file to final location?

Pretty much, yeah.  That's why I haven't even considered snapshot-based
backups of my linux virts.  But, if it makes the Windows admin happy...

-- 
Dave Sherohman


Speccing a large amanda install (was: Re: How's amanda feeling these days?)

2020-11-24 Thread Dave Sherohman
On Mon, Nov 23, 2020 at 11:28:37PM +0100, Stefan G. Weichinger wrote:
> Am 16.11.20 um 14:25 schrieb Dave Sherohman:
> I am a bit surprised by the fact you haven't yet received any reply on
> the list so far (maybe per direct/private reply).

I received one accidentally-off-list reply, as already mentioned.  But,
aside from that, I interpreted it as just the list acting up - if you
check the headers on the message you replied to, I sent it on Monday the
16th, but it didn't go out to the list until Friday the 20th.  So
getting on-list replies on the 24th is right in keeping with that
schedule...

> Your "project" and the related questions could start a new thread
> without problems ;-)

True.  But here's a new subject line, at least.  :)

> * how dynamic is your data: are the incremental changes big or small ...

We're currently doing backup via Tivoli Storage Manager.  The daily TSM
output shows a total of about 700GB per day in "Total number of bytes
transferred".  Most hosts are only sending some MB or maybe a dozen GB.
The substantial majority comes from two database servers (400GB and
150GB/day).

I only have access to the output emitted by the TSM client as it runs,
so I don't know what space is used on the server, but this 700GB/day
is the raw data size.  ("Objects compressed by: 0%")

> * what $dumpcycle is targetted?

Seven days is a nice default, but, given the scale of data here and the
request for maintaining 6 months of backups, I'm thinking 30 days might
be more sane.

Back when I was using amanda 20 years ago, I recall a lot of people
would run a 7-day tapecycle, then monthly and annual full archival
backups.  I assume something like that would be possible with vtapes as
well, so that could be an option for maintaining a seven-day dumpcycle
without needing an exabyte of storage.

And, personally, I think the 6 month retention is massive overkill in
any case.  I've been in this job for just over a decade, and I could
probably count the number of restores in that time on my fingers, and
none of them needed data more than a week old.

> * parallelity: will your new amanda server have multiple NICs etc / plan
> for a big holding disk (array)

We tend to default to 4 NICs on new server purchases and have gone
higher.  But we've only done active/passive bonding so far, which is
basically just single-NIC throughput.  We tried a higher-capacity mode
once, but the campus data center and I weren't able to get all the
pieces to coordinate properly to make it work.  (It was some years ago,
so I don't recall the details of the problems.)

Holding disk size is one of the things I'm looking for advice on.  The
largest DLE is currently a 19T NAS, but the admin responsible for that
system agrees that it should be split into multiple fliesystems, even
aside from backup-related reasons.  Assuming it doesn't get split, would
20T holding disk be sufficient or does it need to be 2x the largest DLE?

> * fast network is nice, but this results in a bottleneck called
> *storage* -> fast RAID arrays, maybe SSDs.

My boss isn't particularly price-sensitive, but I doubt that he could
swallow the cost of putting all the vtapes on SSD, so hopefully it won't
come to that.  SSD for the holding disk should be doable.

> I'd start with asking: how do your current backups look like?
> 
> What is the current rate of new/changed data generated?

Covered that above, but, to quickly reiterate, we're using Tivoli
Storage Manager, which runs daily incrementals totaling approx. 700GB
(uncompressed) per day, the bulk of which is 400GB from one database
server and 150GB from a second database server.  Both are running
mysql/mariadb, if that matters.

> * how long does it take to copy all the 40TB into my amanda box (*if* I
> did a FULL backup every time)?

The 400GB/day server takes about 8 hours to do its daily run.  If we
assume that data rate and *no* parallelization, it comes out to a bit
over a week for 40T.

However, I assume that's being throttled by the TSM server, because I
get approximately double that rate when copying disk images on my kvm
servers, and those are using remote glusterfs disk mounts, so the data
is crossing the network multiple times.

> * what grade of parallelity is possible?

As much as the network capacity will support, really.  Our current
backups kick off simultaneously for almost all servers (the one
exception is that 400G/day db server, which starts earlier).  About half
finish within a minute or so (only backing up a couple hundred MB or
less) and most are complete within half an hour.  It's pretty much just
db servers (the ones I've mentioned already, plus some postgresql
machines with between 10 and 50G/day) that take longer than an hour to
complete.

-- 
Dave Sherohman


Re: How's amanda feeling these days?

2020-11-23 Thread Jon LaBadie
On Mon, Nov 16, 2020 at 07:25:41AM -0600, Dave Sherohman wrote:
> Hello, again!
> 
> You may recall my earlier question to the list, included below.  I've
> now talked with my other coworkers who work with servers and they've
> agreed to go with amanda for our new backup system.
> 
> Now I'd like to get some hardware recommendations.  I'm mostly unsure
> about what we'll need in terms of capacity, both for processing power
> and for storing the actual backups.  Less interested in specific model
> or part numbers, because it will need to come from one of our approved
> vendors, of course, and most likely by way of a formal tender process -
> but I can say that we almost always end up buying complete Dell
> rackmount systems.
> 
> The basic parameters I'm working with are:
> 
> - Backing up around 75 servers (mostly Debian, with a handful of other
>   linux distros and a handful of windows machines).
> 
> - Total amount of data to back up is currently in the 40 TB range.
> 
> - Everything is connected by fast (10- or 100-gigabit) networks.
> 
> - Backup will be to disk/vtapes.
> 
> - I've been asked to have backups available for the previous 6 months.
> 
> - I'm assuming that the best way to handle backup of windows clients
>   will be to mount the disk on a linux box and back it up from there,
>   although some of them are virtual machines, so doing a kvm snapshot
>   and backing that up instead would also be an option.
> 
> Given all that, how beefy of a box should I be looking at, and how much
> disk space can I expect to need?

I did reply to the original message, but looking back it was addressed
to you rather than the list.  In case it was overlooked, here were my
regarding space:

"Just some simple numbers.  Assuming a 7 day dumpcycle and daily runs.
 40TB / 7 day plus some promotion is about 7TB of level 0 (full) dumps
 per day.  Add a TB for incrementals means about 8TB of backup data / day.

 8TB / 1GB/sec is about 8000sec network traffic.  3-4 hrs, doable on your
 slower network.

 6 months retension is nominally 200 days X 8TB / day is 1600 TB of
 vtape capacity.  With 5TB disks thats 320 disks.  Compression will
 reduce that some, how much only experience will tell you."

> 
> Also, as a side note, I'm planning on using VDO (Virtual Data Optimizer)
> to provide on-the-fly data compression and deduplication on the backup
> server, which should reduce disk consumption at the cost of CPU
> overhead.  I'm thinking it would make the most sense to use VDO only for
> the filesystem holding the vtapes, and not for the staging area, but
> feel free to correct me on that.

Did not know what VDO was, so I read a Red Hat description.  It seems
to consist of 3 components each I question the value of for amanda backup.
Hopefully someone with VDO experience can share it.

Elimination of zero filled blocks:  Compression is likely to greatly
shrink the storage of a string of zeros.

Only one copy of duplicate blocks:  Were your files being backed up
individually, as I do in a separate backup of my Home directory using
rsync, this could provide a worthwhile savings.  But you will likely
be merging your files into a tarball or a dumpfile.  The original
disk block alignment will be lost and likely not even match in one
day's tarball to the next.

LZ4 compression on the fly:  I don't know the cpu load for the server
compressing 8TB of data daily.  One thing you would have to deal with
is amanda's view of what has been sent to the backup device and what
size the data actually consume on the device.

There are points where amanda calculates how much space is left on
the device based on it configuration-specified size and how much it
has already sent.  Of course there is actually more space available
because the compression occurs after amanda's involvement.  The
difference may cause amanda to make less than optimal decisions.

Amanda administrators who use tape drive compression face the same
problem.  I believe most over specify the size of the storage medium
to allow more complete tape utilization.


As to Windows backup, I hope someone suggests a good solution.  I
currently use the proprietary "Zmanda Windows Client".  Generally
works well but suffers from a lack of development and unexplained
failures to connect.  It is often corrected by restarting the ZWC
services on the Windows system and always corrected by rebooting.

In the distant past I backed up windows systems by mounting the drives
on UNIX host.  Most often used NFS.  Liked that approach except for
one thing.  Windows, at least then, does not like a file to be opened
by multiple processes.  So each backup included several files that
did not backup because the file was already opened by another Windows
process.  And a few system files were never backed up.

Regarding backing up a KVM snapshot, would that mean that to recover
one file you would have to take a new snapshot, restore the entire system
from the backed up snap, copy the file to somewhere else,

Re: How's amanda feeling these days?

2020-11-23 Thread Stefan G. Weichinger
Am 16.11.20 um 14:25 schrieb Dave Sherohman:
> Hello, again!
> 
> You may recall my earlier question to the list, included below.  I've
> now talked with my other coworkers who work with servers and they've
> agreed to go with amanda for our new backup system.
> 
> Now I'd like to get some hardware recommendations.  I'm mostly unsure
> about what we'll need in terms of capacity, both for processing power
> and for storing the actual backups.  Less interested in specific model
> or part numbers, because it will need to come from one of our approved
> vendors, of course, and most likely by way of a formal tender process -
> but I can say that we almost always end up buying complete Dell
> rackmount systems.
> 
> The basic parameters I'm working with are:
> 
> - Backing up around 75 servers (mostly Debian, with a handful of other
>   linux distros and a handful of windows machines).
> 
> - Total amount of data to back up is currently in the 40 TB range.
> 
> - Everything is connected by fast (10- or 100-gigabit) networks.
> 
> - Backup will be to disk/vtapes.
> 
> - I've been asked to have backups available for the previous 6 months.
> 
> - I'm assuming that the best way to handle backup of windows clients
>   will be to mount the disk on a linux box and back it up from there,
>   although some of them are virtual machines, so doing a kvm snapshot
>   and backing that up instead would also be an option.
> 
> Given all that, how beefy of a box should I be looking at, and how much
> disk space can I expect to need?
> 
> Also, as a side note, I'm planning on using VDO (Virtual Data Optimizer)
> to provide on-the-fly data compression and deduplication on the backup
> server, which should reduce disk consumption at the cost of CPU
> overhead.  I'm thinking it would make the most sense to use VDO only for
> the filesystem holding the vtapes, and not for the staging area, but
> feel free to correct me on that.

I am a bit surprised by the fact you haven't yet received any reply on
the list so far (maybe per direct/private reply).

Your "project" and the related questions could start a new thread
without problems ;-)

In fact this is a rather *big* amanda installation as far as I know and
there are many things to consider:

* how dynamic is your data: are the incremental changes big or small ...

* what $dumpcycle is targetted?

* parallelity: will your new amanda server have multiple NICs etc / plan
for a big holding disk (array)

* fast network is nice, but this results in a bottleneck called
*storage* -> fast RAID arrays, maybe SSDs.

I am absolutely convinced that Amanda is able to backup your servers.

But IMO this will need a rather big box with fast storage and NICs.

And a fast holding disk (array) to provide parallelity.

-

I'd start with asking: how do your current backups look like?

What is the current rate of new/changed data generated?

(maybe I ignore some of your earlier postings right now, sorry)

-

Other amanda-users here run way bigger installations than me, and should
be able to share some tips here.

I think I would do some basic calculations at first:

* how long does it take to copy all the 40TB into my amanda box (*if* I
did a FULL backup every time)?

* what grade of parallelity is possible?

-> which client server hosts X TB, which bandwidth is available to each
server, which server is able to deliver this and that performance
because of its storage hw/setup ...

etc etc

-

Nevertheless a very interesting project, yes ;-)


Re: How's amanda feeling these days?

2020-11-20 Thread Dave Sherohman
Hello, again!

You may recall my earlier question to the list, included below.  I've
now talked with my other coworkers who work with servers and they've
agreed to go with amanda for our new backup system.

Now I'd like to get some hardware recommendations.  I'm mostly unsure
about what we'll need in terms of capacity, both for processing power
and for storing the actual backups.  Less interested in specific model
or part numbers, because it will need to come from one of our approved
vendors, of course, and most likely by way of a formal tender process -
but I can say that we almost always end up buying complete Dell
rackmount systems.

The basic parameters I'm working with are:

- Backing up around 75 servers (mostly Debian, with a handful of other
  linux distros and a handful of windows machines).

- Total amount of data to back up is currently in the 40 TB range.

- Everything is connected by fast (10- or 100-gigabit) networks.

- Backup will be to disk/vtapes.

- I've been asked to have backups available for the previous 6 months.

- I'm assuming that the best way to handle backup of windows clients
  will be to mount the disk on a linux box and back it up from there,
  although some of them are virtual machines, so doing a kvm snapshot
  and backing that up instead would also be an option.

Given all that, how beefy of a box should I be looking at, and how much
disk space can I expect to need?

Also, as a side note, I'm planning on using VDO (Virtual Data Optimizer)
to provide on-the-fly data compression and deduplication on the backup
server, which should reduce disk consumption at the cost of CPU
overhead.  I'm thinking it would make the most sense to use VDO only for
the filesystem holding the vtapes, and not for the staging area, but
feel free to correct me on that.

On Fri, Sep 25, 2020 at 08:19:58AM -0500, Dave Sherohman wrote:
> Howdy, all!
> 
> We've recently had some problems at work with our backup provider, so my
> boss has come to me and requested a recommendation for bringing backups
> in-house.  I've previously adminned a small amanda installation back in
> 2000-2006 and I quite liked the system and how it works, so that was my
> first thought.
> 
> I've done some general web searches and it looks like the situation
> today isn't as good as it was a decade and a half ago - not a lot of
> active development, limited support for Windows clients, etc.  But, on
> the other hand, amanda was already a very mature system back then, so I
> don't know that a lot of ongoing development would still be needed.
> 
> So let's see what the current users have to say.  Is a new amanda
> installation still a sane choice in 2020?
> 
> My use case is that I'll be backing up somewhere in the neighborhood of
> 75ish servers, a mix of physical and (mostly) virtual machines, and a
> mix of mostly Linux with some Windows and one or two FreeBSD.  Total
> disk usage is currently in the 35-40 TB range, growing by maybe 1-2 TB
> per year.  Aside from my own positive experiences with amanda, both I
> and my boss (and most of my coworkers) are very pro-open-source.
> 
> If amanda isn't a reasonable choice for that scenario, what would be a
> better option?
> 
> And what kind of hardware specs should I be looking at?  Is tape still
> king, or is everyone backing up to hard drives now?
> 
> -- 
> Dave Sherohman


-- 
Dave Sherohman


Re: How's amanda feeling these days?

2020-09-29 Thread Diego Zuccato
Il 30/09/20 05:10, Olivier ha scritto:

> But if you don't use RAID, you can double the number of backups you are
> keeping, that includes older versions. That is a risk calculation
> between having secure backups and loosing the backups data as same time
> as the primary data.
We are using a 3U enclosure with 16 hot-swap bays. The SSD for SO is
mounted internally and all the hot swap bays are populated w/ 4TB WD RED
disks configured in RAID6 via mdadm (I don't trust "HW RAID", since in
case of a controller failure your data is usually toast, but you can
always find some way to connect a bunch of drives to a mobo).
In case of a disk failure the only real problem is identify which is the
failed disk (just run "md5sum /dev/mdX" and see which one isn't working,
unless you know how to handle enclosure LEDs).

> No, but I never dug that very much. i need to replace a disk or mount
> and older disk about twice a year, so I can reboot at that time.
I always prefer hot-swap bays. The only thing I should have changed is
not to use all the slots: a free slot is useful both for connecting an
archive volume for offline storage and to rebuild the array in a safer
way (in some cases, the 'bad' disk is still mostly readable, so md can
copy data from it instead of having to read all the other disks, which
could expose another failing disk).

> Developpers can also be told to let their workstation on all the time,
> they can understand the reason why, then you can use
> Amanda. Administrtive staff do not always understand these details.
Administratives can be taught (but usually only the hard way) that the
only recoverable documents are the ones they put on the dedicated
network share. ]:)

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786


Re: How's amanda feeling these days?

2020-09-29 Thread Olivier
Dave Sherohman  writes:

> On Tue, Sep 29, 2020 at 10:55:51AM +0700, Olivier wrote:
>> I think that independant disks is better than any RAID thing that
>> would either waste some storage, or render the array unusable if any
>> one disk gets faulty.
>
> If we go disk-based/vtape for storage, I'd be looking at RAID 1 for
> protection against disk failure (which we've had trouble with in
> production systems in the past).  Definitely not going to use anything
> that would increase exposure to data loss, of course.

But if you don't use RAID, you can double the number of backups you are
keeping, that includes older versions. That is a risk calculation
between having secure backups and loosing the backups data as same time
as the primary data.

>> I have 8 disk bays, one disk for system, holding disk and a copy of
>> the final state of the accounts about to be deleted. Another disk for
>> some additional holding space and a second copy of the deleted
>> accounts and the 6 other disks for vtapes.
>
> Are those hot-swappable?  I've made a couple half-hearted attempts at
> hot-swap disk in the past, but never managed to make it work myself.

No, but I never dug that very much. i need to replace a disk or mount
and older disk about twice a year, so I can reboot at that time.

>> I have been looking at some solution that would run on the user's
>> workstation and that would push a backup on some server and use Amanda
>> to keep a backup of that server.
>
> I've got a couple developers who are backing up their workstations via
> rsync to the FreeNAS server, which sounds like basically the same
> concept.  They seem happy with it, although we've never needed to do a
> restore on any of that data.

Developpers can also be told to let their workstation on all the time,
they can understand the reason why, then you can use
Amanda. Administrtive staff do not always understand these details.

Best regards,

Olivier

-- 


Re: How's amanda feeling these days?

2020-09-29 Thread Charles Curley
On Tue, 29 Sep 2020 07:34:08 -0500
Dave Sherohman  wrote:

> I've got a couple developers who are backing up their workstations via
> rsync to the FreeNAS server, which sounds like basically the same
> concept.  They seem happy with it, although we've never needed to do a
> restore on any of that data.

Take a look at rsnapshot for this.

-- 
Does anybody read signatures any more?

https://charlescurley.com
https://charlescurley.com/blog/


Re: How's amanda feeling these days?

2020-09-29 Thread Dave Sherohman
On Tue, Sep 29, 2020 at 10:55:51AM +0700, Olivier wrote:
> nice thing part of Amanda is written in Perl and integrates well with
> some admin scripts here and there).

Quite nice indeed!  I wasn't aware of that detail, but our IT group is
very Perl-centric, so good to know we're well-situated to make changes
or extensions to amanda if needed.

> One of my concerns would be the network bandwidth, you'd need at least
> 10Gbps network to push that amount of data through in one day. That is
> where Amanda trying to balance the size of the backup across a cycle may
> come handy.

I think we should be able to manage it.  Available bandwidth is
sufficient for the current TSM solution through the law school.  The two
largest servers typically take 6-8 hours to complete their backup runs,
so a bottleneck at the amanda server itself could be an issue, but I can
probably get campus data services to give us a faster link for it if
needed.

> I think that independant disks is better than any RAID thing that
> would either waste some storage, or render the array unusable if any
> one disk gets faulty.

If we go disk-based/vtape for storage, I'd be looking at RAID 1 for
protection against disk failure (which we've had trouble with in
production systems in the past).  Definitely not going to use anything
that would increase exposure to data loss, of course.

> I have 8 disk bays, one disk for system, holding disk and a copy of
> the final state of the accounts about to be deleted. Another disk for
> some additional holding space and a second copy of the deleted
> accounts and the 6 other disks for vtapes.

Are those hot-swappable?  I've made a couple half-hearted attempts at
hot-swap disk in the past, but never managed to make it work myself.

> Amanda is not very well suited to backup workstations when mot of the
> users will turn off their machine after work. I don't know if that is
> part of your plan,

Nope, I'm not backing up any end-user workstations.  Glusterfs storage
cluster, some standalone Windows-based and FreeNAS storage servers, half
a dozen kvm host servers, and a bunch of (mostly-)virtual servers
providing actual end-user services.

> I have been looking at some solution that would run on the user's
> workstation and that would push a backup on some server and use Amanda
> to keep a backup of that server.

I've got a couple developers who are backing up their workstations via
rsync to the FreeNAS server, which sounds like basically the same
concept.  They seem happy with it, although we've never needed to do a
restore on any of that data.

> One think I implemented a long time ago is a replication of Amanda
> configuration and indexes: at the end of each dump, I push a copy of
> that data with rsync and also send that information to me by email
> (email being replicated automatically on the email server). With the
> disks and the indexes, I can manually extract about any backup from a
> machine that would not even have Amanda running. I choose to use tar
> rather than dump for that compatibility advantage.

Seems like a good tip.  Thanks!

-- 
Dave Sherohman


Re: How's amanda feeling these days?

2020-09-28 Thread Olivier
Dave,

> So let's see what the current users have to say.  Is a new amanda
> installation still a sane choice in 2020?

I cannot speak for a new installation as I have been using Amanda for
over 2 decades. But for backing up servers, a mix of unixes, I find it
very reliable and dependable.

I have not been using the latest bells and whistles, I have developped
some of my owns to fit my needs 9nice thing part of Amanda is written in
Perl and integrates well with some admin scripts here and there).

I am far from dealing with your amount of servers and data, having about
15 machines and about 10TB. On a 4 cores, 4GB RAM machine, backups take
only a few hours every night, compression being made on the client or on
Amanda backup, about half and half.

> And what kind of hardware specs should I be looking at?  Is tape still
> king, or is everyone backing up to hard drives now?

One of my concerns would be the network bandwidth, you'd need at least
10Gbps network to push that amount of data through in one day. That is
where Amanda trying to balance the size of the backup across a cycle may
come handy.

I had been using QIC type of tapes for years and changed for vtapes on
disk about 12 years ago. As I was not satisfied that it could only use
one disk at that time (or I did I misunderstood something), I developped
my own tape changer that could work with multiple independant disks. By
choice, I think that independant disks is better than any RAID thing
that would either waste some storage, or render the array unusable if
any one disk gets faulty.

And since, I have been installing bigger disks in my same server when I
need more space, from 500GB, to 1.5TB, 3TB and lately one 6TB (I tried
with only one because I was not sure the motherboard would recognize
it). The old disks are stored so I can retreive the data from them if
needed, I just have to plug the disk in Amanda server. I have 8 disk
bays, one disk for system, holding disk and a copy of the final state of
the accounts about to be deleted. Another disk for some additional
holding space and a second copy of the deleted accounts and the 6 other
disks for vtapes. Per choice too, I am using smallish vtapes so they are
almost all full, that limits the fragmentation (most of the disks end up
with less than 1M enpty space).

I also use Amanda to backup one Windows machine, through the Samba
method. Amanda is not very well suited to backup workstations when mot
of the users will turn off their machine after work. I don't know if
that is part of your plan, but I have been looking at some solution that
would run on the user's workstation and that would push a backup on some
server and use Amanda to keep a backup of that server.

One think I implemented a long time ago is a replication of Amanda
configuration and indexes: at the end of each dump, I push a copy of
that data with rsync and also send that information to me by email
(email being replicated automatically on the email server). With the
disks and the indexes, I can manually extract about any backup from a
machine that would not even have Amanda running. I choose to use tar
rather than dump for that compatibility advantage.

Best regards,

Olivier


Re: How's amanda feeling these days?

2020-09-28 Thread Stefan G. Weichinger
Am 25.09.20 um 15:19 schrieb Dave Sherohman:

> If amanda isn't a reasonable choice for that scenario, what would be a
> better option?

This thread contains numerous valuable and positive replies already, I
am a bit late and decide to reply to the original posting as well.

It's correct and sad that the development of the open source amanda
project is de facto dead for years now. I would love to see movement and
development here.

Aside from that the available software is quite stable and useful
nonetheless.

I run >10 separate amanda-installations at various customers for years
and I am still convinced that it is a solid and reliable backup solution
for environments with linux servers.

Ad windows backups:

don't expect too much

dumping CIFS-shares via smbclient works OK for me. edge cases: maybe

Don't expect specific plugins like MS-SQL, Exchange, etc in the
community edition of amanda. And I don't know about the quality of the
commercial stuff from Zmanda/Betsol.

hardware:

you received some valid pointers already. In general amanda is rather
"compatible": if your linux distribution is able to use some hardware,
it's very likely that amanda is capable as well.

It all comes down to finding your bottleneck: is the overall system
capable to move all the data (to be dumped to fulfill your dump
schedule) within your backup window?

Holding disk(s) help to parallelize things and decouple the transfers
via network from the actual writing to the tapes or vtapes.

tldr:

amanda is a powerful tool. Simple and complex at the same time.

It should work for you, and the community here will be helpful in
getting that solved ;-)

feedback welcome :-)


Re: How's amanda feeling these days?

2020-09-27 Thread Dave Sherohman
On Sat, Sep 26, 2020 at 11:10:14AM -0600, Charles Curley wrote:
> So this is a "green field" installation as far as backup gear goes:
> backup servers, tape drives or hard drives for vtapes, network
> enhancements, etc., as needed. I have no doubt this crew can advise you
> well on all of that.

Good, that's exactly what I was hoping for!

The one thing you mentioned that's not in-scope for this is network
enhancements.  Connectivity is strictly the domain of campus data
services, but we've already got gigabit links in place between all
sites and can get a dedicated VLAN set up just for backups if necessary,
so I don't think there should be any problems in that area.

> However, paranoids live longer. Do not terminate your contract with the
> law school until you are getting good backups locally of everything you
> want to back up.

Oh, absolutely.  And, as was suggested earlier, migrating to the new
backup system server-by-server is definitely the way to go, rather than
flipping the switch on everything all at once.

> And a thought that probably does not sit well with the typical library
> budget: If necessary, spend the money to get good kit. It will save you
> trouble (and hence money) in the long run.

This, at least, is not an issue.  While I don't have an actual fixed
budget allocation to work with for this project, both the library as a
whole and the IT group specifically are sufficiently funded that I've
never had a problem with any kind of (justifiable) hardware purchases.
The sky isn't the limit, but I can almost certainly get whatever is
needed to do it right and I'm sure my boss is expecting this to run at
least US$10-15k in hardware.

-- 
Dave Sherohman


Re: How's amanda feeling these days?

2020-09-26 Thread Gene Heskett
On Saturday 26 September 2020 13:10:14 Charles Curley wrote:

> On Sat, 26 Sep 2020 07:54:46 -0500
>
> Dave Sherohman  wrote:
> > But now we want to terminate th[e law school] contract and
> > start handling backups ourselves, within the library's own IT group,
> > which will remove both TSM and the law school from the picture
> > entirely.
>
> So this is a "green field" installation as far as backup gear goes:
> backup servers, tape drives or hard drives for vtapes, network
> enhancements, etc., as needed. I have no doubt this crew can advise
> you well on all of that.
>
> However, paranoids live longer. Do not terminate your contract with
> the law school until you are getting good backups locally of
> everything you want to back up.
>
> And a thought that probably does not sit well with the typical library
> budget: If necessary, spend the money to get good kit. It will save
> you trouble (and hence money) in the long run.

I'll second and third that. Start with an Asus motherboard, something you 
can stick in a locked closet with a UPS and forget 6 months or more at a 
time. This machine has an Asus x-370 mobo and a 6 core i5 with 32Gigs of 
ram, current uptime is 53 days. I do literally everything on this 
machine. And it feels as if it will do 453 days  just as easily if some 
kernel/security update doesn't need a reboot before then. Good quality 
hardware Just Works. I'm hearing Seagate getting bad mouthed, but that 
has not been my experience. And while I'd recommend spinning rust for 
the amandatapes drive if you go that way, I'd also recommend an SSD for 
the backup machines boot drive and OS, I am using several in the 120 to 
240 gig size range in my cnc machines with zero problems in the last 2 
years. Boot time is cut by 2/3rds, and noticeably faster than spinning 
rust. Spinning rust for /amandatapes use because thats buckets cheaper 
than a multiterabyte SSD raid array would be.

My $0.02, but adjust for inflation since 1934. ;-)

Copyright 2019 by Maurice E. Heskett
Cheers, Gene Heskett
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis
Genes Web page 


Re: How's amanda feeling these days?

2020-09-26 Thread Charles Curley
On Sat, 26 Sep 2020 07:54:46 -0500
Dave Sherohman  wrote:

> But now we want to terminate th[e law school] contract and
> start handling backups ourselves, within the library's own IT group,
> which will remove both TSM and the law school from the picture
> entirely.

So this is a "green field" installation as far as backup gear goes:
backup servers, tape drives or hard drives for vtapes, network
enhancements, etc., as needed. I have no doubt this crew can advise you
well on all of that.

However, paranoids live longer. Do not terminate your contract with the
law school until you are getting good backups locally of everything you
want to back up.

And a thought that probably does not sit well with the typical library
budget: If necessary, spend the money to get good kit. It will save you
trouble (and hence money) in the long run.

-- 
Does anybody read signatures any more?

https://charlescurley.com
https://charlescurley.com/blog/


Re: How's amanda feeling these days?

2020-09-26 Thread Gene Heskett
On Saturday 26 September 2020 08:54:46 Dave Sherohman wrote:

> On Sat, Sep 26, 2020 at 01:35:51PM +0100, J Martin Rushton wrote:
> > If your law school is using TSM, then the only use for Amanda would
> > be to back up the small machines onto the central server, and then
> > TSM can migrate (HSM) or backup to tape/slow disk.
>
> Guess my overview of the history and organizations involved was a bit
> too brief...
>
> My department is the central university library.  We're not officially
> affiliated with the law school, they just offered backups as a service
> when they set up their own backup system, and we contracted with them
> to purchase that service.  But now we want to terminate that contract
> and start handling backups ourselves, within the library's own IT
> group, which will remove both TSM and the law school from the picture
> entirely.

Which amanda can do. The problem we see most often is how amanda goes 
about scheduling those backups as it does not use a fixed schedule. It 
will do a full of a given Disk List Entry, called a DLE, within the time 
period its told to, but within that time period, it will shuffle the 
order around to attempt using about the same amount of media each run.  
This was valuable when smaller tapes and less than reliable drives were 
the norm, but not so much today with the advent of huge hard drives that 
are hundreds of times more dependable and the tapes turning into 
directories per run, and files within those directories. Amanda tracks 
all that so you don't have to. But amrecover can restore a previous 
version of a file if it needs to also.

Commercial time, but no money involved.

I've written a wrapper, and you may have to edit it a bit to suit your 
needs, a GenesAmandaHelper which allows amanda to do a bare metal 
restore to the state the system was in at the completion of the just 
completed run, whereas that situation without my wrapper loses a runs 
data because the recorded index could be from yesterdays run. My scripts 
attach both the index data, including that generated by the just 
completed run to the end of each run, and also the amanda configuration 
that made that backup.  This is a considerable amount of data that would 
be missing from amandas database in the event the drive holding it went 
toes up. That extra file from last nights run is currently 594,288,640 
bytes of data here for the 5 machine backup from last night and is the 
indice data from 60 vtapes on a 2TB drive.  All of it.  The backed up 
configuration OTOH is only about 12k, but if the drive it lives on 
expires, you've still got a working config on the /amandatapes drive in 
the Dailys/data directory.

And I sleep better.

Copyright 2020 by Maurice E. Heskett
Cheers, Gene Heskett
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis
Genes Web page 


Re: How's amanda feeling these days?

2020-09-26 Thread Dave Sherohman
On Sat, Sep 26, 2020 at 01:35:51PM +0100, J Martin Rushton wrote:
> If your law school is using TSM, then the only use for Amanda would be to
> back up the small machines onto the central server, and then TSM can migrate
> (HSM) or backup to tape/slow disk.

Guess my overview of the history and organizations involved was a bit
too brief...

My department is the central university library.  We're not officially
affiliated with the law school, they just offered backups as a service
when they set up their own backup system, and we contracted with them to
purchase that service.  But now we want to terminate that contract and
start handling backups ourselves, within the library's own IT group,
which will remove both TSM and the law school from the picture entirely.

-- 
Dave Sherohman


Re: How's amanda feeling these days?

2020-09-26 Thread J Martin Rushton

On 26/09/2020 12:58, Dave Sherohman wrote:



But I don't really know anything about the hardware or configuration
that the law school is using, only that their chosen software is Tivoli
Storage Manager.




TSM (now called Spectrum Protect) is IBM's proprietory storage handling 
system.  Typically it is integrated with the GPFS filesystem (now called 
Spectrum Scale).  TSM uses a database to keep track of where data has 
been moved to.  On top of TSM you can have an heirarchical starage 
solution (HSM), a backup and archiving solution (BAclient) and IIRC you 
can even write your oen application if you want to (masochists only).


The BAclient is fairly expensive, so what some places do is use an 
in-house or third-party application to back up remote machines to a 
central location, and then allow BAclient to backup the central dumps.


If your law school is using TSM, then the only use for Amanda would be 
to back up the small machines onto the central server, and then TSM can 
migrate (HSM) or backup to tape/slow disk.


--
J Martin Rushton MBCS


Re: How's amanda feeling these days?

2020-09-26 Thread Dave Sherohman
On Fri, Sep 25, 2020 at 08:37:52AM -0600, Charles Curley wrote:
> I think so. You have a lot more machines to herd than I do, but I don't
> think that's an impediment.

Agreed.  I have absolutely no doubt that amanda can handle extremely
large installations, provided that the network can deliver the data to
the amanda server within a reasonable backup window.

> What are you backing up to now?

I work at a university library and, in the past, we backed up to the
central campus data center's backup service, but they charged an arm and
a leg for it (and frequently "forgot" to do things like stop charging us
for backups of decommissioned machines), so we switched to doing backups
via the law school four or five years ago.  But the law school's
sysadmins have never been terribly responsive and they've had a few
problems recently with failures (for their other customers) which we
only heard about some time later, so we don't really trust them any more
and have (finally) decided to do it for ourselves.

But I don't really know anything about the hardware or configuration
that the law school is using, only that their chosen software is Tivoli
Storage Manager.


-- 
Dave Sherohman


Re: How's amanda feeling these days?

2020-09-25 Thread Gene Heskett
On Friday 25 September 2020 14:25:16 Rüdiger Kessel wrote:

> Dear Gene,
>
> you mentioned that you use a dumpcycle of 58 days with incremental
> backups in between.
>
> I use a bit longer dumpcycle and I have problems when I use the
> amrecover tool.
>
> The problem is connected with the fact that amrecover consumes a lot
> of ports and will eventually run out of some internal buffer space.
> This only happens if the dumpcycle is larger than about 40.
>
> I can restore the disks from backup reliably by using amfetchdump, but
> with amrecover it depends.
>
> Have you seen this problem before?
>
> Best regards
>
> Rüdiger Kessel
>
No I haven't. OTOH, I have not had occasion to restore since I had a 
motherboard fire, destroying one of the usb ports, and put a new Asus 
X370 mobo, a 6 core 3.7GHz I5, and 32 GB of dram behind that same set of 
drives and rebooted.  My thoughts are that with that much memory to play 
with, it probably will not have a problem. My backups are running, doing 
5 machines with the compressible stuff being fed to gzip -best, in 
usually under an hour for nominally 20 to 30GB written to a 2TB 
drive.Are you able to monitor memory useage while amrecover is running, 
with something like htop?  Might be educational for both of us.

> Am Fr., 25. Sept. 2020 um 16:24 Uhr schrieb Gene Heskett <
>
> [email protected]>:
> > On Friday 25 September 2020 09:19:58 Dave Sherohman wrote:
> > > Howdy, all!
> > >
> > > We've recently had some problems at work with our backup provider,
> > > so my boss has come to me and requested a recommendation for
> > > bringing backups in-house.  I've previously adminned a small
> > > amanda installation back in 2000-2006 and I quite liked the system
> > > and how it works, so that was my first thought.
> > >
> > > I've done some general web searches and it looks like the
> > > situation today isn't as good as it was a decade and a half ago -
> > > not a lot of active development, limited support for Windows
> > > clients, etc.  But, on the other hand, amanda was already a very
> > > mature system back then, so I don't know that a lot of ongoing
> > > development would still be needed.
> > >
> > > So let's see what the current users have to say.  Is a new amanda
> > > installation still a sane choice in 2020?
> > >
> > > My use case is that I'll be backing up somewhere in the
> > > neighborhood of 75ish servers, a mix of physical and (mostly)
> > > virtual machines, and a mix of mostly Linux with some Windows and
> > > one or two FreeBSD.  Total disk usage is currently in the 35-40 TB
> > > range, growing by maybe 1-2 TB per year.  Aside from my own
> > > positive experiences with amanda, both I and my boss (and most of
> > > my coworkers) are very pro-open-source.
> > >
> > > If amanda isn't a reasonable choice for that scenario, what would
> > > be a better option?
> > >
> > > And what kind of hardware specs should I be looking at?  Is tape
> > > still king, or is everyone backing up to hard drives now?
> >
> > I've been useing hard drives with amanda for over a decade now, I'd
> > estimate they are 100x more dependable than tape, as I've had zero
> > hd failures, and dozens of tape failures and tape drives that
> > absolutely had to spend the holidays in Oklahoma City being rebuilt.
> > And some of the HD's had nearly 100k spinning hours on them when
> > they got too small. Currently doing 5 machines nightly, with about
> > 58 days for vtapes recycle.  Whats not to like?  Amanda has Just
> > Worked here since the later 90's.
> >
> > Cheers, Gene Heskett
> > --
> > "There are four boxes to be used in defense of liberty:
> >  soap, ballot, jury, and ammo. Please use in that order."
> > -Ed Howdershelt (Author)
> > If we desire respect for the law, we must first make the law
> > respectable. - Louis D. Brandeis
> > Genes Web page 


Cheers, Gene Heskett
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis
Genes Web page 



RE: How's amanda feeling these days?

2020-09-25 Thread Cuttler, Brian R (HEALTH)


I came into an existing Amanda environment what I started this job 22 years ago.
Amanda has been upgraded numerous times on many platforms, and while we have 
migrated away from SGI clients we continue to have Solaris clients and several 
flavors of linux. Backup servers vary between Solaris and linux, we have 
upgraded tape drives and jukeboxes and use VTape on several of our Amanda 
servers.

Amanda has never been problematic and has saved us on numerous occasions.

Admittedly we are back rev and have not used some of the newer features, but 
have made good use of tape flush parameters, jukebox and vtape control and have 
never had difficulty with cross platform issues (taking into account 
OS/filesystem specific native tools).

Brian Cuttler
Wadsworth Center/NYS Department of Health

-Original Message-
From: [email protected]  On Behalf 
Of Debra S Baddorf
Sent: Friday, September 25, 2020 2:09 PM
To: Dave Sherohman 
Cc: Debra S Baddorf ; [email protected]
Subject: Re: How's amanda feeling these days?

ATTENTION: This email came from an external source. Do not open attachments or 
click on links from unknown senders or unexpected emails.


I've had an amanda “world” running for 15 or 20 years.  I have 33 unix nodes,
of varying flavors of unix.  They play together nicely, though you can’t unpack 
a file
on a different flavor of unix. I think you can with TAR rather than DUMP.
And, since many of my disks are big, I’m using a lot of TAR, to split them into
smaller chunks — which have grown in size as my tape drive has been upgraded
over time.

We still use physical tape.   I keep 70 days of backups  (inc every day, full 
once a week)
and a separate config for archival fulls once a month.  10-12 times your amount 
of data (fulls once a week)
might be rather a lot of disk space, but I haven’t compared.  Someone else 
recently
discussed the large cost of using cloud space.

I’m not doing any Windows backups;  we have another group that does those.

I’m happy with amanda’s current capabilities.  In fact, I’m not running the most
recent versions of amanda, not wanting the effort of changing for features I 
won’t use.

My data size is a bit over 30 TB, since my archival fulls now require a second
LT05 tape,  one of which claims 30 TB capacity.

Another data point -
Deb Baddorf
Fermilab


> On Sep 25, 2020, at 8:19 AM, Dave Sherohman  wrote:
>
> Howdy, all!
>
> We've recently had some problems at work with our backup provider, so my
> boss has come to me and requested a recommendation for bringing backups
> in-house.  I've previously adminned a small amanda installation back in
> 2000-2006 and I quite liked the system and how it works, so that was my
> first thought.
>
> I've done some general web searches and it looks like the situation
> today isn't as good as it was a decade and a half ago - not a lot of
> active development, limited support for Windows clients, etc.  But, on
> the other hand, amanda was already a very mature system back then, so I
> don't know that a lot of ongoing development would still be needed.
>
> So let's see what the current users have to say.  Is a new amanda
> installation still a sane choice in 2020?
>
> My use case is that I'll be backing up somewhere in the neighborhood of
> 75ish servers, a mix of physical and (mostly) virtual machines, and a
> mix of mostly Linux with some Windows and one or two FreeBSD.  Total
> disk usage is currently in the 35-40 TB range, growing by maybe 1-2 TB
> per year.  Aside from my own positive experiences with amanda, both I
> and my boss (and most of my coworkers) are very pro-open-source.
>
> If amanda isn't a reasonable choice for that scenario, what would be a
> better option?
>
> And what kind of hardware specs should I be looking at?  Is tape still
> king, or is everyone backing up to hard drives now?
>
> --
> Dave Sherohman





Re: How's amanda feeling these days?

2020-09-25 Thread Chris Hoogendyk
I'm running Amanda 3.5.1 on 3 backup servers in 3 different departments. One of them is on the small 
side and still running LTO6 in an Overland tape library with 1 drive. The other 2 are sizeable in 
terms of data, but small in terms of servers. Roughly half a dozen Linux Ubuntu 16.04 servers each, 
but with around 100TB of storage capacity more than half filled. One of the two has a fair bit of 
highly compressible data, while the other has a fair bit of highly uncompressible data. They are 
both running with dual LTO7 tape drives in Overland tape libraries. Backups every night of the week 
with fulls at least once a week. Lots of manual splitting of DLEs. The one with the compressible 
data keeps up fairly well. The one with uncompressible data ends up not being able to do everything 
periodically, and I have to deal with it manually. I let the LTO7 do the compression. It is 
important with my setups. The gzip processes were eating up my servers, and things run much faster 
with the LTO7 compression.



On 9/25/20 9:19 AM, Dave Sherohman wrote:

Howdy, all!

We've recently had some problems at work with our backup provider, so my
boss has come to me and requested a recommendation for bringing backups
in-house.  I've previously adminned a small amanda installation back in
2000-2006 and I quite liked the system and how it works, so that was my
first thought.

I've done some general web searches and it looks like the situation
today isn't as good as it was a decade and a half ago - not a lot of
active development, limited support for Windows clients, etc.  But, on
the other hand, amanda was already a very mature system back then, so I
don't know that a lot of ongoing development would still be needed.

So let's see what the current users have to say.  Is a new amanda
installation still a sane choice in 2020?

My use case is that I'll be backing up somewhere in the neighborhood of
75ish servers, a mix of physical and (mostly) virtual machines, and a
mix of mostly Linux with some Windows and one or two FreeBSD.  Total
disk usage is currently in the 35-40 TB range, growing by maybe 1-2 TB
per year.  Aside from my own positive experiences with amanda, both I
and my boss (and most of my coworkers) are very pro-open-source.

If amanda isn't a reasonable choice for that scenario, what would be a
better option?

And what kind of hardware specs should I be looking at?  Is tape still
king, or is everyone backing up to hard drives now?


--
---

Chris Hoogendyk

-
   O__   Systems Administrator, Retired
  c/ /'_ --- Biology & Geosciences Departments
 (*) \(*) -- 315 Morrill Science Center III
~~ - University of Massachusetts, Amherst



---

Erdös 4



Re: How's amanda feeling these days?

2020-09-25 Thread Debra S Baddorf
I've had an amanda “world” running for 15 or 20 years.  I have 33 unix nodes,
of varying flavors of unix.  They play together nicely, though you can’t unpack 
a file
on a different flavor of unix. I think you can with TAR rather than DUMP.
And, since many of my disks are big, I’m using a lot of TAR, to split them into 
smaller chunks — which have grown in size as my tape drive has been upgraded 
over time.

We still use physical tape.   I keep 70 days of backups  (inc every day, full 
once a week)
and a separate config for archival fulls once a month.  10-12 times your amount 
of data (fulls once a week)
might be rather a lot of disk space, but I haven’t compared.  Someone else 
recently
discussed the large cost of using cloud space.

I’m not doing any Windows backups;  we have another group that does those.

I’m happy with amanda’s current capabilities.  In fact, I’m not running the most
recent versions of amanda, not wanting the effort of changing for features I 
won’t use.

My data size is a bit over 30 TB, since my archival fulls now require a second
LT05 tape,  one of which claims 30 TB capacity.

Another data point -
Deb Baddorf
Fermilab


> On Sep 25, 2020, at 8:19 AM, Dave Sherohman  wrote:
> 
> Howdy, all!
> 
> We've recently had some problems at work with our backup provider, so my
> boss has come to me and requested a recommendation for bringing backups
> in-house.  I've previously adminned a small amanda installation back in
> 2000-2006 and I quite liked the system and how it works, so that was my
> first thought.
> 
> I've done some general web searches and it looks like the situation
> today isn't as good as it was a decade and a half ago - not a lot of
> active development, limited support for Windows clients, etc.  But, on
> the other hand, amanda was already a very mature system back then, so I
> don't know that a lot of ongoing development would still be needed.
> 
> So let's see what the current users have to say.  Is a new amanda
> installation still a sane choice in 2020?
> 
> My use case is that I'll be backing up somewhere in the neighborhood of
> 75ish servers, a mix of physical and (mostly) virtual machines, and a
> mix of mostly Linux with some Windows and one or two FreeBSD.  Total
> disk usage is currently in the 35-40 TB range, growing by maybe 1-2 TB
> per year.  Aside from my own positive experiences with amanda, both I
> and my boss (and most of my coworkers) are very pro-open-source.
> 
> If amanda isn't a reasonable choice for that scenario, what would be a
> better option?
> 
> And what kind of hardware specs should I be looking at?  Is tape still
> king, or is everyone backing up to hard drives now?
> 
> -- 
> Dave Sherohman




Re: How's amanda feeling these days?

2020-09-25 Thread wls
I suggest amtape to vtl and amvault fulls periodically to tape for offsite 
storage.  Less physical wear and tear.

On September 25, 2020 10:37:52 AM EDT, Charles Curley 
 wrote:
>On Fri, 25 Sep 2020 08:19:58 -0500
>Dave Sherohman  wrote:
>
>> So let's see what the current users have to say.  Is a new amanda
>> installation still a sane choice in 2020?
>
>I think so. You have a lot more machines to herd than I do, but I don't
>think that's an impediment. Add a few machines at a time and watch your
>tape usage grow.
>
>Amanda appears to be rock solid. But you'll have to talk to others
>about Windows backups.
>
>> And what kind of hardware specs should I be looking at?  Is tape
>still
>> king, or is everyone backing up to hard drives now?
>
>I have used virtual tapes (vtapes, in Amanda parlance) for over a
>decade. I believe the practice is common. However you have a lot more
>disk space to back up than I do. What are you backing up to now?
>
>Consider multiple configurations, some of which back up to tape, and
>others to vtape.
>
>-- 
>Does anybody read signatures any more?
>
>https://charlescurley.com
>https://charlescurley.com/blog/

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: How's amanda feeling these days?

2020-09-25 Thread Charles Curley
On Fri, 25 Sep 2020 08:19:58 -0500
Dave Sherohman  wrote:

> So let's see what the current users have to say.  Is a new amanda
> installation still a sane choice in 2020?

I think so. You have a lot more machines to herd than I do, but I don't
think that's an impediment. Add a few machines at a time and watch your
tape usage grow.

Amanda appears to be rock solid. But you'll have to talk to others
about Windows backups.

> And what kind of hardware specs should I be looking at?  Is tape still
> king, or is everyone backing up to hard drives now?

I have used virtual tapes (vtapes, in Amanda parlance) for over a
decade. I believe the practice is common. However you have a lot more
disk space to back up than I do. What are you backing up to now?

Consider multiple configurations, some of which back up to tape, and
others to vtape.

-- 
Does anybody read signatures any more?

https://charlescurley.com
https://charlescurley.com/blog/


Re: How's amanda feeling these days?

2020-09-25 Thread Robert Heller
At Fri, 25 Sep 2020 08:19:58 -0500 Dave Sherohman  wrote:

> 
> Howdy, all!
> 
> We've recently had some problems at work with our backup provider, so my
> boss has come to me and requested a recommendation for bringing backups
> in-house.  I've previously adminned a small amanda installation back in
> 2000-2006 and I quite liked the system and how it works, so that was my
> first thought.
> 
> I've done some general web searches and it looks like the situation
> today isn't as good as it was a decade and a half ago - not a lot of
> active development, limited support for Windows clients, etc.  But, on
> the other hand, amanda was already a very mature system back then, so I
> don't know that a lot of ongoing development would still be needed.

Yes, it is quite mature.

> 
> So let's see what the current users have to say.  Is a new amanda
> installation still a sane choice in 2020?

Yes, at least for a Linux network. I have very minimual experience with 
MS-Windows.  The only MS-Windows system I have experience with is a VM on a 
Linux machine using LVM, so I just create a LVM snapshop and then do a raw 
"disk" backup, but only even few months -- the MS-Windows system is only used 
twice a year to interface with an HVAC system.

> 
> My use case is that I'll be backing up somewhere in the neighborhood of
> 75ish servers, a mix of physical and (mostly) virtual machines, and a
> mix of mostly Linux with some Windows and one or two FreeBSD.  Total
> disk usage is currently in the 35-40 TB range, growing by maybe 1-2 TB
> per year.  Aside from my own positive experiences with amanda, both I
> and my boss (and most of my coworkers) are very pro-open-source.
> 
> If amanda isn't a reasonable choice for that scenario, what would be a
> better option?
> 
> And what kind of hardware specs should I be looking at?  Is tape still
> king, or is everyone backing up to hard drives now?
> 

-- 
Robert Heller -- Cell: 413-658-7953 GV: 978-633-5364
Deepwoods Software-- Custom Software Services
http://www.deepsoft.com/  -- Linux Administration Services
[email protected]   -- Webhosting Services
   


Re: How's amanda feeling these days?

2020-09-25 Thread Gene Heskett
On Friday 25 September 2020 09:19:58 Dave Sherohman wrote:

> Howdy, all!
>
> We've recently had some problems at work with our backup provider, so
> my boss has come to me and requested a recommendation for bringing
> backups in-house.  I've previously adminned a small amanda
> installation back in 2000-2006 and I quite liked the system and how it
> works, so that was my first thought.
>
> I've done some general web searches and it looks like the situation
> today isn't as good as it was a decade and a half ago - not a lot of
> active development, limited support for Windows clients, etc.  But, on
> the other hand, amanda was already a very mature system back then, so
> I don't know that a lot of ongoing development would still be needed.
>
> So let's see what the current users have to say.  Is a new amanda
> installation still a sane choice in 2020?
>
> My use case is that I'll be backing up somewhere in the neighborhood
> of 75ish servers, a mix of physical and (mostly) virtual machines, and
> a mix of mostly Linux with some Windows and one or two FreeBSD.  Total
> disk usage is currently in the 35-40 TB range, growing by maybe 1-2 TB
> per year.  Aside from my own positive experiences with amanda, both I
> and my boss (and most of my coworkers) are very pro-open-source.
>
> If amanda isn't a reasonable choice for that scenario, what would be a
> better option?
>
> And what kind of hardware specs should I be looking at?  Is tape still
> king, or is everyone backing up to hard drives now?

I've been useing hard drives with amanda for over a decade now, I'd 
estimate they are 100x more dependable than tape, as I've had zero hd 
failures, and dozens of tape failures and tape drives that absolutely 
had to spend the holidays in Oklahoma City being rebuilt. And some of 
the HD's had nearly 100k spinning hours on them when they got too small. 
Currently doing 5 machines nightly, with about 58 days for vtapes 
recycle.  Whats not to like?  Amanda has Just Worked here since the 
later 90's.

Cheers, Gene Heskett
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis
Genes Web page