RE: [gentoo-user] Admin system documentation

2005-04-18 Thread Dave Nebinger
  I just can't get my head around why you would want or need to do a total
  rebuild.
 I myself can't imagine _wanting_ to do a rebuild, but needing to, yes.
 Flood, tornado, theft, etc .
 
 Users have work to do. Let's see, what's my SAMBA configuration? And
 those SQL databases looked like what? How quickly could you bring a
 system with many users back online _and_ functional?
 
 If you are in a position of being responsible for thoses entities, you
 had better have your ass covered. That's what I'm trying to do.

If your only method of getting a production system back online is by doing a
full rebuild, then you are royally screwed.  Most linux systems (especially
gentoo-based) are in a constant state of flux.  Users get added and removed,
emerge updates packages and config files, databases have rows added/removed,
etc.

Your primary disaster recovery tool is the system backup.  That will be the
only way to get your system back to the state it was in before the failure
occurred.

Planning to rebuild from scratch rather than restore from backup ensures
that a) you're going to need a huge window of time to get the system back to
a functional state, b) you're going to need another huge window of time to
get all of your configuration back in line with the current setup, and c)
your data from databases will be basically lost (recreating tables is just
the start of trying to bring it back online).

So forget the 'rebuild' idea, it won't work and will be prone to failure.

Instead focus on the appropriate tool to build system backups.  Test your
backup procedures and restoration procedures earnestly to ensure that what
you plan to do for disaster recovery really will work.

If you can get to a static system state (i.e. you stop emerging
packages/updates), you could get away with a full system backup performed
once followed by incremental backups from /etc, /var, and /home.

And don't forget to take copies of the backup'd media offsite to another
location; that way if the building goes under (with your system in it)
you'll be able to get another system online based off of the offsite media.

Dave



-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] Admin system documentation

2005-04-18 Thread kashani
John J. Foster wrote:
Good afternoon,
I had intended on starting my conversion from Suse 9.1 to Gentoo over
the weekend, but the weather turned out to be way to nice to remain
indoors.
But in my planning stages I realized I have a bit of a longer learning
curve than I initially anticipated. So, I'm going to remedy this by
starting off with:
Question #1
What do the professional (and amateur) admins among you consider to be
essential system documentation in the event of a disaster. I am
fairly well versed on the requirements of a M$ based system and network,
but have only been dabbling in Linux for a couple years. Backups I know,
but what is considered a fairly necessary _paper_ trail in the event
that the unexpected happens, and a total rebuild is necessary in the
shortest time possible.
	Much of it depends on the size of your network and what you're doing. I 
run mostly clusters of the same box. So I don't back up the OS because I 
can lose one without causing trouble and can easily add a new server in 
the case of a prolonged outage. I do tar.gz the data and relevant config 
files. You'll want to make a judgment call on how much infrastructure 
you think you need to dedicate to backups.

	However in the event that I need to add a new server I do have a few 
build docs and scripts. It's really the little stuff that is a pain in 
the ass. I keep track of it in a Wiki, which seems less annoying than 
they were a few years ago, but whatever works for you. Mine has stuff 
like this:

New Server build
/etc/resolv.conf
	Use the linked file. Never change the domain search order or old broken 
stuff that the wed devs idiots won't grep out will break. The rotate 
stuff distributes the lookups so we don't overload ns1. The timeouts 
cause lookups to timeout faster and move on the next server in case we 
lose a name server.

/etc/rc.conf
Change default editor to vim
Name Servers only
Add symlinks to deal with Gentoo bind/named nonsense
ln -sf /etc/bind/named.conf /etc/named.conf
Make the Redhat admins happy
ln -sf /var/bind /var/named
And then ask, why does Gentoo use named in /var/run/ instead bind?
and so on.
	I would however seriously look at the Catalyst tool for building stage3 
or what people are calling stage4 builds. I haven't gotten it working 
exactly right, but the idea would be to include most of the little 
nonsense in a stage3 and then use that as my base to generate stage4 
build of particular kinds of server. So I'd be able to lay down a new 
name server, web server, mail server, db server etc in under an hour. 
Then import or rsync any data over and you're set.

	A few people I know are using Feather Linux to try to ghost partitions 
and then lay them down on new machines. Haven't heard how well that's 
working, but there was no reason why it would be an issue.

	The roundabout point on this is a Linux environment can get away from 
you if you let it. Use the same USE variables on everything and document 
the reason why you use those. And don't forget to update the docs and 
machines if that ever changes or you'll be fighting for three hours on a 
Sunday when something isn't quite kosher.

kashani
--
gentoo-user@gentoo.org mailing list