pull) to multiple architectures

Dan Podeanu Wed, 19 Oct 2005 17:43:53 -0700

Interesting topic. My solution (I finished implementing it several months
ago) for centralized
management is something like this:


Definitions:

Gentoo snapshot: a image of a Gentoo installation, that deployed on a server
will allow
booting of a running system.
Blade root: a server designated at one time as booting server for others.

Objectives:

1. Low maintenance costs: maintaining and applying patches to a single build
(Gentoo snapshots).
2. Low scalability overhead: scalability should be part of the design, it
should not take more than
10 minutes per server to scale up.
3. Redundancy: Permanent hardware failure of N-1 out of N nodes, or
temporary failure (power off)
of all nodes should allow fast (10 minutes) recovery of all nodes in a
cluster.

Restrictions:

1. Single CPU architecture: I consider the cost of maintaining several
architectures to be bigger than
the cost of purchasing a single architecture.

2. Unified packages tree: I consider the cost of maintaining several Gentoo
snapshots just to have
deployed the minimum of packages per server assigned to a specific
application (mail server, web
server etc.) to be bigger than having a common build with all packages and
just starting the required
services (ie. all deployed servers have a both a MTA and Apache installed,
just that web servers
have Apache started, and the mail servers have it stopped and MTA running
instead).

3. An application that can act as a cluster with transparent failover (web
with balancer and health
checking, multiple MX servers, etc.)

4. A remote storage for persistent data (like logs) helps (you will see
why); you can modify the
partitioning or harddisk configuration to maintain a stable filesystem on
individual servers.

Hardware:

1. AMD Opteron blades: 2x Opteron, 4-12gb ram, 1 SATA HDD. Reasons for
choosing:
   - Opteron is cheaper, faster than Xeon, we can replace single cores with
dual cores at any point
     without generating too much heat.

   - 4-12gb RAM & SATA: I prefer the OS to cache a lot of things in RAM to
speed things up, as opposed
     to getting a little ram and expensive SCSI. Harddisks have too many
moving parts and generate a lot
     of heat, which is a problem in a dense CPU environment such as blades
(and you can't trust your
     datacenter to really cool things). And RAM is cheap nowadays anyway.

2. Gigabit network cards with PXE

Software:

One initial server (blade root) is installed with Gentoo. On top of that, in
a directory, another
Gentoo is installed (Gentoo snapshot) that will be replicated on individual
servers as further described,
and all maintenance to the snapshot is done in chroot.

The Blade root runs DHCP and tftp and is able to answer PXE dhcp/tftp
requests (for network boot) and serve
an initial bootloader (grub 0.95 with diskless and diskless-undi patches to
allow detection of Broadcom NICs),
along with an initial initrd filesystem.

The Gentoo snapshot contains all the packages required for all applications
(roughly 2gb on our systems),
along with dhcp/tftp and configs, to allow it to act as Blade root.

In addition, the Blade root contains individual configurations for every
individual deployed server (or, rather,
only changes to the standard Gentoo config, ie. per-blade IPs, custom
application configs, different configuration
for services to start as boot, etc.)

The Gentoo snapshot is compressed as tar.gz in two archives, one with
'running code' (ie. /usr, /bin, etc.)
and another one with additional things we don't really need on every server
(like portage, usr/src, manpages, etc.).
The collection of scripts for compressing everything, initrd, along with all
individual blade configurations and misc
scripts are archived in a 3rd archive.

Booting takes place like this:

The Blade root is powered on and ready to answer PXE.

Blade boots, uses PXE to get an IP via DHCP, an initial grub image via tftp,
that subsequently downloads
the grub configuration file via dhcp.

Boot menu gets displayed, with a default and timeout. Grub downloads the
Linux kernel via tftp, along with
the initrd image. The kernel boots, mounts initrd, executes /linuxrc script.

The initrd contains busybox, a rsync client, fdisk/tar/gzip. When it starts,
it downloads the Gentoo snapshot from the
Blade root, along with the blade configuration and archiving scripts archive
(via rsync), uses fdisk to recreate a
partition table, creates a filesystem, uncompresses the blade configuration
in the target / partition, changes the root
and exec's init, thus booting Gentoo. At the end of Gentoo bootup, grub is
ran locally to install a bootloater to allow
booting in case the Blade root server is unavailable, and the services
required for the particular application the
blade is intended to are started.

The result is a server that has been booted remotely and is an exact image
of the unique blade source. It also contains
everything needed to boot by itself, and everything needed to further boot
other servers. Thanks to the individual
per-server configuration, only services meant for that machine are started,
and the bootup only takes 3-4 minutes more
than usual.

After booting one blade, using it as source to boot the initial Blade root
ensures all servers involved share the
same setup, thus a self-replicating system.

Maintenance costs are related only to updating the Blade root's blade
snapshot. After an emerge -u world,
rebooting the other blades one at a time will distribute the changes. On top
of this you can have whatever
synchronization method you want for the particular application you're
deploying.

I hope this helps.

Cheers,
Dan.


----- Original Message ----- 
From: "Ian P. Christian" <[EMAIL PROTECTED]>
To: <[email protected]>
Cc: "theboywho" <[EMAIL PROTECTED]>
Sent: Thursday, October 20, 2005 12:22 AM
Subject: Re: [gentoo-server] Centralized Gentoo (build -> push/pull) to
multiple architectures


-- 
[email protected] mailing list

Re: [gentoo-server] Centralized Gentoo (build -> push/pull) to multiple architectures

Reply via email to