pull) to multiple architectures

Ramon van Alteren Sun, 30 Oct 2005 13:36:03 -0800

Hi Dan,

On 20 Oct , 2005, at 2:41 AM, Dan Podeanu wrote:

Interesting topic.

Indeed, I'm moving to a different employer and was considering asimilar setup...

I'm curious about a number of things:

What's the scale of the cluster you're using this setup on ?
Would you be willing / able to share some of the work ?
I'd be very interested to look at your setup before I start my own.

Any comments on the hardware stability of the nodes you're using?
Which make blades are you using?

I was also wondering whether you are familiar with the work of http://www.infrastructures.org/

Your setup has many of it's characteristics.

Objectives:
1. Low maintenance costs: maintaining and applying patches to asingle build
(Gentoo snapshots).
2. Low scalability overhead: scalability should be part of thedesign, it
should not take more than 10 minutes per server to scale up.
3. Redundancy: Permanent hardware failure of N-1 out of N nodes, or
temporary failure (power off) of all nodes should allow fast (10minutes) recovery of all nodes in a
cluster.

I read below that all nodes include configs for dhcp/tftp in order tobe able to take over the golden (blade root) server. How do youhandle that? In case of downtime of the main blade root server whichof the nodes gets to take over? Is that an automatic or a manualprocess?

Additionally, did you test a all node failure and how did the masterblade root cope with the strain of all nodes booting at once? Whathardware are you using for the blade root server ?

Restrictions:

1. Single CPU architecture: I consider the cost of maintaining several
architectures to be bigger than the cost of purchasing a singlearchitecture.


Are you running a full 64-bit setup or 32-bit compatibility mode ?

What are your experiences with stability in 64-bit case ? Especiallycurious about php and it's diverse set of external libs. Do agreethough, any thoughts on the inevitable upgrade that's going to showup some time in the future when your current hardware platform is nolonger available ?

2. Unified packages tree: I consider the cost of maintainingseveral Gentoosnapshots just to have deployed the minimum of packages per serverassigned to a specificapplication (mail server, web server etc.) to be bigger than havinga common build with all packages and just starting the requiredservices (ie. all deployed servers have a both a MTA and Apacheinstalled, just that web servers have Apache started, and the mailservers have it stopped and MTA running instead).

Agreed, doesn't pay off to have seperate base-sets for the differenttype of nodes, and it's good on redundancy, if needed a formerwebserver can stand in as a database server etc..

3. An application that can act as a cluster with transparentfailover (web
with balancer and health checking, multiple MX servers, etc.)


I don't understand this restriction?

4. A remote storage for persistent data (like logs) helps (you willsee why);you can modify the partitioning or harddisk configuration tomaintain a stable filesystem on individual servers.


<snipped>

Software:
One initial server (blade root) is installed with Gentoo. On top ofthat, ina directory, another Gentoo is installed (Gentoo snapshot) thatwill be replicated on individualservers as further described, and all maintenance to the snapshotis done in chroot.
The Blade root runs DHCP and tftp and is able to answer PXE dhcp/tftp
requests (for network boot) and serve an initial bootloader (grub0.95 with diskless and diskless-undi patches to allow detection ofBroadcom NICs), along with an initial initrd filesystem.
The Gentoo snapshot contains all the packages required for allapplications(roughly 2gb on our systems), along with dhcp/tftp and configs, toallow it to act as Blade root.


See question above, is switching manual ?

In addition, the Blade root contains individual configurations foreveryindividual deployed server (or, rather, only changes to thestandard Gentoo config, ie. per-blade IPs, custom applicationconfigs, different configuration for services to start as boot, etc.)

Do you use classes here (e.g. webserver, databaseserver, mailserver,cachingserver etc.)?

Or do you maintain individual setups for each server?

What scripting language did you choose for the config scripts andstuff and why that script lang?


<booting process snipped>

I'm also curious as to what QA procdures you have in place to preventaccidental mistakes on the blade root server. I assume you testbeforehand ? On all server classes ? Modifications to the thirdarchive with the per-server config seem rather difficult to test.

I hope this helps.

Oh it sure did, it confirmed some ideas i was already thinking aboutand gave me a real world example that it can be done :-)


Thanks,

Ramon
--
Change what you're saying,
Don't change what you said

The Eels



--
[email protected] mailing list

Re: [gentoo-server] Centralized Gentoo (build -> push/pull) to multiple architectures

Reply via email to