My experience with NFS mounted roots is that they can bombard your network with packets. A simple little script can manage to generate enough traffic to actually slow down other services just by hitting tons of services. Plus if you have spoolers and such you end up generating a ton of traffic or memory. The advantage to a local install is that you can cut down on the network traffic drastically. Granted, if your applications are all embarrassingly parallel and don't do a ton of disk IO then NFS root works great.. Many of the applications we use here would utterly destroy the network if run from a NFS mounted root.. The advantage of rebuilding is consistency without the disadvantage of NFS roots.

On Nov 21, 2005, at 9:45 AM, Eric Thibodeau wrote:

Le 21 Novembre 2005 11:10, Robin H. Johnson a écrit :
On Sun, Nov 20, 2005 at 08:51:13PM -0500, St?phane Lacasse wrote:
[snip discussion about installing]

I've done the cluster system (128 node+ 1 master) in a similar fashion
to what you are after.
1. PXE-boot install environment for performing installs of both the
master and all of the nodes.
PXE-boot even for the Master?...so where do the images reside...how do you manage the slightly varying config items such as hostname and all? This approach still seems a little bit time consuming since all nodes are still individual entities (not NFS roots to a single maintained image). Though granted that the nodes being all identical, emerge -K should in theory be a
breeze....but it's not the case for maintaining all the config files
consistent.

2. The install environment uses the Gentoo Installer, with the CLI
frontend I wrote for the GLI project, and performs complete installs of
nodes in under 20 minutes (depending on network traffic).
So switching a machine's purpose/profile requires a complete re- install on the node? You state 20 minutes for re-installing, is it a _real_ install or the dump of a "reference" root? (Pardon my ignorance of the CLI installer you are
referring to... I'll read the  http link you'll send me ;) )

By using GLI, it's a simple matter of altering the install profiles to reconfigure the cluster, and wipe the nodes for changing their purpose
(presently we have an MPI mode and a MOSIX mode), some of the cluster
users need assurances that none of their data remains on the cluster
after they are done, hence being able to reinstall easily.
[...]
Also, make use of your cluster tools to administer the cluster. OpenPBS
allows running a job on all nodes, so use it to emerge -K [package].
(not -k as binpkgs don't currently have any locking in $PKGDIR, and can
get corrupted if two emerge processes try to create a binpkg at the
same time.)

Actually, I would have thought you use _one_ node to compile the packages (using distcc at your description) and _then_ propagate the package onto the other nodes with -K....still, I would think maintaining an NFS mounted ROOT
would be less cumbersome....

--
Eric Thibodeau

--
[email protected] mailing list



--
[email protected] mailing list

Reply via email to