My experience with NFS mounted roots is that they can bombard your
network with packets. A simple little script can manage to generate
enough traffic to actually slow down other services just by hitting
tons of services. Plus if you have spoolers and such you end up
generating a ton of traffic or memory. The advantage to a local
install is that you can cut down on the network traffic drastically.
Granted, if your applications are all embarrassingly parallel and
don't do a ton of disk IO then NFS root works great.. Many of the
applications we use here would utterly destroy the network if run
from a NFS mounted root.. The advantage of rebuilding is consistency
without the disadvantage of NFS roots.
On Nov 21, 2005, at 9:45 AM, Eric Thibodeau wrote:
Le 21 Novembre 2005 11:10, Robin H. Johnson a écrit :
On Sun, Nov 20, 2005 at 08:51:13PM -0500, St?phane Lacasse wrote:
[snip discussion about installing]
I've done the cluster system (128 node+ 1 master) in a similar
fashion
to what you are after.
1. PXE-boot install environment for performing installs of both the
master and all of the nodes.
PXE-boot even for the Master?...so where do the images reside...how
do you
manage the slightly varying config items such as hostname and all?
This
approach still seems a little bit time consuming since all nodes
are still
individual entities (not NFS roots to a single maintained image).
Though
granted that the nodes being all identical, emerge -K should in
theory be a
breeze....but it's not the case for maintaining all the config files
consistent.
2. The install environment uses the Gentoo Installer, with the CLI
frontend I wrote for the GLI project, and performs complete
installs of
nodes in under 20 minutes (depending on network traffic).
So switching a machine's purpose/profile requires a complete re-
install on the
node? You state 20 minutes for re-installing, is it a _real_
install or the
dump of a "reference" root? (Pardon my ignorance of the CLI
installer you are
referring to... I'll read the http link you'll send me ;) )
By using GLI, it's a simple matter of altering the install
profiles to
reconfigure the cluster, and wipe the nodes for changing their
purpose
(presently we have an MPI mode and a MOSIX mode), some of the cluster
users need assurances that none of their data remains on the cluster
after they are done, hence being able to reinstall easily.
[...]
Also, make use of your cluster tools to administer the cluster.
OpenPBS
allows running a job on all nodes, so use it to emerge -K [package].
(not -k as binpkgs don't currently have any locking in $PKGDIR,
and can
get corrupted if two emerge processes try to create a binpkg at the
same time.)
Actually, I would have thought you use _one_ node to compile the
packages
(using distcc at your description) and _then_ propagate the package
onto the
other nodes with -K....still, I would think maintaining an NFS
mounted ROOT
would be less cumbersome....
--
Eric Thibodeau
--
[email protected] mailing list
--
[email protected] mailing list