Brian R. Smith wrote:
Hey list,
1. Proprietary parallel storage systems (like Panasas, etc.): It
provides the per-node bandwidth, aggregate bandwidth, caching
mechanisms, fault-tolerance, and redundancy that we require (plus having
a vendor offering 24x7x365 support & 24 hour turnover is quite a breath
of fresh air for us). Price point is a little high for the amount of
storage that we will get though, little more than doubling our current
overall capacity. As far as I can tell, I can use this device as a
permanent data store (like /home) and also as the user's scratch space
so that there is only a single point for all data needs across the
cluster. It does, however, require the installation of vendor kernel
modules which do often add overhead to system administration (as they
need to be compiled, linked, and tested before every kernel update).
If you like Panasas, go with them.
The kernel module thing isn't all that a big deal - they are quite
willing to 'cook' the modules for you.
but YMMV
Our final problem is a relatively simple one though I am definitely a
newbie to the H.A. world. Under this consolidation plan, we will have
only one point of entry to this cluster and hence a single point of
failure. Have any beowulfers had experience with deploying clusters
with redundant head nodes in a pseudo-H.A. fashion (heartbeat
monitoring, fail-over, etc.) and what experiences have you had in
adapting your resource manager to this task? Would it simply be more
feasible to move the resource manager to another machine at this point
(and have both headnodes act as submit and administrative clients)? My
current plan is unfortunately light on the details of handling SGE in
such an environment. It includes purchasing two identical 1U boxes
(with good support contracts). They will monitor each other for
availability and the goal is to have the spare take over if the master
fails. While the spare is not in use, I was planning on dispatching
jobs to it.
I have constructed several clusters using HA.
I believe Joe Landman has also - as you are in the States why not give
some thought to contacting Scalable and getting them to do some more
detailed designs for you?
For HA clusters, I have implemented several clusters using Linux-HA and
heartbeat. This is an active/passive setup, with a primary and a backup
head node. On failover, the backup head node starts up cluster services.
Failing over SGE is (relatively) easy - the main part is making sure
that the cluster spool directory is on shared storage.
And mounting that share storage on one machine or the other :-)
The harder part is failing over NFS - again I've done it.
I gather there is a wrinkle or two with NFS v4 on Linux-HA type systems.
The second way to do this would be to look at using shared storage,
and using the Gridengine queue master failover mechanism. This is a
different approach, in that you have two machines running, using either
a NAS type storage server or Panasas/Lustre. The SGE spool directory is
on this, and the SGE qmaster will start on the second machine if the
first fails to answer its heartbeat.
ps. 1U boxes? Think something a bit bigger - with hot swap PSUs.
You also might have to fit a second network card for your HA heartbeat
link (link plural - you need two links) plus a SCSI card, so think
slightly bigger boxes for the two head nodes.
You can spec 1U nodes for interactive login/compile/job submission
nodes. Maybe you could run a DNS round robin type load balancer for
redundancy on these boxes - they should all be similar, and if one stops
working then ho-hum.
pps. "when the spare is not in use dispatching jobs to it"
Actually, we also do a cold failover setup which is just like that, and
the backup node is used for running jobs when it is idle.
--
John Hearns
Senior HPC Engineer
Streamline Computing,
The Innovation Centre, Warwick Technology Park,
Gallows Hill, Warwick CV34 6UW
Office: 01926 623130 Mobile: 07841 231235
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf