On 03/10/13 00:00, Dave Love wrote:
Lionel SPINELLI <[email protected]> writes:

Hello all,

I have a question that is not directly linked to SGE but relates to
the same. Which tool administrators that have to install, manage,
configure and ensure coherence between lot of grid nodes use? I mean,
if I have 10 nodes in my grid and need to be sure that all of them
have the right software/configuration, I don't want to manually
configure each machine.

It seems to be a religious topic...  My requirements for managing node
images are:

1. free software
2. stateless image (NFS root + local /tmp; modify the shared root
    more-or-less directly)
3. support for heterogeneous systems with different images for multiple
    OSes and customizing for different node groups with a single image
4. decoupled from the OS (not living somewhat in its own world, like
    Rocks) so you do normal package management

When I had to pick one swiftly, the only one it was clear would do
3. properly was oneSIS <http://www.onesis.org>, though probably others
can.  I've run a 250-node horrible mess of hardware as a shared
everything cluster with oneSIS off a single NFS server.  I recently
replaced a vendor's useless imaging scheme with it for the second time.

Do you know a simple tool that could do the job? My researches lead me
to "Puppet Master" but I would like to get advises from experts...

I'm not convinced that's appropriate for an HPC cluster, but people with
more HPC experience disagree.

You need tools apart from image management, of course.

In lots of ways (as with so many things), this is a case of you first need to do is define your requirements... they are certainly not the same for all clusters / setups. Most management solutions will get you there in the end, but you can safe yourself a lot of effort by sizing them to your requirements.

For example, Dave started his description with "My requirements for managing node images are" - in some ways, that's already very different from what I need to do (or how I look at it), as I don't actually have any requirement to manage node images.

Our base requirement is for our whole server/workstation estate (including cluster) to have a known and defined configuration/software environment. Everything's PXE booted of the same installer image. Everything uses the same kickstart file. So to us, it all boils down to configuration management - hence CFEngine.

I'd second Dave's requirement 4 - I wouldn't really go for anything that's coupled to the OS.

Tina

--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442

--
This e-mail and any attachments may contain confidential, copyright and or 
privileged material, and are for the use of the intended addressee only. If you 
are not the intended addressee or an authorised recipient of the addressee 
please notify us of receipt by returning the e-mail and do not use, copy, 
retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and 
Wales with its registered office at Diamond House, Harwell Science and 
Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom




_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to