Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

Bogdan Costescu Wed, 01 Oct 2008 06:04:43 -0700

On Tue, 30 Sep 2008, Donald Becker wrote:

Ahhh, your first flawed assumption.


You believe that the OS needs to be statically provisioned to the nodes.
That is incorrect.

Well, you also make the flawed assumption that the best technicalsolutions are always preferred. From my position I have seen manycases where political or administrative reasons have very muchrestricted the choice of technical solutions that could be used. Otherreasons are related to the lack of flexibility from ISVs which provideapplications in binary form only and make certain assumptions aboutthe way the target cluster works. Yet another reason is the fact thata solution like Scyld's limits the whole cluster to running onedistribution (please correct me if I'm wrong), while a solution withnode "images" allows mixing Linux distributions at will.

The only times that it is asked to do something new (boot, accept anew process) it's communicating with a fully installed, up-to-datemaster node. It has, at least temporarily, complete access to areference install.

I think that this is another assumption that holds true for the Scyldsystem, but there are situations where this is not true. Some yearsago I have developed a rudimentary batch system for which the masternode only contacted the first node allocated/desired for the job; thisnode was then responsible to contact the other nodes allocated/desiredand start the rest of the job. This was very much modelled after theway the naive rsh/ssh based launchers for MPI jobs work: once mpirunis running, there is no connection to the master node, only betweenthe node where mpirun is running and the rest of the nodes specifiedin the hosts file. I think that Torque also has a similar design(Mother Superior being in control of the job), but I haven't lookclosely at the details so I might be wrong.

If you design a cluster system that installs on a local disk, it'svery difficult to adapt it to diskless blades. If you design asystem that is as efficient without disks, it's trivial tooptionally mount disks for caching, temporary files or applicationI/O.

If you design a system that is flexible enough to allow you to useeither diskless or diskfull installs, what do you have to loose ?

The same node "image" can be used in several ways:

- copied to the local disk and booted from there (where the copyingcould be done as a separate operation followed by a reboot or it canbe done from initrd)

- used over NFS-root
- used as a ramdisk, provided that the node "image" is small enough

Note: I have used "image" in this and previous e-mails to signify thecollection of files that the node needs for booting; most likely thisis not a FS image (like an ISO one), but it could also be one. Variousdocuments call this a "virtual node FS", "chroot-ed FS", etc.


--
Bogdan Costescu

IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany
Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850
E-mail: [EMAIL PROTECTED]
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

Reply via email to