On Thursday 17 November 2005 14.59, Eric Thibodeau wrote:
> I would have scanned the mailling list for this but never found the
> search engine for it...
>
> As the title says, I would love to see a Gentoo clustering solution
> based on the www.clustermatic.org/www.ltsp.org approach (basically,
> PXE booted OS). In fact, work on this would open up Gentoo to both
> clustering and LTSP usage. Most of the work that needs to be done,
> if I am not mistaken, is to create a Gentoo environment which is
> NFS bootable as done with LTSP. We could then "easily" manage the
> tolls and utilities available on the slave nodes by using portage.
> Obviously, there is also some kernel tweaking involved but we can
> start from some previous work done in that aera in all cases.
>
> I am actually asked to build a Gentoo based cluster next semester
> and I would definately like to build it to be as flexible and
> scalable as possible. I believe that the above approach fits these
> requirements quite well. So I'm opened to any suggestions for this
> project. Obviously, I'll document this and attempt to make the
> procedure as accessible to anyone as possible.
>
> Future work would probably be to seamlessly integrate OpenPBS into
> such a PXE-able environment to enable it to reboot/configure nodes
> as required for given tasks/profiles.
>
> Thanks,
>
> Eric Thibodeau



Hello folks

This is actually quite doable with gentoo, and not too hard I might 
add. I'm surely no gentoo/linux-wiz, and I have banged together 
something like this. I have built several different clusters by the 
setup since 2002 and it seems to work ok.

SSI: single system image. All servers and nodes run off the shared 
root image, no problem with local installs and duplicating binaries.

Diskless boot: all nodes can boot diskless over pxe => dhcp => tftp => 
nfs

Root over nfs: just one file system tree, nothing on diskless nodes. 
Swap, and local tmp storage on nodes that have disks.

Openmosix, mpi, pvm, gridengine, custom batch queues etc for the 
clustering. The apps I write usually use fork and forget over 
OpenMosix. It's just soooo simple.

And since openmosix helps with load balancing you can use the nodes as 
regular workstations as well. I wouldn't recommend using the servers 
as workstations for stability reasons, even if it is possible. My own 
constant pre-alpha test cluster runs from whatever I happend to have 
available, and is home-based on my workstation. But my production 
clusters have dedicated servers.

A nice side effect is that most if not all of the system is somewhat 
hot-swappable. I can even swap the servers on a running cluster, with 
some restrictions. Yes I know it sounds crazy, but I've done it 
successfully several times.

No duplicate system trees and binaries, just tweak with 
selective /etc/init.d scripts and such.

A few init.d scripts has to be slightly changed from the gentoo 
originals, along with /sbin/rc and /sbin/functions.sh.
The nodes can have different config files. But that is a very minor 
overhead.


I have posted on this topic before, but there hasn't been much 
interest. If you want to try it then contact me and I'll see if I can 
whip up a quick terse description. Then we can flesh it out as we go. 
But I don't have time to actually write something nice for quite a 
while, unfortunately.

Harebrafolk
Jimmy
-- 
[email protected] mailing list

Reply via email to