Hello Bruce,

I don't think that openMOSIX will do any good to you. In fact, IIRC,
if the computer from which MOSIX processes has been started die then
the processes will die too. This is not a solution for
high-avalaibility but mainly for load-balancing.

OSCAR-HA is meant to adress exactly the kind problem you have. As
such, you should try it. It is meant to be used with "regular" OSCAR
and it provides HA to the master node only. You will certainly a
couple of additional hardware if you want a real complete redundancy.

Please have a look at the OSCAR-HA webpage :
http://xcr.cenit.latech.edu/ha-oscar/

Ben

On Thu, 14 Oct 2004 13:55:57 +0200 (SAST), Bruce Becker
<[EMAIL PROTECTED]> wrote:
> Hello OSCAR friends
> 
> I am in the sad situation of having a very sick head head node, which
> tends to die on me at the most inconvenient times. I don't know whether
> it's a hardware or software problem, but the machine is nearing the end
> of her life, so it's not impossible that the motherboard is going wonky or
> something.
> 
> My question to you guys is this :
> How to eliminate that single point of failure ?
> 
> At the moment, this machine is not only our head node, but also the
> ssh-gateway, website, firewall, and many other monitoring services
> like gmond, gmetad, clumon run on that machine. I would like to move most
> of the services off of the head node, and leave only those necessary for
> computing, like lamd, etc. My idea then is to have a separate machine act
> as a router to and from our cluster, which is also a remote logger and
> polls the monitoring feeds from ganglia, etc on the cluster head node. The
> cluster head node can then be virtualised... I was thinking a couple of
> boxes running MOSIX kernel on the same switch... We need to eliminate the
> point of failure and have some-sort of fail-over mechanism. Note that my
> head node doesn't die because of an overloaded system, the failures are
> pretty much random at this stage...
> 
> Can I do something like this with HA-OSCAR ? Can I mix HA-OSCAR and
> FAT-OSCAR ? Does anyone else have this
> head-node-dies-and-life-turns-to-hell problem ? If so, how did you
> get around it ?
> 
> Comments welcome !
> Thanks
> Bruce
> 
> --
> --
> Bruce Becker
> UCT-CERN Research Center - University of Cape Town
> Private Bag RONDEBOSCH 7700
> tel :(w) +27 21 650 3356 | (m) +27 82 537 9425 | (f) +27 21 650 3342
> IM  : AIM/Jabber - brucellino | Yahoo - uctbruce
> WEB :http://hep.phy.uct.ac.za/~becker
> "Viel hilft viel"
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> Use IT products in your business? Tell us what you think of them. Give us
> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
> http://productguide.itmanagersjournal.com/guidepromo.tmpl
> _______________________________________________
> Oscar-users mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/oscar-users
> 


-- 
Benoit des Ligneris Ph. D.
President de Revolution Linux     http://www.revolutionlinux.com/
OSCAR Chair                    http://oscar.openclustergroup.org/
Chef de projet EduLinux                  http://www.edulinux.org/


-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to