I'm trying to build out a web farm cluster using Corosync/Pacemaker. I started with the stock versions in Ubuntu 12.04 but did not have a lot of success. I removed the corosync (1.x) and pacemaker packages and built Corosync 2.3 and Pacemaker 1.1.9 from source. It generally seems to run better but I am having big issues with Corosync. I have 14 completely identical nodes. They differ only in ip address and host name. Periodically, corosync will fail to start up on boot. It's not consistent and happens randomly. What's even worse, once the cluster is up Corosync will occasionally die for no apparent reason. There are not errors logged. Nothing. The process simply disappears, taking the node offline.
My cluster has zero stability thanks to this Corosync issue. Anyone got any ideas?
Thanks. - Rob P. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org