Hi All, A while ago (April, from memory), there was an ABI change in clplumbing in cluster-glue. Presumably this went mostly unnoticed in general usage, however I have twice seen systems where the cluster could not run because of a missing (or incorrect) libglue2 package. One was my development system, with a dodgy build, the other was mentioned on #linux-ha yesterday, and was the result of ignoring a conflict error when installing the pacemaker RPM on openSUSE. So, let me be clear, this is not something anyone should need to worry about... But I thought I'd mention it here, because the error messages you get are, IMO, not very obvious.
Symptoms of a mismatched pacemaker/libglue build are errors like: lrmd: [3004]: ERROR: main: can not create wait connection for command. lrmd: [3004]: ERROR: Startup aborted (can't create comm channel). Shutting down. ... pengine: [4011]: ERROR: init_client_ipc_comms_nodispatch: Could not access channel on: /var/run/crm/pengine corosync[4000]: [pcmk ] ERROR: pcmk_wait_dispatch: Child process pengine exited (pid=4011, rc=1) corosync[4000]: [pcmk ] notice: pcmk_wait_dispatch: Respawning failed child process: pengine If your cluster won't start and you see this in /var/log/messages, make sure libglue2 is up to date. And now that I've mentioned this here and it's made it to the mailing list archive, Google will know, and nobody else will ever have this problem again. This has been a public service announcement. Thank you for reading. Tim -- Tim Serong <tser...@novell.com> Senior Clustering Engineer, OPS Engineering, Novell Inc. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker