On Tue, Oct 30, 2012 at 4:35 PM, Vladimir Elisseev <vo...@vovan.nl> wrote: > Thanks for trying to help! Currently I can't provide crm_report from the > failed node, as I've decided to restore the complete node from backup. > The versions I use are corosync-1.3.0 and pacemaker-1.0.10. Actually the > problem occurred after updating quiet a few system packages, but all the > cluster related software was untouched. I've found exactly the same > issue described in the mailing list earlier: > http://www.gossamer-threads.com/lists/linuxha/pacemaker/77881?do=post_view_threaded#77881 > At least symptoms are exactly the same as well as pasted log files. I've > tried enable debug logging as well and saw that crm tries to connect to > cib sockets (/var/run/crm_*) too early (IMO) and fails because cib > wasn't started yet. > I'm planning to repeat update of these system again, but I'll do this > more carefully in order to understand which particular package leads to > this behavior. BTW, how can I create crm_report? I can't find this > binary anywhere on the system.
Its included in subsequent 1.0.x releases. You should have hb_report available though. > Let me know what kind of input you'll > need if I'll be able to reproduce this problem. > > Regards, > Vlad. > > > On Tue, 2012-10-30 at 16:00 +1100, Andrew Beekhof wrote: >> On Sun, Oct 28, 2012 at 9:05 PM, Vladimir Elisseev <vo...@vovan.nl> wrote: >> > Hello, >> > >> > I'm having problem that after reboot one cluster node can't join cluster >> > anymore. Form the log file I can't understand what actually is going on. >> > I only can see, that cib and crm both are respawned frequently. I'd >> > appreciate any help. Below is relevant part of the log file: >> >> I appreciate that you're trying to keep it brief, but problems often >> originate much earlier than people suspect. >> Can you instead attach a crm_report tarball, that will have everything >> (from both nodes) that we need to be able to help. >> >> What version is this btw? >> >> > >> > Oct 28 10:52:22 srv2 cib: [10646]: info: cib_server_process_diff: >> > Requesting re-sync from peer >> > Oct 28 10:52:22 srv2 cib: [10646]: WARN: cib_diff_notify: Local-only >> > Change (client:crmd, call: 4770): -1.-1.-1 (Application of an update diff >> > failed, requesting a full refresh) >> > Oct 28 10:52:22 srv2 cib: [10653]: info: retrieveCib: Reading cluster >> > configuration from: /var/lib/heartbeat/crm/cib.qJTUAV (digest: >> > /var/lib/heartbeat/crm/cib.XwOKXQ) >> > Oct 28 10:52:22 srv2 cib: [10646]: WARN: cib_server_process_diff: Not >> > applying diff 0.1298.5 -> 0.1299.1 (sync in progress) >> > Oct 28 10:52:22 srv2 cib: [10646]: info: cib_replace_notify: Local-only >> > Replace: -1.-1.-1 from srv1 >> > Oct 28 10:52:22 corosync [pcmk]: ] info: pcmk_ipc_exit: Client cib >> > (conn=0x1837340, async-conn=0x1837340) left >> > Oct 28 10:52:22 corosync [pcmk]: ] ERROR: pcmk_wait_dispatch: Child >> > process cib terminated with signal 6 (pid=10646, core=true) >> > Oct 28 10:52:22 corosync [pcmk]: ] notice: pcmk_wait_dispatch: Respawning >> > failed child process: cib >> > Oct 28 10:52:22 corosync [pcmk]: ] info: spawn_child: Forked child 10656 >> > for process cib >> > Oct 28 10:52:22 srv2 cib: [10656]: info: Invoked: /usr/lib64/heartbeat/cib >> > >> > >> > Regards, >> > Vlad. >> > >> > >> > _______________________________________________ >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> > >> > Project Home: http://www.clusterlabs.org >> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> > Bugs: http://bugs.clusterlabs.org >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org