Yes, hb_report is there, thanks! On Thu, 2012-11-01 at 11:40 +1100, Andrew Beekhof wrote: > On Tue, Oct 30, 2012 at 4:35 PM, Vladimir Elisseev <vo...@vovan.nl> wrote: > > Thanks for trying to help! Currently I can't provide crm_report from the > > failed node, as I've decided to restore the complete node from backup. > > The versions I use are corosync-1.3.0 and pacemaker-1.0.10. Actually the > > problem occurred after updating quiet a few system packages, but all the > > cluster related software was untouched. I've found exactly the same > > issue described in the mailing list earlier: > > http://www.gossamer-threads.com/lists/linuxha/pacemaker/77881?do=post_view_threaded#77881 > > At least symptoms are exactly the same as well as pasted log files. I've > > tried enable debug logging as well and saw that crm tries to connect to > > cib sockets (/var/run/crm_*) too early (IMO) and fails because cib > > wasn't started yet. > > I'm planning to repeat update of these system again, but I'll do this > > more carefully in order to understand which particular package leads to > > this behavior. BTW, how can I create crm_report? I can't find this > > binary anywhere on the system. > > Its included in subsequent 1.0.x releases. > You should have hb_report available though. > > > Let me know what kind of input you'll > > need if I'll be able to reproduce this problem. > > > > Regards, > > Vlad. > > > > > > On Tue, 2012-10-30 at 16:00 +1100, Andrew Beekhof wrote: > >> On Sun, Oct 28, 2012 at 9:05 PM, Vladimir Elisseev <vo...@vovan.nl> wrote: > >> > Hello, > >> > > >> > I'm having problem that after reboot one cluster node can't join cluster > >> > anymore. Form the log file I can't understand what actually is going on. > >> > I only can see, that cib and crm both are respawned frequently. I'd > >> > appreciate any help. Below is relevant part of the log file: > >> > >> I appreciate that you're trying to keep it brief, but problems often > >> originate much earlier than people suspect. > >> Can you instead attach a crm_report tarball, that will have everything > >> (from both nodes) that we need to be able to help. > >> > >> What version is this btw? > >> > >> > > >> > Oct 28 10:52:22 srv2 cib: [10646]: info: cib_server_process_diff: > >> > Requesting re-sync from peer > >> > Oct 28 10:52:22 srv2 cib: [10646]: WARN: cib_diff_notify: Local-only > >> > Change (client:crmd, call: 4770): -1.-1.-1 (Application of an update > >> > diff failed, requesting a full refresh) > >> > Oct 28 10:52:22 srv2 cib: [10653]: info: retrieveCib: Reading cluster > >> > configuration from: /var/lib/heartbeat/crm/cib.qJTUAV (digest: > >> > /var/lib/heartbeat/crm/cib.XwOKXQ) > >> > Oct 28 10:52:22 srv2 cib: [10646]: WARN: cib_server_process_diff: Not > >> > applying diff 0.1298.5 -> 0.1299.1 (sync in progress) > >> > Oct 28 10:52:22 srv2 cib: [10646]: info: cib_replace_notify: Local-only > >> > Replace: -1.-1.-1 from srv1 > >> > Oct 28 10:52:22 corosync [pcmk]: ] info: pcmk_ipc_exit: Client cib > >> > (conn=0x1837340, async-conn=0x1837340) left > >> > Oct 28 10:52:22 corosync [pcmk]: ] ERROR: pcmk_wait_dispatch: Child > >> > process cib terminated with signal 6 (pid=10646, core=true) > >> > Oct 28 10:52:22 corosync [pcmk]: ] notice: pcmk_wait_dispatch: > >> > Respawning failed child process: cib > >> > Oct 28 10:52:22 corosync [pcmk]: ] info: spawn_child: Forked child > >> > 10656 for process cib > >> > Oct 28 10:52:22 srv2 cib: [10656]: info: Invoked: > >> > /usr/lib64/heartbeat/cib > >> > > >> > > >> > Regards, > >> > Vlad. > >> > > >> > > >> > _______________________________________________ > >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >> > > >> > Project Home: http://www.clusterlabs.org > >> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> > Bugs: http://bugs.clusterlabs.org > >> > >> _______________________________________________ > >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >> > >> Project Home: http://www.clusterlabs.org > >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> Bugs: http://bugs.clusterlabs.org > > > > > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org