On Fri, Aug 28, 2009 at 2:44 AM, fengyandong<[email protected]> wrote: > Thanks for your help! Could you tell me what's wrong?
No idea sorry. The reason we put out new versions is that they have fixes for things like this. > If I still want to > user the version 2.1.4, how can I solve the problem? Scan through a few thousand patches to find the right one? > 2009/8/27 Andrew Beekhof <[email protected]> > >> On Thu, Aug 27, 2009 at 12:42 PM, fengyandong<[email protected]> wrote: >> > The attachment is /var/log/messages from passive node. The heartbeat >> version >> > is 2.1.4 >> >> Far too old. Sorry. >> Almost certainly any problem you have was fixed a long time ago. >> Head over to http://www.clusterlabs.org/wiki/Install where you'll find >> instructions on installing something recent. >> >> > >> > 2009/8/27 Andrew Beekhof <[email protected]> >> > >> >> Please send logs as attachments (they're impossible to read otherwise) >> >> and provide version details. >> >> >> >> On Thu, Aug 27, 2009 at 10:56 AM, fengyandong<[email protected]> >> wrote: >> >> > I configured a Heartbeat cluster with two nodes in an activce/passive >> >> > configuration. >> >> > When the active node is poweroff immediately, the passive node does >> not >> >> > takeover the resource as expected. >> >> > >> >> > The /var/log/ha-log from passive node is here: >> >> > >> >> > Aug 26 13:52:33 paul kernel: ib0: multicast join failed for >> >> > ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11 >> >> > Aug 26 13:52:35 paul heartbeat: [12113]: WARN: node peter: is dead >> >> > Aug 26 13:52:35 paul heartbeat: [12113]: info: Send event about the >> given >> >> > node dead: [bwevent_send --level error --module heartbeat --code >> 30100026 >> >> > --desc "paul can not connect to peter."] >> >> > Aug 26 13:52:35 paul crmd: [12132]: notice: crmd_ha_status_callback: >> >> Status >> >> > update: Node peter now has status [dead] >> >> > Aug 26 13:52:35 paul heartbeat: [12113]: info: Link peter:ib0 dead. >> >> > Aug 26 13:52:35 paul heartbeat: [12113]: info: Send event about the >> given >> >> > link status: [bwevent_send --level error --code 30100025 --module >> >> heartbeat >> >> > --desc "Heartbeat on network interface (ib0) fault."] >> >> > Aug 26 13:52:35 paul crmd: [12132]: WARN: do_dc_join_finalize: join-2: >> We >> >> > are still in a transition. Delaying until the TE completes. >> >> > Aug 26 13:52:35 paul tengine: [29501]: WARN: match_down_event: No >> match >> >> for >> >> > shutdown action on 4914271e-4f59-4d45-a5ef-437b0c884629 >> >> > Aug 26 13:52:35 paul tengine: [29501]: info: extract_event: >> >> Stonith/shutdown >> >> > of 4914271e-4f59-4d45-a5ef-437b0c884629 not matched >> >> > Aug 26 13:52:35 paul crmd: [12132]: WARN: do_dc_join_finalize: join-2: >> We >> >> > are still in a transition. Delaying until the TE completes. >> >> > Aug 26 13:52:35 paul cib: [12128]: info: mem_handle_event: Got an >> event >> >> > OC_EV_MS_NOT_PRIMARY from ccm >> >> > Aug 26 13:52:35 paul crmd: [12132]: info: mem_handle_event: Got an >> event >> >> > OC_EV_MS_NOT_PRIMARY from ccm >> >> > Aug 26 13:52:35 paul cib: [12128]: info: mem_handle_event: >> instance=11, >> >> > nodes=2, new=1, lost=0, n_idx=0, new_idx=2, old_idx=4 >> >> > Aug 26 13:52:35 paul crmd: [12132]: info: mem_handle_event: >> instance=11, >> >> > nodes=2, new=1, lost=0, n_idx=0, new_idx=2, old_idx=4 >> >> > Aug 26 13:52:35 paul crmd: [12132]: info: crmd_ccm_msg_callback: >> Quorum >> >> lost >> >> > after event=NOT PRIMARY (id=11) >> >> > Aug 26 13:52:35 paul crmd: [12132]: WARN: do_dc_join_finalize: join-2: >> We >> >> > are still in a transition. Delaying until the TE completes. >> >> > Aug 26 13:52:35 paul ccm: [12127]: info: ccm_state_sent_memlistreq: >> >> directly >> >> > callccm_compute_and_send_final_memlist() >> >> > Aug 26 13:52:35 paul ccm: [12127]: info: Break tie for 2 nodes cluster >> >> > Aug 26 13:52:35 paul cib: [12128]: info: mem_handle_event: Got an >> event >> >> > OC_EV_MS_INVALID from ccm >> >> > Aug 26 13:52:35 paul crmd: [12132]: info: mem_handle_event: Got an >> event >> >> > OC_EV_MS_INVALID from ccm >> >> > Aug 26 13:52:35 paul cib: [12128]: info: mem_handle_event: no >> mbr_track >> >> info >> >> > Aug 26 13:52:35 paul crmd: [12132]: info: mem_handle_event: no >> mbr_track >> >> > info >> >> > Aug 26 13:52:35 paul ccm: [12127]: info: Break tie for 2 nodes cluster >> >> > Aug 26 13:52:35 paul cib: [12128]: info: mem_handle_event: Got an >> event >> >> > OC_EV_MS_NEW_MEMBERSHIP from ccm >> >> > Aug 26 13:52:35 paul crmd: [12132]: info: mem_handle_event: Got an >> event >> >> > OC_EV_MS_NEW_MEMBERSHIP from ccm >> >> > Aug 26 13:52:35 paul cib: [12128]: info: mem_handle_event: >> instance=12, >> >> > nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=3 >> >> > Aug 26 13:52:35 paul crmd: [12132]: info: mem_handle_event: >> instance=12, >> >> > nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=3 >> >> > Aug 26 13:52:35 paul cib: [12128]: info: cib_ccm_msg_callback: LOST: >> >> peter >> >> > Aug 26 13:52:35 paul crmd: [12132]: info: crmd_ccm_msg_callback: >> Quorum >> >> > (re)attained after event=NEW MEMBERSHIP (id=12) >> >> > Aug 26 13:52:35 paul cib: [12128]: info: cib_ccm_msg_callback: PEER: >> paul >> >> > Aug 26 13:52:35 paul crmd: [12132]: info: erase_node_from_join: >> Removed >> >> dead >> >> > node peter from join calculations: welcomed=0 itegrated=0 finalized=0 >> >> > confirmed=0 >> >> > Aug 26 13:52:35 paul crmd: [12132]: WARN: check_dead_member: Our DC >> node >> >> > (peter) left the cluster >> >> > Aug 26 13:52:35 paul crmd: [12132]: info: ccm_event_detail: NEW >> >> MEMBERSHIP: >> >> > trans=12, nodes=1, new=0, lost=1 n_idx=0, new_idx=1, old_idx=3 >> >> > Aug 26 13:52:35 paul crmd: [12132]: info: ccm_event_detail: CURRENT: >> >> paul >> >> > [nodeid=0, born=12] >> >> > Aug 26 13:52:35 paul crmd: [12132]: info: ccm_event_detail: LOST: >> >> peter >> >> > [nodeid=1, born=1] >> >> > Aug 26 13:52:35 paul crmd: [12132]: WARN: do_dc_join_finalize: join-2: >> We >> >> > are still in a transition. Delaying until the TE completes. >> >> > Aug 26 13:52:35 paul crmd: [12132]: WARN: register_fsa_input_adv: >> >> > do_dc_join_finalize stalled the FSA with pending inputs >> >> > Aug 26 13:52:35 paul crmd: [12132]: WARN: do_dc_join_finalize: join-2: >> We >> >> > are still in a transition. Delaying until the TE completes. >> >> > Aug 26 13:52:36 paul crmd: [12132]: WARN: do_dc_join_finalize: join-2: >> We >> >> > are still in a transition. Delaying until the TE completes. >> >> > Aug 26 13:52:42 paul OpenSM[7074]: SM port is down >> >> > Aug 26 13:52:42 paul OpenSM[7074]: Entering DISCOVERING state >> >> > Aug 26 13:52:44 paul : error initializing: File contains no section >> >> headers. >> >> > file: >> >> >> file://///etc/yum.repos.d/CentoOS-Base.repo<file://etc/yum.repos.d/CentoOS-Base.repo>, >> >> > line: 1 'gpgcheck=1\n' >> >> > Aug 26 13:52:52 paul OpenSM[7074]: SM port is down >> >> > Aug 26 13:52:52 paul kernel: enfs: server 192.168.1.3 not responding, >> >> still >> >> > trying >> >> > Aug 26 13:53:02 paul OpenSM[7074]: SM port is down >> >> > Aug 26 13:53:42 paul last message repeated 4 times >> >> > Aug 26 13:53:49 paul kernel: nfs: server 192.168.1.3 not responding, >> >> still >> >> > trying >> >> > Aug 26 13:53:52 paul OpenSM[7074]: SM port is down >> >> > Aug 26 13:54:02 paul OpenSM[7074]: SM port is down >> >> > Aug 26 13:54:08 paul cib: [12128]: info: cib_stats: Processed 11 >> >> operations >> >> > (909.00us average, 0% utilization) in the last 10min >> >> > Aug 26 13:54:12 paul OpenSM[7074]: SM port is down >> >> > Aug 26 13:54:32 paul last message repeated 2 times >> >> > Aug 26 13:54:39 paul crm_mon: [11765]: info: G_main_add_SignalHandler: >> >> Added >> >> > signal handler for signal 15 >> >> > Aug 26 13:54:39 paul crm_mon: [11765]: info: G_main_add_SignalHandler: >> >> Added >> >> > signal handler for signal 2 >> >> > >> >> > >> >> > Any help, >> >> > Thanks! >> >> > _______________________________________________ >> >> > Linux-HA mailing list >> >> > [email protected] >> >> > http://lists.linux-ha.org/mailman/listinfo/linux-ha >> >> > See also: http://linux-ha.org/ReportingProblems >> >> > >> >> _______________________________________________ >> >> Linux-HA mailing list >> >> [email protected] >> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> >> See also: http://linux-ha.org/ReportingProblems >> >> >> > >> > _______________________________________________ >> > Linux-HA mailing list >> > [email protected] >> > http://lists.linux-ha.org/mailman/listinfo/linux-ha >> > See also: http://linux-ha.org/ReportingProblems >> > >> _______________________________________________ >> Linux-HA mailing list >> [email protected] >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems >> > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
