Hi, On Sat, Mar 19, 2011 at 09:45:09PM +0100, Christoph Bartoschek wrote: > Am 19.03.2011 16:53, schrieb Bart Coninckx: > > Don't your logfiles reveal anything? > > > > I see the following messages in the logfile of the first node. There is > a pause from 21:36:25 to 21:37:21. It says that an Election Trigger > popped after 60 seconds. I wonder why a timeout is triggered here. Both > nodes work without problems.
That's normal. It takes 60 seconds on startup to allow nodes to join the cluster. See the dc-deadtime cluster property. Thanks, Dejan > Mar 19 21:36:22 laplace cib: [23840]: info: ais_dispatch_message: > Membership 6820: quorum still lost > Mar 19 21:36:22 laplace cib: [23840]: info: crm_new_peer: Node <null> > now has id: 33663168 > Mar 19 21:36:22 laplace cib: [23840]: info: crm_update_peer: Node > (null): id=33663168 state=member (new) addr=r(0) ip(192.168.1.2) > votes=0 born=0 seen=6820 proc=00000000000000000000000000000000 > Mar 19 21:36:22 laplace cib: [23840]: notice: ais_dispatch_message: > Membership 6820: quorum acquired > Mar 19 21:36:22 laplace cib: [23840]: info: crm_get_peer: Node 33663168 > is now known as ries > Mar 19 21:36:22 laplace cib: [23840]: info: crm_update_peer: Node ries: > id=33663168 state=member addr=r(0) ip(192.168.1.2) votes=1 (new) > born=6820 seen=6820 proc=00000000000000000000000000151312 (new) > Mar 19 21:36:22 laplace corosync[23832]: [CPG ] chosen downlist from > node r(0) ip(192.168.1.1) > Mar 19 21:36:22 laplace corosync[23832]: [MAIN ] Completed service > synchronization, ready to provide service. > Mar 19 21:36:22 laplace crmd: [23845]: info: te_connect_stonith: Connected > Mar 19 21:36:22 laplace crmd: [23845]: info: ais_dispatch_message: > Membership 6820: quorum still lost > Mar 19 21:36:22 laplace crmd: [23845]: info: crm_new_peer: Node <null> > now has id: 33663168 > Mar 19 21:36:22 laplace crmd: [23845]: info: crm_update_peer: Node > (null): id=33663168 state=member (new) addr=r(0) ip(192.168.1.2) > votes=0 born=0 seen=6820 proc=00000000000000000000000000000000 > Mar 19 21:36:22 laplace crmd: [23845]: notice: ais_dispatch_message: > Membership 6820: quorum acquired > Mar 19 21:36:22 laplace crmd: [23845]: info: crm_get_peer: Node 33663168 > is now known as ries > Mar 19 21:36:22 laplace crmd: [23845]: info: ais_status_callback: > status: ries is now member > Mar 19 21:36:22 laplace crmd: [23845]: notice: crmd_peer_update: Status > update: Client ries/crmd now has status [online] (DC=<null>) > Mar 19 21:36:22 laplace crmd: [23845]: info: crm_update_peer: Node ries: > id=33663168 state=member addr=r(0) ip(192.168.1.2) votes=1 (new) > born=6820 seen=6820 proc=00000000000000000000000000151312 (new) > Mar 19 21:36:25 laplace attrd: [23842]: info: cib_connect: Connected to > the CIB after 1 signon attempts > Mar 19 21:36:25 laplace attrd: [23842]: info: cib_connect: Sending full > refresh > Mar 19 21:37:21 laplace crmd: [23845]: info: crm_timer_popped: Election > Trigger (I_DC_TIMEOUT) just popped! (60000ms) > Mar 19 21:37:21 laplace crmd: [23845]: WARN: do_log: FSA: Input > I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING > Mar 19 21:37:21 laplace crmd: [23845]: info: do_state_transition: State > transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT > cause=C_TIMER_POPPED origin=crm_timer_popped ] > Mar 19 21:37:21 laplace crmd: [23845]: info: do_state_transition: State > transition S_ELECTION -> S_PENDING [ input=I_PENDING > cause=C_FSA_INTERNAL origin=do_election_count_vote ] > Mar 19 21:37:21 laplace crmd: [23845]: info: do_dc_release: DC role released > Mar 19 21:37:21 laplace crmd: [23845]: info: do_te_control: Transitioner > is now inactive > Mar 19 21:37:21 laplace crmd: [23845]: info: update_dc: Set DC to ries > (3.0.5) > > > The second node basically says: > > Mar 19 21:36:18 ries crmd: [19625]: info: crm_update_peer: Node ries: > id=33663168 state=member (new) addr=r(0) ip(192.168.1.2) (new) votes=1 > (new) born=6820 seen=6820 proc=00000 > 000000000000000000000151312 (new) > Mar 19 21:36:18 ries crmd: [19625]: info: crm_new_peer: Node laplace now > has id: 16885952 > Mar 19 21:36:18 ries crmd: [19625]: info: crm_new_peer: Node 16885952 is > now known as laplace > Mar 19 21:36:18 ries crmd: [19625]: info: ais_status_callback: status: > laplace is now unknown > Mar 19 21:36:18 ries crmd: [19625]: info: ais_status_callback: status: > laplace is now member (was unknown) > Mar 19 21:36:18 ries crmd: [19625]: info: crm_update_peer: Node laplace: > id=16885952 state=member (new) addr=r(0) ip(192.168.1.1) votes=1 > born=6820 seen=6820 proc=00000000000000 > 000000000000151312 > Mar 19 21:36:18 ries crmd: [19625]: info: do_started: Delaying start, > Config not read (0000000000000040) > Mar 19 21:36:18 ries crmd: Last message '[19625]: info: do_st' repeated > 1 times, supressed by syslog-ng on ries.site > Mar 19 21:36:18 ries crmd: [19625]: info: config_query_callback: > Shutdown escalation occurs after: 1200000ms > Mar 19 21:36:18 ries crmd: [19625]: info: config_query_callback: > Checking for expired actions every 900000ms > Mar 19 21:36:18 ries crmd: [19625]: info: config_query_callback: Sending > expected-votes=2 to corosync > Mar 19 21:36:18 ries crmd: [19625]: info: do_started: The local CRM is > operational > Mar 19 21:36:18 ries crmd: [19625]: info: do_state_transition: State > transition S_STARTING -> S_PENDING [ input=I_PENDING > cause=C_FSA_INTERNAL origin=do_started ] > Mar 19 21:36:19 ries crmd: [19625]: info: ais_dispatch_message: > Membership 6820: quorum retained > Mar 19 21:36:19 ries crmd: [19625]: info: te_connect_stonith: Attempting > connection to fencing daemon... > Mar 19 21:36:20 ries crmd: [19625]: info: te_connect_stonith: Connected > Mar 19 21:36:22 ries attrd: [19623]: info: cib_connect: Connected to the > CIB after 1 signon attempts > Mar 19 21:36:22 ries attrd: [19623]: info: cib_connect: Sending full refresh > Mar 19 21:36:22 ries dhclient: XMT: Solicit on eth0, interval 108990ms. > Mar 19 21:37:17 ries crmd: [19625]: info: do_election_count_vote: > Election 2 (owner: laplace) pass: vote from laplace (Uptime) > Mar 19 21:37:17 ries crmd: [19625]: info: do_state_transition: State > transition S_PENDING -> S_ELECTION [ input=I_ELECTION > cause=C_FSA_INTERNAL origin=do_election_count_vote ] > Mar 19 21:37:17 ries cib: [19621]: info: cib_process_readwrite: We are > now in R/W mode > Mar 19 21:37:17 ries attrd: [19623]: info: find_hash_entry: Creating > hash entry for terminate > Mar 19 21:37:17 ries pengine: [19624]: notice: unpack_config: On loss of > CCM Quorum: Ignore > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
