Hi,

On Sat, Mar 19, 2011 at 09:45:09PM +0100, Christoph Bartoschek wrote:
> Am 19.03.2011 16:53, schrieb Bart Coninckx:
> > Don't your logfiles reveal anything?
> >
> 
> I see the following messages in the logfile of the first node. There is 
> a pause from 21:36:25 to 21:37:21.  It says that an Election Trigger 
> popped after 60 seconds.  I wonder why a timeout is triggered here. Both 
> nodes work without problems.

That's normal. It takes 60 seconds on startup to allow nodes to
join the cluster. See the dc-deadtime cluster property.

Thanks,

Dejan

> Mar 19 21:36:22 laplace cib: [23840]: info: ais_dispatch_message: 
> Membership 6820: quorum still lost
> Mar 19 21:36:22 laplace cib: [23840]: info: crm_new_peer: Node <null> 
> now has id: 33663168
> Mar 19 21:36:22 laplace cib: [23840]: info: crm_update_peer: Node 
> (null): id=33663168 state=member (new) addr=r(0) ip(192.168.1.2) 
> votes=0 born=0 seen=6820 proc=00000000000000000000000000000000
> Mar 19 21:36:22 laplace cib: [23840]: notice: ais_dispatch_message: 
> Membership 6820: quorum acquired
> Mar 19 21:36:22 laplace cib: [23840]: info: crm_get_peer: Node 33663168 
> is now known as ries
> Mar 19 21:36:22 laplace cib: [23840]: info: crm_update_peer: Node ries: 
> id=33663168 state=member addr=r(0) ip(192.168.1.2)  votes=1 (new) 
> born=6820 seen=6820 proc=00000000000000000000000000151312 (new)
> Mar 19 21:36:22 laplace corosync[23832]:  [CPG   ] chosen downlist from 
> node r(0) ip(192.168.1.1)
> Mar 19 21:36:22 laplace corosync[23832]:  [MAIN  ] Completed service 
> synchronization, ready to provide service.
> Mar 19 21:36:22 laplace crmd: [23845]: info: te_connect_stonith: Connected
> Mar 19 21:36:22 laplace crmd: [23845]: info: ais_dispatch_message: 
> Membership 6820: quorum still lost
> Mar 19 21:36:22 laplace crmd: [23845]: info: crm_new_peer: Node <null> 
> now has id: 33663168
> Mar 19 21:36:22 laplace crmd: [23845]: info: crm_update_peer: Node 
> (null): id=33663168 state=member (new) addr=r(0) ip(192.168.1.2) 
> votes=0 born=0 seen=6820 proc=00000000000000000000000000000000
> Mar 19 21:36:22 laplace crmd: [23845]: notice: ais_dispatch_message: 
> Membership 6820: quorum acquired
> Mar 19 21:36:22 laplace crmd: [23845]: info: crm_get_peer: Node 33663168 
> is now known as ries
> Mar 19 21:36:22 laplace crmd: [23845]: info: ais_status_callback: 
> status: ries is now member
> Mar 19 21:36:22 laplace crmd: [23845]: notice: crmd_peer_update: Status 
> update: Client ries/crmd now has status [online] (DC=<null>)
> Mar 19 21:36:22 laplace crmd: [23845]: info: crm_update_peer: Node ries: 
> id=33663168 state=member addr=r(0) ip(192.168.1.2)  votes=1 (new) 
> born=6820 seen=6820 proc=00000000000000000000000000151312 (new)
> Mar 19 21:36:25 laplace attrd: [23842]: info: cib_connect: Connected to 
> the CIB after 1 signon attempts
> Mar 19 21:36:25 laplace attrd: [23842]: info: cib_connect: Sending full 
> refresh
> Mar 19 21:37:21 laplace crmd: [23845]: info: crm_timer_popped: Election 
> Trigger (I_DC_TIMEOUT) just popped! (60000ms)
> Mar 19 21:37:21 laplace crmd: [23845]: WARN: do_log: FSA: Input 
> I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING
> Mar 19 21:37:21 laplace crmd: [23845]: info: do_state_transition: State 
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT 
> cause=C_TIMER_POPPED origin=crm_timer_popped ]
> Mar 19 21:37:21 laplace crmd: [23845]: info: do_state_transition: State 
> transition S_ELECTION -> S_PENDING [ input=I_PENDING 
> cause=C_FSA_INTERNAL origin=do_election_count_vote ]
> Mar 19 21:37:21 laplace crmd: [23845]: info: do_dc_release: DC role released
> Mar 19 21:37:21 laplace crmd: [23845]: info: do_te_control: Transitioner 
> is now inactive
> Mar 19 21:37:21 laplace crmd: [23845]: info: update_dc: Set DC to ries 
> (3.0.5)
> 
> 
> The second node basically says:
> 
> Mar 19 21:36:18 ries crmd: [19625]: info: crm_update_peer: Node ries: 
> id=33663168 state=member (new) addr=r(0) ip(192.168.1.2)  (new) votes=1 
> (new) born=6820 seen=6820 proc=00000
> 000000000000000000000151312 (new)
> Mar 19 21:36:18 ries crmd: [19625]: info: crm_new_peer: Node laplace now 
> has id: 16885952
> Mar 19 21:36:18 ries crmd: [19625]: info: crm_new_peer: Node 16885952 is 
> now known as laplace
> Mar 19 21:36:18 ries crmd: [19625]: info: ais_status_callback: status: 
> laplace is now unknown
> Mar 19 21:36:18 ries crmd: [19625]: info: ais_status_callback: status: 
> laplace is now member (was unknown)
> Mar 19 21:36:18 ries crmd: [19625]: info: crm_update_peer: Node laplace: 
> id=16885952 state=member (new) addr=r(0) ip(192.168.1.1)  votes=1 
> born=6820 seen=6820 proc=00000000000000
> 000000000000151312
> Mar 19 21:36:18 ries crmd: [19625]: info: do_started: Delaying start, 
> Config not read (0000000000000040)
> Mar 19 21:36:18 ries crmd: Last message '[19625]: info: do_st' repeated 
> 1 times, supressed by syslog-ng on ries.site
> Mar 19 21:36:18 ries crmd: [19625]: info: config_query_callback: 
> Shutdown escalation occurs after: 1200000ms
> Mar 19 21:36:18 ries crmd: [19625]: info: config_query_callback: 
> Checking for expired actions every 900000ms
> Mar 19 21:36:18 ries crmd: [19625]: info: config_query_callback: Sending 
> expected-votes=2 to corosync
> Mar 19 21:36:18 ries crmd: [19625]: info: do_started: The local CRM is 
> operational
> Mar 19 21:36:18 ries crmd: [19625]: info: do_state_transition: State 
> transition S_STARTING -> S_PENDING [ input=I_PENDING 
> cause=C_FSA_INTERNAL origin=do_started ]
> Mar 19 21:36:19 ries crmd: [19625]: info: ais_dispatch_message: 
> Membership 6820: quorum retained
> Mar 19 21:36:19 ries crmd: [19625]: info: te_connect_stonith: Attempting 
> connection to fencing daemon...
> Mar 19 21:36:20 ries crmd: [19625]: info: te_connect_stonith: Connected
> Mar 19 21:36:22 ries attrd: [19623]: info: cib_connect: Connected to the 
> CIB after 1 signon attempts
> Mar 19 21:36:22 ries attrd: [19623]: info: cib_connect: Sending full refresh
> Mar 19 21:36:22 ries dhclient: XMT: Solicit on eth0, interval 108990ms.
> Mar 19 21:37:17 ries crmd: [19625]: info: do_election_count_vote: 
> Election 2 (owner: laplace) pass: vote from laplace (Uptime)
> Mar 19 21:37:17 ries crmd: [19625]: info: do_state_transition: State 
> transition S_PENDING -> S_ELECTION [ input=I_ELECTION 
> cause=C_FSA_INTERNAL origin=do_election_count_vote ]
> Mar 19 21:37:17 ries cib: [19621]: info: cib_process_readwrite: We are 
> now in R/W mode
> Mar 19 21:37:17 ries attrd: [19623]: info: find_hash_entry: Creating 
> hash entry for terminate
> Mar 19 21:37:17 ries pengine: [19624]: notice: unpack_config: On loss of 
> CCM Quorum: Ignore
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to