Hi ,
I'm facing a pb at first start on my two-nodes ha cluster:  server2 and 
server3 (both with fedora11)
The start on server2 seems to work fine, and on hb_gui I can see server2 
OnLine and
server3 Offline
But the start on server3 seems to not connect to the cluster, and at the 
end crm_mon displays
only server3, no mention of server2 (in the log, seems that it fails to 
calculate server2 uuid)...
A ping between both heartbeat interfaces works fine ... so I have no 
idea why it does not
work ...
Any idea ? (Log is below)
Thanks for help.
Alain


cib[3137]: 2009/08/12_13:14:17 info: register_heartbeat_conn: Hostname: 
server3
cib[3137]: 2009/08/12_13:14:17 info: register_heartbeat_conn: UUID: 
7c6eb9f6-c938-4012-8ba0-4d5c21dd1315
cib[3137]: 2009/08/12_13:14:17 info: ccm_connect: Registering with CCM...
cib[3137]: 2009/08/12_13:14:17 WARN: ccm_connect: CCM Activation failed
cib[3137]: 2009/08/12_13:14:17 WARN: ccm_connect: CCM Connection failed 
1 times (30 max)
stonithd[3139]: 2009/08/12_13:14:17 notice: 
/usr/lib64/heartbeat/stonithd start up successfully.
stonithd[3139]: 2009/08/12_13:14:17 info: G_main_add_SignalHandler: 
Added signal handler for signal 17
mgmtd[3135]: 2009/08/12_13:14:18 info: init_crm: live
mgmtd[3135]: 2009/08/12_13:14:18 info: login to cib live: 0, ret:-10
crmd[3142]: 2009/08/12_13:14:18 info: do_cib_control: Could not connect 
to the CIB service: connection failed
crmd[3142]: 2009/08/12_13:14:18 WARN: do_cib_control: Couldn't complete 
CIB registration 1 times... pause and retry
crmd[3142]: 2009/08/12_13:14:18 info: crmd_init: Starting crmd's mainloop
mgmtd[3135]: 2009/08/12_13:14:19 info: login to cib live: 1, ret:-10
mgmtd[3135]: 2009/08/12_13:14:20 info: login to cib live: 2, ret:-10
crmd[3142]: 2009/08/12_13:14:20 info: crm_timer_popped: Wait Timer 
(I_NULL) just popped!
cib[3137]: 2009/08/12_13:14:20 info: ccm_connect: Registering with CCM...
cib[3137]: 2009/08/12_13:14:20 WARN: ccm_connect: CCM Activation failed
cib[3137]: 2009/08/12_13:14:20 WARN: ccm_connect: CCM Connection failed 
2 times (30 max)
mgmtd[3135]: 2009/08/12_13:14:21 info: login to cib live: 3, ret:-10
crmd[3142]: 2009/08/12_13:14:21 info: do_cib_control: Could not connect 
to the CIB service: connection failed
crmd[3142]: 2009/08/12_13:14:21 WARN: do_cib_control: Couldn't complete 
CIB registration 2 times... pause and retry
ccm[3136]: 2009/08/12_13:14:21 debug: quorum plugin: majority
ccm[3136]: 2009/08/12_13:14:21 debug: cluster:linux-ha, member_count=1, 
member_quorum_votes=100
ccm[3136]: 2009/08/12_13:14:21 debug: total_node_count=2, 
total_quorum_votes=200
ccm[3136]: 2009/08/12_13:14:21 debug: quorum plugin: twonodes
ccm[3136]: 2009/08/12_13:14:21 debug: cluster:linux-ha, member_count=1, 
member_quorum_votes=100
ccm[3136]: 2009/08/12_13:14:21 debug: total_node_count=2, 
total_quorum_votes=200
ccm[3136]: 2009/08/12_13:14:21 info: Break tie for 2 nodes cluster
ccm[3136]: 2009/08/12_13:14:21 info: G_main_add_SignalHandler: Added 
signal handler for signal 15
mgmtd[3135]: 2009/08/12_13:14:22 info: login to cib live: 4, ret:-10
mgmtd[3135]: 2009/08/12_13:14:23 info: login to cib failed: live
mgmtd[3135]: 2009/08/12_13:14:23 ERROR: Can't initialize management 
library.Shutting down.(-1)
heartbeat[3127]: 2009/08/12_13:14:23 WARN: Managed 
/usr/lib64/heartbeat/mgmtd -v -t process 3135 exited with return code 1.
heartbeat[3127]: 2009/08/12_13:14:23 ERROR: Respawning client 
"/usr/lib64/heartbeat/mgmtd -v -t":
heartbeat[3127]: 2009/08/12_13:14:23 info: Starting child client 
"/usr/lib64/heartbeat/mgmtd -v -t" (0,0)
crmd[3142]: 2009/08/12_13:14:23 info: crm_timer_popped: Wait Timer 
(I_NULL) just popped!
cib[3137]: 2009/08/12_13:14:23 info: ccm_connect: Registering with CCM...
cib[3137]: 2009/08/12_13:14:23 info: cib_init: Requesting the list of 
configured nodes
cib[3137]: 2009/08/12_13:14:23 info: cib_init: Starting cib mainloop
cib[3137]: 2009/08/12_13:14:23 info: cib_client_status_callback: Status 
update: Client server3/cib now has status [join]
cib[3137]: 2009/08/12_13:14:23 info: crm_new_peer: Node 0 is now known 
as server3
cib[3137]: 2009/08/12_13:14:23 info: crm_update_peer_proc: server3.cib 
is now online
cib[3137]: 2009/08/12_13:14:23 info: mem_handle_event: Got an event 
OC_EV_MS_NEW_MEMBERSHIP from ccm
cib[3137]: 2009/08/12_13:14:23 info: mem_handle_event: instance=1, 
nodes=1, new=1, lost=0, n_idx=0, new_idx=0, old_idx=3
cib[3137]: 2009/08/12_13:14:23 info: cib_ccm_msg_callback: Processing 
CCM event=NEW MEMBERSHIP (id=1)
cib[3137]: 2009/08/12_13:14:23 info: crm_get_peer: Node server3 now has 
id: 1
cib[3137]: 2009/08/12_13:14:23 info: crm_update_peer: Node server3: id=1 
state=member (new) addr=(null) votes=-1 born=1 seen=1 
proc=00000000000000000000000000000100
cib[3137]: 2009/08/12_13:14:23 info: crm_update_peer_proc: server3.ais 
is now online
cib[3137]: 2009/08/12_13:14:23 info: crm_update_peer_proc: server3.crmd 
is now online
cib[3144]: 2009/08/12_13:14:23 info: write_cib_contents: Archived 
previous version as /var/lib/heartbeat/crm/cib-9.raw
cib[3144]: 2009/08/12_13:14:23 info: write_cib_contents: Wrote version 
0.7.0 of the CIB to disk (digest: c41e9acb00ec2c131a3375b2b6b0faec)
cib[3144]: 2009/08/12_13:14:23 info: retrieveCib: Reading cluster 
configuration from: /var/lib/heartbeat/crm/cib.Mx3kVH (digest: 
/var/lib/heartbeat/crm/cib.hewZvu)
cib[3137]: 2009/08/12_13:14:24 info: cib_client_status_callback: Status 
update: Client server3/cib now has status [online]
heartbeat[3143]: 2009/08/12_13:14:24 info: Starting 
"/usr/lib64/heartbeat/mgmtd -v -t" as uid 0  gid 0 (pid 3143)
mgmtd[3143]: 2009/08/12_13:14:24 info: Pacemaker-mgmt Hg Version: 
1f7d1be39d34c38d64d971ac79a21339917fba4a
mgmtd[3143]: 2009/08/12_13:14:24 info: G_main_add_SignalHandler: Added 
signal handler for signal 15
mgmtd[3143]: 2009/08/12_13:14:24 debug: Enabling coredumps
mgmtd[3143]: 2009/08/12_13:14:24 info: G_main_add_SignalHandler: Added 
signal handler for signal 10
mgmtd[3143]: 2009/08/12_13:14:24 info: G_main_add_SignalHandler: Added 
signal handler for signal 12
mgmtd[3143]: 2009/08/12_13:14:24 info: init_crm: live
attrd[3141]: 2009/08/12_13:14:24 info: cib_connect: Connected to the CIB 
after 7 signon attempts
attrd[3141]: 2009/08/12_13:14:24 info: cib_connect: Sending full refresh
crmd[3142]: 2009/08/12_13:14:24 info: do_cib_control: CIB connection 
established
crmd[3142]: 2009/08/12_13:14:24 info: crm_cluster_connect: Connecting to 
Heartbeat
crmd[3142]: 2009/08/12_13:14:24 info: register_heartbeat_conn: Hostname: 
server3
crmd[3142]: 2009/08/12_13:14:24 info: register_heartbeat_conn: UUID: 
7c6eb9f6-c938-4012-8ba0-4d5c21dd1315
mgmtd[3143]: 2009/08/12_13:14:24 debug: main: run the loop...
mgmtd[3143]: 2009/08/12_13:14:24 info: Started.
crmd[3142]: 2009/08/12_13:14:25 info: do_ha_control: Connected to the 
cluster
crmd[3142]: 2009/08/12_13:14:25 info: do_ccm_control: CCM connection 
established... waiting for first callback
crmd[3142]: 2009/08/12_13:14:25 info: do_started: Delaying start, CCM 
(0000000000100000) not connected
crmd[3142]: 2009/08/12_13:14:25 info: config_query_callback: Checking 
for expired actions every 900000ms
crmd[3142]: 2009/08/12_13:14:25 notice: crmd_client_status_callback: 
Status update: Client server3/crmd now has status [online] (DC=false)
crmd[3142]: 2009/08/12_13:14:25 info: crm_new_peer: Node 0 is now known 
as server3
crmd[3142]: 2009/08/12_13:14:25 info: crm_update_peer_proc: server3.crmd 
is now online
crmd[3142]: 2009/08/12_13:14:25 info: crmd_client_status_callback: Not 
the DC
crmd[3142]: 2009/08/12_13:14:25 notice: crmd_client_status_callback: 
Status update: Client server3/crmd now has status [online] (DC=false)
crmd[3142]: 2009/08/12_13:14:25 info: crmd_client_status_callback: Not 
the DC
crmd[3142]: 2009/08/12_13:14:25 info: mem_handle_event: Got an event 
OC_EV_MS_NEW_MEMBERSHIP from ccm
crmd[3142]: 2009/08/12_13:14:25 info: mem_handle_event: instance=1, 
nodes=1, new=1, lost=0, n_idx=0, new_idx=0, old_idx=3
crmd[3142]: 2009/08/12_13:14:25 info: crmd_ccm_msg_callback: Quorum 
(re)attained after event=NEW MEMBERSHIP (id=1)
crmd[3142]: 2009/08/12_13:14:25 info: ccm_event_detail: NEW MEMBERSHIP: 
trans=1, nodes=1, new=1, lost=0 n_idx=0, new_idx=0, old_idx=3
crmd[3142]: 2009/08/12_13:14:25 info: ccm_event_detail:         CURRENT: 
server3 [nodeid=1, born=1]
crmd[3142]: 2009/08/12_13:14:25 info: ccm_event_detail:         NEW:     
server3 [nodeid=1, born=1]
crmd[3142]: 2009/08/12_13:14:25 info: crm_get_peer: Node server3 now has 
id: 1
crmd[3142]: 2009/08/12_13:14:25 info: crm_update_peer: Node server3: 
id=1 state=member (new) addr=(null) votes=-1 born=1 seen=1 
proc=00000000000000000000000000000200
crmd[3142]: 2009/08/12_13:14:25 info: crm_update_peer_proc: server3.ais 
is now online
crmd[3142]: 2009/08/12_13:14:25 info: do_started: The local CRM is 
operational
crmd[3142]: 2009/08/12_13:14:25 info: do_state_transition: State 
transition S_STARTING -> S_PENDING [ input=I_PENDING 
cause=C_FSA_INTERNAL origin=do_started ]
crmd[3142]: 2009/08/12_13:14:36 info: crm_timer_popped: Election Trigger 
(I_DC_TIMEOUT) just popped!
crmd[3142]: 2009/08/12_13:14:36 WARN: do_log: FSA: Input I_DC_TIMEOUT 
from crm_timer_popped() received in state S_PENDING
crmd[3142]: 2009/08/12_13:14:36 info: do_state_transition: State 
transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT 
cause=C_TIMER_POPPED origin=crm_timer_popped
crmd[3142]: 2009/08/12_13:14:36 info: do_state_transition: State 
transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC 
cause=C_FSA_INTERNAL origin=do_election_check ]
crmd[3142]: 2009/08/12_13:14:36 info: do_te_control: Registering TE 
UUID: ecf08a7c-6fb8-43e0-a8b6-011ab83ad5f7
crmd[3142]: 2009/08/12_13:14:40 info: update_dc: Set DC to server3 (3.0.1)
crmd[3142]: 2009/08/12_13:14:41 info: do_state_transition: State 
transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED 
cause=C_FSA_INTERNAL origin=check_join_state ]
crmd[3142]: 2009/08/12_13:14:41 info: do_state_transition: All 1 cluster 
nodes responded to the join offer.
crmd[3142]: 2009/08/12_13:14:41 info: do_dc_join_finalize: join-1: 
Syncing the CIB from server3 to the rest of the cluster
cib[3137]: 2009/08/12_13:14:41 info: cib_process_request: Operation 
complete: op cib_sync for section 'all' (origin=local/crmd/14, 
version=0.7.0): ok (rc=0)
cib[3137]: 2009/08/12_13:14:41 info: cib_process_request: Operation 
complete: op cib_modify for section nodes (origin=local/crmd/15, 
version=0.7.0): ok (rc=0)
crmd[3142]: 2009/08/12_13:14:41 info: update_attrd: Connecting to attrd...
crmd[3142]: 2009/08/12_13:14:41 info: attrd_update: Sent update: 
terminate=(null) for server3
crmd[3142]: 2009/08/12_13:14:41 info: attrd_update: Sent update: 
shutdown=(null) for server3
attrd[3141]: 2009/08/12_13:14:41 info: find_hash_entry: Creating hash 
entry for terminate
attrd[3141]: 2009/08/12_13:14:41 info: find_hash_entry: Creating hash 
entry for shutdown
crmd[3142]: 2009/08/12_13:14:41 info: do_dc_join_ack: join-1: Updating 
node state to member for server3
cib[3137]: 2009/08/12_13:14:41 info: cib_process_request: Operation 
complete: op cib_delete for section 
//node_sta...@uname='server3']/transient_attributes 
(origin=local/crmd/16, version=0.7.0): ok (rc=0)
crmd[3142]: 2009/08/12_13:14:41 info: erase_xpath_callback: Deletion of 
"//node_sta...@uname='server3']/transient_attributes": ok (rc=0)
cib[3137]: 2009/08/12_13:14:41 info: cib_process_request: Operation 
complete: op cib_delete for section //node_sta...@uname='server3']/lrm 
(origin=local/crmd/17, version=0.7.0): ok (rc=0)
crmd[3142]: 2009/08/12_13:14:41 info: erase_xpath_callback: Deletion of 
"//node_sta...@uname='server3']/lrm": ok (rc=0)
cib[3137]: 2009/08/12_13:14:41 info: cib_process_request: Operation 
complete: op cib_delete for section //node_sta...@uname='server3']/lrm 
(origin=local/crmd/18, version=0.7.0): ok (rc=0)
crmd[3142]: 2009/08/12_13:14:41 info: erase_xpath_callback: Deletion of 
"//node_sta...@uname='server3']/lrm": ok (rc=0)
crmd[3142]: 2009/08/12_13:14:41 info: do_state_transition: State 
transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED 
cause=C_FSA_INTERNAL origin=check_join_state ]
crmd[3142]: 2009/08/12_13:14:41 info: populate_cib_nodes_ha: Requesting 
the list of configured nodes
crmd[3142]: 2009/08/12_13:14:43 WARN: get_uuid: Could not calculate UUID 
for server2
crmd[3142]: 2009/08/12_13:14:43 WARN: populate_cib_nodes_ha: Node 
server2: no uuid found
crmd[3142]: 2009/08/12_13:14:43 info: do_state_transition: All 1 cluster 
nodes are eligible to run resources.
crmd[3142]: 2009/08/12_13:14:43 info: do_dc_join_final: Ensuring DC, 
quorum and node attributes are up-to-date
crmd[3142]: 2009/08/12_13:14:43 info: attrd_update: Sent update: 
(null)=(null) for localhost
attrd[3141]: 2009/08/12_13:14:43 info: attrd_local_callback: Sending 
full refresh (origin=crmd)
crmd[3142]: 2009/08/12_13:14:43 info: crm_update_quorum: Updating quorum 
status to true (call=22)
attrd[3141]: 2009/08/12_13:14:43 info: attrd_trigger_update: Sending 
flush op to all hosts for: shutdown (<null>)
crmd[3142]: 2009/08/12_13:14:43 info: abort_transition_graph: 
do_te_invoke:190 - Triggered transition abort (complete=1) : Peer Cancelled
crmd[3142]: 2009/08/12_13:14:43 info: do_pe_invoke: Query 23: Requesting 
the current CIB: S_POLICY_ENGINE
attrd[3141]: 2009/08/12_13:14:43 info: attrd_trigger_update: Sending 
flush op to all hosts for: terminate (<null>)
cib[3137]: 2009/08/12_13:14:43 info: cib_process_request: Operation 
complete: op cib_modify for section nodes (origin=local/crmd/20, 
version=0.7.1): ok (rc=0)
crmd[3142]: 2009/08/12_13:14:43 info: abort_transition_graph: 
need_abort:59 - Triggered transition abort (complete=1) : Non-status change
crmd[3142]: 2009/08/12_13:14:43 info: need_abort: Aborting on change to 
admin_epoch
crmd[3142]: 2009/08/12_13:14:43 info: do_pe_invoke: Query 24: Requesting 
the current CIB: S_POLICY_ENGINE
cib[3137]: 2009/08/12_13:14:43 info: log_data_element: cib:diff: - <cib 
admin_epoch="0" epoch="7" num_updates="1" />
cib[3137]: 2009/08/12_13:14:43 info: log_data_element: cib:diff: + <cib 
dc-uuid="7c6eb9f6-c938-4012-8ba0-4d5c21dd1315" admin_epoch="0" epoch="8" 
num_updates="1" />
cib[3137]: 2009/08/12_13:14:43 info: cib_process_request: Operation 
complete: op cib_modify for section cib (origin=local/crmd/22, 
version=0.8.1): ok (rc=0)
crmd[3142]: 2009/08/12_13:14:43 info: do_pe_invoke_callback: Invoking 
the PE: ref=pe_calc-dc-1250075683-7, seq=1, quorate=1
pengine[3145]: 2009/08/12_13:14:43 info: unpack_config: Node scores: 
'red' = -INFINITY, 'yellow' = 0, 'green' = 0
pengine[3145]: 2009/08/12_13:14:43 ERROR: unpack_resources: No STONITH 
resources have been defined
pengine[3145]: 2009/08/12_13:14:43 ERROR: unpack_resources: Either 
configure some or disable STONITH with the stonith-enabled option
pengine[3145]: 2009/08/12_13:14:43 ERROR: unpack_resources: NOTE: 
Clusters with shared data need STONITH to ensure data integrity
pengine[3145]: 2009/08/12_13:14:43 info: determine_online_status: Node 
server3 is online
pengine[3145]: 2009/08/12_13:14:43 info: stage6: Delaying fencing 
operations until there are resources to manage
cib[3146]: 2009/08/12_13:14:43 info: write_cib_contents: Archived 
previous version as /var/lib/heartbeat/crm/cib-10.raw
cib[3146]: 2009/08/12_13:14:43 info: write_cib_contents: Wrote version 
0.8.0 of the CIB to disk (digest: a037845ebd38e720357a1dbf3e1faf68)

crmd[3142]: 2009/08/12_13:14:36 WARN: cib_client_add_notify_callback: 
Callback already present
crmd[3142]: 2009/08/12_13:14:36 info: set_graph_functions: Setting 
custom graph functions
crmd[3142]: 2009/08/12_13:14:36 info: unpack_graph: Unpacked transition 
-1: 0 actions in 0 synapses
crmd[3142]: 2009/08/12_13:14:36 info: start_subsystem: Starting 
sub-system "pengine"
pengine[3145]: 2009/08/12_13:14:36 info: main: Starting pengine
crmd[3142]: 2009/08/12_13:14:39 info: do_dc_takeover: Taking over DC 
status for this partition
cib[3137]: 2009/08/12_13:14:39 info: cib_process_readwrite: We are now 
in R/W mode
cib[3137]: 2009/08/12_13:14:39 info: cib_process_request: Operation 
complete: op cib_master for section 'all' (origin=local/crmd/6, 
version=0.7.0): ok (rc=0)
cib[3137]: 2009/08/12_13:14:39 info: cib_process_request: Operation 
complete: op cib_modify for section cib (origin=local/crmd/7, 
version=0.7.0): ok (rc=0)
cib[3137]: 2009/08/12_13:14:39 info: cib_process_request: Operation 
complete: op cib_modify for section crm_config (origin=local/crmd/9, 
version=0.7.0): ok (rc=0)
crmd[3142]: 2009/08/12_13:14:39 info: join_make_offer: Making join 
offers based on membership 1
crmd[3142]: 2009/08/12_13:14:39 info: do_dc_join_offer_all: join-1: 
Waiting on 1 outstanding join acks
crmd[3142]: 2009/08/12_13:14:39 info: te_connect_stonith: Attempting 
connection to fencing daemon...
cib[3137]: 2009/08/12_13:14:39 info: cib_process_request: Operation 
complete: op cib_modify for section crm_config (origin=local/crmd/11, 
version=0.7.0): ok (rc=0)
crmd[3142]: 2009/08/12_13:14:40 info: te_connect_stonith: Connected
crmd[3142]: 2009/08/12_13:14:40 info: config_query_callback: Checking 
for expired actions every 900000ms

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to