hello list,i am having three node in my cluster.
Node A,B,Cand using heartbeat 2.0.5 with crm on
in cib, resource stickiness is set to INFINITY one resource IPaddrand no
constraint my problem is resource is failing back on node A when it comes up
with this configuration which is not expected.I got some error in log files so
i am attaching part of log files on two node A and B where i think the problem
is if not so then please correct me where i am wrong
anyone having solution to above mentioned problem please help......
pengine[6343]: 2010/11/30_09:58:47 info: unpack_config:unpack.c STONITH will
reboot nodes
pengine[6343]: 2010/11/30_09:58:47 WARN: unpack_config:unpack.c No value
specified for cluster preference: symmetric_cluster
pengine[6343]: 2010/11/30_09:58:47 info: unpack_config:unpack.c Cluster is
symmetric - resources can run anywhere by default
pengine[6343]: 2010/11/30_09:58:47 WARN: unpack_config:unpack.c No value
specified for cluster preference: no_quorum_policy
pengine[6343]: 2010/11/30_09:58:47 info: unpack_config:unpack.c On loss of CCM
Quorum: Stop ALL resources
pengine[6343]: 2010/11/30_09:58:47 WARN: unpack_config:unpack.c No value
specified for cluster preference: stop_orphan_resources
pengine[6343]: 2010/11/30_09:58:47 info: unpack_config:unpack.c Orphan
resources are stopped
pengine[6343]: 2010/11/30_09:58:47 WARN: unpack_config:unpack.c No value
specified for cluster preference: stop_orphan_actions
pengine[6343]: 2010/11/30_09:58:47 info: unpack_config:unpack.c Orphan resource
actions are stopped
pengine[6343]: 2010/11/30_09:58:47 WARN: unpack_config:unpack.c No value
specified for cluster preference: remove_after_stop
pengine[6343]: 2010/11/30_09:58:47 info: unpack_config:unpack.c Stopped
resources are removed from the status section: false
pengine[6343]: 2010/11/30_09:58:47 WARN: unpack_config:unpack.c No value
specified for cluster preference: is_managed_default
pengine[6343]: 2010/11/30_09:58:47 info: unpack_config:unpack.c By default
resources are managed
pengine[6343]: 2010/11/30_09:58:47 info: determine_online_status:unpack.c Node
ofi246 is online
pengine[6343]: 2010/11/30_09:58:47 info: determine_online_status:unpack.c Node
ofi249 is online
pengine[6343]: 2010/11/30_09:58:47 ERROR: native_add_running:native.c Resource
ocf::IPaddr:ip_1 appears to be active on 2 nodes.
pengine[6343]: 2010/11/30_09:58:47 ERROR: See
http://linux-ha.org/v2/faq/resource_too_active for more information.
pengine[6343]: 2010/11/30_09:58:47 info: ip_1 (heartbeat::ocf:IPaddr)
pengine[6343]: 2010/11/30_09:58:47 info: 0 : ofi246
pengine[6343]: 2010/11/30_09:58:47 info: 1 : ofi249
pengine[6343]: 2010/11/30_09:58:47 ERROR: native_create_actions:native.c
Attempting recovery of resource ip_1
pengine[6343]: 2010/11/30_09:58:47 notice: StopRsc:native.c ofi246 Stop
ip_1
pengine[6343]: 2010/11/30_09:58:47 notice: StopRsc:native.c ofi249 Stop
ip_1
pengine[6343]: 2010/11/30_09:58:47 notice: StartRsc:native.c ofi249 Start
ip_1
pengine[6343]: 2010/11/30_09:58:47 notice: Recurring:native.c ofi249
ip_1_monitor_5000
pengine[6343]: 2010/11/30_09:58:47 notice: stage8:stages.c Created transition
graph 9.
pengine[6343]: 2010/11/30_09:58:47 WARN: process_pe_message:pengine.c No value
specified for cluster preference: pe-error-series-max
pengine[6343]: 2010/11/30_09:58:47 ERROR: process_pe_message:pengine.c
Transition 9: ERRORs found during PE processing. PEngine Input stored in:
/var/lib/heartbeat/pengine/pe-error-22.bz2
pengine[6343]: 2010/11/30_09:58:47 info: process_pe_message:pengine.c
Configuration ERRORs found during PE processing. Please run "crm_verify -L" to
identify issues.
crmd[4080]: 2010/11/30_09:58:47 info: do_state_transition:fsa.c ofi246: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=do_msg_route ]
tengine[6342]: 2010/11/30_09:58:47 info: unpack_graph:unpack.c Unpacked
transition 9: 5 actions in 5 synapses
tengine[6342]: 2010/11/30_09:58:47 info: send_rsc_command:actions.c Initiating
action 5: ip_1_stop_0 on ofi246
tengine[6342]: 2010/11/30_09:58:47 info: send_rsc_command:actions.c Initiating
action 6: ip_1_stop_0 on ofi249
tengine[6342]: 2010/11/30_09:58:47 info: send_rsc_command:actions.c Initiating
action 3: probe_complete on ofi249
"/var/log/ha-log" 7938L, 925536C
6294,1 79%
attrd[4254]: 2010/11/30_12:23:44 info: main:attrd.c Exiting...
crmd[4255]: 2010/11/30_12:23:44 info: crm_shutdown:control.c Requesting shutdown
crmd[4255]: 2010/11/30_12:23:44 ERROR: cib_native_msgready:cib_native.c Message
pending on command channel [4251]
crmd[4255]: 2010/11/30_12:23:44 ERROR: #========= cib:cmd message start
==========#
crmd[4255]: 2010/11/30_12:23:44 ERROR: MSG: No message to dump
crmd[4255]: 2010/11/30_12:23:44 info: cib_native_msgready:cib_native.c Lost
connection to the CIB service [4251].
crmd[4255]: 2010/11/30_12:23:44 CRIT: cib_native_dispatch:cib_native.c Lost
connection to the CIB service [4251/callback].
crmd[4255]: 2010/11/30_12:23:44 ERROR: crmd_cib_connection_destroy:callbacks.c
Connection to the CIB terminated...
crmd[4255]: 2010/11/30_12:23:44 info: do_shutdown_req:control.c Sending
shutdown request to DC: ofi246
crmd[4255]: 2010/11/30_12:23:44 ERROR: do_log:misc.c [[FSA]] Input I_ERROR from
crmd_cib_connection_destroy() received in state (S_NOT_DC)
crmd[4255]: 2010/11/30_12:23:44 info: do_state_transition:fsa.c ofi249: State
transition S_NOT_DC -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL
origin=crmd_cib_connection_destroy ]
crmd[4255]: 2010/11/30_12:23:44 ERROR: do_recover:control.c Action A_RECOVER
(0000000001000000) not supported
crmd[4255]: 2010/11/30_12:23:44 ERROR: do_log:misc.c [[FSA]] Input I_STOP from
do_recover() received in state (S_RECOVERY)
crmd[4255]: 2010/11/30_12:23:44 info: do_state_transition:fsa.c ofi249: State
transition S_RECOVERY -> S_STOPPING [ input=I_STOP cause=C_FSA_INTERNAL
origin=do_recover ]
cib[4251]: 2010/11/30_12:23:44 info: mem_handle_func:IPC broken, ccm is dead
before the client!
crmd[4255]: 2010/11/30_12:23:44 info: do_dc_release:election.c DC role released
crmd[4255]: 2010/11/30_12:23:44 WARN: do_log:misc.c [[FSA]] Input
I_RELEASE_SUCCESS from do_dc_release() received in state (S_STOPPING)
crmd[4255]: 2010/11/30_12:23:44 info: do_state_transition:fsa.c ofi249: State
transition S_STOPPING -> S_TERMINATE [ input=I_TERMINATE cause=C_FSA_INTERNAL
origin=do_shutdown ]
cib[4251]: 2010/11/30_12:23:44 ERROR: cib_ccm_dispatch:callbacks.c CCM
connection appears to have failed: rc=-1.
mgmtd[4256]: 2010/11/30_12:23:44 info: mgmtd is shutting down
mgmtd[4256]: 2010/11/30_12:23:44 ERROR: cib_native_msgready:cib_native.c
Message pending on command channel [4251]
mgmtd[4256]: 2010/11/30_12:23:44 ERROR: #========= cib:cmd message start
==========#
mgmtd[4256]: 2010/11/30_12:23:44 ERROR: MSG: No message to dump
mgmtd[4256]: 2010/11/30_12:23:44 CRIT: cib_native_dispatch:cib_native.c Lost
connection to the CIB service [4251/callback].
mgmtd[4256]: 2010/11/30_12:23:44 ERROR: Connection to the CIB terminated...
exiting
crmd[4255]: 2010/11/30_12:23:44 info: stop_all_resources:lrm.c Making sure all
active resources are stopped before exit
heartbeat[4207]: 2010/11/30_12:23:44 info: killing /usr/lib64/heartbeat/mgmtd
-v process group 4256 with signal 15
heartbeat[4207]: 2010/11/30_12:23:44 info: killing /usr/lib64/heartbeat/crmd
process group 4255 with signal 15
lrmd[4252]: 2010/11/30_12:23:44 info: lrmd is shutting down
crmd[4255]: 2010/11/30_12:23:44 ERROR: lrm_get_all_rscs(596): failed to receive
a reply message of getall.
crmd[4255]: 2010/11/30_12:23:44 info: do_lrm_control:lrm.c Disconnected from
the LRM
crmd[4255]: 2010/11/30_12:23:44 info: do_ha_control:control.c Disconnected from
Heartbeat
crmd[4255]: 2010/11/30_12:23:44 info: do_cib_control:cib.c Disconnecting CIB
crmd[4255]: 2010/11/30_12:23:44 ERROR: crm_send_ipc_message:ipc.c IPC Channel
to 4251 is not connected
crmd[4255]: 2010/11/30_12:23:44 WARN: #========= IPC[outbound] message start
==========#
crmd[4255]: 2010/11/30_12:23:44 WARN: MSG: Dumping message with 4 fields
6493,105 92_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems