Thank you Marian. I removed th efile as you suggested but unfortunately it has made no difference. The ip address is simply not being released when I stop the heartbeat process.
Anyone have an ideas where I could start to look at this? The only way I can get the ip address released is to reboot the node. thanks Marian Marinov wrote: > On Saturday 20 March 2010 03:56:27 mike wrote: > >> Hi guys, >> >> I have a simple 2 node cluster with a VIP running on RHEL 5.3 on s390. >> Nothing else configured yet. >> >> When I start up the cluster, all is well. The VIP starts up on the home >> node and crm_mon shows the resource and nodes as on line. No errors in >> the logs. >> >> If I issue service heartbeat stop on the main node, the ip fails over to >> the back up node and crm_mon shows as I would expect it should, i.e. the >> ip address is on the back up node and that the other node is offline. >> However, if I do a ifconfig on the main node I see that the eth0:0 entry >> is still there so in effect the vip address is now on both servers. >> >> If both nodes were up and running and I rebooted the main node then the >> failover works perfectly. >> >> Would anyone know why the nodes seem unable to release the vip unless >> rebooted? >> >> ha-log: >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> stage6: Scheduling Node dbsuat1a.intranet.mydomain.com for shutdown >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: >> LogActions: Move resource IPaddr_172_28_185_49 (Started >> dbsuat1a.intranet.mydomain.com -> dbsuat1b.intranet.mydomain.com) >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_state_transition: State transition S_POLICY_ENGINE -> >> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE >> origin=handle_response ] >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> process_pe_message: Transition 5: PEngine Input stored in: >> /usr/var/lib/pengine/pe-input-337.bz2 >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> process_pe_message: Configuration WARNINGs found during PE processing. >> Please run "crm_verify -L" to identify issues. >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> unpack_graph: Unpacked transition 5: 5 actions in 5 synapses >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_te_invoke: Processing graph 5 (ref=pe_calc-dc-1269021312-26) derived >> from /usr/var/lib/pengine/pe-input-337.bz2 >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> te_rsc_command: Initiating action 6: stop IPaddr_172_28_185_49_stop_0 on >> dbsuat1a.intranet.mydomain.com (local) >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_lrm_rsc_op: Performing key=6:5:0:888fa84e-3267-409e-966b-2ab01e579c0f >> op=IPaddr_172_28_185_49_stop_0 ) >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com lrmd: [4528]: info: >> rsc:IPaddr_172_28_185_49:5: stop >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> process_lrm_event: LRM operation IPaddr_172_28_185_49_monitor_5000 >> (call=4, status=1, cib-update=0, confirmed=true) Cancelled >> Mar 19 13:55:32 DBSUAT1A.intranet.mydomain.com lrmd: [4528]: WARN: >> IPaddr_172_28_185_49:stop process (PID 5474) timed out (try 1). Killing >> with signal SIGTERM (15). >> Mar 19 13:55:32 DBSUAT1A.intranet.mydomain.com lrmd: [4528]: WARN: >> Managed IPaddr_172_28_185_49:stop process 5474 killed by signal 15 >> [SIGTERM - Termination (ANSI)]. >> Mar 19 13:55:32 DBSUAT1A.intranet.mydomain.com lrmd: [4528]: WARN: >> operation stop[5] on ocf::IPaddr::IPaddr_172_28_185_49 for client 4531, >> its parameters: ip=[172.28.185.49] CRM_meta_timeout=[20000] >> crm_feature_set=[3.0.1] : pid [5474] timed out >> Mar 19 13:55:32 DBSUAT1A.intranet.mydomain.com crmd: [4531]: ERROR: >> process_lrm_event: LRM operation IPaddr_172_28_185_49_stop_0 (5) Timed >> Out (timeout=20000ms) >> Mar 19 13:55:32 DBSUAT1A.intranet.mydomain.com crmd: [4531]: WARN: >> status_from_rc: Action 6 (IPaddr_172_28_185_49_stop_0) on >> dbsuat1a.intranet.mydomain.com failed (target: 0 vs. rc: -2): Error >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: WARN: >> update_failcount: Updating failcount for IPaddr_172_28_185_49 on >> dbsuat1a.intranet.mydomain.com after failed stop: rc=-2 >> (update=INFINITY, time=1269021333) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> abort_transition_graph: match_graph_event:272 - Triggered transition >> abort (complete=0, tag=lrm_rsc_op, id=IPaddr_172_28_185_49_stop_0, >> magic=2:-2;6:5:0:888fa84e-3267-409e-966b-2ab01e579c0f, cib=0.23.16) : >> Event failed >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> update_abort_priority: Abort priority upgraded from 0 to 1 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> update_abort_priority: Abort action done superceeded by restart >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> match_graph_event: Action IPaddr_172_28_185_49_stop_0 (6) confirmed on >> dbsuat1a.intranet.mydomain.com (rc=4) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> run_graph: ==================================================== >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: notice: >> run_graph: Transition 5 (Complete=1, Pending=0, Fired=0, Skipped=4, >> Incomplete=0, Source=/usr/var/lib/pengine/pe-input-337.bz2): Stopped >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> te_graph_trigger: Transition 5 is now complete >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: >> find_hash_entry: Creating hash entry for fail-count-IPaddr_172_28_185_49 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_state_transition: State transition S_TRANSITION_ENGINE -> >> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd ] >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_state_transition: All 2 cluster nodes are eligible to run resources. >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: >> attrd_trigger_update: Sending flush op to all hosts for: >> fail-count-IPaddr_172_28_185_49 (INFINITY) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: >> attrd_perform_update: Sent update 24: >> fail-count-IPaddr_172_28_185_49=INFINITY >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_pe_invoke: Query 53: Requesting the current CIB: S_POLICY_ENGINE >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: >> find_hash_entry: Creating hash entry for last-failure-IPaddr_172_28_185_49 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> abort_transition_graph: te_update_diff:146 - Triggered transition abort >> (complete=1, tag=transient_attributes, >> id=db80324b-c9de-4995-a66a-eedf93abb42c, magic=NA, cib=0.23.17) : >> Transient attribute: update >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: >> attrd_trigger_update: Sending flush op to all hosts for: >> last-failure-IPaddr_172_28_185_49 (1269021333) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: >> attrd_perform_update: Sent update 27: >> last-failure-IPaddr_172_28_185_49=1269021333 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_pe_invoke_callback: Invoking the PE: ref=pe_calc-dc-1269021333-28, >> seq=2, quorate=1 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> abort_transition_graph: te_update_diff:146 - Triggered transition abort >> (complete=1, tag=transient_attributes, >> id=db80324b-c9de-4995-a66a-eedf93abb42c, magic=NA, cib=0.23.18) : >> Transient attribute: update >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: >> update_validation: Upgrading transitional-0.6-style configuration to >> pacemaker-1.0 with /usr/share/pacemaker/upgrade06.xsl >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> update_validation: Transformation /usr/share/pacemaker/upgrade06.xsl >> successful >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: >> update_validation: Upgraded from transitional-0.6 to pacemaker-1.0 >> validation >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: WARN: >> cli_config_update: Your configuration was internally updated to the >> latest version (pacemaker-1.0) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_pe_invoke: Query 54: Requesting the current CIB: S_POLICY_ENGINE >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_pe_invoke: Query 55: Requesting the current CIB: S_POLICY_ENGINE >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_pe_invoke_callback: Invoking the PE: ref=pe_calc-dc-1269021333-29, >> seq=2, quorate=1 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> determine_online_status: Node dbsuat1a.intranet.mydomain.com is shutting >> down >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: WARN: >> unpack_rsc_op: Processing failed op IPaddr_172_28_185_49_stop_0 on >> dbsuat1a.intranet.mydomain.com: unknown exec error (-2) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> native_add_running: resource IPaddr_172_28_185_49 isnt managed >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> determine_online_status: Node dbsuat1b.intranet.mydomain.com is online >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: >> native_print: IPaddr_172_28_185_49 (ocf::heartbeat:IPaddr): >> Started dbsuat1a.intranet.mydomain.com (unmanaged) FAILED >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> get_failcount: IPaddr_172_28_185_49 has failed 1000000 times on >> dbsuat1a.intranet.mydomain.com >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: WARN: >> common_apply_stickiness: Forcing IPaddr_172_28_185_49 away from >> dbsuat1a.intranet.mydomain.com after 1000000 failures (max=1000000) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> native_color: Unmanaged resource IPaddr_172_28_185_49 allocated to >> 'nowhere': failed >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> stage6: Scheduling Node dbsuat1a.intranet.mydomain.com for shutdown >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: >> LogActions: Leave resource IPaddr_172_28_185_49 (Started unmanaged) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> handle_response: pe_calc calculation pe_calc-dc-1269021333-28 is obsolete >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> process_pe_message: Transition 6: PEngine Input stored in: >> /usr/var/lib/pengine/pe-input-338.bz2 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> process_pe_message: Configuration WARNINGs found during PE processing. >> Please run "crm_verify -L" to identify issues. >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: >> update_validation: Upgrading transitional-0.6-style configuration to >> pacemaker-1.0 with /usr/share/pacemaker/upgrade06.xsl >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> update_validation: Transformation /usr/share/pacemaker/upgrade06.xsl >> successful >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: >> update_validation: Upgraded from transitional-0.6 to pacemaker-1.0 >> validation >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: WARN: >> cli_config_update: Your configuration was internally updated to the >> latest version (pacemaker-1.0) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> determine_online_status: Node dbsuat1a.intranet.mydomain.com is shutting >> down >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: WARN: >> unpack_rsc_op: Processing failed op IPaddr_172_28_185_49_stop_0 on >> dbsuat1a.intranet.mydomain.com: unknown exec error (-2) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> native_add_running: resource IPaddr_172_28_185_49 isnt managed >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> determine_online_status: Node dbsuat1b.intranet.mydomain.com is online >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: >> native_print: IPaddr_172_28_185_49 (ocf::heartbeat:IPaddr): >> Started dbsuat1a.intranet.mydomain.com (unmanaged) FAILED >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> get_failcount: IPaddr_172_28_185_49 has failed 1000000 times on >> dbsuat1a.intranet.mydomain.com >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: WARN: >> common_apply_stickiness: Forcing IPaddr_172_28_185_49 away from >> dbsuat1a.intranet.mydomain.com after 1000000 failures (max=1000000) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> native_color: Unmanaged resource IPaddr_172_28_185_49 allocated to >> 'nowhere': failed >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> stage6: Scheduling Node dbsuat1a.intranet.mydomain.com for shutdown >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: >> LogActions: Leave resource IPaddr_172_28_185_49 (Started unmanaged) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> process_pe_message: Transition 7: PEngine Input stored in: >> /usr/var/lib/pengine/pe-input-339.bz2 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_state_transition: State transition S_POLICY_ENGINE -> >> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE >> origin=handle_response ] >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> process_pe_message: Configuration WARNINGs found during PE processing. >> Please run "crm_verify -L" to identify issues. >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> unpack_graph: Unpacked transition 7: 1 actions in 1 synapses >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_te_invoke: Processing graph 7 (ref=pe_calc-dc-1269021333-29) derived >> from /usr/var/lib/pengine/pe-input-339.bz2 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> te_crm_command: Executing crm-event (10): do_shutdown on >> dbsuat1a.intranet.mydomain.com >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> te_crm_command: crm-event (10) is a local shutdown >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> run_graph: ==================================================== >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: notice: >> run_graph: Transition 7 (Complete=1, Pending=0, Fired=0, Skipped=0, >> Incomplete=0, Source=/usr/var/lib/pengine/pe-input-339.bz2): Complete >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> te_graph_trigger: Transition 7 is now complete >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_state_transition: State transition S_TRANSITION_ENGINE -> S_STOPPING >> [ input=I_STOP cause=C_FSA_INTERNAL origin=notify_crmd ] >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_dc_release: DC role released >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> stop_subsystem: Sent -TERM to pengine: [4714] >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_te_control: Transitioner is now inactive >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: >> crm_signal_dispatch: Invoking handler for signal 15: Terminated >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_te_control: Disconnecting STONITH... >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> tengine_stonith_connection_destroy: Fencing daemon disconnected >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: notice: Not >> currently connected. >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_shutdown: Terminating the pengine >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> stop_subsystem: Sent -TERM to pengine: [4714] >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_shutdown: Waiting for subsystems to exit >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: WARN: >> register_fsa_input_adv: do_shutdown stalled the FSA with pending inputs >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_shutdown: All subsystems stopped, continuing >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: WARN: >> do_log: FSA: Input I_RELEASE_SUCCESS from do_dc_release() received in >> state S_STOPPING >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_shutdown: Terminating the pengine >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> stop_subsystem: Sent -TERM to pengine: [4714] >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_shutdown: Waiting for subsystems to exit >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_shutdown: All subsystems stopped, continuing >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> crmdManagedChildDied: Process pengine:[4714] exited (signal=0, exitcode=0) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> pe_msg_dispatch: Received HUP from pengine:[4714] >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> pe_connection_destroy: Connection to the Policy Engine released >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_shutdown: All subsystems stopped, continuing >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: ERROR: >> verify_stopped: Resource IPaddr_172_28_185_49 was active at shutdown. >> You may ignore this error if it is unmanaged. >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_lrm_control: Disconnected from the LRM >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com ccm: [4526]: info: client >> (pid=4531) removed from ccm >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_ha_control: Disconnected from Heartbeat >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_cib_control: Disconnecting CIB >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: >> cib_process_readwrite: We are now in R/O mode >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> crmd_cib_connection_destroy: Connection to the CIB terminated... >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_exit: Performing A_EXIT_0 - gracefully exiting the CRMd >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> free_mem: Dropping I_TERMINATE: [ state=S_STOPPING cause=C_FSA_INTERNAL >> origin=do_stop ] >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: >> do_exit: [crmd] stopped (0) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: >> killing /usr/lib64/heartbeat/attrd process group 4530 with signal 15 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: >> crm_signal_dispatch: Invoking handler for signal 15: Terminated >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: >> attrd_shutdown: Exiting >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: >> main: Exiting... >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: >> attrd_cib_connection_destroy: Connection to the CIB terminated... >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: >> killing /usr/lib64/heartbeat/stonithd process group 4529 with signal 15 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com stonithd: [4529]: notice: >> /usr/lib64/heartbeat/stonithd normally quit. >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: >> killing /usr/lib64/heartbeat/lrmd -r process group 4528 with signal 15 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com lrmd: [4528]: info: lrmd >> is shutting down >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com lrmd: [4528]: WARN: >> resource IPaddr_172_28_185_49 is left in UNKNOWN status.(last op stop >> finished without LRM_OP_DONE status.) >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: >> killing /usr/lib64/heartbeat/cib process group 4527 with signal 15 >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: >> crm_signal_dispatch: Invoking handler for signal 15: Terminated >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: >> cib_shutdown: Disconnected 0 clients >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: >> cib_process_disconnect: All clients disconnected... >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: >> initiate_exit: Sending disconnect notification to 2 peers... >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: >> cib_process_shutdown_req: Shutdown ACK from dbsuat1b.intranet.mydomain.com >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: >> terminate_cib: cib_process_shutdown_req: Disconnecting heartbeat >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: >> terminate_cib: Exiting... >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: >> cib_process_request: Operation complete: op cib_shutdown_req for section >> 'all' >> (origin=dbsuat1b.intranet.mydomain.com/dbsuat1b.intranet.mydomain.com/(null >> ), version=0.0.0): ok (rc=0) >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: >> ha_msg_dispatch: Lost connection to heartbeat service. >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: main: >> Done >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com ccm: [4526]: info: client >> (pid=4527) removed from ccm >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: >> killing /usr/lib64/heartbeat/ccm process group 4526 with signal 15 >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com ccm: [4526]: info: >> received SIGTERM, going to shut down >> Mar 19 13:55:35 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: >> killing HBFIFO process 4522 with signal 15 >> Mar 19 13:55:35 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: >> killing HBWRITE process 4523 with signal 15 >> Mar 19 13:55:35 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: >> killing HBREAD process 4524 with signal 15 >> Mar 19 13:55:35 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: >> Core process 4524 exited. 3 remaining >> Mar 19 13:55:35 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: >> Core process 4523 exited. 2 remaining >> Mar 19 13:55:35 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: >> Core process 4522 exited. 1 remaining >> Mar 19 13:55:35 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: >> dbsuat1a.intranet.mydomain.com Heartbeat shutdown complete. >> >> >> my ha.cf >> # Logging >> debug 1 >> debugfile /var/log/ha-debug >> logfile /var/log/ha-log >> logfacility local0 >> #use_logd true >> #logfacility daemon >> >> # Misc Options >> traditional_compression off >> compression bz2 >> coredumps true >> >> # Communications >> udpport 691 >> bcast eth0 >> ##autojoin any >> autojoin none >> >> # Thresholds (in seconds) >> keepalive 1 >> warntime 6 >> deadtime 10 >> initdead 15 >> >> node dbsuat1a.intranet.mydomain.com >> node dbsuat1b.intranet.mydomain.com >> #enable pacemaker >> crm yes >> #enable STONITH >> #crm respawn >> >> my haresources: >> DBSUAT1A.intranet.mydomain.com 172.28.185.49 >> >> >> > > I don't have a very good advice, but you shouldn't use haresources anymore. > You should use pacemaker for configuring the cluster. > > You have said that you wish to use pacemaker(crm) with this line of your > config: crm yes > > Remove the haresources file, restart the heartbeat on both nodes and redo the > tests. > > > ------------------------------------------------------------------------ > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
