Can you please give us your crm configuration ? Marian
On Sunday 21 March 2010 23:30:46 mike wrote: > Thank you Marian. I removed th efile as you suggested but unfortunately > it has made no difference. The ip address is simply not being released > when I stop the heartbeat process. > > Anyone have an ideas where I could start to look at this? The only way I > can get the ip address released is to reboot the node. > > thanks > > Marian Marinov wrote: > > On Saturday 20 March 2010 03:56:27 mike wrote: > >> Hi guys, > >> > >> I have a simple 2 node cluster with a VIP running on RHEL 5.3 on s390. > >> Nothing else configured yet. > >> > >> When I start up the cluster, all is well. The VIP starts up on the home > >> node and crm_mon shows the resource and nodes as on line. No errors in > >> the logs. > >> > >> If I issue service heartbeat stop on the main node, the ip fails over to > >> the back up node and crm_mon shows as I would expect it should, i.e. the > >> ip address is on the back up node and that the other node is offline. > >> However, if I do a ifconfig on the main node I see that the eth0:0 entry > >> is still there so in effect the vip address is now on both servers. > >> > >> If both nodes were up and running and I rebooted the main node then the > >> failover works perfectly. > >> > >> Would anyone know why the nodes seem unable to release the vip unless > >> rebooted? > >> > >> ha-log: > >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> stage6: Scheduling Node dbsuat1a.intranet.mydomain.com for shutdown > >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: > >> LogActions: Move resource IPaddr_172_28_185_49 (Started > >> dbsuat1a.intranet.mydomain.com -> dbsuat1b.intranet.mydomain.com) > >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_state_transition: State transition S_POLICY_ENGINE -> > >> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE > >> origin=handle_response ] > >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> process_pe_message: Transition 5: PEngine Input stored in: > >> /usr/var/lib/pengine/pe-input-337.bz2 > >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> process_pe_message: Configuration WARNINGs found during PE processing. > >> Please run "crm_verify -L" to identify issues. > >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> unpack_graph: Unpacked transition 5: 5 actions in 5 synapses > >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_te_invoke: Processing graph 5 (ref=pe_calc-dc-1269021312-26) derived > >> from /usr/var/lib/pengine/pe-input-337.bz2 > >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> te_rsc_command: Initiating action 6: stop IPaddr_172_28_185_49_stop_0 on > >> dbsuat1a.intranet.mydomain.com (local) > >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_lrm_rsc_op: Performing key=6:5:0:888fa84e-3267-409e-966b-2ab01e579c0f > >> op=IPaddr_172_28_185_49_stop_0 ) > >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com lrmd: [4528]: info: > >> rsc:IPaddr_172_28_185_49:5: stop > >> Mar 19 13:55:12 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> process_lrm_event: LRM operation IPaddr_172_28_185_49_monitor_5000 > >> (call=4, status=1, cib-update=0, confirmed=true) Cancelled > >> Mar 19 13:55:32 DBSUAT1A.intranet.mydomain.com lrmd: [4528]: WARN: > >> IPaddr_172_28_185_49:stop process (PID 5474) timed out (try 1). Killing > >> with signal SIGTERM (15). > >> Mar 19 13:55:32 DBSUAT1A.intranet.mydomain.com lrmd: [4528]: WARN: > >> Managed IPaddr_172_28_185_49:stop process 5474 killed by signal 15 > >> [SIGTERM - Termination (ANSI)]. > >> Mar 19 13:55:32 DBSUAT1A.intranet.mydomain.com lrmd: [4528]: WARN: > >> operation stop[5] on ocf::IPaddr::IPaddr_172_28_185_49 for client 4531, > >> its parameters: ip=[172.28.185.49] CRM_meta_timeout=[20000] > >> crm_feature_set=[3.0.1] : pid [5474] timed out > >> Mar 19 13:55:32 DBSUAT1A.intranet.mydomain.com crmd: [4531]: ERROR: > >> process_lrm_event: LRM operation IPaddr_172_28_185_49_stop_0 (5) Timed > >> Out (timeout=20000ms) > >> Mar 19 13:55:32 DBSUAT1A.intranet.mydomain.com crmd: [4531]: WARN: > >> status_from_rc: Action 6 (IPaddr_172_28_185_49_stop_0) on > >> dbsuat1a.intranet.mydomain.com failed (target: 0 vs. rc: -2): Error > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: WARN: > >> update_failcount: Updating failcount for IPaddr_172_28_185_49 on > >> dbsuat1a.intranet.mydomain.com after failed stop: rc=-2 > >> (update=INFINITY, time=1269021333) > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> abort_transition_graph: match_graph_event:272 - Triggered transition > >> abort (complete=0, tag=lrm_rsc_op, id=IPaddr_172_28_185_49_stop_0, > >> magic=2:-2;6:5:0:888fa84e-3267-409e-966b-2ab01e579c0f, cib=0.23.16) : > >> Event failed > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> update_abort_priority: Abort priority upgraded from 0 to 1 > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> update_abort_priority: Abort action done superceeded by restart > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> match_graph_event: Action IPaddr_172_28_185_49_stop_0 (6) confirmed on > >> dbsuat1a.intranet.mydomain.com (rc=4) > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> run_graph: ==================================================== > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: notice: > >> run_graph: Transition 5 (Complete=1, Pending=0, Fired=0, Skipped=4, > >> Incomplete=0, Source=/usr/var/lib/pengine/pe-input-337.bz2): Stopped > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> te_graph_trigger: Transition 5 is now complete > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: > >> find_hash_entry: Creating hash entry for fail-count-IPaddr_172_28_185_49 > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_state_transition: State transition S_TRANSITION_ENGINE -> > >> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL > >> origin=notify_crmd ] Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com > >> crmd: [4531]: info: do_state_transition: All 2 cluster nodes are > >> eligible to run resources. Mar 19 13:55:33 > >> DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: > >> attrd_trigger_update: Sending flush op to all hosts for: > >> fail-count-IPaddr_172_28_185_49 (INFINITY) > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: > >> attrd_perform_update: Sent update 24: > >> fail-count-IPaddr_172_28_185_49=INFINITY > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_pe_invoke: Query 53: Requesting the current CIB: S_POLICY_ENGINE > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: > >> find_hash_entry: Creating hash entry for > >> last-failure-IPaddr_172_28_185_49 Mar 19 13:55:33 > >> DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> abort_transition_graph: te_update_diff:146 - Triggered transition abort > >> (complete=1, tag=transient_attributes, > >> id=db80324b-c9de-4995-a66a-eedf93abb42c, magic=NA, cib=0.23.17) : > >> Transient attribute: update > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: > >> attrd_trigger_update: Sending flush op to all hosts for: > >> last-failure-IPaddr_172_28_185_49 (1269021333) > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: > >> attrd_perform_update: Sent update 27: > >> last-failure-IPaddr_172_28_185_49=1269021333 > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_pe_invoke_callback: Invoking the PE: ref=pe_calc-dc-1269021333-28, > >> seq=2, quorate=1 > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> abort_transition_graph: te_update_diff:146 - Triggered transition abort > >> (complete=1, tag=transient_attributes, > >> id=db80324b-c9de-4995-a66a-eedf93abb42c, magic=NA, cib=0.23.18) : > >> Transient attribute: update > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: > >> update_validation: Upgrading transitional-0.6-style configuration to > >> pacemaker-1.0 with /usr/share/pacemaker/upgrade06.xsl > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> update_validation: Transformation /usr/share/pacemaker/upgrade06.xsl > >> successful > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: > >> update_validation: Upgraded from transitional-0.6 to pacemaker-1.0 > >> validation > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: WARN: > >> cli_config_update: Your configuration was internally updated to the > >> latest version (pacemaker-1.0) > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_pe_invoke: Query 54: Requesting the current CIB: S_POLICY_ENGINE > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_pe_invoke: Query 55: Requesting the current CIB: S_POLICY_ENGINE > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_pe_invoke_callback: Invoking the PE: ref=pe_calc-dc-1269021333-29, > >> seq=2, quorate=1 > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> determine_online_status: Node dbsuat1a.intranet.mydomain.com is shutting > >> down > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: WARN: > >> unpack_rsc_op: Processing failed op IPaddr_172_28_185_49_stop_0 on > >> dbsuat1a.intranet.mydomain.com: unknown exec error (-2) > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> native_add_running: resource IPaddr_172_28_185_49 isnt managed > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> determine_online_status: Node dbsuat1b.intranet.mydomain.com is online > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: > >> native_print: IPaddr_172_28_185_49 (ocf::heartbeat:IPaddr): > >> Started dbsuat1a.intranet.mydomain.com (unmanaged) FAILED > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> get_failcount: IPaddr_172_28_185_49 has failed 1000000 times on > >> dbsuat1a.intranet.mydomain.com > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: WARN: > >> common_apply_stickiness: Forcing IPaddr_172_28_185_49 away from > >> dbsuat1a.intranet.mydomain.com after 1000000 failures (max=1000000) > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> native_color: Unmanaged resource IPaddr_172_28_185_49 allocated to > >> 'nowhere': failed > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> stage6: Scheduling Node dbsuat1a.intranet.mydomain.com for shutdown > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: > >> LogActions: Leave resource IPaddr_172_28_185_49 (Started unmanaged) > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> handle_response: pe_calc calculation pe_calc-dc-1269021333-28 is > >> obsolete Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: > >> info: process_pe_message: Transition 6: PEngine Input stored in: > >> /usr/var/lib/pengine/pe-input-338.bz2 > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> process_pe_message: Configuration WARNINGs found during PE processing. > >> Please run "crm_verify -L" to identify issues. > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: > >> update_validation: Upgrading transitional-0.6-style configuration to > >> pacemaker-1.0 with /usr/share/pacemaker/upgrade06.xsl > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> update_validation: Transformation /usr/share/pacemaker/upgrade06.xsl > >> successful > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: > >> update_validation: Upgraded from transitional-0.6 to pacemaker-1.0 > >> validation > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: WARN: > >> cli_config_update: Your configuration was internally updated to the > >> latest version (pacemaker-1.0) > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> determine_online_status: Node dbsuat1a.intranet.mydomain.com is shutting > >> down > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: WARN: > >> unpack_rsc_op: Processing failed op IPaddr_172_28_185_49_stop_0 on > >> dbsuat1a.intranet.mydomain.com: unknown exec error (-2) > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> native_add_running: resource IPaddr_172_28_185_49 isnt managed > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> determine_online_status: Node dbsuat1b.intranet.mydomain.com is online > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: > >> native_print: IPaddr_172_28_185_49 (ocf::heartbeat:IPaddr): > >> Started dbsuat1a.intranet.mydomain.com (unmanaged) FAILED > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> get_failcount: IPaddr_172_28_185_49 has failed 1000000 times on > >> dbsuat1a.intranet.mydomain.com > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: WARN: > >> common_apply_stickiness: Forcing IPaddr_172_28_185_49 away from > >> dbsuat1a.intranet.mydomain.com after 1000000 failures (max=1000000) > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> native_color: Unmanaged resource IPaddr_172_28_185_49 allocated to > >> 'nowhere': failed > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> stage6: Scheduling Node dbsuat1a.intranet.mydomain.com for shutdown > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: notice: > >> LogActions: Leave resource IPaddr_172_28_185_49 (Started unmanaged) > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> process_pe_message: Transition 7: PEngine Input stored in: > >> /usr/var/lib/pengine/pe-input-339.bz2 > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_state_transition: State transition S_POLICY_ENGINE -> > >> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE > >> origin=handle_response ] > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> process_pe_message: Configuration WARNINGs found during PE processing. > >> Please run "crm_verify -L" to identify issues. > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> unpack_graph: Unpacked transition 7: 1 actions in 1 synapses > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_te_invoke: Processing graph 7 (ref=pe_calc-dc-1269021333-29) derived > >> from /usr/var/lib/pengine/pe-input-339.bz2 > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> te_crm_command: Executing crm-event (10): do_shutdown on > >> dbsuat1a.intranet.mydomain.com > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> te_crm_command: crm-event (10) is a local shutdown > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> run_graph: ==================================================== > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: notice: > >> run_graph: Transition 7 (Complete=1, Pending=0, Fired=0, Skipped=0, > >> Incomplete=0, Source=/usr/var/lib/pengine/pe-input-339.bz2): Complete > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> te_graph_trigger: Transition 7 is now complete > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_state_transition: State transition S_TRANSITION_ENGINE -> S_STOPPING > >> [ input=I_STOP cause=C_FSA_INTERNAL origin=notify_crmd ] > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_dc_release: DC role released > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> stop_subsystem: Sent -TERM to pengine: [4714] > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_te_control: Transitioner is now inactive > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com pengine: [4714]: info: > >> crm_signal_dispatch: Invoking handler for signal 15: Terminated > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_te_control: Disconnecting STONITH... > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> tengine_stonith_connection_destroy: Fencing daemon disconnected > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: notice: Not > >> currently connected. > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_shutdown: Terminating the pengine > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> stop_subsystem: Sent -TERM to pengine: [4714] > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_shutdown: Waiting for subsystems to exit > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: WARN: > >> register_fsa_input_adv: do_shutdown stalled the FSA with pending inputs > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_shutdown: All subsystems stopped, continuing > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: WARN: > >> do_log: FSA: Input I_RELEASE_SUCCESS from do_dc_release() received in > >> state S_STOPPING > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_shutdown: Terminating the pengine > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> stop_subsystem: Sent -TERM to pengine: [4714] > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_shutdown: Waiting for subsystems to exit > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_shutdown: All subsystems stopped, continuing > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> crmdManagedChildDied: Process pengine:[4714] exited (signal=0, > >> exitcode=0) Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: > >> info: pe_msg_dispatch: Received HUP from pengine:[4714] > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> pe_connection_destroy: Connection to the Policy Engine released > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_shutdown: All subsystems stopped, continuing > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: ERROR: > >> verify_stopped: Resource IPaddr_172_28_185_49 was active at shutdown. > >> You may ignore this error if it is unmanaged. > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_lrm_control: Disconnected from the LRM > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com ccm: [4526]: info: client > >> (pid=4531) removed from ccm > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_ha_control: Disconnected from Heartbeat > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_cib_control: Disconnecting CIB > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: > >> cib_process_readwrite: We are now in R/O mode > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> crmd_cib_connection_destroy: Connection to the CIB terminated... > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_exit: Performing A_EXIT_0 - gracefully exiting the CRMd > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> free_mem: Dropping I_TERMINATE: [ state=S_STOPPING cause=C_FSA_INTERNAL > >> origin=do_stop ] > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com crmd: [4531]: info: > >> do_exit: [crmd] stopped (0) > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: > >> killing /usr/lib64/heartbeat/attrd process group 4530 with signal 15 > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: > >> crm_signal_dispatch: Invoking handler for signal 15: Terminated > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: > >> attrd_shutdown: Exiting > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: > >> main: Exiting... > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com attrd: [4530]: info: > >> attrd_cib_connection_destroy: Connection to the CIB terminated... > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: > >> killing /usr/lib64/heartbeat/stonithd process group 4529 with signal 15 > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com stonithd: [4529]: notice: > >> /usr/lib64/heartbeat/stonithd normally quit. > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: > >> killing /usr/lib64/heartbeat/lrmd -r process group 4528 with signal 15 > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com lrmd: [4528]: info: lrmd > >> is shutting down > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com lrmd: [4528]: WARN: > >> resource IPaddr_172_28_185_49 is left in UNKNOWN status.(last op stop > >> finished without LRM_OP_DONE status.) > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: > >> killing /usr/lib64/heartbeat/cib process group 4527 with signal 15 > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: > >> crm_signal_dispatch: Invoking handler for signal 15: Terminated > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: > >> cib_shutdown: Disconnected 0 clients > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: > >> cib_process_disconnect: All clients disconnected... > >> Mar 19 13:55:33 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: > >> initiate_exit: Sending disconnect notification to 2 peers... > >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: > >> cib_process_shutdown_req: Shutdown ACK from > >> dbsuat1b.intranet.mydomain.com Mar 19 13:55:34 > >> DBSUAT1A.intranet.mydomain.com cib: [4527]: info: terminate_cib: > >> cib_process_shutdown_req: Disconnecting heartbeat Mar 19 13:55:34 > >> DBSUAT1A.intranet.mydomain.com cib: [4527]: info: terminate_cib: > >> Exiting... > >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: > >> cib_process_request: Operation complete: op cib_shutdown_req for section > >> 'all' > >> (origin=dbsuat1b.intranet.mydomain.com/dbsuat1b.intranet.mydomain.com/(n > >>ull ), version=0.0.0): ok (rc=0) > >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: > >> ha_msg_dispatch: Lost connection to heartbeat service. > >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com cib: [4527]: info: main: > >> Done > >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com ccm: [4526]: info: client > >> (pid=4527) removed from ccm > >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: > >> killing /usr/lib64/heartbeat/ccm process group 4526 with signal 15 > >> Mar 19 13:55:34 DBSUAT1A.intranet.mydomain.com ccm: [4526]: info: > >> received SIGTERM, going to shut down > >> Mar 19 13:55:35 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: > >> killing HBFIFO process 4522 with signal 15 > >> Mar 19 13:55:35 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: > >> killing HBWRITE process 4523 with signal 15 > >> Mar 19 13:55:35 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: > >> killing HBREAD process 4524 with signal 15 > >> Mar 19 13:55:35 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: > >> Core process 4524 exited. 3 remaining > >> Mar 19 13:55:35 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: > >> Core process 4523 exited. 2 remaining > >> Mar 19 13:55:35 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: > >> Core process 4522 exited. 1 remaining > >> Mar 19 13:55:35 DBSUAT1A.intranet.mydomain.com heartbeat: [4519]: info: > >> dbsuat1a.intranet.mydomain.com Heartbeat shutdown complete. > >> > >> > >> my ha.cf > >> # Logging > >> debug 1 > >> debugfile /var/log/ha-debug > >> logfile /var/log/ha-log > >> logfacility local0 > >> #use_logd true > >> #logfacility daemon > >> > >> # Misc Options > >> traditional_compression off > >> compression bz2 > >> coredumps true > >> > >> # Communications > >> udpport 691 > >> bcast eth0 > >> ##autojoin any > >> autojoin none > >> > >> # Thresholds (in seconds) > >> keepalive 1 > >> warntime 6 > >> deadtime 10 > >> initdead 15 > >> > >> node dbsuat1a.intranet.mydomain.com > >> node dbsuat1b.intranet.mydomain.com > >> #enable pacemaker > >> crm yes > >> #enable STONITH > >> #crm respawn > >> > >> my haresources: > >> DBSUAT1A.intranet.mydomain.com 172.28.185.49 > > > > I don't have a very good advice, but you shouldn't use haresources > > anymore. You should use pacemaker for configuring the cluster. > > > > You have said that you wish to use pacemaker(crm) with this line of your > > config: crm yes > > > > Remove the haresources file, restart the heartbeat on both nodes and redo > > the tests. > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > -- Best regards, Marian Marinov
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
