Hi, On 26-Nov-10, at 1:41 AM, Andrew Beekhof wrote:
>> The problem here is that these spurious node failures cause Pacemaker >> to initiate unnecessary resource migrations. Is it normal for the >> cluster to become confused for a while when the network connection to >> a node is suddenly restored? > > Its normal for the CCM (part of heartbeat) and used to be normal for > corosync. > These days I think corosync does a better job in these scenarios. Does Pacemaker have an option to hold off on running the resource reassignments until the node membership as reported by Heartbeat / Corosync has stabilized? Having a rejoining node cause the cluster to incorrectly drop good nodes seems like it's a big problem for installations that don't use STONITH to keep a once-dead node down. >> Or is this happening because using >> iptables is not a fair test of how the system will respond during a >> network split? > > Unless you've got _really_ quick hands, you're also creating an > asymmetric network split. > ie. A can see B but B can't see A. > > This would be causing additional confusion at the messaging level. I retested using ifdown / ifup to be sure I was getting a consistent split of the network. Unfortunately, it still looks like it's a problem. Just in case it's helpful, here's a snippet from the logs showing node "test2" returning after a network outage. You'll see that the node "test1" is incorrectly marked as failed for a while, even though the nodes test1, test3, and the test1 <-> test3 network connection was up at all times. These logs are from an Ubuntu 10.04 system running the default versions of Pacemaker (1.0.8+hg15494-2ubuntu2) and Heartbeat (3.0.3-1ubuntu1). Dec 15 02:42:16 test3 heartbeat: [1109]: CRIT: Cluster node test2 returning after partition. Dec 15 02:42:16 test3 heartbeat: [1109]: info: For information on cluster partitions, See URL: http://linux-ha.org/wiki/Split_Brain Dec 15 02:42:16 test3 heartbeat: [1109]: WARN: Deadtime value may be too small. Dec 15 02:42:16 test3 heartbeat: [1109]: info: See FAQ for information on tuning deadtime. Dec 15 02:42:16 test3 heartbeat: [1109]: info: URL: http://linux-ha.org/wiki/FAQ#Heavy_Load Dec 15 02:42:16 test3 heartbeat: [1109]: info: Link test2:eth0 up. Dec 15 02:42:16 test3 heartbeat: [1109]: WARN: Late heartbeat: Node test2: interval 125220 ms Dec 15 02:42:16 test3 heartbeat: [1109]: info: Status update for node test2: status active Dec 15 02:42:16 test3 cib: [1124]: WARN: cib_peer_callback: Discarding cib_apply_diff message (6ba) from test2: not in our membership Dec 15 02:42:16 test3 crmd: [1128]: notice: crmd_ha_status_callback: Status update: Node test2 now has status [active] (DC=true) Dec 15 02:42:16 test3 crmd: [1128]: info: crm_update_peer_proc: test2.ais is now online Dec 15 02:42:17 test3 ccm: [1123]: debug: quorum plugin: majority Dec 15 02:42:17 test3 ccm: [1123]: debug: cluster:linux-ha, member_count=1, member_quorum_votes=100 Dec 15 02:42:17 test3 ccm: [1123]: debug: total_node_count=3, total_quorum_votes=300 Dec 15 02:42:17 test3 cib: [1124]: WARN: cib_peer_callback: Discarding cib_apply_diff message (6bc) from test2: not in our membership Dec 15 02:42:17 test3 cib: [1124]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Dec 15 02:42:17 test3 cib: [1124]: info: mem_handle_event: no mbr_track info Dec 15 02:42:17 test3 cib: [1124]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Dec 15 02:42:17 test3 cib: [1124]: info: mem_handle_event: instance=33, nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=4 Dec 15 02:42:17 test3 cib: [1124]: info: cib_ccm_msg_callback: Processing CCM event=INVALID (id=33) Dec 15 02:42:17 test3 cib: [1124]: info: crm_update_peer: Node test1: id=0 state=lost (new) addr=(null) votes=-1 born=30 seen=32 proc=00000000000000000000000000000302 Dec 15 02:42:17 test3 crmd: [1128]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Dec 15 02:42:17 test3 crmd: [1128]: info: mem_handle_event: no mbr_track info Dec 15 02:42:17 test3 crmd: [1128]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Dec 15 02:42:17 test3 crmd: [1128]: info: mem_handle_event: instance=33, nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=4 Dec 15 02:42:17 test3 crmd: [1128]: info: crmd_ccm_msg_callback: Quorum lost after event=INVALID (id=33) Dec 15 02:42:17 test3 crmd: [1128]: info: crm_update_quorum: Updating quorum status to false (call=353) Dec 15 02:42:17 test3 crmd: [1128]: info: ccm_event_detail: INVALID: trans=33, nodes=1, new=0, lost=1 n_idx=0, new_idx=1, old_idx=4 Dec 15 02:42:17 test3 crmd: [1128]: info: ccm_event_detail: #011CURRENT: test3 [nodeid=2, born=33] Dec 15 02:42:17 test3 crmd: [1128]: info: ccm_event_detail: #011LOST: test1 [nodeid=0, born=30] Dec 15 02:42:17 test3 crmd: [1128]: info: ais_status_callback: status: test1 is now lost (was member) ---> Notice here that test1 is now marked as down @ 2:42:17 Dec 15 02:42:17 test3 crmd: [1128]: info: crm_update_peer: Node test1: id=0 state=lost (new) addr=(null) votes=-1 born=30 seen=32 proc=00000000000000000000000000000202 Dec 15 02:42:17 test3 crmd: [1128]: info: erase_node_from_join: Removed node test1 from join calculations: welcomed=0 itegrated=0 finalized=0 confirmed=1 Dec 15 02:42:17 test3 crmd: [1128]: info: populate_cib_nodes_ha: Requesting the list of configured nodes Dec 15 02:42:17 test3 cib: [1124]: info: log_data_element: cib:diff: - <cib have-quorum="1" admin_epoch="1" epoch="279" num_updates="16" /> Dec 15 02:42:17 test3 cib: [1124]: info: log_data_element: cib:diff: + <cib have-quorum="0" admin_epoch="1" epoch="280" num_updates="1" /> Dec 15 02:42:17 test3 cib: [1124]: info: cib_process_request: Operation complete: op cib_modify for section cib (origin=local/crmd/ 353, version=1.280.1): ok (rc=0) Dec 15 02:42:17 test3 cib: [1239]: info: write_cib_contents: Archived previous version as /var/lib/heartbeat/crm/cib-71.raw Dec 15 02:42:17 test3 cib: [1239]: info: write_cib_contents: Wrote version 1.280.0 of the CIB to disk (digest: 4ceccd803c6229ad6dd3f1b939e3cb06) Dec 15 02:42:17 test3 cib: [1239]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.MU18pF (digest: /var/ lib/heartbeat/crm/cib.4gj2D3) Dec 15 02:42:19 test3 crmd: [1128]: info: abort_transition_graph: need_abort:59 - Triggered transition abort (complete=1) : Non-status change Dec 15 02:42:19 test3 crmd: [1128]: info: need_abort: Aborting on change to have-quorum Dec 15 02:42:19 test3 crmd: [1128]: info: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] Dec 15 02:42:19 test3 crmd: [1128]: info: do_state_transition: All 1 cluster nodes are eligible to run resources. Dec 15 02:42:19 test3 crmd: [1128]: info: do_pe_invoke: Query 356: Requesting the current CIB: S_POLICY_ENGINE Dec 15 02:42:19 test3 cib: [1124]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/ 354, version=1.280.1): ok (rc=0) Dec 15 02:42:19 test3 crmd: [1128]: WARN: match_down_event: No match for shutdown action on b3a4cc48-7707-4e47-aa10-3cd34230cebc Dec 15 02:42:19 test3 crmd: [1128]: info: te_update_diff: Stonith/ shutdown of b3a4cc48-7707-4e47-aa10-3cd34230cebc not matched Dec 15 02:42:19 test3 crmd: [1128]: info: abort_transition_graph: te_update_diff:191 - Triggered transition abort (complete=1, tag=node_state, id=b3a4cc48-7707-4e47-aa10-3cd34230cebc, magic=NA, cib=1.280.2) : Node failure Dec 15 02:42:19 test3 crmd: [1128]: info: do_pe_invoke: Query 357: Requesting the current CIB: S_POLICY_ENGINE Dec 15 02:42:19 test3 crmd: [1128]: info: do_pe_invoke_callback: Invoking the PE: query=357, ref=pe_calc-dc-1292380939-203, seq=33, quorate=0 Dec 15 02:42:19 test3 pengine: [1137]: info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 Dec 15 02:42:19 test3 pengine: [1137]: WARN: cluster_status: We do not have quorum - fencing and resource management disabled Dec 15 02:42:19 test3 pengine: [1137]: ERROR: unpack_resources: Resource start-up disabled since no STONITH resources have been defined Dec 15 02:42:19 test3 pengine: [1137]: ERROR: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option Dec 15 02:42:19 test3 pengine: [1137]: ERROR: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity Dec 15 02:42:19 test3 pengine: [1137]: info: determine_online_status: Node test3 is online Dec 15 02:42:19 test3 pengine: [1137]: WARN: pe_fence_node: Node test2 will be fenced because it is un-expectedly down Dec 15 02:42:19 test3 pengine: [1137]: info: determine_online_status_fencing: #011ha_state=active, ccm_state=false, crm_state=online, join_state=pending, expected=member Dec 15 02:42:19 test3 pengine: [1137]: WARN: determine_online_status: Node test2 is unclean Dec 15 02:42:19 test3 pengine: [1137]: WARN: pe_fence_node: Node test1 will be fenced because it is un-expectedly down Dec 15 02:42:19 test3 pengine: [1137]: info: determine_online_status_fencing: #011ha_state=active, ccm_state=false, crm_state=online, join_state=member, expected=member Dec 15 02:42:19 test3 pengine: [1137]: WARN: determine_online_status: Node test1 is unclean Dec 15 02:42:19 test3 pengine: [1137]: WARN: stage6: Node test2 is unclean! Dec 15 02:42:19 test3 pengine: [1137]: WARN: stage6: Node test1 is unclean! Dec 15 02:42:19 test3 pengine: [1137]: notice: stage6: Cannot fence unclean nodes until quorum is attained (or no-quorum-policy is set to ignore) Dec 15 02:42:19 test3 crmd: [1128]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Dec 15 02:42:19 test3 crmd: [1128]: info: unpack_graph: Unpacked transition 22: 0 actions in 0 synapses Dec 15 02:42:19 test3 crmd: [1128]: info: do_te_invoke: Processing graph 22 (ref=pe_calc-dc-1292380939-203) derived from /var/lib/pengine/ pe-warn-499.bz2 Dec 15 02:42:19 test3 crmd: [1128]: info: run_graph: ==================================================== Dec 15 02:42:19 test3 crmd: [1128]: notice: run_graph: Transition 22 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/ lib/pengine/pe-warn-499.bz2): Complete Dec 15 02:42:19 test3 crmd: [1128]: info: te_graph_trigger: Transition 22 is now complete Dec 15 02:42:19 test3 crmd: [1128]: info: notify_crmd: Transition 22 status: done - <null> Dec 15 02:42:19 test3 crmd: [1128]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] Dec 15 02:42:19 test3 crmd: [1128]: info: do_state_transition: Starting PEngine Recheck Timer Dec 15 02:42:20 test3 pengine: [1137]: WARN: process_pe_message: Transition 22: WARNINGs found during PE processing. PEngine Input stored in: /var/lib/pengine/pe-warn-499.bz2 Dec 15 02:42:20 test3 pengine: [1137]: info: process_pe_message: Configuration ERRORs found during PE processing. Please run "crm_verify -L" to identify issues. Dec 15 02:42:20 test3 ccm: [1123]: debug: quorum plugin: majority Dec 15 02:42:20 test3 ccm: [1123]: debug: cluster:linux-ha, member_count=2, member_quorum_votes=200 Dec 15 02:42:20 test3 ccm: [1123]: debug: total_node_count=3, total_quorum_votes=300 Dec 15 02:42:20 test3 crmd: [1128]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Dec 15 02:42:20 test3 crmd: [1128]: info: mem_handle_event: no mbr_track info Dec 15 02:42:20 test3 crmd: [1128]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm Dec 15 02:42:20 test3 crmd: [1128]: info: mem_handle_event: instance=34, nodes=2, new=1, lost=0, n_idx=0, new_idx=2, old_idx=5 Dec 15 02:42:20 test3 crmd: [1128]: info: crmd_ccm_msg_callback: Quorum (re)attained after event=NEW MEMBERSHIP (id=34) Dec 15 02:42:20 test3 crmd: [1128]: info: crm_update_quorum: Updating quorum status to true (call=358) Dec 15 02:42:20 test3 crmd: [1128]: info: ccm_event_detail: NEW MEMBERSHIP: trans=34, nodes=2, new=1, lost=0 n_idx=0, new_idx=2, old_idx=5 Dec 15 02:42:20 test3 crmd: [1128]: info: ccm_event_detail: #011CURRENT: test3 [nodeid=2, born=1] Dec 15 02:42:20 test3 crmd: [1128]: info: ccm_event_detail: #011CURRENT: test1 [nodeid=0, born=34] Dec 15 02:42:20 test3 crmd: [1128]: info: ccm_event_detail: #011NEW: test1 [nodeid=0, born=34] Dec 15 02:42:20 test3 crmd: [1128]: info: ais_status_callback: status: test1 is now member (was lost) --> test1 is now back at 2:42:20 -- Andrew _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
