On Wed, Dec 15, 2010 at 9:11 PM, Andrew Miklas <[email protected]> wrote:
> Hi,
>
> On 26-Nov-10, at 1:41 AM, Andrew Beekhof wrote:
>
>>> The problem here is that these spurious node failures cause Pacemaker
>>> to initiate unnecessary resource migrations.  Is it normal for the
>>> cluster to become confused for a while when the network connection to
>>> a node is suddenly restored?
>>
>> Its normal for the CCM (part of heartbeat) and used to be normal for
>> corosync.
>> These days I think corosync does a better job in these scenarios.
>
> Does Pacemaker have an option to hold off on running the resource
> reassignments until the node membership as reported by Heartbeat /
> Corosync has stabilized?

Define stable?
Basically one should configure this in corosync/heartbeat - at the
point they report a node gone it really should be.

If they're reporting other nodes disappearing, then thats a bug in
their software.

> Having a rejoining node cause the cluster to
> incorrectly drop good nodes seems like it's a big problem for
> installations that don't use STONITH to keep a once-dead node down.

There is a reason Red Hat and SUSE don't support installations that
don't use STONITH ;-)

>
>
>>>  Or is this happening because using
>>> iptables is not a fair test of how the system will respond during a
>>> network split?
>>
>> Unless you've got _really_ quick hands, you're also creating an
>> asymmetric network split.
>> ie. A can see B but B can't see A.
>>
>> This would be causing additional confusion at the messaging level.
>
> I retested using ifdown / ifup to be sure I was getting a consistent
> split of the network.  Unfortunately, it still looks like it's a
> problem.
>
> Just in case it's helpful,

Not really I'm afraid.
Its a well known bug in the CCM - the problem is no-one knows how it works.

Honestly, try corosync instead.

> here's a snippet from the logs showing node
> "test2" returning after a network outage.  You'll see that the node
> "test1" is incorrectly marked as failed for a while, even though the
> nodes test1, test3, and the test1 <-> test3 network connection was up
> at all times.  These logs are from an Ubuntu 10.04 system running the
> default versions of Pacemaker (1.0.8+hg15494-2ubuntu2) and Heartbeat
> (3.0.3-1ubuntu1).
>
> Dec 15 02:42:16 test3 heartbeat: [1109]: CRIT: Cluster node test2
> returning after partition.
> Dec 15 02:42:16 test3 heartbeat: [1109]: info: For information on
> cluster partitions, See URL: http://linux-ha.org/wiki/Split_Brain
> Dec 15 02:42:16 test3 heartbeat: [1109]: WARN: Deadtime value may be
> too small.
> Dec 15 02:42:16 test3 heartbeat: [1109]: info: See FAQ for information
> on tuning deadtime.
> Dec 15 02:42:16 test3 heartbeat: [1109]: info: URL: 
> http://linux-ha.org/wiki/FAQ#Heavy_Load
> Dec 15 02:42:16 test3 heartbeat: [1109]: info: Link test2:eth0 up.
> Dec 15 02:42:16 test3 heartbeat: [1109]: WARN: Late heartbeat: Node
> test2: interval 125220 ms
> Dec 15 02:42:16 test3 heartbeat: [1109]: info: Status update for node
> test2: status active
> Dec 15 02:42:16 test3 cib: [1124]: WARN: cib_peer_callback: Discarding
> cib_apply_diff message (6ba) from test2: not in our membership
> Dec 15 02:42:16 test3 crmd: [1128]: notice: crmd_ha_status_callback:
> Status update: Node test2 now has status [active] (DC=true)
> Dec 15 02:42:16 test3 crmd: [1128]: info: crm_update_peer_proc:
> test2.ais is now online
> Dec 15 02:42:17 test3 ccm: [1123]: debug: quorum plugin: majority
> Dec 15 02:42:17 test3 ccm: [1123]: debug: cluster:linux-ha,
> member_count=1, member_quorum_votes=100
> Dec 15 02:42:17 test3 ccm: [1123]: debug: total_node_count=3,
> total_quorum_votes=300
> Dec 15 02:42:17 test3 cib: [1124]: WARN: cib_peer_callback: Discarding
> cib_apply_diff message (6bc) from test2: not in our membership
> Dec 15 02:42:17 test3 cib: [1124]: info: mem_handle_event: Got an
> event OC_EV_MS_INVALID from ccm
> Dec 15 02:42:17 test3 cib: [1124]: info: mem_handle_event: no
> mbr_track info
> Dec 15 02:42:17 test3 cib: [1124]: info: mem_handle_event: Got an
> event OC_EV_MS_INVALID from ccm
> Dec 15 02:42:17 test3 cib: [1124]: info: mem_handle_event:
> instance=33, nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=4
> Dec 15 02:42:17 test3 cib: [1124]: info: cib_ccm_msg_callback:
> Processing CCM event=INVALID (id=33)
> Dec 15 02:42:17 test3 cib: [1124]: info: crm_update_peer: Node test1:
> id=0 state=lost (new) addr=(null) votes=-1 born=30 seen=32
> proc=00000000000000000000000000000302
> Dec 15 02:42:17 test3 crmd: [1128]: info: mem_handle_event: Got an
> event OC_EV_MS_INVALID from ccm
> Dec 15 02:42:17 test3 crmd: [1128]: info: mem_handle_event: no
> mbr_track info
> Dec 15 02:42:17 test3 crmd: [1128]: info: mem_handle_event: Got an
> event OC_EV_MS_INVALID from ccm
> Dec 15 02:42:17 test3 crmd: [1128]: info: mem_handle_event:
> instance=33, nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=4
> Dec 15 02:42:17 test3 crmd: [1128]: info: crmd_ccm_msg_callback:
> Quorum lost after event=INVALID (id=33)
> Dec 15 02:42:17 test3 crmd: [1128]: info: crm_update_quorum: Updating
> quorum status to false (call=353)
> Dec 15 02:42:17 test3 crmd: [1128]: info: ccm_event_detail: INVALID:
> trans=33, nodes=1, new=0, lost=1 n_idx=0, new_idx=1, old_idx=4
> Dec 15 02:42:17 test3 crmd: [1128]: info: ccm_event_detail:
> #011CURRENT: test3 [nodeid=2, born=33]
> Dec 15 02:42:17 test3 crmd: [1128]: info: ccm_event_detail:
> #011LOST:    test1 [nodeid=0, born=30]
> Dec 15 02:42:17 test3 crmd: [1128]: info: ais_status_callback: status:
> test1 is now lost (was member)
>
> ---> Notice here that test1 is now marked as down @ 2:42:17
>
> Dec 15 02:42:17 test3 crmd: [1128]: info: crm_update_peer: Node test1:
> id=0 state=lost (new) addr=(null) votes=-1 born=30 seen=32
> proc=00000000000000000000000000000202
> Dec 15 02:42:17 test3 crmd: [1128]: info: erase_node_from_join:
> Removed node test1 from join calculations: welcomed=0 itegrated=0
> finalized=0 confirmed=1
> Dec 15 02:42:17 test3 crmd: [1128]: info: populate_cib_nodes_ha:
> Requesting the list of configured nodes
> Dec 15 02:42:17 test3 cib: [1124]: info: log_data_element: cib:diff: -
> <cib have-quorum="1" admin_epoch="1" epoch="279" num_updates="16" />
> Dec 15 02:42:17 test3 cib: [1124]: info: log_data_element: cib:diff: +
> <cib have-quorum="0" admin_epoch="1" epoch="280" num_updates="1" />
> Dec 15 02:42:17 test3 cib: [1124]: info: cib_process_request:
> Operation complete: op cib_modify for section cib (origin=local/crmd/
> 353, version=1.280.1): ok (rc=0)
> Dec 15 02:42:17 test3 cib: [1239]: info: write_cib_contents: Archived
> previous version as /var/lib/heartbeat/crm/cib-71.raw
> Dec 15 02:42:17 test3 cib: [1239]: info: write_cib_contents: Wrote
> version 1.280.0 of the CIB to disk (digest:
> 4ceccd803c6229ad6dd3f1b939e3cb06)
> Dec 15 02:42:17 test3 cib: [1239]: info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.MU18pF (digest: /var/
> lib/heartbeat/crm/cib.4gj2D3)
> Dec 15 02:42:19 test3 crmd: [1128]: info: abort_transition_graph:
> need_abort:59 - Triggered transition abort (complete=1) : Non-status
> change
> Dec 15 02:42:19 test3 crmd: [1128]: info: need_abort: Aborting on
> change to have-quorum
> Dec 15 02:42:19 test3 crmd: [1128]: info: do_state_transition: State
> transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC
> cause=C_FSA_INTERNAL origin=abort_transition_graph ]
> Dec 15 02:42:19 test3 crmd: [1128]: info: do_state_transition: All 1
> cluster nodes are eligible to run resources.
> Dec 15 02:42:19 test3 crmd: [1128]: info: do_pe_invoke: Query 356:
> Requesting the current CIB: S_POLICY_ENGINE
> Dec 15 02:42:19 test3 cib: [1124]: info: cib_process_request:
> Operation complete: op cib_modify for section nodes (origin=local/crmd/
> 354, version=1.280.1): ok (rc=0)
> Dec 15 02:42:19 test3 crmd: [1128]: WARN: match_down_event: No match
> for shutdown action on b3a4cc48-7707-4e47-aa10-3cd34230cebc
> Dec 15 02:42:19 test3 crmd: [1128]: info: te_update_diff: Stonith/
> shutdown of b3a4cc48-7707-4e47-aa10-3cd34230cebc not matched
> Dec 15 02:42:19 test3 crmd: [1128]: info: abort_transition_graph:
> te_update_diff:191 - Triggered transition abort (complete=1,
> tag=node_state, id=b3a4cc48-7707-4e47-aa10-3cd34230cebc, magic=NA,
> cib=1.280.2) : Node failure
> Dec 15 02:42:19 test3 crmd: [1128]: info: do_pe_invoke: Query 357:
> Requesting the current CIB: S_POLICY_ENGINE
> Dec 15 02:42:19 test3 crmd: [1128]: info: do_pe_invoke_callback:
> Invoking the PE: query=357, ref=pe_calc-dc-1292380939-203, seq=33,
> quorate=0
> Dec 15 02:42:19 test3 pengine: [1137]: info: unpack_config: Node
> scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
> Dec 15 02:42:19 test3 pengine: [1137]: WARN: cluster_status: We do not
> have quorum - fencing and resource management disabled
> Dec 15 02:42:19 test3 pengine: [1137]: ERROR: unpack_resources:
> Resource start-up disabled since no STONITH resources have been defined
> Dec 15 02:42:19 test3 pengine: [1137]: ERROR: unpack_resources: Either
> configure some or disable STONITH with the stonith-enabled option
> Dec 15 02:42:19 test3 pengine: [1137]: ERROR: unpack_resources: NOTE:
> Clusters with shared data need STONITH to ensure data integrity
> Dec 15 02:42:19 test3 pengine: [1137]: info: determine_online_status:
> Node test3 is online
> Dec 15 02:42:19 test3 pengine: [1137]: WARN: pe_fence_node: Node test2
> will be fenced because it is un-expectedly down
> Dec 15 02:42:19 test3 pengine: [1137]: info:
> determine_online_status_fencing: #011ha_state=active, ccm_state=false,
> crm_state=online, join_state=pending, expected=member
> Dec 15 02:42:19 test3 pengine: [1137]: WARN: determine_online_status:
> Node test2 is unclean
> Dec 15 02:42:19 test3 pengine: [1137]: WARN: pe_fence_node: Node test1
> will be fenced because it is un-expectedly down
> Dec 15 02:42:19 test3 pengine: [1137]: info:
> determine_online_status_fencing: #011ha_state=active, ccm_state=false,
> crm_state=online, join_state=member, expected=member
> Dec 15 02:42:19 test3 pengine: [1137]: WARN: determine_online_status:
> Node test1 is unclean
> Dec 15 02:42:19 test3 pengine: [1137]: WARN: stage6: Node test2 is
> unclean!
> Dec 15 02:42:19 test3 pengine: [1137]: WARN: stage6: Node test1 is
> unclean!
> Dec 15 02:42:19 test3 pengine: [1137]: notice: stage6: Cannot fence
> unclean nodes until quorum is attained (or no-quorum-policy is set to
> ignore)
> Dec 15 02:42:19 test3 crmd: [1128]: info: do_state_transition: State
> transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
> cause=C_IPC_MESSAGE origin=handle_response ]
> Dec 15 02:42:19 test3 crmd: [1128]: info: unpack_graph: Unpacked
> transition 22: 0 actions in 0 synapses
> Dec 15 02:42:19 test3 crmd: [1128]: info: do_te_invoke: Processing
> graph 22 (ref=pe_calc-dc-1292380939-203) derived from /var/lib/pengine/
> pe-warn-499.bz2
> Dec 15 02:42:19 test3 crmd: [1128]: info: run_graph:
> ====================================================
> Dec 15 02:42:19 test3 crmd: [1128]: notice: run_graph: Transition 22
> (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/
> lib/pengine/pe-warn-499.bz2): Complete
> Dec 15 02:42:19 test3 crmd: [1128]: info: te_graph_trigger: Transition
> 22 is now complete
> Dec 15 02:42:19 test3 crmd: [1128]: info: notify_crmd: Transition 22
> status: done - <null>
> Dec 15 02:42:19 test3 crmd: [1128]: info: do_state_transition: State
> transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
> cause=C_FSA_INTERNAL origin=notify_crmd ]
> Dec 15 02:42:19 test3 crmd: [1128]: info: do_state_transition:
> Starting PEngine Recheck Timer
> Dec 15 02:42:20 test3 pengine: [1137]: WARN: process_pe_message:
> Transition 22: WARNINGs found during PE processing. PEngine Input
> stored in: /var/lib/pengine/pe-warn-499.bz2
> Dec 15 02:42:20 test3 pengine: [1137]: info: process_pe_message:
> Configuration ERRORs found during PE processing.  Please run
> "crm_verify -L" to identify issues.
> Dec 15 02:42:20 test3 ccm: [1123]: debug: quorum plugin: majority
> Dec 15 02:42:20 test3 ccm: [1123]: debug: cluster:linux-ha,
> member_count=2, member_quorum_votes=200
> Dec 15 02:42:20 test3 ccm: [1123]: debug: total_node_count=3,
> total_quorum_votes=300
> Dec 15 02:42:20 test3 crmd: [1128]: info: mem_handle_event: Got an
> event OC_EV_MS_INVALID from ccm
> Dec 15 02:42:20 test3 crmd: [1128]: info: mem_handle_event: no
> mbr_track info
> Dec 15 02:42:20 test3 crmd: [1128]: info: mem_handle_event: Got an
> event OC_EV_MS_NEW_MEMBERSHIP from ccm
> Dec 15 02:42:20 test3 crmd: [1128]: info: mem_handle_event:
> instance=34, nodes=2, new=1, lost=0, n_idx=0, new_idx=2, old_idx=5
> Dec 15 02:42:20 test3 crmd: [1128]: info: crmd_ccm_msg_callback:
> Quorum (re)attained after event=NEW MEMBERSHIP (id=34)
> Dec 15 02:42:20 test3 crmd: [1128]: info: crm_update_quorum: Updating
> quorum status to true (call=358)
> Dec 15 02:42:20 test3 crmd: [1128]: info: ccm_event_detail: NEW
> MEMBERSHIP: trans=34, nodes=2, new=1, lost=0 n_idx=0, new_idx=2,
> old_idx=5
> Dec 15 02:42:20 test3 crmd: [1128]: info: ccm_event_detail:
> #011CURRENT: test3 [nodeid=2, born=1]
> Dec 15 02:42:20 test3 crmd: [1128]: info: ccm_event_detail:
> #011CURRENT: test1 [nodeid=0, born=34]
> Dec 15 02:42:20 test3 crmd: [1128]: info: ccm_event_detail:
> #011NEW:     test1 [nodeid=0, born=34]
> Dec 15 02:42:20 test3 crmd: [1128]: info: ais_status_callback: status:
> test1 is now member (was lost)
>
> --> test1 is now back at 2:42:20
>
>
>
> -- Andrew
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to