Bart,
All these stickiness settings and methods boil down to and manipulate
one thing - the "score" assigned to each node.
To achieve any particular failover objective, what you have to do is
tweak the settings so that the score of the node matches your desired
outcome. If you want services to stick once they fail over and not fail
back to the primary node, your objective should be to make the scores
equal or to make the secondary node's score remain highest even after
the primary node comes back online.
The scoring calculation is described here:
http://www.linux-ha.org/ScoreCalculation
The essence of it that is particularly germane is:
score = (constraint-score) + (num_group_resources * resource_stickiness)
+ (failcount * (resource_failure_stickiness) )
(for a resource group).
Without a group:
score = (constraint-score) + (resource_stickiness) + (failcount *
(resource_failure_stickiness) )
And you can see the values currently assigned to all resources using:
ptest -Ls
while the CRM on a given node is online.
Cheers,
-- Alex
Bart Coninckx wrote:
Hi all,
I have a two node Heartbeat cluster (version 2.0.8). I'd like to have the
resources failover to the other node when rebooting or setting in standby.
For this I've set "Resource Stickyness" to "INFINITY" in hb_gui. At first,
this seemed to do the trick, but when I reboot a node, resources seem to fail
back to the rebooted node in stead of remaining on the other node.
This is my cib.xml :
<cib admin_epoch="0" have_quorum="true" ignore_dtd="false" num_peers="2"
cib_feature_revision="1.3" generated="true" ccm_transition="4"
dc_uuid="df9c672b-3644-4514-bc5a-31fc152d2dd4" epoch="58" num_updates="50
01" cib-last-written="Fri Nov 28 17:47:11 2008">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<attributes>
<nvpair name="default-resource-failure-stickiness"
id="id-default-resource-failure-stickiness" value="0"/>
<nvpair name="default-resource-stickiness"
id="cib-bootstrap-options-default-resource-stickiness" value="INFINITY"/>
<nvpair name="last-lrm-refresh"
id="cib-bootstrap-options-last-lrm-refresh" value="1227867724"/>
</attributes>
</cluster_property_set>
</crm_config>
<nodes>
<node uname="node2" type="normal"
id="df9c672b-3644-4514-bc5a-31fc152d2dd4">
<instance_attributes id="nodes-df9c672b-3644-4514-bc5a-31fc152d2dd4">
<attributes>
<nvpair name="standby"
id="standby-df9c672b-3644-4514-bc5a-31fc152d2dd4" value="off"/>
</attributes>
</instance_attributes>
</node>
<node uname="node1" type="normal"
id="72913bca-5fdd-4b03-a96b-a56ad634bfea">
<instance_attributes id="nodes-72913bca-5fdd-4b03-a96b-a56ad634bfea">
<attributes>
<nvpair name="standby"
id="standby-72913bca-5fdd-4b03-a96b-a56ad634bfea" value="off"/>
</attributes>
</instance_attributes>
</node>
</nodes>
<resources>
<master_slave id="drbd0">
<instance_attributes id="drbd0_instance_attrs">
<attributes>
<nvpair id="drbd0-clone-max" name="clone_max" value="2"/>
<nvpair id="drbd0-clone-node_max" name="clone_node_max"
value="1"/>
<nvpair id="drbd0-master-max" name="master_max" value="1"/>
<nvpair id="drbd0-master-node-max" name="master_node_max"
value="1"/>
<nvpair id="drbd0-notify" name="notify" value="yes"/>
<nvpair id="drbd0_target_role" name="target_role"
value="started"/>
</attributes>
</instance_attributes>
<primitive class="ocf" type="drbd" provider="heartbeat" id="drbd_r0">
<instance_attributes id="drbd_r0-instance-attrs">
<attributes>
<nvpair name="target_role" id="drbd_r0-target-role"
value="started"/>
<nvpair id="drbd_r0-resource" name="drbd_resource" value="r0"/>
</attributes>
</instance_attributes>
<operations>
<op id="drbd_r0-monitor" name="monitor" interval="5s"
timeout="20s" start_delay="20s" disabled="false" role="Slave"
prereq="nothing" on_fail="restart"/>
</operations>
</primitive>
</master_slave>
<group ordered="true" collocated="true" id="group0">
<primitive class="ocf" type="Filesystem" provider="heartbeat"
id="filesystem0">
<instance_attributes id="filesystem0_instance_attrs">
<attributes>
<nvpair id="filesystem0-fstype" name="fstype" value="ext3"/>
<nvpair id="filesystem0-device" name="device"
value="/dev/drbd0"/>
<nvpair id="filesystem0-directory" name="directory"
value="/data"/>
<nvpair id="filesystem0-target-role" name="target_role"
value="started"/>
</attributes>
</instance_attributes>
<operations>
<op name="monitor" interval="20" timeout="40" on_fail="restart"
disabled="false" role="Started" id="filesystem0-monitor" start_delay="10"/>
</operations>
</primitive>
<instance_attributes id="group0_instance_attrs">
<attributes>
<nvpair id="group0-target-role" name="target_role"
value="started"/>
</attributes>
</instance_attributes>
<primitive id="ipaddress0" class="ocf" type="IPaddr"
provider="heartbeat">
<instance_attributes id="ipaddress0_instance_attrs">
<attributes>
<nvpair id="ipaddress0_target_role" name="target_role"
value="started"/>
<nvpair id="e54ebd50-cb03-4738-b133-15852d5bee5f" name="ip"
value="192.168.1.240"/>
<nvpair id="8de20ccb-90c2-4a10-a3ea-1c03c9363d4f" name="nic"
value="eth0"/>
<nvpair id="6ce93a20-e476-4f73-8713-0a577a7faca2"
name="cidr_netmask" value="255.255.255.0"/>
</attributes>
</instance_attributes>
</primitive>
</group>
</resources>
<constraints>
<rsc_colocation id="colocation0" from="group0" to="drbd0"
to_role="master" score="INFINITY"/>
<rsc_order id="order0" from="group0" type="after" to="drbd0"
action="start" to_action="promote"/>
</constraints>
</configuration>
</cib>
What am I doing wrong?
Thank you !!!!
B.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
--
Alex Balashov
Evariste Systems
Web : http://www.evaristesys.com/
Tel : (+1) (678) 954-0670
Direct : (+1) (678) 954-0671
Mobile : (+1) (706) 338-8599
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems