On Friday 28 November 2008 18:34, Alex Balashov wrote:

Alex,

thanks for this very quick and elaborate reply. Bless you  ;-)

> Bart,
>
> All these stickiness settings and methods boil down to and manipulate
> one thing - the "score" assigned to each node.
>
> To achieve any particular failover objective, what you have to do is
> tweak the settings so that the score of the node matches your desired
> outcome.  If you want services to stick once they fail over and not fail
> back to the primary node, your objective should be to make the scores
> equal or to make the secondary node's score remain highest even after
> the primary node comes back online.
>
> The scoring calculation is described here:
>
> http://www.linux-ha.org/ScoreCalculation

Fiew. Rocket-science. I kinda start to miss the old version 1 days ;-)

> The essence of it that is particularly germane is:
>
> score = (constraint-score) + (num_group_resources * resource_stickiness)
> + (failcount * (resource_failure_stickiness) )
>
> (for a resource group).
>
> Without a group:
>
> score = (constraint-score) + (resource_stickiness) + (failcount *
> (resource_failure_stickiness) )

I didn't use any resource_stickyness for the resources, so it should look at 
the default resource stickyness. Which it doesn't actually, but that probably 
has to do with the drbd resource I have (master-slave thing). I'll try to 
manipulate the scores but in order to do so, I first need to understand the 
scoring-theory and that seems to me like a hell of a job. It's putting me off 
already as a matter of fact...

> And you can see the values currently assigned to all resources using:
>
> ptest -Ls

the "s" parameter is not supported on my ptest (SLES 10 SP1), but ptest -LV 
gives:

node1:/usr/lib64/heartbeat # ./ptest -LV
 <transition_graph cluster-delay="60s" transition_id="0"/>


Thx again!


B.

> while the CRM on a given node is online.
>
> Cheers,
>
> -- Alex
>
> Bart Coninckx wrote:
> > Hi all,
> >
> > I have a two node Heartbeat cluster (version 2.0.8). I'd like to have the
> > resources failover to the other node when rebooting or setting in
> > standby. For this I've set "Resource Stickyness" to "INFINITY" in hb_gui.
> > At first, this seemed to do the trick, but when I reboot a node,
> > resources seem to fail back to the rebooted node in stead of remaining on
> > the other node.
> >
> > This is my cib.xml :
> >
> >  <cib admin_epoch="0" have_quorum="true" ignore_dtd="false" num_peers="2"
> > cib_feature_revision="1.3" generated="true" ccm_transition="4"
> > dc_uuid="df9c672b-3644-4514-bc5a-31fc152d2dd4" epoch="58" num_updates="50
> > 01" cib-last-written="Fri Nov 28 17:47:11 2008">
> >    <configuration>
> >      <crm_config>
> >        <cluster_property_set id="cib-bootstrap-options">
> >          <attributes>
> >            <nvpair name="default-resource-failure-stickiness"
> > id="id-default-resource-failure-stickiness" value="0"/>
> >            <nvpair name="default-resource-stickiness"
> > id="cib-bootstrap-options-default-resource-stickiness" value="INFINITY"/>
> >            <nvpair name="last-lrm-refresh"
> > id="cib-bootstrap-options-last-lrm-refresh" value="1227867724"/>
> >          </attributes>
> >        </cluster_property_set>
> >      </crm_config>
> >      <nodes>
> >        <node uname="node2" type="normal"
> > id="df9c672b-3644-4514-bc5a-31fc152d2dd4">
> >          <instance_attributes
> > id="nodes-df9c672b-3644-4514-bc5a-31fc152d2dd4"> <attributes>
> >              <nvpair name="standby"
> > id="standby-df9c672b-3644-4514-bc5a-31fc152d2dd4" value="off"/>
> >            </attributes>
> >          </instance_attributes>
> >        </node>
> >        <node uname="node1" type="normal"
> > id="72913bca-5fdd-4b03-a96b-a56ad634bfea">
> >          <instance_attributes
> > id="nodes-72913bca-5fdd-4b03-a96b-a56ad634bfea"> <attributes>
> >              <nvpair name="standby"
> > id="standby-72913bca-5fdd-4b03-a96b-a56ad634bfea" value="off"/>
> >            </attributes>
> >          </instance_attributes>
> >        </node>
> >      </nodes>
> >      <resources>
> >        <master_slave id="drbd0">
> >          <instance_attributes id="drbd0_instance_attrs">
> >            <attributes>
> >              <nvpair id="drbd0-clone-max" name="clone_max" value="2"/>
> >              <nvpair id="drbd0-clone-node_max" name="clone_node_max"
> > value="1"/>
> >              <nvpair id="drbd0-master-max" name="master_max" value="1"/>
> >              <nvpair id="drbd0-master-node-max" name="master_node_max"
> > value="1"/>
> >              <nvpair id="drbd0-notify" name="notify" value="yes"/>
> >              <nvpair id="drbd0_target_role" name="target_role"
> > value="started"/>
> >            </attributes>
> >          </instance_attributes>
> >          <primitive class="ocf" type="drbd" provider="heartbeat"
> > id="drbd_r0"> <instance_attributes id="drbd_r0-instance-attrs">
> >              <attributes>
> >                <nvpair name="target_role" id="drbd_r0-target-role"
> > value="started"/>
> >                <nvpair id="drbd_r0-resource" name="drbd_resource"
> > value="r0"/> </attributes>
> >            </instance_attributes>
> >            <operations>
> >              <op id="drbd_r0-monitor" name="monitor" interval="5s"
> > timeout="20s" start_delay="20s" disabled="false" role="Slave"
> > prereq="nothing" on_fail="restart"/>
> >            </operations>
> >          </primitive>
> >        </master_slave>
> >        <group ordered="true" collocated="true" id="group0">
> >          <primitive class="ocf" type="Filesystem" provider="heartbeat"
> > id="filesystem0">
> >            <instance_attributes id="filesystem0_instance_attrs">
> >              <attributes>
> >                <nvpair id="filesystem0-fstype" name="fstype"
> > value="ext3"/> <nvpair id="filesystem0-device" name="device"
> > value="/dev/drbd0"/>
> >                <nvpair id="filesystem0-directory" name="directory"
> > value="/data"/>
> >                <nvpair id="filesystem0-target-role" name="target_role"
> > value="started"/>
> >              </attributes>
> >            </instance_attributes>
> >            <operations>
> >              <op name="monitor" interval="20" timeout="40"
> > on_fail="restart" disabled="false" role="Started"
> > id="filesystem0-monitor" start_delay="10"/> </operations>
> >          </primitive>
> >          <instance_attributes id="group0_instance_attrs">
> >            <attributes>
> >              <nvpair id="group0-target-role" name="target_role"
> > value="started"/>
> >            </attributes>
> >          </instance_attributes>
> >          <primitive id="ipaddress0" class="ocf" type="IPaddr"
> > provider="heartbeat">
> >            <instance_attributes id="ipaddress0_instance_attrs">
> >              <attributes>
> >                <nvpair id="ipaddress0_target_role" name="target_role"
> > value="started"/>
> >                <nvpair id="e54ebd50-cb03-4738-b133-15852d5bee5f"
> > name="ip" value="192.168.1.240"/>
> >                <nvpair id="8de20ccb-90c2-4a10-a3ea-1c03c9363d4f"
> > name="nic" value="eth0"/>
> >                <nvpair id="6ce93a20-e476-4f73-8713-0a577a7faca2"
> > name="cidr_netmask" value="255.255.255.0"/>
> >              </attributes>
> >            </instance_attributes>
> >          </primitive>
> >        </group>
> >      </resources>
> >      <constraints>
> >        <rsc_colocation id="colocation0" from="group0" to="drbd0"
> > to_role="master" score="INFINITY"/>
> >        <rsc_order id="order0" from="group0" type="after" to="drbd0"
> > action="start" to_action="promote"/>
> >      </constraints>
> >    </configuration>
> >  </cib>
> >
> >
> > What am I doing wrong?
> >
> >
> >
> > Thank you !!!!
> >
> >
> > B.
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems

-- 
Bits 'n Tricks
http://www.bitsandtricks.com
Free To Evolve
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to