[Linux-HA] Understanding scoring with DRBD Master/Slave and resource-group

Geoffroy ARNOUD Wed, 04 Jun 2008 02:30:31 -0700

Hi all,

We are setting up a MySQL HA cluster, with Heart/Pacemaker and DRBD.
Heartbeat is configured has follows:
- DRBD as a master/slave resource
- MySQL as a resource-group with the following primitives:
   * a Virtual IP address (IPAddr2)
   * a File system
   * MySQL
- 2 constraints between DRBD and the resource group (rsc_order and
rsc_colocation)


We have some troubles to figure out how scores are computed.
We have set the following resource_stickiness and failure stickiness:
- Mysql : 210 / -100
- File system : 60 / -25
- VIP : 25 / -10

When cluster is started, the showscores output is:
Resource            Score     Node            Stickiness #Fail
Fail-Stickiness
myserver01-drbd:0   76        uamwpdb2        100        0        -100
myserver01-drbd:0   -INFINITY uamwpdb1        100        0        -100
myserver01-drbd:0_(master)75        uamwpdb2        100        0        -100
myserver01-drbd:1   0         uamwpdb2        100        0        -100
myserver01-drbd:1   76        uamwpdb1        100        0        -100
myserver01-drbd:1_(master)665       uamwpdb1        100        0        -100
myserver01-fs       270       uamwpdb1        60         0        -25
myserver01-fs       -INFINITY uamwpdb2        60         0        -25
myserver01-mysql    210       uamwpdb1        210        0        -100
myserver01-mysql    -INFINITY uamwpdb2        210        0        -100
myserver01-vip      371       uamwpdb1        25         0        -10
myserver01-vip      -INFINITY uamwpdb2        25         0        -10

>From heartbeat doc, resource-group score is : 210 + 60 + 25 = 295
Here : myserver01-vip score is 371, which is 295 + 76. As there is a
constraint between DRBD and resource-group, I understand that scores
are sumed.

My questions are :
- where are the normal score (76) and master score of slave (75) computed ?
- the Master score of Master is 665, which seems to be rg-score (295)
* 2 + normal score (75) - is that right ? Why is the resource-group
score counted twice ?

With this config, after 3 failures of the database, all the resources
are migrated to the slave node, because the master score of DRBD on
the master node goes under 75. Which makes heartbeat fail the DRBD
over.

With other values of stickiness, it is possible to fall in a case
where the lysql database has a negative score for the master node, but
the master score of DRBD is greater than the slave node. Therefore,
heartbeat refuses to restart the database, but won't migrate the
resources
Is it possible to configure heartbeat so when a resource of a
resource-group is not able to run on the node anymore (<0), the
resource-group and the master/salve are failed-over ?

My CIB.xml is attached.

Software releases are (taken from
http://download.opensuse.org/repositories/server:/ha-clustering for
heartbeat):
- heartbeat-resources-2.1.3-22.1
- heartbeat-common-2.1.3-22.1
- pacemaker-heartbeat-0.6.4-7.1
- heartbeat-2.1.3-22.1
- drbd-8.0.12-3
- drbd-km-2.6.18_8.el5-8.0.12-3

Thanks in advance for all answers.

Geoff.

<cib generated="true" admin_epoch="0" epoch="0" num_updates="0" have_quorum="true">
	<configuration>
		<crm_config>
			<cluster_property_set id="cib-bootstrap-options">
				<attributes>
					<nvpair id="cib-bootstrap-options-default-resource-stickiness" name="default-resource-stickiness" value="100"/> 
					<nvpair id="cib-bootstrap-options-default-resource-failure-stickiness" name="default-resource-failure-stickiness" value="-100"/>
				</attributes>
			</cluster_property_set>
		</crm_config>
		<nodes/>
		<resources>
			<master_slave id="ms-myserver01-drbd">
				<meta_attributes id="ma-ms-myserver01-drbd">
					<attributes>
						<nvpair id="ma-ms-myserver01-drbd-1" name="clone_max" value="2"/>
						<nvpair id="ma-ms-myserver01-drbd-2" name="clone_node_max" value="1"/>
						<nvpair id="ma-ms-myserver01-drbd-3" name="master_max" value="1"/>
						<nvpair id="ma-ms-myserver01-drbd-4" name="master_node_max" value="1"/>
						<nvpair id="ma-ms-myserver01-drbd-5" name="notify" value="yes"/>
						<nvpair id="ma-ms-myserver01-drbd-6" name="globally_unique" value="false"/>
						<nvpair id="ma-ms-myserver01-drbd-7" name="target_role" value="#default"/>
					</attributes>
				</meta_attributes>
				<primitive id="myserver01-drbd" class="ocf" provider="heartbeat" type="drbd">
					<instance_attributes id="ia-myserver01-drbd">
						<attributes>
							<nvpair id="ia-myserver01-drbd-1" name="drbd_resource" value="myserver01"/>
						</attributes>
					</instance_attributes>
					<operations>
						<op id="op-myserver01-drbd-1" name="monitor" interval="59s" timeout="30s" role="Master"/>
						<op id="op-myserver01-drbd-2" name="monitor" interval="60s" timeout="30s" role="Slave"/>
					</operations>
				</primitive>
			</master_slave>
			<group id="rg-myserver01">
				<meta_attributes id="ma-rg-myserver01">
					<attributes>
						<nvpair id="ma-rg-myserver01-1" name="target_role" value="stopped"/>
						<nvpair id="ma-rg-myserver01-2" name="ordered" value="true"/>
						<nvpair id="ma-rg-myserver01-3" name="collocated" value="true"/>
					</attributes>
				</meta_attributes>
				<primitive id="myserver01-vip" class="ocf" type="IPaddr2" provider="heartbeat">
					<meta_attributes id="ma-myserver01-vip" >
						<attributes>
							<nvpair id="ma-myserver01-vip-rs" name="resource_stickiness" value="25" />
							<nvpair id="ma-myserver01-vip-rfs" name="resource_failure_stickiness" value="-10" />
						</attributes>
					</meta_attributes>
					<instance_attributes id="ia-myserver01-vip">
						<attributes>
							<nvpair id="ia-myserver01-vip-ip" name="ip" value="10.67.106.26"/>
							<nvpair id="ia-myserver01-vip-nic" name="nic" value="bond0.252"/>
							<nvpair id="ia-myserver01-vip-netmask" name="cidr_netmask" value="255.255.255.192"/>
						</attributes>
					</instance_attributes>
				</primitive>
				<primitive id="myserver01-fs" class="ocf" type="Filesystem" provider="heartbeat">
					<meta_attributes id="ma-myserver01-fs" >
						<attributes>
							<nvpair id="ma-myserver01-fs-rs" name="resource_stickiness" value="60" />
							<nvpair id="ma-myserver01-fs-rfs" name="resource_failure_stickiness" value="-25" />
						</attributes>
					</meta_attributes>
					<instance_attributes id="ia-myserver01-fs">
						<attributes>
							<nvpair id="ia-myserver01-fs-device" name="device" value="/dev/drbd0"/>
							<nvpair id="ia-myserver01-fs-directory" name="directory" value="/data/myq/myserver01"/>
							<nvpair id="ia-myserver01-fs-fstype" name="fstype" value="ext3"/>
						</attributes>
					</instance_attributes>
				</primitive>
				<primitive id="myserver01-mysql" class="ocf" type="mysql" provider="heartbeat">
					<meta_attributes id="ma-myserver01-mysql" >
						<attributes>
							<nvpair id="ma-myserver01-mysql-rs" name="resource_stickiness" value="210" />
							<nvpair id="ma-myserver01-mysql-rfs" name="resource_failure_stickiness" value="-100" />
						</attributes>
					</meta_attributes>
					<instance_attributes id="myserver01-mysql-attributes">
						<attributes>
							<nvpair id="myserver01-mysql-attribute-01" name="binary" value="/exec/products/mysql/v5.0.45/sbin/mysqld"/>
							<nvpair id="myserver01-mysql-attribute-02" name="config" value="/data/myq/myserver01/data/my.cnf"/>
							<nvpair id="myserver01-mysql-attribute-03" name="datadir" value="/data/myq/myserver01/data"/>
							<nvpair id="myserver01-mysql-attribute-04" name="user" value="mysql"/>
							<nvpair id="myserver01-mysql-attribute-05" name="group" value="mysql"/>
							<nvpair id="myserver01-mysql-attribute-06" name="log" value="/data/myq/myserver01/log/log/myserver01_log.log"/>
							<nvpair id="myserver01-mysql-attribute-07" name="pid" value="/data/myq/myserver01/data/myserver01.pid"/>
							<nvpair id="myserver01-mysql-attribute-08" name="socket" value="/data/myq/myserver01/data/myserver01.sock"/>
							<nvpair id="myserver01-mysql-attribute-09" name="test_user" value="system"/>
							<nvpair id="myserver01-mysql-attribute-10" name="test_table" value="mysql.user"/>
							<nvpair id="myserver01-mysql-attribute-11" name="test_passwd" value=""/>
							<nvpair id="myserver01-mysql-attribute-12" name="additional_parameters" value="--character-sets-dir=/exec/products/mysql/v5.0.45/share/mysql/charsets --language=/exec/products/mysql/v5.0.45/share/mysql/english --skip-external-locking --skip-symbolic-links --local-infile"/>
						</attributes>
					</instance_attributes>
					<operations>
						<op id="myserver01-mysql-monitor" interval="120s" name="monitor" timeout="60s"/>
					</operations>
				</primitive>
			</group>
			</resources>
			<constraints>
				<!-- Promote node before mount file system -->
				<rsc_order id="promote-node-before-mount-myserver01-fs" from="rg-myserver01" action="start" type="after" to="ms-myserver01-drbd" to_action="promote"/>
				<!-- Mount file system only on node which is master -->
				<rsc_colocation id="start-rg-myserver01-only-on-drdb-master" to="ms-myserver01-drbd" to_role="master" from="rg-myserver01" score="INFINITY"/>
			</constraints>
		</configuration>
	<status/>
</cib>

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Understanding scoring with DRBD Master/Slave and resource-group

Reply via email to