Stephen Nelson-Smith wrote: > Hi, > > I am running Heartbeat 2.3 on CentOS 5.2. I have 2 nodes - both > apache servers. All I want to achieve is a simple failover: > > In the case where one of the two nodes is running httpd, if the > running node experiences a failure - httpd is stopped, or the machine > stops responding (ie the network has been lost or the machine down > hard), fail over to the second node. > > I seem to have achieved this when starting with a fresh install. I > have defined two resources: > > <resources> > <primitive class="ocf" id="IPaddr_10_0_0_53" > provider="heartbeat" type="IPaddr"> > <operations> > <op id="IPaddr_10_0_0_53_mon" interval="5s" > name="monitor" timeout="5s"/> > </operations> > <instance_attributes id="IPaddr_10_0_0_53_inst_attr"> > <attributes> > <nvpair id="IPaddr_10_0_0_53_attr_0" name="ip" > value="10.0.0.53"/> > </attributes> > </instance_attributes> > </primitive> > <primitive class="lsb" id="httpd_2" provider="heartbeat" > type="httpd"> > <operations> > <op id="httpd_2_mon" interval="20s" name="monitor" > timeout="10s"/> > </operations> > </primitive> > </resources> > > As I understand it, the IP, primitive type="IPaddr" has a monitor set > to fire every 5 seconds, and > timeout after 5 seconds, and it has one attribute, the IP address itself. > > The httpd, primitive type="httpd", really just refers to the > /etc/init.d/httpd script, since it is of class="lsb". It only has a > single operation and no attributes - the operation is a monitor which > fires every 10 seconds, and will timeout after 10 seconds. For an > init script, the monitor just consists of running the script as > "/etc/init.d/httpd status" and looking for "running" in the response. > > I've defined one constraint: > > <constraints> > <rsc_colocation id="web_same" from="IPaddr_10_0_0_53" > to="httpd_2" score="INFINITY"/> > </constraints> > > > The IP address and the httpd are preferred to run on the same > machine, with INFINITE priority - in other words, they MUST run on the > same machine. > > This should have the effect of forcing the migration of both resources > together. > > I've modified default-resource-stickiness and > default-resource-failure-stickiness: > > <nvpair id="cib-bootstrap-options-default-resource-stickiness" > name="default-resource-stickiness" value="1000"/> > <nvpair id="cib-bootstrap-options-default-resource-failure-stickiness" > name="default-resource-failure-stickiness" value="-6001"/> > > AIUI, these two options define how the CRM and the LRM handle failures > and failovers. > > The default-resource-stickiness is the score given to each active > resource on the active node, leading to a default score of 2000 for > the active > node and 0 for the inactive node.
Not exactly. With the cib snippets you have given, the cluster will pick a node, and _after_ successful resource start, it will _add_ 1000 to each resource's score for the particular node. You should be able to look at the scores with ptest -Ls. In earlier versions, you also need -V. Regards Dominik _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
