Jonas Andradas wrote:
Hello Steve,
On Wed, Jan 30, 2008 at 7:55 PM, Steve Wray <[EMAIL PROTECTED]> wrote:
Jonas Andradas wrote:
Hello,
if my memory doesn't fail, the cib.xml file should be located in:
/var/lib/heartbeat/crm/cib.xml
Thanks for that.
The "base" cib.xml contains just the configuration. During operation,
values are added and modified on the fly, as you say, with the score of
each
node, the pingd score, and so.
So the xml code which was given on the pingd documentation page does
need to be *manually* inserted into the cib.xml code?
Yes, that pingd code has to be inserted into de cib.xml. During execution,
a section of the XML (which I cannot remember right now between which tags
can be found) is updated on-the-fly, with execution data, such as (as stated
previously) node score, pingd score, and such. The node with the highest
score is the 'winner node', the one resources would prefer (though it might
be *not* the one they actually run on. Depending on how the
resource_stickiness is set, resources might stay on a lower-scored node
unless they are forced to switch).
Ok I now have cib.xml working but the behavior of the cluster is still
strange.
I took the code from the pingd documentation and inserted it into the
cib.xml as follows:
<constraints>
<rsc_location id="rsc_location_group_1" rsc="group_1">
<rule id="prefered_location_group_1" score="100">
<expression attribute="#uname"
id="prefered_location_group_1_expr" operation="eq" value="drbd-test-1"/>
</rule>
</rsc_location>
<rsc_location id="my_resource:connected" rsc="my_resource">
<rule id="my_resource:connected:rule"
score="-INFINITY" boolean_op="or">
<expression id="my_resource:connected:expr:undefined"
attribute="pingd" operation="not_defined"/>
<expression id="my_resource:connected:expr:zero"
attribute="pingd" operation="lte" value="0"/>
</rule>
</rsc_location>
</constraints>
The documentation is not clear on this, but is that the correct place to
insert the code fragment?
Note that the ha.cf files now look like this:
crm yes
logfacility local0
keepalive 100ms
deadping 5
deadtime 30
warntime 10
ucast eth0 10.10.2.26
ucast eth0 10.10.2.27
node drbd-test-1
node drbd-test-2
auto_failback off
ping 10.10.10.1
respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd
The observed behavior is now that if the passive node loses network
connectivity but the active node can contact its ping node then the
active node tries to become passive... but fails as it can't unmount its
NFS filesystem or stop drbd. It relinquishes the floating IP address
though and effectively fails. Kind of the opposite to what I am after...
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems