Re: [Linux-HA] what to do on loss of network

Andrew Beekhof Thu, 31 Jan 2008 01:04:43 -0800


On Jan 31, 2008, at 1:53 AM, Steve Wray wrote:

Jonas Andradas wrote:
Hello Steve,
On Wed, Jan 30, 2008 at 7:55 PM, Steve Wray <[EMAIL PROTECTED]>wrote:
Jonas Andradas wrote:
Hello,

if my memory doesn't fail, the cib.xml file should be located in:

/var/lib/heartbeat/crm/cib.xml
Thanks for that.
The "base" cib.xml contains just the configuration. Duringoperation,values are added and modified on the fly, as you say, with thescore of
each
node, the pingd score, and so.
So the xml code which was given on the pingd documentation page does
need to be *manually* inserted into the cib.xml code?
Yes, that pingd code has to be inserted into de cib.xml. Duringexecution,a section of the XML (which I cannot remember right now betweenwhich tagscan be found) is updated on-the-fly, with execution data, such as(as statedpreviously) node score, pingd score, and such. The node with thehighestscore is the 'winner node', the one resources would prefer (thoughit might
be *not* the one they actually run on.  Depending on how the
resource_stickiness is set, resources might stay on a lower-scorednode
unless they are forced to switch).
Ok I now have cib.xml working but the behavior of the cluster isstill strange.
I took the code from the pingd documentation and inserted it intothe cib.xml as follows:
<constraints>
 <rsc_location id="rsc_location_group_1" rsc="group_1">
   <rule id="prefered_location_group_1" score="100">
<expression attribute="#uname"id="prefered_location_group_1_expr" operation="eq" value="drbd-test-1"/>
   </rule>
 </rsc_location>
 <rsc_location id="my_resource:connected" rsc="my_resource">
   <rule id="my_resource:connected:rule"
         score="-INFINITY" boolean_op="or">
     <expression id="my_resource:connected:expr:undefined"
         attribute="pingd" operation="not_defined"/>
     <expression id="my_resource:connected:expr:zero"
         attribute="pingd" operation="lte" value="0"/>
   </rule>
 </rsc_location>
</constraints>
The documentation is not clear on this, but is that the correctplace to insert the code fragment?


Yes.



Note that the ha.cf files now look like this:

crm yes
logfacility     local0
keepalive 100ms
deadping 5
deadtime 30
warntime 10
ucast eth0 10.10.2.26
ucast eth0 10.10.2.27
node drbd-test-1
node drbd-test-2
auto_failback off
ping 10.10.10.1
respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd

The observed behavior is now that if the passive node loses networkconnectivity


you didn't simulate this by unplugging eth0 did you?

but the active node can contact its ping node then the active nodetries to become passive... but fails as it can't unmount its NFSfilesystem or stop drbd. It relinquishes the floating IP addressthough and effectively fails. Kind of the opposite to what I amafter...
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] what to do on loss of network

Reply via email to