Hi, On Tue, Oct 02, 2007 at 10:55:03PM +0100, Peter Farrell wrote: > On 02/10/2007, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote: > > Hi, > > > > On Tue, Oct 02, 2007 at 05:17:38PM +0100, Peter Farrell wrote: > > > Can someone verify my CIB please? > > > > > > It's not working as intended and the more I read the less I understand... > > > I've stared at the config for the past 2 days hoping to be struck by > > > sudden understanding... hasn't happened yet. > > > > Don't worry, the learning curve is extremely steep. We all need > > quite some patience. > > > > > I don't understand how you make a rule, and then call that rule as a > > > result of an action. I used the bit from the pingd FAQ page: > > > http://www.linux-ha.org/v2/faq/pingd > > > "Quickstart - Only Run my_resource on Nodes with Access to at Least > > > One Ping Node" > > > > > > So - for my pingd clone, the operation is 'monitor' and 'on_fail=fence' > > > <op id="pingd-child-monitor" name="monitor" interval="20s" > > > timeout="40s" prereq="nothing" on_fail="fence"/> > > > > > > I assume that this literally means: > > > "ask the LRM to see if pingd is running every 20s, if after 40s pingd > > > is not running, call it 'failed', and as it's 'failed' - fence it off, > > > which forces the resource to migrate to another node and marks this > > > one as 'degraded' and will not allow resource to run until it's been > > > cleaned up" > > > > > > Is that right? If so, then this bit I'm OK with. > > > > No, not exactly. The monitor operation may fail (i.e. the > > resource agent says that the resource isn't running) or timeout > > (that's what you described). Of course, both are considered to be > > failures by CRM. on_fail=fence means that if this operation > > fails, the node will be fenced, i.e. rebooted if you have an > > operational stonith device. Perhaps a tad harsh for a monitor > > failure. > > 1. The approach for me is (this is a test cluster - but I want to use > it to replace a production one) - if either of the load balancers > can't ping one or two routers in my DMZ, then this must mean they're > dead. I figured if they can't see the router - how the hell can they > see the apache servers they're meant to be managing? > Is this 'correct political thought' or a sloppy foundation to begin with?
It's just that the resources _are_ going to move. No need to kill the cooperating node. > 2. I didn't know that fence meant 'rebooted'. I thought it was sort of > 'fenced off' and left in a degraded state should someone want to poke > around a bit. > RE: Perhaps a tad harsh for a monitor failure - I agree. But what's a > girl to do? > Am I on the right track here? Do I want it rebooting? Do I just want > Heartbeat to restart? Does it matter? If it comes up and the link is > still dead - will it cycle forever w/ reboots? Not sure, but could be. Whenever a node comes up all resources are probed, i.e. one monitor operation is fired. > 3. the real bit I'm missing: Let's say I want it rebooted after > fencing. Fencing _is_ rebooting. > What 'commands it' to do so? Just the flag 'on_fail=fence'? Yes. Some other things too. For example, one node has a quorum and it cannot establish the state of another node. Then, to make sure, it kills the other node. > Does that automatically look for a started stonith device or resource > and if it finds one, it just uses it? Yes. > I mean - how does the stonith > suicide (which doesn't work - but suppose for a minute it did) - how > is it connected to another operational directive? Not sure why does this confuse you. Once a decision has been reached that a node should be fenced (rebooted), the cluster will try to find a means to do that. That means is a stonith resource. > > > But - the 'dampen and multiplier' - I don't get. > > > <nvpair id="pingd-dampen" name="dampen" value="5s"/> > > > Does this mean: Wait 5 seconds before saying "yep - pingd says there's > > > nothing out there, once pingd says 'there's nothing out there;?" Now > > > write it out to the CIB and let any actions take place? > > > > Yes, the cluster sort of stands back a bit until everything > > settles. > > > > > <nvpair id="pingd-multiplier" name="multiplier" value="100"/> > > > This is a weighted score thing right? It's adding 100 to each node > > > that 'can' ping? > > > > Right. > > Can you control the frequency of the pings themselves? What > constitutes a timeout in this case? (n) packets lost? latency? I don't know. But it's supposed to do the "right thing". > > > So if one can't ping, then the score gets knocked down and the > > > resource wants to move to a "higher scoring" node?? I completely don't > > > understand this... What if you already have a constraint set for a > > > node preference, does this override it? Conflict with it? > > > > The node with the highest score is chosen to run the service. If > > there's more than one with the same score, then one's chosen at > > (pseudo)random. If no score is non-negative then the resource > > can't run anywhere. > > > > > In any case - now that my node has no ping, and is fenced, I saw > > > another bit of code called 'DoFencing' which I modified thinking it > > > would now cause the node to commit suicide since it had no > > > connectivity. But I've no idea about how it's meant to work... It's > > > saying "your clone DoFencing is stonith via suicide" right? > > > > I don't know until I see the code you're talking about. Typically > > though stonith resources are configured to reboot other nodes and > > not commit suicides. There's a special stonith agent called > > suicide for this purpose. > > > > > What do the clone_max and clone_node_max mean? > > > Is clone_max = 2, mean that there are a maximum of 2 nodes that use > > > it? 2 stonith daemons that run on each node? What? Ditto for > > > clone_node_max? > > > > clone_max: The maximum number of instances of this clone in the cluster. > > What is the guidance for this? Should you have one per machine? One in > general? There's no guidance. Clones are just useful if you want to have more than one instance of a resource. Typically this is set to the number of nodes. > > clone_node_max: The maximum number of instances of this clone at one node. > > Ditto above: Should I have a stonith clone per resource / per node? Or > just one? One typical example for clones is an NFS filesystem. If one wants it mounted on all nodes, a cloned Filesystem resource suffices. > > > As for the operations on the DoFencing clone - what are they > > > triggering? The timeouts are for what? the stonith daemon itself? Am I > > > calling the stonith daemon itself to commit suicide? If so - why would > > > I have a monitor or start operation? > > > > This is admitedly a bit confusing. The start operation doesn't > > do anything with the device, just makes it available. The stop > > operation is the opposite. In other words, in order for the > > stonith device to be used it must first be started. > > So for any stonith resource, (using suicide / ssh methods) I'll always > want to have > monitor, start & stop? Just start and monitor. Normally, you don't have to use stop. > Monitor for the cluster to use it, start to see it and stop is > effectively the 'reboot' bit? No, the stop bit is to stop the stonith resource. > > The monitor operation is essential because the cluster wants to > > make sure that the stonith device is operational. Typically, it > > consists of logging into the device and requesting some kind of > > status. > > > > The timeouts are for the operations on which they are defined. > > The start operation implies a monitor. > > > > > Do you need a constraint with a rule to 'start' this resource? ie. > > > kill myself? Does it just 'know' to do this? I'm really not getting > > > it. > > > > Under some circumstances it is necessary to ensure that a node > > has relinquished resources. A typical example is a failed stop > > operation. In that case the CRM will issue a RESET or POWEROFF > > request to the eligible stonith device. > > So - the previous 'on_fail=fence' for the pingd clone - where would > that go ideally? It's really simple: on_fail instructs cluster what to do in case this operation failed. > (I mean - on which operation?) > Ping needs a monitor and needs a start. Does it need a stop? No. > > > <clone id="DoFencing"> > > > <instance_attributes> > > > <attributes> > > > <nvpair name="clone_max" value="2"/> > > > <nvpair name="clone_node_max" value="1"/> > > > </attributes> > > > </instance_attributes> > > > <primitive class="stonith" id="child_DoFencing" type="suicide" > > > provider="heartbeat"> > > > <operations> > > > <op name="monitor" interval="5s" timeout="20s" prereq="nothing"/> > > > <op name="start" timeout="20s" prereq="nothing"/> > > > </operations> > > > </primitive> > > > </clone> > > > > The suicide stonith device is not exactly the best approach. > > Ultimately it is not reliable, so it should not be used on the > > production clusters. If you can afford it, get a real (hardware) > > stonith device. > > Can't. No budget. Advice taken - I'll have to kill these via SSH or suicide. Note that in case the cluster wants to stonith (reset) a node it will try to do that forever. Hence, if at that time your stonith device is not operational, the cluster will basically block. That's also why using ssh as a stonith device is dangerous. For example, if the power supply fails, the living node will never take over the resources. > I set up ssh keys for every user, root, haclient, hacluster - they > always fail authentication. > How can you tell which user / method it's using? ssh uses the root user. You should check yourself if it works without password. > Can you set which > interface they use (in order to force it (ssh) down the crossover > cables?) No. It's as if you run ssh on the command line. > > > > > Intended actions: > > > > node1 loses ping, (which in my world means that it's dead) > > > > resources migrate to node2 > > > > node1 reboots (what I really want is for the fenced resource to be > > > > 'cleaned up' so that it can run again on this node - I'm not fussy > > > > about how I achieve that) > > > > resource migrates back to node1 once ping (connectivity is restored). > > > > Rebooting a node should imply a resource cleanup. In the next > > release the cluster will also be able to "forget" after some time > > about the failure. > > > > > actual actions: > > > > node1 loses ping, > > > > resource migrates to node2. > > > > And the node1 is not rebooted? Then there's a problem with the > > stonith setup. Any errors in logs? > > It's never called. I've cocked up the config by experimenting via > 'cut-n-paste' rather than taking the time to understand the thing > properly. Having said that I've read the docs, watching Alan down > under, trawled the lists and re-arranged others configs, but it's > still pretty random. Plus it's been 2 weeks and I'm an instant > gratification kind of guy, so I'm out of my comfort zone and getting a > little pissed (w/ myself) now :-) > > I just don't get how it's (stonith) is called in relation to another > resource failing. The mechanism, the relationships. It's not just > stonith, for example if the ping failed and I wanted to start apache > on a cluster node to take over all IP addresses and serve up a 'temp. > out of service' page I wouldn't have the foggiest. Well, it definitely takes some time to get used to it. Thanks, Dejan > > > > node2 loses ping but 'resource cannot run anywhere' ensues and both > > > > nodes are 'active' but no resources are being ran. > > > > > > I think fundamentally my approach is wrong and that I should leave it > > > to fail and have human intervention to clean it up rather than hope it > > > will flip flop between nodes. > > > > That depends on your needs of course. At any rate, it should be > > possible to configure the cluster to fit those needs. > > > > There is also the meatware stonith device which will prompt a > > human to clean up/reboot. > > > > > But - I'd like to have a better grasp of > > > how V2 works in general before making the choice to fall back to a > > > simpler config. > > > > HTH. > > It has. Thanks a lot. > > -Peter > > > Thanks, > > > > Dejan > > > > > -Peter > > > > > > > > > Active / Passive set up. > > > 2 nodes, one resource (ldirectord) balancing traffic for IP addresses > > > on 2 web servers. > > > 2 nics [eth0: dmz facing - eth1: crossover cable, on 10.0.0.1/2] > > > > > > This relates to the previous post: > > > "How can you clean up a degraded node w/out killing it (and not > > > manually)?" > > > > > > Versions: > > > heartbeat-stonith-2.1.2-3.el4.centos > > > heartbeat-pils-2.1.2-3.el4.centos > > > heartbeat-ldirectord-2.1.2-3.el4.centos > > > heartbeat-2.1.2-3.el4.centos > > > > > <resources> > > > <group id="group_1"> > > > <primitive class="ocf" id="IPaddr_212_140_130_37" > > > provider="heartbeat" type="IPaddr"> > > > <operations> > > > <op id="IPaddr_212_140_130_37_mon" > > > interval="5s" name="monitor" timeout="5s"/> > > > </operations> > > > <instance_attributes > > > id="IPaddr_212_140_130_37_inst_attr"> > > > <attributes> > > > <nvpair > > > id="IPaddr_212_140_130_37_attr_0" name="ip" value="212.140.130.37"/> > > > </attributes> > > > </instance_attributes> > > > </primitive> > > > <primitive class="ocf" id="IPaddr_212_140_130_38" > > > provider="heartbeat" type="IPaddr"> > > > <operations> > > > <op id="IPaddr_212_140_130_38_mon" > > > interval="5s" name="monitor" timeout="5s"/> > > > </operations> > > > <instance_attributes > > > id="IPaddr_212_140_130_38_inst_attr"> > > > <attributes> > > > <nvpair > > > id="IPaddr_212_140_130_38_attr_0" name="ip" value="212.140.130.38"/> > > > </attributes> > > > </instance_attributes> > > > </primitive> > > > <primitive class="ocf" id="ldirectord_3" > > > provider="heartbeat" type="ldirectord"> > > > <operations> > > > <op id="ldirectord_3_mon" interval="120s" > > > name="monitor" timeout="60s"/> > > > </operations> > > > <instance_attributes id="ldirectord_3_inst_attr"> > > > <attributes> > > > <nvpair id="ldirectord_3_attr_1" > > > name="1" value="ldirectord.cf"/> > > > </attributes> > > > </instance_attributes> > > > </primitive> > > > </group> > > > <clone id="pingd"> > > > <instance_attributes id="pingd"> > > > <attributes> > > > <nvpair id="pingd-clone_node_max" > > > name="clone_node_max" value="1"/> > > > </attributes> > > > </instance_attributes> > > > <primitive id="pingd-child" provider="heartbeat" > > > class="ocf" type="pingd"> > > > <operations> > > > <op id="pingd-child-monitor" name="monitor" > > > interval="20s" timeout="40s" prereq="nothing" on_fail="fence"/> > > > </operations> > > > <instance_attributes id="pingd_inst_attr"> > > > <attributes> > > > <nvpair id="pingd-dampen" > > > name="dampen" value="5s"/> > > > <nvpair id="pingd-multiplier" > > > name="multiplier" value="100"/> > > > </attributes> > > > </instance_attributes> > > > </primitive> > > > </clone> > > > <clone id="DoFencing"> > > > <instance_attributes> > > > <attributes> > > > <nvpair name="clone_max" value="2"/> > > > <nvpair name="clone_node_max" value="1"/> > > > </attributes> > > > </instance_attributes> > > > <primitive id="child_DoFencing" class="stonith" > > > type="suicide" provider="heartbeat"> > > > <operations> > > > <op name="monitor" interval="5s" > > > timeout="20s" prereq="nothing"/> > > > <op name="start" timeout="20s" > > > prereq="nothing"/> > > > </operations> > > > </primitive> > > > </clone> > > > </resources> > > > <constraints> > > > <rsc_location rsc="group_1" id="rsc_location_group_1"> > > > <rule id="prefered_location_group_1" score="200"> > > > <expression attribute="#uname" > > > id="prefered_location_group_1_expr" operation="eq" > > > value="dmz1.scarceskills.com"/> > > > </rule> > > > <rule id="group_1:connected:rule" score="-INFINITY" > > > boolean_op="and"> > > > <expression id="my_resource:connected:expr:zero" > > > attribute="pingd" operation="lte" value="0"/> > > > </rule> > > > </rsc_location> > > > <rsc_location id="cli-prefer-group_1" rsc="group_1"> > > > <rule id="cli-prefer-rule-group_1" score="INFINITY"> > > > <expression id="cli-prefer-expr-group_1" > > > attribute="#uname" operation="eq" value="dmz1.scarceskills.com" > > > type="string"/> > > > </rule> > > > </rsc_location> > > > </constraints> > > > _______________________________________________ > > > Linux-HA mailing list > > > [email protected] > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > See also: http://linux-ha.org/ReportingProblems > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
