Jerome Yanga wrote: > Dominik, > > Here is the status of the two concerns I needed help on. > > 01) When a node comes back up after a restart of heartbeat, resources gets > bounced when it rejoins the cluster. > STATUS: The resources still gets bounced when a node joins the cluster even > if I had deleted all the constraints.
Well, your configuration lacks resource-stickiness ;) I think I already mentioned this in an earlier email. > 02) Stopping one resource in a group does not failover the group to the > other node. > STATUS: migration-threshold works like a charm. :) Thanks. > > If I may, I have another concern that popped up. > > 03) I cannot seem to get MailTo to work. I am trying to add this resource > under the Directory_Server group so that everytime a failover is experienced, > it will notify me that it did. The configuration of the agent is - as far as I can see - okay. You'd have to look at the logs and see what it was doing/trying to do but failed. Also: Lookup your $MAILCMD in /usr/lib/ocf/resource.d/heartbeat/.ocf-binaries and then try to do something like: echo "some text for the test email" | $MAILCMD -s "failover occured" [email protected] If that works (ie you receive the email), the agent also should work. Regards Dominik > Below is the current cib.xml file I have. > > <cib admin_epoch="0" validate-with="pacemaker-1.0" crm_feature_set="3.0" > have-quorum="1" dc-uuid="27f54ec3-b626-4b4f-b8a6-4ed0b768513c" epoch="99" > num_updates="0" cib-last-written="Tue Jan 27 12:59:21 2009"> > <configuration> > <crm_config> > <cluster_property_set id="cib-bootstrap-options"> > <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" > value="1.0.1-node: 6fc5ce8302abf145a02891ec41e5a492efbe8efe"/> > </cluster_property_set> > </crm_config> > <nodes> > <node id="5e3e3c2d-55e7-4c51-90be-5c4a1912bf3e" uname="nomen.esri.com" > type="normal"> > <instance_attributes id="nodes-5e3e3c2d-55e7-4c51-90be-5c4a1912bf3e"> > <nvpair id="standby-5e3e3c2d-55e7-4c51-90be-5c4a1912bf3e" > name="standby" value="off"/> > </instance_attributes> > </node> > <node id="27f54ec3-b626-4b4f-b8a6-4ed0b768513c" uname="rubric.esri.com" > type="normal"> > <instance_attributes id="nodes-27f54ec3-b626-4b4f-b8a6-4ed0b768513c"> > <nvpair id="standby-27f54ec3-b626-4b4f-b8a6-4ed0b768513c" > name="standby" value="off"/> > </instance_attributes> > </node> > </nodes> > <resources> > <group id="Directory_Server"> > <meta_attributes id="Directory_Server-meta_attributes"> > <nvpair id="Directory_Server-meta_attributes-collocated" > name="collocated" value="true"/> > <nvpair id="Directory_Server-meta_attributes-ordered" > name="ordered" value="true"/> > <nvpair id="Directory_Server-meta_attributes-migration-threshold" > name="migration-threshold" value="1"/> > <nvpair id="Directory_Server-meta_attributes-failure-timeout" > name="failure-timeout" value="10s"/> > </meta_attributes> > <primitive class="ocf" id="VIP" provider="heartbeat" type="IPaddr"> > <instance_attributes id="VIP-instance_attributes"> > <nvpair id="VIP-instance_attributes-ip" name="ip" > value="10.50.26.250"/> > </instance_attributes> > <operations id="VIP-ops"> > <op id="VIP-monitor-5s" interval="5s" name="monitor" > timeout="5s"/> > </operations> > </primitive> > <primitive class="ocf" id="ECAS" provider="esri" type="ecas"> > <operations id="ECAS-ops"> > <op id="ECAS-monitor-3s" interval="3s" name="monitor" > timeout="3s"/> > </operations> > </primitive> > <primitive class="ocf" id="FDS_Admin" provider="esri" type="fdsadm"> > <operations id="FDS_Admin-ops"> > <op id="FDS_Admin-monitor-3s" interval="3s" name="monitor" > timeout="3s"/> > </operations> > </primitive> > <primitive class="ocf" provider="heartbeat" type="MailTo" > id="Emergency_Contact"> > <instance_attributes id="Emergency_Contact-instance_attributes"> > <nvpair id="Emergency_Contact-instance_attributes-email" > name="email" value="[email protected]"/> > <nvpair id="Emergency_Contact-instance_attributes-subject" > name="subject" value="Failover Occured"/> > </instance_attributes> > <operations id="Emergency_Contact-ops"> > <op interval="3s" name="monitor" timeout="3s" > id="Emergency_Contact-monitor-3s"/> > </operations> > </primitive> > </group> > </resources> > <constraints/> > </configuration> > </cib> > > Help. > > Regards, > jerome > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Dominik Klein > Sent: Monday, January 26, 2009 10:52 PM > To: General Linux-HA mailing list > Subject: Re: [Linux-HA] Failover not working as I expected > > Jerome Yanga wrote: >> Andrew, >> >> I apologize for my sending my previous email abruptly. >> >> I have followed your recommendation and installed Pacemaker. >> >> Here is my config. >> >> Packages Installed: >> heartbeat-2.99.2-6.1 >> heartbeat-common-2.99.2-6.1 >> heartbeat-debug-2.99.2-6.1 >> heartbeat-ldirectord-2.99.2-6.1 >> heartbeat-resources-2.99.2-6.1 >> libheartbeat2-2.99.2-6.1 >> libpacemaker3-1.0.1-3.1 >> pacemaker-1.0.1-3.1 >> pacemaker-debug-1.0.1-3.1 >> pacemaker-pygui-1.4-11.9 >> pacemaker-pygui-debug-1.4-11.9 >> >> >> >> ha.cf: >> # Logging >> debug 1 >> use_logd false >> logfacility daemon >> >> # Misc Options >> traditional_compression off >> compression bz2 >> coredumps true >> >> # Communications >> udpport 691 >> bcast eth1 eth0 >> autojoin any >> >> # Thresholds (in seconds) >> keepalive 1 >> warntime 6 >> deadtime 10 >> initdead 15 >> >> ping 10.50.254.254 >> crm respawn >> apiauth mgmtd uid=root >> respawn root /usr/lib/heartbeat/mgmtd -v >> >> >> cib.xml: >> <cib admin_epoch="0" validate-with="pacemaker-1.0" crm_feature_set="3.0" >> have-quorum="1" epoch="57" dc-uuid="5e3e3c2d-55e7-4c51-90be-5c4a1912bf3e" >> num_updates="0" cib-last-written="Mon Jan 26 13:57:32 2009"> >> <configuration> >> <crm_config> >> <cluster_property_set id="cib-bootstrap-options"> >> <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" >> value="1.0.1-node: 6fc5ce8302abf145a02891ec41e5a492efbe8efe"/> >> </cluster_property_set> >> </crm_config> >> <nodes> >> <node id="5e3e3c2d-55e7-4c51-90be-5c4a1912bf3e" uname="nomen.esri.com" >> type="normal"> >> <instance_attributes id="nodes-5e3e3c2d-55e7-4c51-90be-5c4a1912bf3e"> >> <nvpair id="standby-5e3e3c2d-55e7-4c51-90be-5c4a1912bf3e" >> name="standby" value="off"/> >> </instance_attributes> >> </node> >> <node id="27f54ec3-b626-4b4f-b8a6-4ed0b768513c" >> uname="rubric.esri.com" type="normal"> >> <instance_attributes id="nodes-27f54ec3-b626-4b4f-b8a6-4ed0b768513c"> >> <nvpair id="standby-27f54ec3-b626-4b4f-b8a6-4ed0b768513c" >> name="standby" value="off"/> >> </instance_attributes> >> </node> >> </nodes> >> <resources> >> <group id="Directory_Server"> >> <meta_attributes id="Directory_Server-meta_attributes"> >> <nvpair id="Directory_Server-meta_attributes-collocated" >> name="collocated" value="true"/> >> <nvpair id="Directory_Server-meta_attributes-ordered" >> name="ordered" value="true"/> >> <nvpair id="Directory_Server-meta_attributes-resource_stickiness" >> name="resource_stickiness" value="100"/> >> </meta_attributes> >> <primitive class="ocf" id="VIP" provider="heartbeat" type="IPaddr"> >> <instance_attributes id="VIP-instance_attributes"> >> <nvpair id="VIP-instance_attributes-ip" name="ip" >> value="10.50.26.250"/> >> </instance_attributes> >> <operations id="VIP-ops"> >> <op id="VIP-monitor-5s" interval="5s" name="monitor" >> timeout="5s"/> >> </operations> >> </primitive> >> <primitive class="ocf" id="ECAS" provider="esri" type="ecas"> >> <operations id="ECAS-ops"> >> <op id="ECAS-monitor-3s" interval="3s" name="monitor" >> timeout="3s"/> >> </operations> >> <meta_attributes id="ECAS-meta_attributes"> >> <nvpair id="ECAS-meta_attributes-target-role" name="target-role" >> value="Started"/> >> </meta_attributes> >> </primitive> >> <primitive class="ocf" id="FDS_Admin" provider="esri" type="fdsadm"> >> <operations id="FDS_Admin-ops"> >> <op id="FDS_Admin-monitor-3s" interval="3s" name="monitor" >> timeout="3s"/> >> </operations> >> </primitive> >> </group> >> </resources> >> <constraints> >> <rsc_location id="cli-prefer-Directory_Server" rsc="Directory_Server"> >> <rule id="cli-prefer-rule-Directory_Server" score="INFINITY" >> boolean-op="and"> >> <expression id="cli-prefer-expr-Directory_Server" >> attribute="#uname" operation="eq" value="rubric.esri.com" type="string"/> >> </rule> >> </rsc_location> >> <rsc_location id="cli-prefer-FDS_Admin" rsc="FDS_Admin"> >> <rule id="cli-prefer-rule-FDS_Admin" score="INFINITY" >> boolean-op="and"> >> <expression id="cli-prefer-expr-FDS_Admin" attribute="#uname" >> operation="eq" value="nomen.esri.com" type="string"/> >> </rule> >> </rsc_location> >> </constraints> >> </configuration> >> </cib> >> >> >> >> I still have the following issues when I only had heartbeat 2.1.3-1. My >> concerns are still as follows: >> >> 01) When a node comes back up after a restart of heartbeat, resources gets >> bounced when it rejoins the cluster. > > Well, you have defined rsc_location constraints with a score of > INFINITY, so that is expected. > >> 02) Stopping one resource in a group does not failover the group to the >> other node. > > Lookup migration-threshold. > > Regards > Dominik > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
