[Linux-HA] Re: Re: showscores.sh weirdness and Not failing over after, repeated kills of IPaddr2?

Roland G. McIntosh Wed, 26 Mar 2008 09:32:40 -0700

Dominik Klein wrote:
> With a failure stickiness of -30, you allow your groups resources to
> fail (400/30)=14 times. Is that what you want?

Although the default failure stickiness is -30, the group has a failurestickiness of -100. I would like to failover after 3 or 4 failures.

My test with 15 stop commands was "just to be sure."

> You don't have any monitor operations for the ipaddr and jboss
> resources. Failures on them are not detected. Configure monitor
> operations and try again.

I actually do have monitor operations on both, I accidentally sent outan old cib.xml, updated file attached. Previous showscores.sh output iscorrect for this cib.xml. It behaves as described in the last e-maileven with monitor operations on IPaddr2 and jboss.


> Also make sure you use a recent version.Otherwise you may also hit the
> bug of not increasing failcount in 2.1.3's crm. This is fixed in
> pacemaker (0.6.x)

Uh oh. I definitely have 2.1.3 with the crm_failcount bug, but I didn'tthink this would affect score calculation. I didn't install a pacemakerpackage, I used the CentOS4 extras RPMs. I hope CentOS4 / RHEL4packages can be released. I could not rebuild RHEL5 packages from theopenSUSE ha-clustering repository due to this:


configure:3065: gcc -c  -O2 -g  conftest.c >&5
conftest.c:2: error: syntax error before "me"
configure:3071: $? = 1
configure: failed program was:
| #ifndef __cplusplus
|    choke me
| #endif

Should I seek an alternative to these CentOS 4 extras RPMs?

> pps. where did you get the jboss RA? I'd be interested in it.

http://rgm.nu/jbossocf

I hacked it together, it's ugly. Hope it's useful to you, although it'stailored to a very old jboss release 3.0.8, with some customizations tosupport multiple instances of jboss on different ports. Relies on ps,awk, egrep, and curl, only tested on RHEL4. You'll want to change theHTTPCODE check with an URL for your servlet (or modify to use jmx-console).



Regards,

Roland

 <cib admin_epoch="0" have_quorum="true" ignore_dtd="false" num_peers="0" 
cib_feature_revision="1.3" generated="false" num_updates="1" epoch="50" 
cib-last-written="Tue Mar 25 13:48:27 2008" ccm_transition="1">
   <configuration>
     <crm_config>
       <cluster_property_set id="cib-bootstrap-options">
         <attributes>
           <nvpair id="id-no-quorum-policy" name="no-quorum-policy" 
value="ignore"/>
           <nvpair id="cib-bootstrap-options-default-resource-stickiness" 
name="default-resource-stickiness" value="100"/>
           <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" 
value="2.1.3-node: 552305612591183b1628baa5bc6e903e0f1e26a3"/>
           <nvpair id="cib-bootstrap-options-last-lrm-refresh" 
name="last-lrm-refresh" value="1206118390"/>
           <nvpair 
id="cib-bootstrap-options-default-resource-failure-stickiness" 
name="default-resource-failure-stickiness" value="-30"/>
         </attributes>
       </cluster_property_set>
     </crm_config>
     <nodes>
       <node uname="slinkfail" type="normal" 
id="9b8c9849-b713-401b-86f9-c7a0402a4658">
         <instance_attributes id="nodes-9b8c9849-b713-401b-86f9-c7a0402a4658">
           <attributes>
             <nvpair name="standby" 
id="standby-9b8c9849-b713-401b-86f9-c7a0402a4658" value="off"/>
           </attributes>
         </instance_attributes>
       </node>
       <node id="cb25eedb-6f51-4c75-b137-ec375e253890" uname="slinkmaster" 
type="normal">
         <instance_attributes id="nodes-cb25eedb-6f51-4c75-b137-ec375e253890">
           <attributes>
             <nvpair id="standby-cb25eedb-6f51-4c75-b137-ec375e253890" 
name="standby" value="off"/>
           </attributes>
         </instance_attributes>
       </node>
     </nodes>
     <resources>
       <group id="MyGroup" collocated="true" ordered="true">
         <primitive id="slink_ipaddr2" class="ocf" type="IPaddr2" 
provider="heartbeat">
           <instance_attributes id="slink_ipaddr2_instance_attrs">
             <attributes>
               <nvpair id="74461f56-ba60-47f2-a767-ffd114562363" name="ip" 
value="192.168.1.222"/>
             </attributes>
           </instance_attributes>
           <operations>
             <op id="46cb04c6-a824-4e67-b514-a9bf8fce4525" name="monitor" 
interval="30s" timeout="20s" start_delay="5s"/>
           </operations>
         </primitive>
         <primitive id="slink_db" class="ocf" type="pgsql" provider="heartbeat">
           <meta_attributes id="slink_db_meta_attrs">
             <attributes/>
           </meta_attributes>
           <operations>
             <op id="45d14088-f223-4262-b309-713b0c850e77" name="monitor" 
interval="30" timeout="30" start_delay="10" disabled="false" role="Started"/>
           </operations>
         </primitive>
         <primitive id="slink_jboss" class="ocf" type="jbossocf" 
provider="enexity">
           <instance_attributes id="slink_jboss_instance_attrs">
             <attributes/>
           </instance_attributes>
           <meta_attributes id="slink_jboss_meta_attrs">
             <attributes/>
           </meta_attributes>
           <operations>
             <op id="7c63f880-8b00-4453-87a4-bf722a1bb95f" name="monitor" 
interval="30" timeout="20" start_delay="1m"/>
           </operations>
         </primitive>
         <meta_attributes id="MyGroup_meta_attrs">
           <attributes>
             <nvpair id="MyGroup_metaattr_resource_failure_stickiness" 
name="resource_failure_stickiness" value="-100"/>
           </attributes>
         </meta_attributes>
       </group>
     </resources>
     <constraints>
       <rsc_location id="run_MyGroup_group" rsc="MyGroup">
         <rule id="pref_run_MyGroup_group" score="100">
           <expression id="keep_group_on_master" attribute="#uname" 
operation="eq" value="slinkmaster"/>
         </rule>
       </rsc_location>
     </constraints>
   </configuration>
 </cib>

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Re: Re: showscores.sh weirdness and Not failing over after, repeated kills of IPaddr2?

Reply via email to