[Linux-HA] Problem with LSB init script when monitoring

Darren Mansell Wed, 03 Dec 2008 09:36:12 -0800

Hello everyone.

I am trying to run a 2 node cluster with 1 shared IP for Tomcat. This
works fine until I set the monitor operation inside the Tomcat resource
where the CRM keeps trying to restart Tomcat over and over infinitely.


Without the monitor operation in the CIB it won't keep trying to restart
Tomcat but if I stop it manually it doesn't automatically get started
again.

I tried the tomcat OCF RA but there are lots of incorrect values hard
coded in so I edited up an init script to what I thought was LSB
compatible. This is the init script:



#!/bin/sh                                                                       
                                                                                
                    
# description: Start or stop the Tomcat server                                  
                                                                                
                    
#                                                                               
                                                                                
                    
### BEGIN INIT INFO                                                             
                                                                                
                    
# Provides: tomcat                                                              
                                                                                
                    
# Required-Start: $network $syslog                                              
                                                                                
                    
# Required-Stop: $network
# Default-Start: 3
# Default-Stop: 0
# Description: Start or stop the Tomcat server
### END INIT INFO

RETVAL=$?
NAME=tomcat
export JRE_HOME=/opt/java
export CATALINA_HOME=/opt/$NAME
export CATALINA_BASE=/opt/$NAME
export JAVA_HOME=/opt/java

check_running() {
        NAME=$1
        LINES=`ps -ef | grep java | grep opt | grep $NAME | grep -v grep | wc 
-l `
        [ $LINES -gt 0 ] && echo "yes"
}

case "$1" in
'start')
        RUNNING=`check_running $NAME`
        [ "$RUNNING" ] && exit 0
        if [ -f $CATALINA_HOME/bin/startup.sh ];
                then
                        echo $"Starting Tomcat"
                        $CATALINA_HOME/bin/startup.sh
        fi
        ;;
'stop')
        RUNNING=`check_running $NAME`
        [ ! "$RUNNING" ] && exit 0
        if [ -f $CATALINA_HOME/bin/shutdown.sh ];
                then
                        echo $"Stopping Tomcat"
                        $CATALINA_HOME/bin/shutdown.sh
        fi
        ;;
'restart')
        $0 stop
        sleep 15
        $0 start
        ;;
'status')
        RUNNING=`check_running $NAME`
        [ "$RUNNING" ] && exit 0 || exit 1;;
*)
        echo
        echo $"Usage: $0 {start|stop}"
        echo
        exit 1;;

esac
exit $RETVAL





This is my cib.xml





 <cib generated="true" admin_epoch="0" have_quorum="true" ignore_dtd="false" 
num_peers="2" cib_feature_revision="1.3" crm_feature_set="2.0" epoch="125" 
num_updates="82" cib-last-written="Wed Dec  3 16:45:56 2008" ccm_transition="2" 
dc_uuid="ae4489bf-2c5d-4cfd-bf81-5e25b11932eb">                                 
                                                
   <configuration>                                                              
                                                                                
                    
     <crm_config>                                                               
                                                                                
                    
       <cluster_property_set id="cib-bootstrap-options">                        
                                                                                
                    
         <attributes>                                                           
                                                                                
                    
           <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" 
value="2.1.3-node: a3184d5240c6e7032aef9cce6e5b7752ded544b3"/>                  
                         
         </attributes>                                                          
                                                                                
                    
       </cluster_property_set>                                                  
                                                                                
                    
     </crm_config>                                                              
                                                                                
                    
     <nodes>                                                                    
                                                                                
                    
       <node id="7e9a5233-d24c-441f-9f14-03352172f08b" uname="hs-node2" 
type="normal"/>                                                                 
                            
       <node id="ae4489bf-2c5d-4cfd-bf81-5e25b11932eb" uname="hs-node1" 
type="normal"/>                                                                 
                            
     </nodes>                                                                   
                                                                                
                    
     <resources>                                                                
                                                                                
                    
       <clone id="tomcat">                                                      
                                                                                
                    
         <instance_attributes id="5908d3eb-7d48-4c7d-bcca-9020f8eadc87">        
                                                                                
                    
           <attributes>                                                         
                                                                                
                    
             <nvpair name="clone_max" value="2" 
id="19a0d76d-9697-4d19-8990-0f098d299a4f"/>                                     
                                                    
             <nvpair name="clone_node_max" value="1" 
id="de765b64-ece4-4c19-9659-13e20b60d9bb"/>                                     
                                               
           </attributes>                                                        
                                                                                
                    
         </instance_attributes>                                                 
                                                                                
                    
         <group id="tomcat_group">                                              
                                                                                
                    
           <primitive id="ip_1" class="ocf" type="IPaddr" provider="heartbeat"> 
                                                                                
                    
             <instance_attributes id="e79760a4-c715-477a-a4b7-85eab9bf9ae9">    
                                                                                
                    
               <attributes>                                                     
                                                                                
                    
                 <nvpair name="ip" value="2.21.2.5" 
id="07540941-f4f8-4bd0-ac78-7d62f212145a"/>                                     
                                                
               </attributes>                                                    
                                                                                
                    
             </instance_attributes>                                             
                                                                                
                    
           </primitive>                                                         
                                                                                
                    
           <primitive id="tomcat_1" class="lsb" type="tomcat" 
provider="heartbeat">                                                           
                                      
             <operations>                                                       
                                                                                
                    
               <op id="monitor_tomcat" interval="120s" name="monitor" 
timeout="60s"/>                                                                 
                              
             </operations>                                                      
                                                                                
                    
           </primitive>                                                         
                                                                                
                    
         </group>                                                               
                                                                                
                    
       </clone>                                                                 
                                                                                
                    
     </resources>                                                               
                                                                                
                    
     <constraints/>                                                             
                                                                                
                    
   </configuration>




This is the ha.cf:




udpport 694
autojoin none
crm true
ucast eth0 2.21.2.4
ucast eth0 2.21.2.3
node hs-node1
node hs-node2
respawn root /sbin/evmsd
apiauth evms uid=hacluster,root





This is what crm_mon says:




============
Last updated: Wed Dec  3 17:26:47 2008
Current DC: hs-node1 (ae4489bf-2c5d-4cfd-bf81-5e25b11932eb)
2 Nodes configured.
1 Resources configured.
============

Node: hs-node2 (7e9a5233-d24c-441f-9f14-03352172f08b): online
Node: hs-node1 (ae4489bf-2c5d-4cfd-bf81-5e25b11932eb): online

Clone Set: tomcat
    Resource Group: tomcat_group:0
        ip_1:0  (ocf::heartbeat:IPaddr):        Started hs-node2
        tomcat_1:0      (lsb:tomcat):   Started hs-node2 FAILED
    Resource Group: tomcat_group:1
        ip_1:1  (ocf::heartbeat:IPaddr):        Started hs-node1
        tomcat_1:1      (lsb:tomcat):   Stopped

Failed actions:
    tomcat_1:0_monitor_120000 (node=hs-node2, call=809, rc=7): complete






It was working but suddenly stopped and I have no idea why. If anyone could 
provide any pointers that would be great. I'm using:

SLES 10 SP2
Heartbeat 2.1.3

Thanks

Darren Mansell

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Problem with LSB init script when monitoring

Reply via email to