Check out the "Is This init Script LSB Compatible?" appendix in
http://clusterlabs.org/mw/Image:Configuration_Explained.pdf

Until those tests pass, there is no point using it in the cluster - it
will only blow up.

On Wed, Dec 3, 2008 at 18:09, Darren Mansell
<[EMAIL PROTECTED]> wrote:
> Hello everyone.
>
> I am trying to run a 2 node cluster with 1 shared IP for Tomcat. This
> works fine until I set the monitor operation inside the Tomcat resource
> where the CRM keeps trying to restart Tomcat over and over infinitely.
>
> Without the monitor operation in the CIB it won't keep trying to restart
> Tomcat but if I stop it manually it doesn't automatically get started
> again.
>
> I tried the tomcat OCF RA but there are lots of incorrect values hard
> coded in so I edited up an init script to what I thought was LSB
> compatible. This is the init script:
>
>
>
> #!/bin/sh
> # description: Start or stop the Tomcat server
> #
> ### BEGIN INIT INFO
> # Provides: tomcat
> # Required-Start: $network $syslog
> # Required-Stop: $network
> # Default-Start: 3
> # Default-Stop: 0
> # Description: Start or stop the Tomcat server
> ### END INIT INFO
>
> RETVAL=$?
> NAME=tomcat
> export JRE_HOME=/opt/java
> export CATALINA_HOME=/opt/$NAME
> export CATALINA_BASE=/opt/$NAME
> export JAVA_HOME=/opt/java
>
> check_running() {
>        NAME=$1
>        LINES=`ps -ef | grep java | grep opt | grep $NAME | grep -v grep | wc 
> -l `
>        [ $LINES -gt 0 ] && echo "yes"
> }
>
> case "$1" in
> 'start')
>        RUNNING=`check_running $NAME`
>        [ "$RUNNING" ] && exit 0
>        if [ -f $CATALINA_HOME/bin/startup.sh ];
>                then
>                        echo $"Starting Tomcat"
>                        $CATALINA_HOME/bin/startup.sh
>        fi
>        ;;
> 'stop')
>        RUNNING=`check_running $NAME`
>        [ ! "$RUNNING" ] && exit 0
>        if [ -f $CATALINA_HOME/bin/shutdown.sh ];
>                then
>                        echo $"Stopping Tomcat"
>                        $CATALINA_HOME/bin/shutdown.sh
>        fi
>        ;;
> 'restart')
>        $0 stop
>        sleep 15
>        $0 start
>        ;;
> 'status')
>        RUNNING=`check_running $NAME`
>        [ "$RUNNING" ] && exit 0 || exit 1;;
> *)
>        echo
>        echo $"Usage: $0 {start|stop}"
>        echo
>        exit 1;;
>
> esac
> exit $RETVAL
>
>
>
>
>
> This is my cib.xml
>
>
>
>
>
>  <cib generated="true" admin_epoch="0" have_quorum="true" ignore_dtd="false" 
> num_peers="2" cib_feature_revision="1.3" crm_feature_set="2.0" epoch="125" 
> num_updates="82" cib-last-written="Wed Dec  3 16:45:56 2008" 
> ccm_transition="2" dc_uuid="ae4489bf-2c5d-4cfd-bf81-5e25b11932eb">
>   <configuration>
>     <crm_config>
>       <cluster_property_set id="cib-bootstrap-options">
>         <attributes>
>           <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" 
> value="2.1.3-node: a3184d5240c6e7032aef9cce6e5b7752ded544b3"/>
>         </attributes>
>       </cluster_property_set>
>     </crm_config>
>     <nodes>
>       <node id="7e9a5233-d24c-441f-9f14-03352172f08b" uname="hs-node2" 
> type="normal"/>
>       <node id="ae4489bf-2c5d-4cfd-bf81-5e25b11932eb" uname="hs-node1" 
> type="normal"/>
>     </nodes>
>     <resources>
>       <clone id="tomcat">
>         <instance_attributes id="5908d3eb-7d48-4c7d-bcca-9020f8eadc87">
>           <attributes>
>             <nvpair name="clone_max" value="2" 
> id="19a0d76d-9697-4d19-8990-0f098d299a4f"/>
>             <nvpair name="clone_node_max" value="1" 
> id="de765b64-ece4-4c19-9659-13e20b60d9bb"/>
>           </attributes>
>         </instance_attributes>
>         <group id="tomcat_group">
>           <primitive id="ip_1" class="ocf" type="IPaddr" provider="heartbeat">
>             <instance_attributes id="e79760a4-c715-477a-a4b7-85eab9bf9ae9">
>               <attributes>
>                 <nvpair name="ip" value="2.21.2.5" 
> id="07540941-f4f8-4bd0-ac78-7d62f212145a"/>
>               </attributes>
>             </instance_attributes>
>           </primitive>
>           <primitive id="tomcat_1" class="lsb" type="tomcat" 
> provider="heartbeat">
>             <operations>
>               <op id="monitor_tomcat" interval="120s" name="monitor" 
> timeout="60s"/>
>             </operations>
>           </primitive>
>         </group>
>       </clone>
>     </resources>
>     <constraints/>
>   </configuration>
>
>
>
>
> This is the ha.cf:
>
>
>
>
> udpport 694
> autojoin none
> crm true
> ucast eth0 2.21.2.4
> ucast eth0 2.21.2.3
> node hs-node1
> node hs-node2
> respawn root /sbin/evmsd
> apiauth evms uid=hacluster,root
>
>
>
>
>
> This is what crm_mon says:
>
>
>
>
> ============
> Last updated: Wed Dec  3 17:26:47 2008
> Current DC: hs-node1 (ae4489bf-2c5d-4cfd-bf81-5e25b11932eb)
> 2 Nodes configured.
> 1 Resources configured.
> ============
>
> Node: hs-node2 (7e9a5233-d24c-441f-9f14-03352172f08b): online
> Node: hs-node1 (ae4489bf-2c5d-4cfd-bf81-5e25b11932eb): online
>
> Clone Set: tomcat
>    Resource Group: tomcat_group:0
>        ip_1:0  (ocf::heartbeat:IPaddr):        Started hs-node2
>        tomcat_1:0      (lsb:tomcat):   Started hs-node2 FAILED
>    Resource Group: tomcat_group:1
>        ip_1:1  (ocf::heartbeat:IPaddr):        Started hs-node1
>        tomcat_1:1      (lsb:tomcat):   Stopped
>
> Failed actions:
>    tomcat_1:0_monitor_120000 (node=hs-node2, call=809, rc=7): complete
>
>
>
>
>
>
> It was working but suddenly stopped and I have no idea why. If anyone could 
> provide any pointers that would be great. I'm using:
>
> SLES 10 SP2
> Heartbeat 2.1.3
>
> Thanks
>
> Darren Mansell
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to