Hi,

You can try several approaches (I'll list 2 that I'm aware of):

1) Automatic restarts on OutOfMemory errors:
Add the following to CATALINA_OPTS:

-XX:OnOutOfMemoryError=/usr/sbin/restart_tcserver

Write your restart_tcserver (you may send an e-mail notification from it
etc.)

2) This is what I do (please critisice/suggest improvements to this
approach):

I've got 2 servers with Tomcat+Apache httpd with heartbeat beetween them:
I'm running this little script every 15 min. via cron:
--------------------
# cat /srv/scripts/test_live.sh
#!/bin/bash
SERVICE_HTTPD=$(ps -ef | grep -v grep | grep -c httpd)
SERVICE_TOMCAT=$(ps -ef | grep -v grep | grep -c tomcat)
SERVICE_HEARTBEAT=$(ps -ef | grep -v grep | grep -c heartbeat)
SERVICE_STATUS=$(/srv/scripts/check_http.pl -H confluence-server.myorg.com
-u /blank.html)

# While testing, please uncomment the following echo statements
if [ $SERVICE_HTTPD -ne 0 -a $SERVICE_TOMCAT -ne 0 -a "$SERVICE_STATUS" =
"Status: OK" ]
        then
#               echo "SERVICE_HTTPD and SERVICE_TOMCAT and SERVICE_STATUS
are OK, everything is fine"
                exit
        elif [ $SERVICE_HEARTBEAT -ne 0 ]
        then
                echo "The following output triggered failover:
SERVICE_HTTPD=$SERVICE_HTTPD , SERVICE_TOMCAT=$SERVICE_TOMCAT ,
SERVICE_STATUS=$SERVICE_STATUS , failing over to spare server"
                echo "The following output triggered failover:
SERVICE_HTTPD=$SERVICE_HTTPD , SERVICE_TOMCAT=$SERVICE_TOMCAT ,
SERVICE_STATUS=$SERVICE_STATUS , failing over to a spare server at `date`" |
/bin/mailx -s "Server `uname -n` encountered a problem, failing over to a
spare server at `date`" lkolchin at gmail dot com
                /etc/init.d/heartbeat stop
        else
#               echo "This server probably failed over to the spare one,
nothing to do"
                exit
fi
---------------------

If Tomcat+Apache running and application responsive ($SERVICE_STATUS) do
nothing if at least one of those conditions is not true, failover to a spare
server.

check_http.pl - This is a perl script (from Nagios Plugin I believe)-
## check_http.pl
## Copyright (c) 2008, Oliver Wittenburg  <oli...@wiburg.de>
##
## This program is free software: you can redistribute it and/or modify it
under
## the terms of the GNU General Public License as published by the Free
Software
.......


Cheers,
Leon Kolchinsky



On Thu, Sep 23, 2010 at 04:30, Christopher Schultz <
ch...@christopherschultz.net> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Shashank,
>
> On 9/22/2010 8:30 AM, Mendiratta, Shashank wrote:
> > Thanx , about that here the outbound port 80 is blocked so we cannot
> > wget , moreover this wont solve the problem as to why the the services
> > are getting hung.
>
> Hmm. Can you monitor from the server itself? That's not unusual to do.
> Also, connections to localhost:80 usually work even when software-based
> firewalls are in place, since the local host is usually considered trusted.
>
> > Well I had an idea, please critic it. Why not monitor the server.log
> > file if we get some kind of error. We send an alert and then restart the
> > service . Befire that we have to make a repository of types of error
> > that can occur
>
> We have one particularly poorly-written webapp that has a habit of
> running out of memory. We have segregated it into it's own Tomcat
> instance and actually do scan the log file for errors in the way you
> describe.
>
> The script is essentially this:
>
> grep -m 1 OutOfMemoryError ${LOGFILE} > /dev/null
>
> if [ "$?" == "0" ] ; then
>
> # notify an administrator
>
> fi
>
> It's not particularly elegant, but it gets the job done.
>
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (MingW32)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAkyaS0wACgkQ9CaO5/Lv0PCxXQCgwIlct+hqxxejBAEUAPw8+gXj
> EiAAoImkWA55dP3Nw8iuWIqM2P/N7Hvk
> =avt1
> -----END PGP SIGNATURE-----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>

Reply via email to