Re: [monit] stop on failed service

Martin Pala Mon, 21 Jan 2008 19:35:01 -0800

When the monitored service type is process, monit checks whether theprocess is running. If it is not running, then it bypasses the stopaction, since it is not needed as per current logic.

In your case you can either fix the startup script (to cleanup beforestart) or do it inline, something like:

start program = "/bin/bash -c '/bin/rm -f /var/run/myprocess.pid&& /usr/bin/python myprocess.py"



Martin


On Jan 21, 2008, at 6:11 AM, Peter Baumgartner wrote:

Monit had it's first opportunity to fix a failed service for me last
night. The problem was it never ran my stop script to remove the stale
pid file causing it to fail to start. Is there something I can change
to force it to run the stop script even when it finds the process not
running? Here's what my syslog captured:

Jan 21 06:30:59 server monit[5653]: 'myprocess' process is not running
Jan 21 06:30:59 server monit[5653]: 'myprocess' trying to restart
Jan 21 06:30:59 server monit[5653]: 'myprocess' start: /usr/bin/python
Jan 21 06:30:59 server monit[5653]: 'myprocess' failed to start

And here is my configuration:

check process myprocess with pidfile /var/run/myprocess.pid
    start program = "/usr/bin/python myprocess.py"
    stop program = "/usr/local/bin/killserver.sh myprocess"
    if totalmem is greater than 60.0 MB for 5 cycles then restart
 # eating up memory?
    if cpu is greater than 50% for 2 cycles then alert
 # send an email to admin
    if cpu is greater than 80% for 3 cycles then restart
 # hung process?
    if loadavg(5min) greater than 10 for 8 cycles then restart
 # bad, bad, bad
    if 3 restarts within 5 cycles then timeout
# something is wrong, call the sys-admin

if failed port 9060 protocol http # check forresponse

        with timeout 10 seconds
        then restart
    group mygroup


--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general




--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general

Re: [monit] stop on failed service

Reply via email to