As I mentioned in an earlier e-mail, I'm trying to get monit to
watch a group of processes so it can start/stop ospfd for an anycast
high availability application. However, in doing this I'm seeing some
odd behaviour that doesn't match what I expect -- is this a bug? 

In
the scenario below, why is it ever trying to start ospfd? If apache is
down, shouldn't ospfd stay down until apache comes back up or is
monitored again after being unmonitored? It does end up in the correct
state at the end, but not without restarting and stopping ospfd twice in
the meantime. 

As an example, I have the following configured: 

check
process apache with pidfile /var/run/httpd.pid
 start program =
"/etc/init.d/httpd start"
 stop program = "/etc/init.d/httpd stop"
 if
failed host localhost port 80 protocol http
 and request "/" then
restart
 if 2 restarts within 2 cycles then stop

check process ospfd
with pidfile /var/run/quagga/ospfd.pid
 start program =
"/etc/init.d/ospfd start"
 stop program = "/etc/init.d/ospfd stop"

depends on apache 

If I make it so that apache cannot run (by removing
execute permissions on /usr/sbin/httpd) and then kill it, I see the
following in the monit logs: 

Dec 6 08:47:39 tecate monit[9988]:
'apache' process is not running 
Dec 6 08:47:39 tecate monit[9988]:
'apache' trying to restart 
Dec 6 08:47:39 tecate monit[9988]: 'ospfd'
stop: /etc/init.d/ospfd 
Dec 6 08:47:39 tecate monit[9988]: 'apache'
start: /etc/init.d/httpd 
Dec 6 08:47:40 tecate monit[9988]: 'ospfd'
unmonitor on user request 
Dec 6 08:47:40 tecate monit[9988]: monit
daemon at 9988 awakened 
Dec 6 08:48:09 tecate monit[9988]: 'apache'
failed to start 
Dec 6 08:48:09 tecate monit[9988]: 'ospfd' start:
/etc/init.d/ospfd 
Dec 6 08:48:09 tecate monit[9988]: 'ospfd' unmonitor
action done 
Dec 6 08:48:09 tecate monit[9988]: Awakened by User defined
signal 1 
Dec 6 08:48:09 tecate monit[9988]: 'apache' process is not
running 
Dec 6 08:48:09 tecate monit[9988]: 'apache' trying to restart

Dec 6 08:48:09 tecate monit[9988]: 'ospfd' stop: /etc/init.d/ospfd 
Dec
6 08:48:09 tecate monit[9988]: 'apache' start: /etc/init.d/httpd 
Dec 6
08:48:09 tecate monit[9988]: 'ospfd' unmonitor on user request 
Dec 6
08:48:09 tecate monit[9988]: monit daemon at 9988 awakened 
Dec 6
08:48:39 tecate monit[9988]: 'apache' failed to start 
Dec 6 08:48:39
tecate monit[9988]: 'ospfd' start: /etc/init.d/ospfd 
Dec 6 08:48:39
tecate monit[9988]: 'ospfd' unmonitor action done 
Dec 6 08:48:39 tecate
monit[9988]: Awakened by User defined signal 1 
Dec 6 08:48:39 tecate
monit[9988]: 'apache' service restarted 2 times within 2 cycles(s) -
stop 
Dec 6 08:48:39 tecate monit[9988]: 'ospfd' stop: /etc/init.d/ospfd

Dec 6 08:48:39 tecate monit[9988]: 'ospfd' unmonitor on user request

Dec 6 08:48:39 tecate monit[9988]: monit daemon at 9988 awakened 
Dec 6
08:48:39 tecate monit[9988]: Awakened by User defined signal 1 
Dec 6
08:48:39 tecate monit[9988]: 'ospfd' unmonitor action done 

-- 

Dan
Rich  
http://www.employees.org/~drich/ [1]
 "Step up to red alert!"
"Are you sure, sir?
 It means changing the bulb in the sign..."
 - Red
Dwarf (BBC)   

Links:
------
[1] http://www.employees.org/%7Edrich/
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

Reply via email to