As I mentioned in an earlier e-mail, I'm trying to get monit to watch a group of processes so it can start/stop ospfd for an anycast high availability application. However, in doing this I'm seeing some odd behaviour that doesn't match what I expect -- is this a bug?
In the scenario below, why is it ever trying to start ospfd? If apache is down, shouldn't ospfd stay down until apache comes back up or is monitored again after being unmonitored? It does end up in the correct state at the end, but not without restarting and stopping ospfd twice in the meantime. As an example, I have the following configured: check process apache with pidfile /var/run/httpd.pid start program = "/etc/init.d/httpd start" stop program = "/etc/init.d/httpd stop" if failed host localhost port 80 protocol http and request "/" then restart if 2 restarts within 2 cycles then stop check process ospfd with pidfile /var/run/quagga/ospfd.pid start program = "/etc/init.d/ospfd start" stop program = "/etc/init.d/ospfd stop" depends on apache If I make it so that apache cannot run (by removing execute permissions on /usr/sbin/httpd) and then kill it, I see the following in the monit logs: Dec 6 08:47:39 tecate monit[9988]: 'apache' process is not running Dec 6 08:47:39 tecate monit[9988]: 'apache' trying to restart Dec 6 08:47:39 tecate monit[9988]: 'ospfd' stop: /etc/init.d/ospfd Dec 6 08:47:39 tecate monit[9988]: 'apache' start: /etc/init.d/httpd Dec 6 08:47:40 tecate monit[9988]: 'ospfd' unmonitor on user request Dec 6 08:47:40 tecate monit[9988]: monit daemon at 9988 awakened Dec 6 08:48:09 tecate monit[9988]: 'apache' failed to start Dec 6 08:48:09 tecate monit[9988]: 'ospfd' start: /etc/init.d/ospfd Dec 6 08:48:09 tecate monit[9988]: 'ospfd' unmonitor action done Dec 6 08:48:09 tecate monit[9988]: Awakened by User defined signal 1 Dec 6 08:48:09 tecate monit[9988]: 'apache' process is not running Dec 6 08:48:09 tecate monit[9988]: 'apache' trying to restart Dec 6 08:48:09 tecate monit[9988]: 'ospfd' stop: /etc/init.d/ospfd Dec 6 08:48:09 tecate monit[9988]: 'apache' start: /etc/init.d/httpd Dec 6 08:48:09 tecate monit[9988]: 'ospfd' unmonitor on user request Dec 6 08:48:09 tecate monit[9988]: monit daemon at 9988 awakened Dec 6 08:48:39 tecate monit[9988]: 'apache' failed to start Dec 6 08:48:39 tecate monit[9988]: 'ospfd' start: /etc/init.d/ospfd Dec 6 08:48:39 tecate monit[9988]: 'ospfd' unmonitor action done Dec 6 08:48:39 tecate monit[9988]: Awakened by User defined signal 1 Dec 6 08:48:39 tecate monit[9988]: 'apache' service restarted 2 times within 2 cycles(s) - stop Dec 6 08:48:39 tecate monit[9988]: 'ospfd' stop: /etc/init.d/ospfd Dec 6 08:48:39 tecate monit[9988]: 'ospfd' unmonitor on user request Dec 6 08:48:39 tecate monit[9988]: monit daemon at 9988 awakened Dec 6 08:48:39 tecate monit[9988]: Awakened by User defined signal 1 Dec 6 08:48:39 tecate monit[9988]: 'ospfd' unmonitor action done -- Dan Rich http://www.employees.org/~drich/ [1] "Step up to red alert!" "Are you sure, sir? It means changing the bulb in the sign..." - Red Dwarf (BBC) Links: ------ [1] http://www.employees.org/%7Edrich/
-- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general
