Hello All, OK I have read some of the documentation, and still am unclear on the best way to handle this. I will use my DNS monitoring as my sample: #-----< Bind >----- check process named with pidfile /var/named/chroot/var/run/named/named.pid alert [email protected] only on { timeout, nonexist } start program = "/etc/rc.d/init.d/named start" stop program = "/etc/rc.d/init.d/named stop" if failed host 127.0.0.1 port 53 type tcp protocol dns then alert alert [email protected] only on { timeout, nonexist } if failed host 127.0.0.1 port 53 type udp protocol dns then alert alert [email protected] only on { timeout, nonexist } if 5 restarts within 5 cycles then timeout
I'm not sure the best/right way to make this work so here is what I'm trying to do: 1. I want to test if bind is running of course, since bind may reboot through scripts when we update information I really don't need to know if a pid changes ( way too many notifications that way) so I add the only on timeout, nonexist. I am testing 3 things, the process is running, the 2 ports are responding which is good am I missing any other test I should be doing? 2. I want to have it evaluate that it is actually down for a minute or some number which I can control, what is the best way to do this? So if it fails for 1 min, it's really down, 20 seconds it may just be busy so don't go trying to reboot processes and such. 3.In this example will it only restart on a process issue? what if here is a port issue, in the above example, does start and stop execute no matter what the failure is? I have the above configured, if I restart DNS I get notifications: Date: Tue, 15 Dec 2009 01:54:25 -0500 Action: Alert Host: myserver.net Description: process PID changed to 15258 PID changed Service named Once I can figure out the logic / formatting of this language per say I'm hoping to clean up all my notifications and utilize monit a little better. -- Thanks! Jack -- To unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general
