Hi everyone.. I've about reached the end of my road here trying to get Monit to run, and at this point, I'm simply going 'uncle' and posting for help. I have Googled, I have read documentation, I have studied examples, all to no avail so far. The app runs for the specified 60 second 'wait' period in my monitrc, then goes away. No matter what I've tried, it's the exact same result.
Let me begin by saying I followed this guide here: http://www.howtoforge.com/server-monitoring-with-munin-and-monit-on-centos-5.2-p2 I went through the setup for a 64 bit box with CentOS 5 Final. Every step matched what was documented to the 'T'. After doing the SSL certs, the website said "finally, we can start Monit: /etc/init.d/monit start", which I did. It complained my mysqld wasn't in the right path, nor my postfix. I just commented those entries out to come back to them later, and restarted the daemon. It seemed to grab, as a ps aux | grep monit showed it running, and /etc/init.d/monit status confirmed it. I opened a browser and pointed it to my box with the proper port, but got nothing. Went back to the running processes and found Monit dead. Going through the monit.log, I saw there was an id error, because the folder expected to hold the id wasn't there. I created it, re-ran the daemon, and this time it reported that it wrote a unique id file to the directory I created, and it was once again running. 60 seconds later, it was dead again. The monit.log revealed nothing out of the ordinary, here is what a cycle of start -> dead looks like in the log: [EST Dec 15 14:11:26] info : monit: generated unique Monit id 99655fc9cc168e531b8d9734cab746b9 and stored to '/var/monit/id' [EST Dec 15 14:11:26] info : Starting monit daemon with http interface at [*:2812] [EST Dec 15 14:11:26] info : Monit start delay set -- pause for 60s [EST Dec 15 14:12:26] info : Starting monit HTTP server at [*:2812] I then started running the daemon in the foreground with noise, and frankly, if the problem is revealed in there, I don't see it. Here's that: $/usr/bin/monit -d 10 -c /etc/monit.d/monitrc -v -l /var/log/monit.log monit: Debug: Adding net allow '{my_home_ip_here}'. monit: Debug: Adding credentials for user 'admin'. Runtime constants: Control file = /etc/monit.d/monitrc Log file = /var/log/monit.log Pid file = /var/run/monit.pid Debug = True Log = True Use syslog = False Is Daemon = True Use process engine = True Poll time = 10 seconds with start delay 0 seconds Expect buffer = 256 bytes Mail from = (not defined) Mail subject = (not defined) Mail message = (not defined) Start monit httpd = True httpd bind address = Any/All httpd portnumber = 2812 httpd signature = True Use ssl encryption = True PEM key/cert file = /var/certs/monit.pem Client cert file = None Allow self certs = False httpd auth. style = Basic Authentication and Host/Net allow list The service list contains the following entries: Process Name = proftpd Pid file = /var/run/proftpd.pid Monitoring mode = active Start program = '/etc/init.d/proftpd start' timeout 30 second(s) Stop program = '/etc/init.d/proftpd stop' timeout 30 second(s) Existence = if does not exist 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert Pid = if changed 1 times within 1 cycle(s) then alert Ppid = if changed 1 times within 1 cycle(s) then alert Port = if failed localhost:21 [FTP via TCP] with timeout 5 seconds 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert Timeout = If restarted 5 times within 5 cycle(s) then unmonitor Process Name = sshd Pid file = /var/run/sshd.pid Monitoring mode = active Start program = '/etc/init.d/sshd start' timeout 30 second(s) Stop program = '/etc/init.d/sshd stop' timeout 30 second(s) Existence = if does not exist 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert Pid = if changed 1 times within 1 cycle(s) then alert Ppid = if changed 1 times within 1 cycle(s) then alert Port = if failed localhost:22 [SSH via TCP] with timeout 5 seconds 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert Timeout = If restarted 5 times within 5 cycle(s) then unmonitor Process Name = apache Group = www Pid file = /var/run/httpd.pid Monitoring mode = active Start program = '/etc/init.d/httpd start' timeout 30 second(s) Stop program = '/etc/init.d/httpd stop' timeout 30 second(s) Existence = if does not exist 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert Pid = if changed 1 times within 1 cycle(s) then alert Ppid = if changed 1 times within 1 cycle(s) then alert Port = if failed www.ezcommunities.com:80/monit/token<http://www.ezcommunities.com/monit/token>[HTTP via TCP] with timeout 5 seconds 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert Load avg. (5min) = if greater than 10.0 8 times within 8 cycle(s) then stop else if succeeded 1 times within 1 cycle(s) then alert Children = if greater than 250 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert CPU usage limit = if greater than 80.0% 5 times within 5 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert CPU usage limit = if greater than 60.0% 2 times within 2 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Timeout = If restarted 3 times within 5 cycle(s) then unmonitor System Name = system_{myexample.site.com} Monitoring mode = active ------------------------------------------------------------------------------- Starting monit daemon with http interface at [*:2812] monit.log says: [EST Dec 16 02:27:04] info : Starting monit daemon with http interface at [*:2812] [EST Dec 16 02:27:04] info : Starting monit HTTP server at [*:2812] [EST Dec 16 02:27:04] info : monit HTTP server started [EST Dec 16 02:27:04] info : 'system_{myexample.site.com}' Monit started /etc/init.d/monit status says: monit dead but pid file exists For completeness, here is monitrc: set daemon 60 with start delay 60 set logfile /var/log/monit.log # set mailserver localhost # set mail-format { from: r...@{myexample.site.com<r...@%7bmyexample.site.com>} } # set alert [email protected] set httpd port 2812 and SSL ENABLE PEMFILE /var/certs/monit.pem allow {my_home_ip_here} allow admin:test check process proftpd with pidfile /var/run/proftpd.pid start program = "/etc/init.d/proftpd start" stop program = "/etc/init.d/proftpd stop" if failed port 21 protocol ftp then restart if 5 restarts within 5 cycles then timeout check process sshd with pidfile /var/run/sshd.pid start program "/etc/init.d/sshd start" stop program "/etc/init.d/sshd stop" if failed port 22 protocol ssh then restart if 5 restarts within 5 cycles then timeout # check process mysql with pidfile /var/run/mysqld/mysqld.pid # group database # start program = "/usr/sbin/mysqld start" # stop program = "/usr/sbin/mysqld stop" # if failed host 127.0.0.1 port 3306 then restart # if 5 restarts within 5 cycles then timeout check process apache with pidfile /var/run/httpd.pid group www start program = "/etc/init.d/httpd start" stop program = "/etc/init.d/httpd stop" if failed host {myexample.site.com} port 80 protocol http and request "/monit/token" then restart if cpu is greater than 60% for 2 cycles then alert if cpu > 80% for 5 cycles then restart # if totalmem > 500 MB for 5 cycles then restart if children > 250 then restart if loadavg(5min) greater than 10 for 8 cycles then stop if 3 restarts within 5 cycles then timeout # check process postfix with pidfile /var/spool/postfix/pid/master.pid # group mail # start program = "/etc/init.d/postfix start" # stop program = "/etc/init.d/postfix stop" # if failed port 25 protocol smtp then restart # if 5 restarts within 5 cycles then timeout As stated, I'm at a dead-end. I have no idea what to try next, as I've tried everything that I could see from a variety of other trouble posts, but always end up with a dead service after 60 seconds. Help appreciated. = ) - Keith
-- To unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general
