monit 4.10.1 just died with an assertion failure: ... Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen.conf' checksum was changed for /etc/pen.d/testapp.conf Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen.conf' trying to restart Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen' stop: /bin/bash Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen' start: /usr/bin/pen Mar 25 14:54:34 localhost monit[17903]: AssertException: s at xmalloc.c:110 aborting..
System details: * CentOS 4.5 Linux localhost.localdomain 2.6.9-55.0.2.plus.c4 #1 Fri Jul 6 05:04:29 EDT 2007 i686 i686 i386 GNU/Linux * monit 4.10.1 built as an RPM, within a chroot environment (mach) on another host. Spec file taken from http://dag.wieers.com/rpm/packages/monit/monit.spec (just changed 4.9 to 4.10.1) What I was doing: I had set up a dependency between a config file (/etc/pen.d/testapp.conf) and a process, then I modified the config file by adding a blank line, to see if monit would restart the process. It appears that it started to do so, then died :-( My full configs are attached below - in particular see /etc/monit.d/testapp.monitrc I'm not sure that what I was doing was valid (having a 'restart' action within a file check, and then a process check dependent on the file check). So it's possible this is a case of operator error. However I still wouldn't have expected monit to die. In case it's relevant, I should add that the checks testapp_mongrel_1 and testapp_mongrel_2 are intentionally failing, because the processes which they are trying to start have not yet been installed on the target box. Here is a fuller log extract: ... Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_1' process is not running Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_1' trying to restart Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_1' start: /usr/bin/mongrel_rails Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_2' process is not running Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_2' trying to restart Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_2' start: /usr/bin/mongrel_rails Mar 25 14:54:04 localhost monit[17903]: 'testapp_mongrel_1' failed to start Mar 25 14:54:04 localhost monit[17903]: 'testapp_mongrel_2' failed to start Mar 25 14:54:34 localhost monit[17903]: 'testapp_mongrel_1' failed to start Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen.conf' checksum was changed for /etc/pen.d/testapp.conf Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen.conf' trying to restart Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen' stop: /bin/bash Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen' start: /usr/bin/pen Mar 25 14:54:34 localhost monit[17903]: AssertException: s at xmalloc.c:110 aborting.. The bug appears to be repeatable - I tried restarting monit and changing that config file, and I get the same crash. Regards, Brian Candler. # cat /etc/monit.conf set daemon 30 set logfile syslog facility log_daemon set mailserver localhost set mail-format {from:[EMAIL PROTECTED] set alert [EMAIL PROTECTED] only on { timeout, nonexist } set httpd port 2812 allow localhost allow X.X.X.0/255.255.252.0 include /etc/monit.d/* # head -100 /etc/monit.d/* ==> /etc/monit.d/apache.monitrc <== check process apache with pidfile "/var/run/httpd.pid" start program = "/etc/init.d/httpd start" stop program = "/etc/init.d/httpd stop" if 2 restarts within 3 cycles then timeout if totalmem > 100 Mb then alert if children > 255 for 5 cycles then stop if cpu usage > 95% for 3 cycles then restart #if failed port 80 protocol http then restart group server depends on httpd.conf, httpd.conf.d check file httpd.conf with path /etc/httpd/conf/httpd.conf # Reload apache if the httpd.conf file was changed if changed checksum then exec "/etc/init.d/httpd graceful" check directory httpd.conf.d with path /etc/httpd/conf.d if changed timestamp then exec "/etc/init.d/httpd graceful" ==> /etc/monit.d/memcached.monitrc <== check process memcached with pidfile /var/run/memcached/memcached.pid start program = "/etc/init.d/memcached start" stop program = "/etc/init.d/memcached stop" if cpu is greater than 80% for 4 cycles then restart ==> /etc/monit.d/testapp.monitrc <== check process testapp_pen with pidfile /var/run/pen/testapp.pid start program = "/usr/bin/pen -F /etc/pen.d/testapp.conf -u nobody -p /var/run/pen/testapp.pid -C 127.0.0.1:9999 127.0.0.1:10000" stop program = "/bin/bash -c 'kill -s SIGTERM `cat /var/run/pen/testapp.pid`'" if totalmem is greater than 10.0 MB for 2 cycles then restart if cpu is greater than 50% for 2 cycles then restart if 2 restarts within 3 cycles then timeout depends on testapp_pen.conf group testapp check file testapp_pen.conf with path /etc/pen.d/testapp.conf if changed checksum then restart check process testapp_mongrel_1 with pidfile /u/apps/testapp/shared/tmp/pids/mongrel.10001.pid start program = "/usr/bin/mongrel_rails cluster::start --clean -C /u/apps/testapp/current/config/mongrel_cluster.yml --only 10001" stop program = "/usr/bin/mongrel_rails cluster::stop -C /u/apps/testapp/current/config/mongrel_cluster.yml --only 10001" if totalmem is greater than 110.0 MB for 4 cycles then restart if cpu is greater than 80% for 4 cycles then restart if 10 restarts within 10 cycles then timeout group testapp check process testapp_mongrel_2 with pidfile /u/apps/testapp/shared/tmp/pids/mongrel.10002.pid start program = "/usr/bin/mongrel_rails cluster::start --clean -C /u/apps/testapp/current/config/mongrel_cluster.yml --only 10002" stop program = "/usr/bin/mongrel_rails cluster::stop -C /u/apps/testapp/current/config/mongrel_cluster.yml --only 10002" if totalmem is greater than 110.0 MB for 4 cycles then restart if cpu is greater than 80% for 4 cycles then restart if 10 restarts within 10 cycles then timeout group testapp _______________________________________________ monit-dev mailing list monit-dev@nongnu.org http://lists.nongnu.org/mailman/listinfo/monit-dev