Hi, the "if failed host ... " test is skipped, as the process the first check is process existence and if it doesn't exist, monit assumes it makes no sense to test process' port if the process is not running and tries to immediately restart it using start/stop/restart programs.
You need to add the start/stop programs or use the "if does not exist then ..." statement to override the default restart action. Regards, Martin > On 11 Dec 2015, at 10:19, Gerald Weber <[email protected]> wrote: > > Hi, > > using Monit 5.15 on FreeBSD 10.2: > > <config> > set daemon 5 > set logfile syslog > set pidfile /var/run/monit.pid > set idfile /var/.monit.id > set statefile /var/.monit.state > set alert [email protected] <mailto:[email protected]> > set mailserver localhost > set httpd port 2812 and > use address 192.168.40.72 > allow 192.168.20.0/24 > allow admin:monit > > check process haproxy with pidfile /var/run/haproxy.pid > if failed host 192.168.40.72 port 9090 type tcp > then exec "/bin/sh -c '/bin/echo `/bin/date` >> /tmp/monit.test'" > </config> > > > When i run monit with -vI and i kill haproxy, i have the following output: > > <log> > Adding net allow '192.168.20.0/24' > Adding credentials for user 'admin' > Runtime constants: > Control file = /usr/local/etc/monitrc > Log file = syslog > Pid file = /var/run/monit.pid > Id file = /var/.monit.id > State file = /var/.monit.state > Debug = True > Log = True > Use syslog = True > Is Daemon = True > Use process engine = True > Poll time = 5 seconds with start delay 0 seconds > Expect buffer = 256 bytes > Mail server(s) = localhost:25 with timeout 30 seconds > Mail from = (not defined) > Mail subject = (not defined) > Mail message = (not defined) > Start monit httpd = True > httpd bind address = 192.168.40.72 > httpd portnumber = 2812 > httpd ssl = Disabled > httpd signature = Enabled > httpd auth. style = Basic Authentication and Host/Net allow list > Alert mail to = root@localhost > Alert on = All events > > The service list contains the following entries: > > Process Name = haproxy > Pid file = /var/run/haproxy.pid > Monitoring mode = active > Existence = if does not exist then restart > Port = if failed [192.168.40.72]:9090 type TCP/IP protocol > DEFAULT with timeout 5 seconds then exec '/bin/sh -c /bin/echo `/bin/date` >> > /tmp/monit.test' > > System Name = appsrv01 > Monitoring mode = active > > ------------------------------------------------------------------------------- > pidfile '/var/run/monit.pid' does not exist > Starting Monit 5.15 daemon with http interface at [192.168.40.72]:2812 > Starting Monit HTTP server at [192.168.40.72]:2812 > Monit HTTP server started > 'appsrv01' Monit 5.15 started > Sending Monit instance changed notification to root@localhost > 'haproxy' process is running with pid 42999 > 'haproxy' zombie check succeeded > 'haproxy' succeeded testing protocol [DEFAULT] at [192.168.40.72]:9090 > [TCP/IP] > 'haproxy' connection succeeded to [192.168.40.72]:9090 [TCP/IP] > 'haproxy' process is running with pid 42999 > 'haproxy' zombie check succeeded > 'haproxy' succeeded testing protocol [DEFAULT] at [192.168.40.72]:9090 > [TCP/IP] > 'haproxy' connection succeeded to [192.168.40.72]:9090 [TCP/IP] > 'haproxy' process is running with pid 42999 > 'haproxy' zombie check succeeded > 'haproxy' succeeded testing protocol [DEFAULT] at [192.168.40.72]:9090 > [TCP/IP] > 'haproxy' connection succeeded to [192.168.40.72]:9090 [TCP/IP] > 'haproxy' process test failed [pid=42999] -- No such process > 'haproxy' process is not running > Sending Does not exist notification to root@localhost > 'haproxy' trying to restart > 'haproxy' stop skipped -- method not defined > 'haproxy' start method not defined > 'haproxy' monitoring enabled > 'haproxy' process test failed [pid=42999] -- No such process > 'haproxy' process is not running > 'haproxy' trying to restart > 'haproxy' stop skipped -- method not defined > 'haproxy' start method not defined > 'haproxy' monitoring enabled > ^CShutting down Monit HTTP server > Monit HTTP server stopped > Monit daemon with pid [48685] stopped > 'appsrv01' Monit 5.15 stopped > Sending Monit instance changed notification to root@localhost > </log> > > The EXEC Line never gets executed, i dont see any new lines in /tmp/monit.test > If i change the checked Port from 9090 to some invalid port, lets say 9190 > and start monit (haproxy is running !), i see: > > <log> > Starting Monit 5.15 daemon with http interface at [192.168.40.72]:2812 > Starting Monit HTTP server at [192.168.40.72]:2812 > Monit HTTP server started > 'appsrv01' Monit 5.15 started > Sending Monit instance changed notification to root@localhost > 'haproxy' process is running with pid 50703 > 'haproxy' zombie check succeeded > Socket test failed for [192.168.40.72]:9190 -- Connection refused > 'haproxy' failed protocol test [DEFAULT] at [192.168.40.72]:9190 [TCP/IP] -- > Connection refused > Sending Connection failed notification to root@localhost > 'haproxy' exec: /bin/sh > 'haproxy' process is running with pid 50703 > 'haproxy' zombie check succeeded > Socket test failed for [192.168.40.72]:9190 -- Connection refused > 'haproxy' failed protocol test [DEFAULT] at [192.168.40.72]:9190 [TCP/IP] -- > Connection refused > 'haproxy' exec: /bin/sh > </log> > > Why does the EXEC Line works here but not when i kill -9 haproxy ? > What i'm trying to do is get monit to run the exec in case of a haproxy > failure. > The exec line will then contain a command to switch the CARP IP to another > host. > haproxy itself is monitored using zabbix, so the NOC can investigate the > cause of the failure later. > > Thanks®ards > gerald > > -- > To unsubscribe: > https://lists.nongnu.org/mailman/listinfo/monit-general > <https://lists.nongnu.org/mailman/listinfo/monit-general>
-- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general
