Thanks for data. It seems that there is different start command then in the original configuration snip:
—8<— Process Name = opsworks-agent … Start program = '/usr/bin/env service opsworks-agent start' timeout 30 second(s) —8<— vs. —8<— check process opsworks-agent with pidfile “/var/lib/aws/opsworks/pid/opsworks-agent.pid" start program = "/etc/init.d/opsworks-agent start" —8<— Since monit 5.8 the environment variables are no longer purged => the wrapping with “/usr/bin/env” is not necessary (but should still work). Please try to change the configuration like this: check process opsworks-agent with pidfile "/var/lib/aws/opsworks/pid/opsworks-agent.pid" start program = "/usr/sbin/service opsworks-agent start" stop program = "/usr/sbin/service opsworks-agent stop" depends on opsworks-agent-master-running depends on opsworks-agent-statistic-daemons-log depends on opsworks-agent-process-command-daemons-log depends on opsworks-agent-keep-alive-daemons-log group opsworks Regards, Martin > On 13 May 2015, at 08:14, Shrinath M <[email protected]> wrote: > > OK, done - > > Not sure if attaching files is allowed; and not much to show here either - so > here goes - > > Last few lines of log once I restarted in debug mode - > > [UTC May 13 06:09:05] info : Starting Monit 5.13 daemon with http > interface at [*]:2812 > [UTC May 13 06:09:05] info : Monit start delay set -- pause for 5s > [UTC May 13 06:09:10] info : Starting Monit HTTP server at [*]:2812 > [UTC May 13 06:09:10] info : Monit HTTP server started > [UTC May 13 06:09:10] info : 'crumble.localdomain' Monit started > [UTC May 13 06:09:10] info : M/Monit heartbeat started > [UTC May 13 06:09:10] error : 'opsworks-agent-master-running' process is > not running > [UTC May 13 06:09:10] error : 'opsworks-agent' process is not running > [UTC May 13 06:09:10] info : 'opsworks-agent' trying to restart > [UTC May 13 06:09:10] info : 'opsworks-agent' start: /usr/bin/env > [UTC May 13 06:09:42] error : 'opsworks-agent-master-running' process is > not running > [UTC May 13 06:09:42] info : 'opsworks-agent-master-running' trying to > restart > [UTC May 13 06:09:42] info : 'opsworks-agent' start: /usr/bin/env > [UTC May 13 06:10:14] error : 'opsworks-agent-master-running' process is > not running > [UTC May 13 06:10:14] info : 'opsworks-agent-master-running' trying to > restart > [UTC May 13 06:10:14] info : 'opsworks-agent' start: /usr/bin/env > > > The debug produced this - > > Starting monit: Adding credentials for user 'admin' > Runtime constants: > Control file = /etc/monit/monitrc > Log file = /var/log/monit.log > Pid file = /var/run/monit.pid > Id file = /var/lib/monit.id > State file = /var/run/monit.state > Debug = True > Log = True > Use syslog = False > Is Daemon = True > Use process engine = True > Poll time = 30 seconds with start delay 5 seconds > Expect buffer = 256 bytes > Event queue = base directory /var/monit with 100 slots > M/Monit(s) = http://[FILTERED_IP]:80/collector with timeout 5 > seconds using credentials > Mail from = [email protected] > Mail subject = $SERVICE $EVENT at $DATE > Mail message = Monit $ACTION $SERVI..(truncated) > Start monit httpd = True > httpd bind address = Any/All > httpd portnumber = 2812 > httpd ssl = Disabled > httpd signature = Enabled > httpd auth. style = Basic Authentication > > The service list contains the following entries: > > Process Name = opsworks-agent-master-running > Group = opsworks > Match = opsworks-agent: master > Monitoring mode = active > Existence = if does not exist for 2 cycles then restart > > Process Name = opsworks-agent > Group = opsworks > Pid file = /var/lib/aws/opsworks/pid/opsworks-agent.pid > Monitoring mode = active > Start program = '/usr/bin/env service opsworks-agent start' timeout > 30 second(s) > Stop program = '/usr/bin/env service opsworks-agent stop' timeout 30 > second(s) > Existence = if does not exist then restart > Depends on Service = opsworks-agent-keep-alive-daemons-log > Depends on Service = opsworks-agent-process-command-daemons-log > Depends on Service = opsworks-agent-statistic-daemons-log > Depends on Service = opsworks-agent-master-running > > File Name = opsworks-agent-statistic-daemons-log > Group = opsworks > Path = /var/log/aws/opsworks/opsworks-agent.statistics.log > Monitoring mode = active > Existence = if does not exist for 3 cycles then restart > Timestamp = if greater than 120 second(s) for 3 cycles then > restart > > File Name = opsworks-agent-process-command-daemons-log > Group = opsworks > Path = > /var/log/aws/opsworks/opsworks-agent.process_command.log > Monitoring mode = active > Existence = if does not exist for 3 cycles then restart > Timestamp = if greater than 120 second(s) for 3 cycles then > restart > > File Name = opsworks-agent-keep-alive-daemons-log > Group = opsworks > Path = /var/log/aws/opsworks/opsworks-agent.keep_alive.log > Monitoring mode = active > Existence = if does not exist for 3 cycles then restart > Timestamp = if greater than 120 second(s) for 3 cycles then > restart > > System Name = crumble.localdomain > Monitoring mode = active > > ------------------------------------------------------------------------------- > Monit daemon with PID 26769 awakened > > > On Wed, May 13, 2015 at 11:37 AM Martin Pala <[email protected]> wrote: > Please make sure monit logging is enabled (the “set logfile” statement) + run > Monit in debug mode (-v option), try to reproduce the problem and send logs. > > Regards, > Martin > > > > On 13 May 2015, at 07:15, Shrinath M <[email protected]> wrote: > > > > I am using AWS Opsworks and AWS uses an old version of monit (5.3.2) to > > monitor their agent. Obviously, when their opsworks-agent dies, monit > > restarts it. > > Recently, I wanted to monitor few processes of my own and required newer > > versions of monit to use the explicit "restart" command support. I upgraded > > monit to 5.13. > > Now, monit does not restart opsworks agent if it dies! > > > > I tried looking for changelog of monit to see if something was changed > > between versions, but could not find them for all versions beyond 5.7. > > Can someone please take a look at opsworks config below and see what might > > be breaking? > > > > opsworks-config follows - > > check process opsworks-agent with pidfile > > "/var/lib/aws/opsworks/pid/opsworks-agent.pid" > > start program = "/etc/init.d/opsworks-agent start" > > stop program = "/etc/init.d/opsworks-agent stop" > > depends on opsworks-agent-master-running > > depends on opsworks-agent-statistic-daemons-log > > depends on opsworks-agent-process-command-daemons-log > > depends on opsworks-agent-keep-alive-daemons-log > > group opsworks > > > > check process opsworks-agent-master-running matching > > "opsworks-agent:\smaster" > > if not exist for 2 cycles then restart > > group opsworks > > > > # check run of statistic daemon > > check file opsworks-agent-statistic-daemons-log with path > > "/var/log/aws/opsworks/opsworks-agent.statistics.log" > > if timestamp > 2 minutes for 3 cycles then restart > > if does not exist for 3 cycles then restart > > group opsworks > > > > # check run of process command daemon > > check file opsworks-agent-process-command-daemons-log with path > > "/var/log/aws/opsworks/opsworks-agent.process_command.log" > > if timestamp > 2 minutes for 3 cycles then restart > > if does not exist for 3 cycles then restart > > group opsworks > > > > # check run of keep alive deamon > > check file opsworks-agent-keep-alive-daemons-log with path > > "/var/log/aws/opsworks/opsworks-agent.keep_alive.log" > > if timestamp > 2 minutes for 3 cycles then restart > > if does not exist for 3 cycles then restart > > group opsworks > > > > - end of file > > > > Monit logs say restart done, but opsworks doesn't run. If I downgrade to > > 5.3.2, it does magically run! > > -- > > To unsubscribe: > > https://lists.nongnu.org/mailman/listinfo/monit-general > > > -- > To unsubscribe: > https://lists.nongnu.org/mailman/listinfo/monit-general > -- > To unsubscribe: > https://lists.nongnu.org/mailman/listinfo/monit-general -- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general
