Hi, Thanks for the information, this is exactly what I needed. This setting will make monit work for me again :-)
best regards, --Jan > On 2019-06-03, at 08:37, [email protected] wrote: > > Hi, > > since monit 5.16.0, the exec action is executed only on a state change. In > your case the service didn't transition to the "succeeded" state, so the exec > action wasn't repeated. > > If you want to retry the exec action if the service remains in failure state, > you can use the "repeat" option. > > Snip from monit 5.16.0 changelog which provides more details: > > --8<-- > New: The exec action is now executed only once, on state change, same way as > the alert > action. The new "repeat" option allows to repeat the exec action after given > number of > cycles if the error persists. Syntax: > if <test> then exec <script> repeat every <x> cycles > If you want to get the old behaviour, use "repeat every 1 cycle". Example: > if failed port 1234 then exec "/usr/bin/myscript.sh" repeat every 5 > cycles > --8<-- > > Best regards, > Martin > > >> On 31 May 2019, at 19:14, Jan Rychter <[email protected]> wrote: >> >> Hi, >> >> I'm looking for help, because I can't figure out what I'm doing wrong. I >> have a simple monit setup, which is supposed to monitor a web server and >> restart it if anything seems wrong. >> >> This seems to work but not always. Monit does restart the service, but on >> subsequent failures it just notices that the service isn't working and >> doesn't act anymore. >> >> Example from the log, where the service was restarted, but went down again, >> and monit didn't do anything: >> >> [CEST May 31 06:44:11] info : 'triac.mysite.com' Monit 5.16 started >> [CEST May 31 09:36:29] error : 'mysite.com' failed protocol test [HTTP] >> at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource >> temporarily unavailable >> [CEST May 31 09:37:39] error : 'mysite.com' failed protocol test [HTTP] >> at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource >> temporarily unavailable >> [CEST May 31 09:37:39] info : 'mysite.com' exec: /usr/bin/supervisorctl >> [CEST May 31 09:38:49] error : 'mysite.com' failed protocol test [HTTP] >> at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource >> temporarily unavailable >> [CEST May 31 09:39:59] error : 'mysite.com' failed protocol test [HTTP] >> at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource >> temporarily unavailable >> [CEST May 31 09:41:09] error : 'mysite.com' failed protocol test [HTTP] >> at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource >> temporarily unavailable >> [CEST May 31 09:42:19] error : 'mysite.com' failed protocol test [HTTP] >> at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource >> temporarily unavailable >> [CEST May 31 09:43:29] error : 'mysite.com' failed protocol test [HTTP] >> at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource >> temporarily unavailable >> [CEST May 31 09:44:39] error : 'mysite.com' failed protocol test [HTTP] >> at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource >> temporarily unavailable >> [CEST May 31 09:45:50] error : 'mysite.com' failed protocol test [HTTP] >> at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource >> temporarily unavailable >> [CEST May 31 09:47:00] error : 'mysite.com' failed protocol test [HTTP] >> at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource >> temporarily unavailable >> [CEST May 31 09:48:10] error : 'mysite.com' failed protocol test [HTTP] >> at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource >> temporarily unavailable >> >> The net result is that the service doesn't work and monit just sits there, >> knowing that the service failed the protocol test, but doing nothing about >> it. >> >> I suspect this is because monit does not notice that the service was OK >> after restarting for a moment, so it does not notice another transition from >> OK to failed. >> >> Here is the relevant part of the configuration (nearly all of it): >> >> set daemon 60 >> check host mysite.com with address mysite.com >> if failed >> port 443 >> protocol https >> with ssl options {verify: enable} >> for 2 cycles >> then exec "/usr/bin/supervisorctl restart mysite" >> if 20 restarts within 60 cycles then unmonitor >> >> Is there a way to achieve unconditional actions? E.g. "even though I haven't >> noticed the service to transition from failed to working, restart it anyway >> after 60 seconds if it is still in the failed state" >> >> Any help would be much appreciated. >> >> --J. >> >> >> -- >> To unsubscribe: >> https://lists.nongnu.org/mailman/listinfo/monit-general > > > -- > To unsubscribe: > https://lists.nongnu.org/mailman/listinfo/monit-general -- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general
