Alternatively, if no such mode is possible, are there any issues to just having the monit running in endless loop trying to recover the service?
On Mon, Dec 21, 2015 at 1:08 PM, Stas Oskin <[email protected]> wrote: > Hi, > > Thanks for the clarification, it seems this was exactly the change between > 4 and 5 that caused us the confusion. > > Is there a way to have a previous mode of operation, where the monit will > reset the state by restarting the monit itself (and not the server, as per > your suggestion)? > > Thanks. > > On Wed, Dec 16, 2015 at 6:11 PM, Martin Pala <[email protected]> > wrote: > >> Hi, >> >> Monit will disable the process monitoring on excessive restart failures >> due to "if 5 restart within 5 cycles then timeout" statement in your >> configuration (the "timeout" action an alias for "unmonitor" and we >> switched the documentation in the past to "unmonitor" as it is more clear: >> https://mmonit.com/monit/documentation/monit.html#SERVICE-RESTART-LIMIT) >> >> The monitoring state is persistent - the "timed out" service has usually >> some hard error which requires manual intervention (timeout statement >> prevents endless restart loop). When the problem is resolved, the >> monitoring needs to be enabled manually. >> >> If you want to drop the state for example after reboot, place the >> statefile to tmpfs filesystem (you can use "set statefile <path>" statement >> to customize state file placement). >> >> Regards, >> Martin >> >> >> >> On 16 Dec 2015, at 14:20, Stas Oskin <[email protected]> wrote: >> >> Hi, >> >> After some more digging, it occurred to me that it might that monit just >> stops monitoring the process after it unable to restart it. >> >> So on monit 4.x it appears this state was cleared when just restarting >> monit, while on 5.x it seems you need actually to mark the check as active >> manually via the monit command. >> >> Is this correct? >> >> On Fri, Nov 20, 2015 at 5:37 PM, Martin Pala <[email protected]> >> wrote: >> >>> Hi, >>> >>> please run monit in debug mode and send output: >>> >>> monit -vI >>> >>> Regards, >>> Martin >>> >>> >>> >>> On 19 Nov 2015, at 20:39, Stas Oskin <[email protected]> wrote: >>> >>> Hi, >>> >>> The monit log shows only the general start-up information. >>> >>> There is no messages about the processes going offline, it's like monit >>> does not use the pid file to find the process anymore. >>> >>> When I use HTTP port probing though it works just fine. Any idea what >>> could it be? >>> >>> Regards. >>> >>> On Sun, Nov 15, 2015 at 9:16 PM, Martin Pala <[email protected]> >>> wrote: >>> >>>> Hi, >>>> >>>> please can you provide more details about the problem? (error messages >>>> and/or monit log). >>>> >>>> Note that monit 5.9 includes fix for program execution for >>>> CentOS6/RHEL6, we recommend upgrade to latest monit version (5.15), you can >>>> get it here: https://mmonit.com/monit/#download. You can build rpm >>>> directly from the source code release: rpmbuild -tb monit-5.15.tar.gz. I >>>> think RHEL uses custom configuration file, official monit looks for >>>> /etc/monitrc, so you may need to rename the configuration file or create a >>>> link after upgrade. >>>> >>>> Regards, >>>> Martin >>>> >>>> >>>> > On 14 Nov 2015, at 16:51, Stas Oskin <[email protected]> wrote: >>>> > >>>> > Hi, >>>> > >>>> > Monit has reliably served us through the years, and we are very happy >>>> of it. >>>> > >>>> > Unfortunately during scheduled migration to CentOS 6 due CentOS 5 >>>> EOL, we discovered it stopped monitoring the services pid files. HTTP >>>> monitoring works fine. >>>> > >>>> > The CentOS 6 version is: >>>> > monit-5.1.1-4.el6.x86_64 >>>> > >>>> > CentOS 5 version is: >>>> > monit-4.10.1-8.el5 >>>> > >>>> > An example config that not working anymore (but accepted by monit >>>> when starting): >>>> > check process XXXX with pidfile /XXXX/pid/XXXXX.pid >>>> > start program "/etc/init.d/xxxxx restart" >>>> > stop program "/etc/init.d/xxxxxx stop" >>>> > if mem usage > 85% then restart >>>> > if 5 restarts within 5 cycles then timeout >>>> > >>>> > I guess something changed in configuration jump from 4 to 5, will >>>> appreciate any advice. >>>> > >>>> > Thanks! >>>> > -- >>>> > To unsubscribe: >>>> > https://lists.nongnu.org/mailman/listinfo/monit-general >>>> >>>> >>>> -- >>>> To unsubscribe: >>>> https://lists.nongnu.org/mailman/listinfo/monit-general >>>> >>> >>> -- >>> To unsubscribe: >>> https://lists.nongnu.org/mailman/listinfo/monit-general >>> >>> >>> >>> -- >>> To unsubscribe: >>> https://lists.nongnu.org/mailman/listinfo/monit-general >>> >> >> -- >> To unsubscribe: >> https://lists.nongnu.org/mailman/listinfo/monit-general >> >> >> >> -- >> To unsubscribe: >> https://lists.nongnu.org/mailman/listinfo/monit-general >> > >
-- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general
