Alternatively, if no such mode is possible, are there any issues to just
having the monit running in endless loop trying to recover the service?

On Mon, Dec 21, 2015 at 1:08 PM, Stas Oskin <[email protected]> wrote:

> Hi,
>
> Thanks for the clarification, it seems this was exactly the change between
> 4 and 5 that caused us the confusion.
>
> Is there a way to have a previous mode of operation, where the monit will
> reset the state by restarting the monit itself (and not the server, as per
> your suggestion)?
>
> Thanks.
>
> On Wed, Dec 16, 2015 at 6:11 PM, Martin Pala <[email protected]>
> wrote:
>
>> Hi,
>>
>> Monit will disable the process monitoring on excessive restart failures
>> due to "if 5 restart within 5 cycles then timeout" statement in your
>> configuration (the "timeout" action an alias for "unmonitor" and we
>> switched the documentation in the past to "unmonitor" as it is more clear:
>> https://mmonit.com/monit/documentation/monit.html#SERVICE-RESTART-LIMIT)
>>
>> The monitoring state is persistent - the "timed out" service has usually
>> some hard error which requires manual intervention (timeout statement
>> prevents endless restart loop). When the problem is resolved, the
>> monitoring needs to be enabled manually.
>>
>> If you want to drop the state for example after reboot, place the
>> statefile to tmpfs filesystem (you can use "set statefile <path>" statement
>> to customize state file placement).
>>
>> Regards,
>> Martin
>>
>>
>>
>> On 16 Dec 2015, at 14:20, Stas Oskin <[email protected]> wrote:
>>
>> Hi,
>>
>> After some more digging, it occurred to me that it might that monit just
>> stops monitoring the process after it unable to restart it.
>>
>> So on monit 4.x it appears this state was cleared when just restarting
>> monit, while on 5.x it seems you need actually to mark the check as active
>> manually via the monit command.
>>
>> Is this correct?
>>
>> On Fri, Nov 20, 2015 at 5:37 PM, Martin Pala <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> please run monit in debug mode and send output:
>>>
>>> monit -vI
>>>
>>> Regards,
>>> Martin
>>>
>>>
>>>
>>> On 19 Nov 2015, at 20:39, Stas Oskin <[email protected]> wrote:
>>>
>>> Hi,
>>>
>>> The monit log shows only the general start-up information.
>>>
>>> There is no messages about the processes going offline, it's like monit
>>> does not use the pid file to find the process anymore.
>>>
>>> When I use HTTP port probing though it works just fine. Any idea what
>>> could it be?
>>>
>>> Regards.
>>>
>>> On Sun, Nov 15, 2015 at 9:16 PM, Martin Pala <[email protected]>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> please can you provide more details about the problem? (error messages
>>>> and/or monit log).
>>>>
>>>> Note that monit 5.9 includes fix for program execution for
>>>> CentOS6/RHEL6, we recommend upgrade to latest monit version (5.15), you can
>>>> get it here: https://mmonit.com/monit/#download. You can build rpm
>>>> directly from the source code release: rpmbuild -tb monit-5.15.tar.gz. I
>>>> think RHEL uses custom configuration file, official monit looks for
>>>> /etc/monitrc, so you may need to rename the configuration file or create a
>>>> link after upgrade.
>>>>
>>>> Regards,
>>>> Martin
>>>>
>>>>
>>>> > On 14 Nov 2015, at 16:51, Stas Oskin <[email protected]> wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> > Monit has reliably served us through the years, and we are very happy
>>>> of it.
>>>> >
>>>> > Unfortunately during scheduled migration to CentOS 6 due CentOS 5
>>>> EOL, we discovered it stopped monitoring the services pid files. HTTP
>>>> monitoring works fine.
>>>> >
>>>> > The CentOS 6 version is:
>>>> > monit-5.1.1-4.el6.x86_64
>>>> >
>>>> > CentOS 5 version is:
>>>> > monit-4.10.1-8.el5
>>>> >
>>>> > An example config that not working anymore (but accepted by monit
>>>> when starting):
>>>> > check process XXXX with pidfile /XXXX/pid/XXXXX.pid
>>>> >     start program  "/etc/init.d/xxxxx restart"
>>>> >     stop program  "/etc/init.d/xxxxxx stop"
>>>> >     if mem usage > 85% then restart
>>>> >     if 5 restarts within 5 cycles then timeout
>>>> >
>>>> > I guess something changed in configuration jump from 4 to 5, will
>>>> appreciate any advice.
>>>> >
>>>> > Thanks!
>>>> > --
>>>> > To unsubscribe:
>>>> > https://lists.nongnu.org/mailman/listinfo/monit-general
>>>>
>>>>
>>>> --
>>>> To unsubscribe:
>>>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>>>
>>>
>>> --
>>> To unsubscribe:
>>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>>
>>>
>>>
>>> --
>>> To unsubscribe:
>>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>>
>>
>> --
>> To unsubscribe:
>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>
>>
>>
>> --
>> To unsubscribe:
>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>
>
>
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

Reply via email to