Strange, as on CentOS 5 it was enough to restart monit only.

Anyhow, thanks for helping to finally pinpoint the issue.
On Dec 21, 2015 4:46 PM, "Martin Pala" <[email protected]> wrote:

> The monitoring state persistency is part of monit for very long time - i
> think most probably even monit <= 3.x worked like this.
>
> As mentioned in the previous email, you can remove the timeout statement -
> if it won't be possible to recover the service and restart action will be
> called each cycle, there will be no limit on number of restarts, but you
> will be notified and can fix it manually (which is required in such case
> anyway).
>
> Regards,
> Martin
>
>
> On 21 Dec 2015, at 12:09, Stas Oskin <[email protected]> wrote:
>
> Alternatively, if no such mode is possible, are there any issues to just
> having the monit running in endless loop trying to recover the service?
>
> On Mon, Dec 21, 2015 at 1:08 PM, Stas Oskin <[email protected]> wrote:
>
>> Hi,
>>
>> Thanks for the clarification, it seems this was exactly the change
>> between 4 and 5 that caused us the confusion.
>>
>> Is there a way to have a previous mode of operation, where the monit will
>> reset the state by restarting the monit itself (and not the server, as per
>> your suggestion)?
>>
>> Thanks.
>>
>> On Wed, Dec 16, 2015 at 6:11 PM, Martin Pala <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> Monit will disable the process monitoring on excessive restart failures
>>> due to "if 5 restart within 5 cycles then timeout" statement in your
>>> configuration (the "timeout" action an alias for "unmonitor" and we
>>> switched the documentation in the past to "unmonitor" as it is more clear:
>>> https://mmonit.com/monit/documentation/monit.html#SERVICE-RESTART-LIMIT)
>>>
>>> The monitoring state is persistent - the "timed out" service has usually
>>> some hard error which requires manual intervention (timeout statement
>>> prevents endless restart loop). When the problem is resolved, the
>>> monitoring needs to be enabled manually.
>>>
>>> If you want to drop the state for example after reboot, place the
>>> statefile to tmpfs filesystem (you can use "set statefile <path>" statement
>>> to customize state file placement).
>>>
>>> Regards,
>>> Martin
>>>
>>>
>>>
>>> On 16 Dec 2015, at 14:20, Stas Oskin <[email protected]> wrote:
>>>
>>> Hi,
>>>
>>> After some more digging, it occurred to me that it might that monit just
>>> stops monitoring the process after it unable to restart it.
>>>
>>> So on monit 4.x it appears this state was cleared when just restarting
>>> monit, while on 5.x it seems you need actually to mark the check as active
>>> manually via the monit command.
>>>
>>> Is this correct?
>>>
>>> On Fri, Nov 20, 2015 at 5:37 PM, Martin Pala <[email protected]>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> please run monit in debug mode and send output:
>>>>
>>>> monit -vI
>>>>
>>>> Regards,
>>>> Martin
>>>>
>>>>
>>>>
>>>> On 19 Nov 2015, at 20:39, Stas Oskin <[email protected]> wrote:
>>>>
>>>> Hi,
>>>>
>>>> The monit log shows only the general start-up information.
>>>>
>>>> There is no messages about the processes going offline, it's like monit
>>>> does not use the pid file to find the process anymore.
>>>>
>>>> When I use HTTP port probing though it works just fine. Any idea what
>>>> could it be?
>>>>
>>>> Regards.
>>>>
>>>> On Sun, Nov 15, 2015 at 9:16 PM, Martin Pala <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> please can you provide more details about the problem? (error messages
>>>>> and/or monit log).
>>>>>
>>>>> Note that monit 5.9 includes fix for program execution for
>>>>> CentOS6/RHEL6, we recommend upgrade to latest monit version (5.15), you 
>>>>> can
>>>>> get it here: https://mmonit.com/monit/#download. You can build rpm
>>>>> directly from the source code release: rpmbuild -tb monit-5.15.tar.gz. I
>>>>> think RHEL uses custom configuration file, official monit looks for
>>>>> /etc/monitrc, so you may need to rename the configuration file or create a
>>>>> link after upgrade.
>>>>>
>>>>> Regards,
>>>>> Martin
>>>>>
>>>>>
>>>>> > On 14 Nov 2015, at 16:51, Stas Oskin <[email protected]> wrote:
>>>>> >
>>>>> > Hi,
>>>>> >
>>>>> > Monit has reliably served us through the years, and we are very
>>>>> happy of it.
>>>>> >
>>>>> > Unfortunately during scheduled migration to CentOS 6 due CentOS 5
>>>>> EOL, we discovered it stopped monitoring the services pid files. HTTP
>>>>> monitoring works fine.
>>>>> >
>>>>> > The CentOS 6 version is:
>>>>> > monit-5.1.1-4.el6.x86_64
>>>>> >
>>>>> > CentOS 5 version is:
>>>>> > monit-4.10.1-8.el5
>>>>> >
>>>>> > An example config that not working anymore (but accepted by monit
>>>>> when starting):
>>>>> > check process XXXX with pidfile /XXXX/pid/XXXXX.pid
>>>>> >     start program  "/etc/init.d/xxxxx restart"
>>>>> >     stop program  "/etc/init.d/xxxxxx stop"
>>>>> >     if mem usage > 85% then restart
>>>>> >     if 5 restarts within 5 cycles then timeout
>>>>> >
>>>>> > I guess something changed in configuration jump from 4 to 5, will
>>>>> appreciate any advice.
>>>>> >
>>>>> > Thanks!
>>>>> > --
>>>>> > To unsubscribe:
>>>>> > https://lists.nongnu.org/mailman/listinfo/monit-general
>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe:
>>>>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>>>>
>>>>
>>>> --
>>>> To unsubscribe:
>>>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>>>
>>>>
>>>>
>>>> --
>>>> To unsubscribe:
>>>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>>>
>>>
>>> --
>>> To unsubscribe:
>>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>>
>>>
>>>
>>> --
>>> To unsubscribe:
>>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>>
>>
>>
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general
>
>
>
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general
>
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

Reply via email to