Hello TJ, thank you for data.

The problem is, that monit updates the state file after each test cycle with 
blocking write ... when the filesystem freezes, monit is blocked by the state 
file update and wasn't able to continue monitoring until the filesystem becomes 
writable again. I have created an issue to track this problem, we'll fix it: 
https://bitbucket.org/tildeslash/monit/issues/493/state-file-refactor-to-non-blocking-read

Workaround is to place the statefile to a filesystem which won't freeze (tmpfs 
should work), you can set the location using "set statefile" statement:

        set statefile /run/monit.state

Best regards,
Martin



> On 27 Oct 2016, at 22:10, TJ Stroker <[email protected]> wrote:
> 
> Hello Martin,
> 
> Just emailed to you.
> 
> Thank you!
> 
> On Thu, Oct 27, 2016 at 7:09 AM, Martin Pala <[email protected] 
> <mailto:[email protected]>> wrote:
> Hello,
> 
> please can you send you Monit configuration and Monit log to 
> [email protected] <mailto:[email protected]>?
> 
> The status messages were most probably send to M/Monit, otherwise the chart 
> will have a gap,for example:
> 
> <PastedGraphic-1.png>
> 
> 
> 
> Best regards,
> Martin
> 
> 
>> On 26 Oct 2016, at 22:04, TJ Stroker <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Hello
>> 
>> I wanted to ask a questions, or point out an issue... whichever fits
>> 
>> Yesterday afternoon I noticed an odd issue with a server, which just 
>> happened to be running monit 5.19. The issue had actually been in effect for 
>> a couple of days. I use m/monit, but never had received any alerts on this 
>> issue. The issue is highlighted in this RedHat TID
>> 
>> Message "audit: backlog limit exceeded" reported and possibly hung system 
>> due to a frozen filesystem
>> 
>> https://access.redhat.com/solutions/473223 
>> <https://access.redhat.com/solutions/473223>
>> 
>> 
>> What I found when I ssh'd to my server was that I had a system load of 299. 
>> However monit and m/monit both showed a load of almost 0. I will attach an 
>> m/monit weekly load graph. 
>> 
>> This server is not used for anything but internal, so it didn't create any 
>> real problems for us. But it could have been something more important.
>> 
>> At this point (as I'm still digging into the auditd issue) I can only think 
>> that somehow, due to the freeze, monit was unable to queue messages out. And 
>> because of this I had no error condition on m/monit. 
>> 
>> So I wanted to point it out, but also ask if there might be some insight on 
>> how to catch this type of issue in the future.
>> 
>> 
>> Jim
>> <Screen Shot 2016-10-26 at 11.21.59 AM.png>--
>> To unsubscribe:
>> https://lists.nongnu.org/mailman/listinfo/monit-general 
>> <https://lists.nongnu.org/mailman/listinfo/monit-general>
> 
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general 
> <https://lists.nongnu.org/mailman/listinfo/monit-general>
> 
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general

--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

Reply via email to