Just to report that this happens also when monit is monitoring back, for
example:

[EDT Sep 24 15:18:34] info     : monit daemon with PID 17391 awakened
[EDT Sep 24 15:18:34] info     : 'server1' monitor action done
[EDT Sep 24 15:18:34] info     : Awakened by User defined signal 1
[EDT Sep 24 15:18:34] info     : 'server2' monitor on user request
[EDT Sep 24 15:18:34] info     : monit daemon with PID 17391 awakened
[EDT Sep 24 15:18:34] info     : 'server2' monitor action done
[EDT Sep 24 15:18:34] info     : 'server3' monitor on user request
[EDT Sep 24 15:18:34] info     : monit daemon with PID 17391 awakened
[EDT Sep 24 15:18:34] error    : 'server1' connection failed,
INET[server1:80] via TCP is not ready for i|o -- I
nterrupted system call
[EDT Sep 24 15:18:34] info     : 'server6' monitor on user request
[EDT Sep 24 15:18:34] info     : monit daemon with PID 17391 awakened
[EDT Sep 24 15:18:34] info     : 'server4' monitor on user request
[EDT Sep 24 15:18:34] info     : monit daemon with PID 17391 awakened
[EDT Sep 24 15:18:34] info     : 'server5' monitor on user request
[EDT Sep 24 15:18:34] info     : monit daemon with PID 17391 awakened
[EDT Sep 24 15:18:35] info     : 'server3' monitor action done
[EDT Sep 24 15:18:35] info     : 'server6' monitor action done
[EDT Sep 24 15:18:35] info     : 'server4' monitor action done
[EDT Sep 24 15:18:35] info     : 'server5' monitor action done
[EDT Sep 24 15:18:35] info     : Awakened by User defined signal 1
[EDT Sep 24 15:18:35] info     : 'server1' connection succeeded to
INET[server1:80] via TCP



On Mon, Sep 24, 2012 at 10:14 AM, Nestor Urquiza
<[email protected]>wrote:

> Hi guys,
>
> Not sure if this is a problem in other OSs as well but I believe I have
> found a bug in monit 5.5 which at least for Solaris 10 is failing to
> synchronize unmonitor actions with ongoing checks. Here is how to recreate
> (tested in two different physical Solaris boxes (Intel)
>
> 1. Configure monit to check every minute. Create several instances like
> the below, checking several external ports and servers:
>
> check host myhost with address myhost
>
> if failed port myport type tcp with timeout 15 seconds
>
>    then alert
>
> 2. Issue the below command exactly by the time monit runs (when the clock
> is giving hh:mm:59):
>
> monit unmonitor all
>
> 3. Randomly you get an alert for at least one of the host/port combination
> even though the host/port is actually available. As an example:
>
> Action: alert, Description: connection failed, INET[mssql:1433] via TCP is
> not ready for i|o -- Interrupted system call, Service: ptrsvr, Tested From
> Host: myhost
>
> 4. After issuing 'monit monitor all' no alert about the service being back
> up is sent but 'monit status' does show the service is up.
>
>
> IMO monit has a bug where basically it does not synchronize the calls to
> unmonitor and the checks to be performed. If monit receives "unmonitor all"
> it should: (wait for all current checks to finish OR cancel them AND ignore
> any alert messages to be sent).
>
>
> Makes sense?
>
>
> Thanks!
>
> -Nestor
>
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

Reply via email to