Re: monit not catching failed ping test

Fant, Andrew (NIH/NIDA) [E] Fri, 08 Mar 2019 13:01:21 -0800

In the monitrc file, I have:

set daemon   120


As for the monit -vi output, it has 22 remote host checks in total.  A 
shortened, anonymized copy of it is:

Adding 'allow localhost' -- host resolved to [::ffff:127.0.0.1]
Adding credentials for user 'admin'
Runtime constants:
 Control file       = /etc/monitrc
 Log file           = syslog
 Pid file           = /etc/monit/monit.pid
 Id file            = /etc/monit/monit.id
 State file         = /etc/monit/monit.state
 Debug              = True
 Log                = True
 Use syslog         = True
 Is Daemon          = True
 Use process engine = True
 Limits             = {
                    =   programOutput:     512 B
                    =   sendExpectBuffer:  256 B
                    =   fileContentBuffer: 512 B
                    =   httpContentBuffer: 1 MB
                    =   networkTimeout:    5 s
                    =   programTimeout:    5 m
                    =   stopTimeout:       30 s
                    =   startTimeout:      30 s
                    =   restartTimeout:    30 s
                    = }
 On reboot          = start
 Poll time          = 120 seconds with start delay 0 seconds
 Event queue        = base directory /var/monitor with 1000 slots
 M/Monit(s)         = http://[host1.local]:8080/collector with timeout 5 s with 
credentials
 Start monit httpd  = True
 httpd bind address = localhost
 httpd portnumber   = 2812
 httpd signature    = Enabled
 httpd auth. style  = Basic Authentication and Host/Net allow list

The service list contains the following entries:

System Name           = host1
 Monitoring mode      = active
 On reboot            = start

Remote Host Name      = host2_ping
 Address              = 192.168.1.2
 Monitoring mode      = active
 On reboot            = start
 Ping                 = if failed [count 3 size 64 with timeout 5 s] then alert

-------------------------------------------------------------------------------

Hopefully this will be of some use.


--
Andrew Fant                      |            Systems Administrator
[email protected]       |      Lei Shi Lab , NIH/NIDA/IRP
(443)740-2849                   |

From: "[email protected]" <[email protected]>
Reply-To: This is the general mailing list for monit <[email protected]>
Date: Friday, March 8, 2019 at 3:26 PM
To: This is the general mailing list for monit <[email protected]>
Subject: Re: monit not catching failed ping test

Hello,

monit checks the service in intervals given by the "set daemon <x>" settings. 
If the interval between checks is long or the check is blocked by some service 
timeout/action, then the interval can be longer.

Please can you check the "set daemon" settings and run monit in debug mode?:

1.) stop monit
2.) monit -vI

Best regards,
Martin



On 8 Mar 2019, at 16:49, Fant, Andrew (NIH/NIDA) [E] 
<[email protected]<mailto:[email protected]>> wrote:

Good morning.
     I have a small monitoring setup with m/monit 3.7.2, using monit 5.25.2 as 
the agent.   There are a couple of systems that I cannot install monit on that 
I still need to be aware of any downtime, so I have added them as ping checks 
in the monitrc on the host where I installed m/monit.  Yesterday, one of those 
remote systems went down, but monit and m/monit didn’t report an alert for it 
and still have its status as OK.  Using anonymized information,  the entry in 
the monitrc on host1 is:

CHECK HOST host2_ping with ADDRESS 192.168.1.2
        IF FAILED ping THEN ALERT

And from the command line on host1:

host1% monit status host2_ping
Monit 5.25.2 uptime: 48d 19h 8m

Remote Host 'host2_ping'
  status                       OK
  monitoring status            Monitored
  monitoring mode              active
  on reboot                    start
  ping response time           -
  data collected               Fri, 08 Mar 2019 10:41:33

But:

host1% ping host2
PING host2.example.org<http://host2.example.org/> (192.168.1.2) 56(84) bytes of 
data.
From host1.example.org<http://host1.example.org/> (192.168.1.1) icmp_seq=1 
Destination Host Unreachable
From host1.example.org<http://host1.example.org/> (192.168.1.1) icmp_seq=2 
Destination Host Unreachable
From host1.example.org<http://host1.example.org/> (192.168.1.1) icmp_seq=3 
Destination Host Unreachable

Clearly there is a disconnect between the OS-provided ping utility and what 
monit is seeing.   I’m sure that it’s probably a simple error in configuration, 
but I am not seeing what I did wrong.   Can someone please set me on the 
correct path?

Thank you

--
Andrew Fant                      |            Systems Administrator
[email protected]<mailto:[email protected]>       |      Lei Shi Lab , 
NIH/NIDA/IRP
(443)740-2849                   |
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

-- 
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

Re: monit not catching failed ping test

Reply via email to