In the monitrc file, I have:
set daemon 120
As for the monit -vi output, it has 22 remote host checks in total. A
shortened, anonymized copy of it is:
Adding 'allow localhost' -- host resolved to [::ffff:127.0.0.1]
Adding credentials for user 'admin'
Runtime constants:
Control file = /etc/monitrc
Log file = syslog
Pid file = /etc/monit/monit.pid
Id file = /etc/monit/monit.id
State file = /etc/monit/monit.state
Debug = True
Log = True
Use syslog = True
Is Daemon = True
Use process engine = True
Limits = {
= programOutput: 512 B
= sendExpectBuffer: 256 B
= fileContentBuffer: 512 B
= httpContentBuffer: 1 MB
= networkTimeout: 5 s
= programTimeout: 5 m
= stopTimeout: 30 s
= startTimeout: 30 s
= restartTimeout: 30 s
= }
On reboot = start
Poll time = 120 seconds with start delay 0 seconds
Event queue = base directory /var/monitor with 1000 slots
M/Monit(s) = http://[host1.local]:8080/collector with timeout 5 s with
credentials
Start monit httpd = True
httpd bind address = localhost
httpd portnumber = 2812
httpd signature = Enabled
httpd auth. style = Basic Authentication and Host/Net allow list
The service list contains the following entries:
System Name = host1
Monitoring mode = active
On reboot = start
Remote Host Name = host2_ping
Address = 192.168.1.2
Monitoring mode = active
On reboot = start
Ping = if failed [count 3 size 64 with timeout 5 s] then alert
-------------------------------------------------------------------------------
Hopefully this will be of some use.
--
Andrew Fant | Systems Administrator
[email protected] | Lei Shi Lab , NIH/NIDA/IRP
(443)740-2849 |
From: "[email protected]" <[email protected]>
Reply-To: This is the general mailing list for monit <[email protected]>
Date: Friday, March 8, 2019 at 3:26 PM
To: This is the general mailing list for monit <[email protected]>
Subject: Re: monit not catching failed ping test
Hello,
monit checks the service in intervals given by the "set daemon <x>" settings.
If the interval between checks is long or the check is blocked by some service
timeout/action, then the interval can be longer.
Please can you check the "set daemon" settings and run monit in debug mode?:
1.) stop monit
2.) monit -vI
Best regards,
Martin
On 8 Mar 2019, at 16:49, Fant, Andrew (NIH/NIDA) [E]
<[email protected]<mailto:[email protected]>> wrote:
Good morning.
I have a small monitoring setup with m/monit 3.7.2, using monit 5.25.2 as
the agent. There are a couple of systems that I cannot install monit on that
I still need to be aware of any downtime, so I have added them as ping checks
in the monitrc on the host where I installed m/monit. Yesterday, one of those
remote systems went down, but monit and m/monit didn’t report an alert for it
and still have its status as OK. Using anonymized information, the entry in
the monitrc on host1 is:
CHECK HOST host2_ping with ADDRESS 192.168.1.2
IF FAILED ping THEN ALERT
And from the command line on host1:
host1% monit status host2_ping
Monit 5.25.2 uptime: 48d 19h 8m
Remote Host 'host2_ping'
status OK
monitoring status Monitored
monitoring mode active
on reboot start
ping response time -
data collected Fri, 08 Mar 2019 10:41:33
But:
host1% ping host2
PING host2.example.org<http://host2.example.org/> (192.168.1.2) 56(84) bytes of
data.
From host1.example.org<http://host1.example.org/> (192.168.1.1) icmp_seq=1
Destination Host Unreachable
From host1.example.org<http://host1.example.org/> (192.168.1.1) icmp_seq=2
Destination Host Unreachable
From host1.example.org<http://host1.example.org/> (192.168.1.1) icmp_seq=3
Destination Host Unreachable
Clearly there is a disconnect between the OS-provided ping utility and what
monit is seeing. I’m sure that it’s probably a simple error in configuration,
but I am not seeing what I did wrong. Can someone please set me on the
correct path?
Thank you
--
Andrew Fant | Systems Administrator
[email protected]<mailto:[email protected]> | Lei Shi Lab ,
NIH/NIDA/IRP
(443)740-2849 |
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general