On 7/22/14 2:32 AM, Vincent WATREMEZ wrote:
Hi Paul,
By any chance, does the user running monit have the right privileges
to run `burcado status`?
Also, you might debug the status command by displaying its output to
STDERR.
Regards
Vincent
Thanks Vincent - I figured out part of it. It wasn't privileges, it was
paths - monit maintains a very strict and limited path. When I ran the
script myself as root on the command line, it worked fine. But within
monit, it actually wanted my test script to be this, with all paths
spelled out:
#!/bin/sh
/usr/local/bin/bucardo.pl status |/bin/grep trumgr|/usr/bin/cut -d"|"
-f4|/bin/grep ".m" >/dev/null 2>&1
exit $?
Now monit correctly understands when the test fails:
Program 'bucardo.monitor'
status Status failed
monitoring status Monitored
last started Wed, 23 Jul 2014 13:01:36
last exit value 0
data collected Wed, 23 Jul 2014 13:01:36
Which is all great - except that it never generates an alert! I've
confirmed that my other checks generate alerts - only this one fails to
do so. I have of course tried reversing the status checks, etc - no
joy. So I'm still stuck.
2014-07-21 23:44 GMT+02:00 Paul Theodoropoulos <[email protected]
<mailto:[email protected]>>:
I have a daemon which I want to monitor specific status. I've
created the following script called 'bucardo.monitor':
#!/bin/sh
bucardo status |grep mydb|cut -d"|" -f4| grep ".m" >/dev/null 2>&1
exit $?
In short, if the string "(one char)m" exists, I wish to get an
alert. When I run the script from the command line, and the string
I'm looking for exists, I get the following expected output:
me# bucardo.monitor;echo $?
0
I created a monit conf file thus:
alert [email protected] <mailto:[email protected]> with
reminder on 5 cycle
alert [email protected] <mailto:[email protected]> with
reminder on 5 cycle
check program bucardo-monitor with path /usr/local/bin/bucardo.monitor
with timeout 3 seconds
if status = 0 then alert
The manual states that the operator should be "==", however the
last example under status only uses a single equals sign - and
I've tried both, no difference. I've also use just "if status 0
then alert" as suggested in the manual, also no difference.
The problem is that monit always shows a last exit status of "1" -
except for a few moments after issuing 'monit reload' to deploy
changes to the script:
Program 'bucardo-monitor'
status Status ok
monitoring status Monitored
last started Mon, 21 Jul 2014 14:40:47
last exit value 1
data collected Mon, 21 Jul 2014 14:40:47
I've forced the test to be highly sensitive so that it will
changed from an exit of 0 to 1 every few minutes, well within my
monitoring window - but again, I never get a status other than 1
in monit status, and thus never get an alert.
Am I doing something wrong? Misunderstanding?
--
Paul Theodoropoulos
www.anastrophe.com <http://www.anastrophe.com>
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general
--
Paul Theodoropoulos
www.anastrophe.com
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general