Are the alerts on your system managed on Monit side or in M/Monit?

Best regards,
Martin


> On 12 Mar 2016, at 01:01, Paul Theodoropoulos <[email protected]> wrote:
> 
> I'm stumped. I have an ugly little script to alert me if today's backup of a 
> database is smaller than the one from yesterday (and the day before). The 
> script works properly, and I have a simple monit rule in place to alert me if 
> it fails. When monit checks, it reports a failure; that is pushed up to my 
> m/monit server, which also logs the failure. From there, all alerts go to 
> PagerDuty. But I never get alerts from this check. 
> 
> (Hopefully) all relevant output is below. Some strings have been obfuscated. 
> Note that I have the rule modified to falsely report a failure, for testing.
> 
> root@db1-primary: /etc/monit/conf.d # cat /etc/debian_version
> 7.9
> 
> root@db1-primary: /etc/monit/conf.d # monit --version
> This is Monit version 5.17
> Built with ssl, without pam and with large files
> Copyright (C) 2001-2016 Tildeslash Ltd. All Rights Reserved.
> 
> root@db1-primary: /etc/monit/conf.d # cat backups
> check program backup_failure with path /usr/local/bin/check_backup with 
> timeout 15 seconds
> not every "* 14 * * *"
> #if status != 0 then alert
> if status != 1 then alert
> 
> root@db1-primary: /etc/monit/conf.d # cat /usr/local/bin/check_backup
> #!/bin/bash
> BACKUP_DIR=/var/backups
> cd ${BACKUP_DIR}
> BUFILE=`date +%Y_%m_%d`_"group".sql.gz
> YDAY_BUFILE=`date --date "1 days ago" +%Y_%m_%d`_"group".sql.gz
> DAYBEFORE_YDAY_BUFILE=`date --date "2 days ago" +%Y_%m_%d`_"group".sql.gz
> if [ -e "${BUFILE}" ];then
>     TDAYSIZE=`du ${BUFILE}|cut -f1`
>     YDAYSIZE=`du ${YDAY_BUFILE}|cut -f1`
>     DBDAYSIZE=`du ${DAYBEFORE_YDAY_BUFILE}|cut -f1`
>     if [ $YDAYSIZE -gt $DBDAYSIZE ];then
>     if [ $TDAYSIZE -gt $YDAYSIZE ];then
>         exit 0
>     fi
>     else
>         exit 1
>     fi
> fi
> 
> root@db1-primary:/etc/monit/conf.d 
> <mailto:root@db1-primary:/etc/monit/conf.d> #  tail -1 /var/log/daemon.log
> Mar 11 15:25:04 localhost monit[10562]: 'backup_failure' 
> '/usr/local/bin/check_backup' failed with exit status (0) -- no output
> 
> root@db1-primary: ~ # monit status|tail -7
> Program 'backup_failure'
>   status                            Status failed
>   monitoring status                 Monitored
>   last started                      Fri, 11 Mar 2016 15:42:36
>   last exit value                   0
>   data collected                    Fri, 11 Mar 2016 15:42:36
> 
> What am I missing?
>  -- 
> Paul Theodoropoulos
> www.anastrophe.com <http://www.anastrophe.com/>--
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general

--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

Reply via email to