Dzahn has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/361023 )
Change subject: icinga/role:mail::mx: add monitoring of exim queue size ...................................................................... icinga/role:mail::mx: add monitoring of exim queue size Adds a new Icinga plugin check_exim_queue using bash and exipick. Adds NRPE monitoring service in MX role class assuming it is prefered that i put it there vs the module. Using 1000 for WARN and 3000 for CRIT. I observed values around 300 on mx1001 the other day when testing it. Bug: T133110 Change-Id: I70bdef87eed6902ad27c92f2fa0e19b3d2274d7d --- A modules/icinga/files/check_exim_queue.sh M modules/role/manifests/mail/mx.pp 2 files changed, 92 insertions(+), 0 deletions(-) Approvals: Dzahn: Verified; Looks good to me, approved diff --git a/modules/icinga/files/check_exim_queue.sh b/modules/icinga/files/check_exim_queue.sh new file mode 100755 index 0000000..6446fc5 --- /dev/null +++ b/modules/icinga/files/check_exim_queue.sh @@ -0,0 +1,70 @@ +#!/bin/bash +# Nagios/Icinga plugin to check for oversized exim4 queues. +# +# Daniel Zahn - Wikimedia Foundation Inc. +# +# https://phabricator.wikimedia.org/T133110 +# +# ./check_exim_queue -w <warn> -c <crit> +# +# <warn> = number of mails in queue that trigger a WARN (int) +# <crit> = number of mails in queue that trigger a CRIT (int) +# +# dependencies: exipick, sudo + +set -eu + +usage() { echo "Usage: $0 -w <warn> -c <crit>" 1>&2; exit 1; } + +declare -i WARN_LIMIT=0 +declare -i CRIT_LIMIT=0 + +# count only messages older than MIN_AGE +MIN_AGE="10m" + +while getopts "w:c:" o; do + case "${o}" in + w) + WARN_LIMIT=${OPTARG} + ;; + c) + CRIT_LIMIT=${OPTARG} + ;; + *) + usage + ;; + esac +done + +if [ $WARN_LIMIT == 0 ] || [ $CRIT_LIMIT == 0 ]; then + usage +fi + +declare -i QSIZE=0 + +SUDO="/usr/bin/sudo" +EXIPICK="/usr/sbin/exipick" + +# number of messages in queue older than $MIN_AGE +QSIZE="$(${SUDO} ${EXIPICK} -bpc -o ${MIN_AGE})" + +# echo "QSIZE: ${QSIZE} WARN: ${WARN_LIMIT} CRIT: ${CRIT_LIMIT}" + +if [ "$QSIZE" -ge "$CRIT_LIMIT" ] ; then + echo "CRITICAL: ${QSIZE} mails in exim queue." + exit 2 +fi + +if [ "$QSIZE" -ge "$WARN_LIMIT" ] ; then + echo "WARNING: ${QSIZE} mails in exim queue." + exit 1 +fi + +if [ "$QSIZE" -lt "$WARN_LIMIT" ] && [ "$QSIZE" -lt "$CRIT_LIMIT" ] ; then + echo "OK: Less than ${WARN_LIMIT} mails in exim queue." + exit 0 +fi + +echo "UNKNOWN: something went wrong. check plugin ($0)." +exit 3 + diff --git a/modules/role/manifests/mail/mx.pp b/modules/role/manifests/mail/mx.pp index 905b309..d860227 100644 --- a/modules/role/manifests/mail/mx.pp +++ b/modules/role/manifests/mail/mx.pp @@ -106,4 +106,26 @@ ensure => 'present', source => 'puppet:///modules/role/exim/logrotate/exim4-base.mx', } + + # monitor mail queue size (T133110) + file { '/usr/local/lib/nagios/plugins/check_exim_queue': + ensure => present, + owner => 'root', + group => 'root', + mode => '0555', + source => 'puppet:///modules/icinga/check_exim_queue.sh', + } + + ::sudo::user { 'nagios_exim_queue': + user => 'nagios', + privileges => ['ALL = NOPASSWD: /usr/sbin/exipick -bpc -o [[\:digit\:]][[\:digit\:]][mh]'], + } + + nrpe::monitor_service { 'check_exim_queue': + description => 'exim queue', + nrpe_command => '/usr/local/lib/nagios/plugins/check_exim_queue -w 1000 -c 3000', + check_interval => 30, + retry_interval => 10, + timeout => 20, + } } -- To view, visit https://gerrit.wikimedia.org/r/361023 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I70bdef87eed6902ad27c92f2fa0e19b3d2274d7d Gerrit-PatchSet: 12 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Dzahn <dz...@wikimedia.org> Gerrit-Reviewer: Alexandros Kosiaris <akosia...@wikimedia.org> Gerrit-Reviewer: Dzahn <dz...@wikimedia.org> Gerrit-Reviewer: Filippo Giunchedi <fgiunch...@wikimedia.org> Gerrit-Reviewer: Herron <kher...@wikimedia.org> Gerrit-Reviewer: jenkins-bot <> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits