Ori.livneh has uploaded a new change for review.
https://gerrit.wikimedia.org/r/81121
Change subject: Add Icinga plug-in & NRPE check for EventLogging jobs
......................................................................
Add Icinga plug-in & NRPE check for EventLogging jobs
This patch adds a shell script that acts as an Icinga plug-in for EventLogging.
It walks the job instance definition tree (/etc/eventlogging.d) and checks that
each defined instance is running. If /etc/eventlogging.d is missing, it exits
with status UNKNOWN. If one or more defined instances are stopped, it emits a
CRITICAL status with a message that enumerates the stopped services. If all
defined instances are running, it exits with status OK.
Change-Id: I88af621c470f1cc7cc557bcdbbc2dee5806d7c01
---
M manifests/role/eventlogging.pp
A modules/eventlogging/files/check_eventlogging_jobs
M modules/eventlogging/manifests/monitor.pp
3 files changed, 49 insertions(+), 0 deletions(-)
git pull ssh://gerrit.wikimedia.org:29418/operations/puppet
refs/changes/21/81121/1
diff --git a/manifests/role/eventlogging.pp b/manifests/role/eventlogging.pp
index 0947baf..b29bc76 100644
--- a/manifests/role/eventlogging.pp
+++ b/manifests/role/eventlogging.pp
@@ -154,4 +154,15 @@
hosts_allow => $backup_destinations,
}
}
+
+
+ ## Monitoring
+
+ nrpe::monitor_service { 'eventlogging':
+ ensure => 'present',
+ description => 'Check status of defined EventLogging jobs',
+ nrpe_command => '/usr/lib/nagios/plugins/check_eventlogging_jobs',
+ require =>
File['/usr/lib/nagios/plugins/check_eventlogging_jobs'],
+ contact_group => 'admins,analytics',
+ }
}
diff --git a/modules/eventlogging/files/check_eventlogging_jobs
b/modules/eventlogging/files/check_eventlogging_jobs
new file mode 100755
index 0000000..1e2a4ac
--- /dev/null
+++ b/modules/eventlogging/files/check_eventlogging_jobs
@@ -0,0 +1,33 @@
+#!/bin/sh
+# check_eventlogging_jobs
+#
+# EventLogging plug-in for Nagios/Icinga. Iterates through job instance
+# definition files in /etc/eventlogging.d and ensures that they are running.
+
+if [ ! -d "/etc/eventlogging.d" ]; then
+ echo "UNKNOWN: Can't find EventLogging job config dir /etc/eventlogging.d"
+ exit 3
+fi
+
+roles="forwarder processor multiplexer consumer"
+set -- $roles
+stopped=""
+
+for role in "$@"; do
+ for config in /etc/eventlogging.d/${role}s/*; do
+ [ -e "$config" ] || break
+ name="$(basename $config)"
+ /sbin/status -q "eventlogging/${role}" NAME="${name}"
+ if [ $? -ne 0 ]; then
+ stopped="${role}/${name} ${stopped}"
+ fi
+ done
+done
+
+if [ ! -z "$stopped" ]; then
+ echo "CRITICAL: Stopped EventLogging jobs: ${stopped}"
+ exit 2
+fi
+
+echo "OK: All defined EventLogging jobs are runnning."
+exit 0
diff --git a/modules/eventlogging/manifests/monitor.pp
b/modules/eventlogging/manifests/monitor.pp
index e0cc6d6..ebd822b 100644
--- a/modules/eventlogging/manifests/monitor.pp
+++ b/modules/eventlogging/manifests/monitor.pp
@@ -18,4 +18,9 @@
require => File['/usr/lib/ganglia/python_modules/eventlogging_mon.py'],
notify => Service['gmond'],
}
+
+ file { '/usr/lib/nagios/plugins/check_eventlogging_jobs':
+ source => 'puppet:///modules/eventlogging/check_eventlogging_jobs',
+ mode => '0755',
+ }
}
--
To view, visit https://gerrit.wikimedia.org/r/81121
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I88af621c470f1cc7cc557bcdbbc2dee5806d7c01
Gerrit-PatchSet: 1
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Ori.livneh <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits