Ori.livneh has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/81121


Change subject: Add Icinga plug-in & NRPE check for EventLogging jobs
......................................................................

Add Icinga plug-in & NRPE check for EventLogging jobs

This patch adds a shell script that acts as an Icinga plug-in for EventLogging.
It walks the job instance definition tree (/etc/eventlogging.d) and checks that
each defined instance is running. If /etc/eventlogging.d is missing, it exits
with status UNKNOWN. If one or more defined instances are stopped, it emits a
CRITICAL status with a message that enumerates the stopped services. If all
defined instances are running, it exits with status OK.

Change-Id: I88af621c470f1cc7cc557bcdbbc2dee5806d7c01
---
M manifests/role/eventlogging.pp
A modules/eventlogging/files/check_eventlogging_jobs
M modules/eventlogging/manifests/monitor.pp
3 files changed, 49 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/operations/puppet 
refs/changes/21/81121/1

diff --git a/manifests/role/eventlogging.pp b/manifests/role/eventlogging.pp
index 0947baf..b29bc76 100644
--- a/manifests/role/eventlogging.pp
+++ b/manifests/role/eventlogging.pp
@@ -154,4 +154,15 @@
             hosts_allow => $backup_destinations,
         }
     }
+
+
+    ## Monitoring
+
+    nrpe::monitor_service { 'eventlogging':
+        ensure        => 'present',
+        description   => 'Check status of defined EventLogging jobs',
+        nrpe_command  => '/usr/lib/nagios/plugins/check_eventlogging_jobs',
+        require       => 
File['/usr/lib/nagios/plugins/check_eventlogging_jobs'],
+        contact_group => 'admins,analytics',
+    }
 }
diff --git a/modules/eventlogging/files/check_eventlogging_jobs 
b/modules/eventlogging/files/check_eventlogging_jobs
new file mode 100755
index 0000000..1e2a4ac
--- /dev/null
+++ b/modules/eventlogging/files/check_eventlogging_jobs
@@ -0,0 +1,33 @@
+#!/bin/sh
+# check_eventlogging_jobs
+#
+# EventLogging plug-in for Nagios/Icinga. Iterates through job instance
+# definition files in /etc/eventlogging.d and ensures that they are running.
+
+if [ ! -d "/etc/eventlogging.d" ]; then
+    echo "UNKNOWN: Can't find EventLogging job config dir /etc/eventlogging.d"
+    exit 3
+fi
+
+roles="forwarder processor multiplexer consumer"
+set -- $roles
+stopped=""
+
+for role in "$@"; do
+    for config in /etc/eventlogging.d/${role}s/*; do
+        [ -e "$config" ] || break
+        name="$(basename $config)"
+        /sbin/status -q "eventlogging/${role}" NAME="${name}"
+        if [ $? -ne 0 ]; then
+            stopped="${role}/${name} ${stopped}"
+        fi
+    done
+done
+
+if [ ! -z "$stopped" ]; then
+    echo "CRITICAL: Stopped EventLogging jobs: ${stopped}"
+    exit 2
+fi
+
+echo "OK: All defined EventLogging jobs are runnning."
+exit 0
diff --git a/modules/eventlogging/manifests/monitor.pp 
b/modules/eventlogging/manifests/monitor.pp
index e0cc6d6..ebd822b 100644
--- a/modules/eventlogging/manifests/monitor.pp
+++ b/modules/eventlogging/manifests/monitor.pp
@@ -18,4 +18,9 @@
         require => File['/usr/lib/ganglia/python_modules/eventlogging_mon.py'],
         notify  => Service['gmond'],
     }
+
+    file { '/usr/lib/nagios/plugins/check_eventlogging_jobs':
+        source => 'puppet:///modules/eventlogging/check_eventlogging_jobs',
+        mode   => '0755',
+    }
 }

-- 
To view, visit https://gerrit.wikimedia.org/r/81121
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I88af621c470f1cc7cc557bcdbbc2dee5806d7c01
Gerrit-PatchSet: 1
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Ori.livneh <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to