Volans has submitted this change and it was merged.

Change subject: Monitoring: avoid NRPE limit for RAID get status
......................................................................


Monitoring: avoid NRPE limit for RAID get status

NRPE has a limit of 1024 bytes and the get RAID status scripts output
were truncated. Compressing the output to reduce the probability to
reach the limit. Encoding the NULL bytes that NRPE doesn't handle. Given
the specific domain there is no need of a full yEnc encoding or similar.

Bug: T142085
Change-Id: I585f1cc8d4eaff408c8ebc43252e09770d10f3cd
---
M modules/raid/files/get-raid-status-hpssacli.sh
M modules/raid/files/get-raid-status-megacli.py
M modules/raid/manifests/init.pp
3 files changed, 94 insertions(+), 20 deletions(-)

Approvals:
  Faidon Liambotis: Looks good to me, but someone else must approve
  Volans: Looks good to me, approved
  jenkins-bot: Verified



diff --git a/modules/raid/files/get-raid-status-hpssacli.sh 
b/modules/raid/files/get-raid-status-hpssacli.sh
index 839232d..fd0ae96 100644
--- a/modules/raid/files/get-raid-status-hpssacli.sh
+++ b/modules/raid/files/get-raid-status-hpssacli.sh
@@ -2,7 +2,47 @@
 
 set -e
 
+function usage() {
+    cat <<EOF
+usage: ${0} [-c]
+
+Print a summarized status of all detected HP Raid controllers
+
+optional arguments:
+  -c          compress with zlib the summary to overcome NRPE output limits.
+  -h          show this help message and exit
+EOF
+
+    exit "${1}"
+}
+
+COMPRESS=0
+if [[ -n "${1}" ]]; then
+    case "${1}" in
+        "-c") COMPRESS=1;;
+        "-h") usage 0;;
+        *)
+            echo "Invalid parameter '${1}'"
+            usage 1
+            ;;
+    esac
+fi
+
+OUTPUT=""
 while read -r CONTROLLER; do
-    /usr/bin/sudo /usr/sbin/hpssacli controller slot="${CONTROLLER}" ld all 
show detail
-    echo
+    OUTPUT="${OUTPUT}$(/usr/bin/sudo /usr/sbin/hpssacli controller 
slot="${CONTROLLER}" ld all show detail)\n"
 done < <(/usr/bin/sudo /usr/sbin/hpssacli controller all show | egrep -o 'Slot 
[0-9] ' | cut -d' ' -f2)
+
+PYTHON_SCRIPT="
+import sys
+import zlib
+
+# NRPE doesn't handle NULL bytes, encoding them.
+# Given the specific domain there is no need of a full yEnc encoding
+print(zlib.compress(sys.stdin.read()).replace('\x00', '###NULL###'))"
+
+if [[ "${COMPRESS}" -eq "1" ]]; then
+    echo -e "${OUTPUT}" | python -c "${PYTHON_SCRIPT}"
+else
+    echo -e "${OUTPUT}"
+fi
diff --git a/modules/raid/files/get-raid-status-megacli.py 
b/modules/raid/files/get-raid-status-megacli.py
index 6c67e76..7164f26 100644
--- a/modules/raid/files/get-raid-status-megacli.py
+++ b/modules/raid/files/get-raid-status-megacli.py
@@ -3,10 +3,12 @@
 Get the status of a MegaRAID RAID
 
 Execute and parse megacli commands in order to print a summary of the RAID
-status. Only components in non-optimal status are shown.
+status. By default only components in non-optimal status are shown.
 """
 
+import argparse
 import subprocess
+import zlib
 
 ADAPTER_LINE_STARTSWITH = 'Adapter #'
 EXIT_LINE_STARTSWITH = 'Exit Code:'
@@ -176,30 +178,33 @@
             if context == final_context:
                 break
 
-    def print_status(self, optimal=False):
-        """ Print to stdout the summarized RAID status
+    def get_status(self, get_all=False):
+        """ Return a string with the summarized RAID status
 
             Keyword arguments:
-            optimal -- if False print only hierarchical chains where there is
+            get_all -- if False print only hierarchical chains where there is
                        at least one non-optimal block, all blocks otherwise
         """
 
+        status = []
         message = 'does not include components in optimal state'
-        if optimal:
+        if get_all:
             message = 'includes all components'
 
-        print('=== RaidStatus ({})'.format(message))
+        status.append('=== RaidStatus ({})'.format(message))
 
         for adapter in self.adapters:
-            if not optimal and adapter['optimal']:
+            if not get_all and adapter['optimal']:
                 continue
 
-            self._print_block(adapter, optimal=optimal)
+            status += self._get_block_status(adapter, get_all=get_all)
 
-        print('=== RaidStatus completed')
+        status.append('=== RaidStatus completed')
 
-    def _print_block(self, block, prefix='', optimal=False):
-        """ Print to stdout a summary of the given block
+        return '\n'.join(status)
+
+    def _get_block_status(self, block, prefix='', get_all=False):
+        """ Return an array of string with the summary of the given block
 
             Arguments:
             block   -- the block to be printed
@@ -210,22 +215,42 @@
                        at least one non-optimal block, all blocks otherwise
         """
 
+        status = []
         for key in CONTEXTS[block['context']]['print_keys']:
             try:
-                print('{}{}: {}'.format(prefix, key, block[key]))
+                status.append('{}{}: {}'.format(prefix, key, block[key]))
             except:
                 pass  # Explicitely ignore missing keys
 
-        print('')  # Separate each block
+        status.append('')  # Separate each block
 
         for child in block['childs']:
             # Skip the child if not needed
-            if not optimal and child['optimal']:
+            if not get_all and child['optimal']:
                 if (block['optimal'] or
                         not CONTEXTS[block['context']]['include_childs']):
                     continue
 
-            self._print_block(child, prefix=prefix + '\t', optimal=optimal)
+            status += self._get_block_status(
+                child, prefix=prefix + '\t', get_all=get_all)
+
+        return status
+
+
+def parse_args():
+    """Parse command line arguments"""
+
+    parser = argparse.ArgumentParser(
+        description=('Print a summarized status of all non-optimal components '
+                     'of all detected MegaRAID controllers'))
+    parser.add_argument(
+        '-c', dest='compress', action='store_true',
+        help='Compress with zlib the summary to overcome NRPE output limits.')
+    parser.add_argument(
+        '-a', dest='all', action='store_true',
+        help='Include all components in the summary.')
+
+    return parser.parse_args()
 
 
 def parse_megacli_status(status):
@@ -276,6 +301,15 @@
 
 
 if __name__ == '__main__':
+    args = parse_args()
+
     status = RaidStatus()
     parse_megacli_status(status)
-    status.print_status()
+    summary = status.get_status(get_all=args.all)
+
+    if args.compress:
+        # NRPE doesn't handle NULL bytes, encoding them.
+        # Given the specific domain there is no need of a full yEnc encoding
+        print(zlib.compress(summary).replace('\x00', '###NULL###'))
+    else:
+        print(summary)
diff --git a/modules/raid/manifests/init.pp b/modules/raid/manifests/init.pp
index d99e052..a9a4f1f 100644
--- a/modules/raid/manifests/init.pp
+++ b/modules/raid/manifests/init.pp
@@ -41,7 +41,7 @@
         }
 
         nrpe::check { 'get_raid_status_megacli':
-            command => "/usr/bin/sudo ${get_raid_status_megacli}",
+            command => "/usr/bin/sudo ${get_raid_status_megacli} -c",
         }
 
         nrpe::monitor_service { 'raid_megaraid':
@@ -105,7 +105,7 @@
         }
 
         nrpe::check { 'get_raid_status_hpssacli':
-            command => $get_raid_status_hpssacli,
+            command => "${get_raid_status_hpssacli} -c",
         }
     }
 

-- 
To view, visit https://gerrit.wikimedia.org/r/303992
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I585f1cc8d4eaff408c8ebc43252e09770d10f3cd
Gerrit-PatchSet: 3
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Volans <rcocci...@wikimedia.org>
Gerrit-Reviewer: Faidon Liambotis <fai...@wikimedia.org>
Gerrit-Reviewer: Volans <rcocci...@wikimedia.org>
Gerrit-Reviewer: jenkins-bot <>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to