Package: mon-contrib
Version: 1.0+dfsg-3
Severity: normal
Tags: patch

The attached patch gives options for the minimum number of alerts to be sent 
and the
minimum failure duration before a failure on a remote system is considered to 
be a
local failure.  If you have multiple servers that are all capable of 
independently
notifying the sysadmin then remote.monitor is good to make sure that they are 
all
operating correctly.  In the normal case the remote server notifies you and if 
that
doesn't work then the master server notifies you some time later.

--- /usr/lib/mon/mon.d/remote.monitor   2014-07-03 11:31:07.000000000 +1000
+++ remote.monitor      2016-05-13 21:25:13.117297577 +1000
@@ -12,6 +12,11 @@
 #                    return for each failed mon server the list of the
 #                    failed. Like : host1([g1:s1|s3][g4:s5]) ... 
 #
+# --alerts_sent           : the number of alerts that should be sent before we 
consider it a
+#                   problem
+#
+# --failure_duration : the minimum duration of a recorded problem before we 
alert
+#
 # --bigsummary     : flag to extend the summary of this monitor
 #                    return for each failed mon server the list of the
 #                    failed. Like : host1([g1:s1{sum}|s3{sum}][g4:s5{sum}]) 
... 
@@ -47,6 +52,8 @@
                "timeout|t:i"  => \$timeout,
                "summary"      => \$summary,
               "bigsummary"   => \$bigsummary,
+              "failure_duration:i" => \$min_failure_duration,
+              "alerts_sent:i"  => \$min_alerts_sent,
                "debug|d"      => \$debug,
                "help|h"       => \$help,              
                "restrict|r:s"  => \$restrict,
@@ -61,6 +68,8 @@
 $port    = ($port)    ? $port   : "2583";
 $timeout = ($timeout) ? $timeout : "10";
 $summary = ($summary) ? $summary : $bigsummary;
+$min_failure_duration = ($min_failure_duration) ? $min_failure_duration : 0;
+$min_alerts_sent = ($min_alerts_sent) ? $min_alerts_sent : 1;
 ($restrict) and ($only_watch,$only_service) = split( /:/, ($restrict) );
 
 @failures = ();
@@ -177,6 +186,11 @@
                     my($opstatus);
 
                     next if ( ($only_service) && !( $service eq 
($only_service) ));
+
+                    my $alerts_sent = $s{$watch}{$service}{alerts_sent};
+                    my $failure_duration = 
$s{$watch}{$service}{failure_duration};
+                    next if ($alerts_sent < $min_alerts_sent);
+                    next if ($failure_duration < $min_failure_duration);
                     # state service recuperation
                     $opstatus = $s{$watch}{$service}{opstatus};
                     ($debug) and print "$watch $service opstatus=$opstatus\n";
@@ -193,7 +207,7 @@
                     # service failed and not disabled
                     $hosterr++;
                     $watcherr++;
-                    ($debug) and print "Watch $watch service $service 
failed\n";
+                    ($debug) and print "Watch $watch service $service failed 
with $alerts_sent alerts for $failure_duration seconds\n";
                     push (@failures, ${host}) unless 
(defined($failuresDetails{${host}}));
                     $failuresDetails{${host}} .=
                          "Watch $watch, service $service, failed ".
-- System Information:
Debian Release: stretch/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.5.0-2-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_AU.UTF-8, LC_CTYPE=en_AU.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Init: systemd (via /run/systemd/system)

Versions of packages mon-contrib depends on:
ii  mon  1.2.0-9

mon-contrib recommends no packages.

mon-contrib suggests no packages.

-- no debconf information

Reply via email to