Re: fping.monitor output problems
On 2005-07-07 13:00, Kevin Ivory wrote: the fping.monitor included with mon-1.0.0pre5 doesn't semm to parse the output of fping correctly. ... # ./fping.monitor 192.168.140.3 192.168.140.3 ICMP ICMP ICMP ICMP some more extra information: the problematic code must have went in between pre3 and pre4: pre3's output looks fine. Kevin -- _ | Kevin Ivory | Tel: +49-551-370 |_ |\ || Service Network GmbH | Fax: +49-551-379 ._|ER | \|ET | Bahnhofsallee 1b | mailto:[EMAIL PROTECTED] Service Network | 37081 Goettingen |http://www.SerNet.de/ ___ mon mailing list mon@linux.kernel.org http://linux.kernel.org/mailman/listinfo/mon
fping.monitor output problems
the fping.monitor included with mon-1.0.0pre5 doesn't semm to parse the output of fping correctly. (Extra info: Depending of the fping, the combination of default options used in fping.monitor works in 1.5s for the old version an 16s for the recent version.) This happens both with SUSE with fping-2.2b1-270 and Debian with fping 2.4b2-to-ipv6- # fping -e -r 3 -t 2000 -b 56 192.168.140.3 ICMP Host Unreachable from 192.168.140.2 for ICMP Echo sent to 192.168.140.3 ICMP Host Unreachable from 192.168.140.2 for ICMP Echo sent to 192.168.140.3 ICMP Host Unreachable from 192.168.140.2 for ICMP Echo sent to 192.168.140.3 ICMP Host Unreachable from 192.168.140.2 for ICMP Echo sent to 192.168.140.3 192.168.140.3 is unreachable # ./fping.monitor 192.168.140.3 192.168.140.3 ICMP ICMP ICMP ICMP start time: Thu Jul 7 12:10:51 2005 end time : Thu Jul 7 12:11:07 2005 duration : 16 seconds fping args: fping -e -r 3 -t 2000 -b 56 -- unreachable hosts -- ICMP ICMP ICMP ICMP 192.168.140.3 # Kevin -- _ | Kevin Ivory | Tel: +49-551-370 |_ |\ || Service Network GmbH | Fax: +49-551-379 ._|ER | \|ET | Bahnhofsallee 1b | mailto:[EMAIL PROTECTED] Service Network | 37081 Goettingen |http://www.SerNet.de/ ___ mon mailing list mon@linux.kernel.org http://linux.kernel.org/mailman/listinfo/mon
Re: Monitoring software raid?
Hi Simon, On 2005-04-05 12:49, Simon Detheridge wrote: Has anyone come up with a monitor to keep an eye on linux software-raid status? How about remote boxes? Presumably this would require some kind of snmp extension... Does anyone have a working solution running for this? The local check can be done by an extremely simple shell-script (the user executing it only needs to write a reference file once): #!/bin/bash # softraid.monitor # Linux Software RAID check with mon compatible output/return values # Call without arguments. # The reference file $md_ref must exist. To generate it: # softraid.monitor learn # [ cat /proc/mdstat > /path/to/dir/mdstat.reference ] # no administrative permissions needed for this script. # Return values: 3 /proc/mdstat missing, no Software RAID? #2 reference file missing #1 RAID not okay #0 alles okay mdstat="/proc/mdstat" # THIS NEEDS TO BE SOMEWHERE THE CHECKING USER CAN WRITE md_ref="/var/something/mdstat.reference" if [ ! -r "$mdstat" ]; then echo -e "$HOSTNAME:$0 Missing RAID status file: $mdstat Perhaps no software RAID?" exit 3 fi if [ "$1" = "learn" ]; then cat "$mdstat" > "$md_ref" fi if [ ! -r "$md_ref" ]; then echo -e "$HOSTNAME:$0 Missing RAID reference file: $md_ref Generate with: $0 learn > $md_ref" exit 2 fi md_out="Complete contents of $mdstat:\n\n$(cat $mdstat)" diff=$(diff -u -U 0 $md_ref $mdstat) stat=$? if [ $stat -eq 0 ]; then echo -e "$HOSTNAME\nSoftware RAID ok:\n$md_out" else echo -e "$HOSTNAME\nSoftware RAID not ok:\n$diff\n\n$md_out" exit 1 fi # end of softraid.monitor We realize remote checks with ssh private/public keys and forced commands and a separate shell script that loops over a whole bunch of remote hosts. (The biggest problem with this is that the ssh command does not timeout.) Kevin -- _ | Kevin Ivory | Tel: +49-551-370 |_ |\ || Service Network GmbH | Fax: +49-551-379 ._|ER | \|ET | Bahnhofsallee 1b | mailto:[EMAIL PROTECTED] Service Network | 37081 Goettingen |http://www.SerNet.de/ ___ mon mailing list mon@linux.kernel.org http://linux.kernel.org/mailman/listinfo/mon
Re: DNS Monitor
Casey Bralla wrote: Has anybody gotten DNS monitoring to work? I get a ": dns.monitor: The zone master server must be specified" error, even on DNS servers I know to be fully functional. you need to use the -zone option. Here is a line from one of our watches: monitor dns.monitor -zone SerNet.de -master 193.159.217.2 Kevin -- _ | Kevin Ivory | Tel: +49-551-370 |_ |\ || Service Network GmbH | Fax: +49-551-379 ._|ER | \|ET | Bahnhofsallee 1b | mailto:[EMAIL PROTECTED] Service Network | 37081 Goettingen |http://www.SerNet.de/ ___ mon mailing list [EMAIL PROTECTED] http://linux.kernel.org/mailman/listinfo/mon
Re: How to Set eMail "From" name in Alerts?
Casey Bralla wrote: I've just set up mon to eMail me alerts when one of my services goes down. However, the From line lists the mail as coming from "[EMAIL PROTECTED]". How can I customize this to say something meaningful? I once received a much enhanced mail.alert version from someone on this list (see header of attached version). Even in the enhanced version there was no variable to set the from/reply-to fields. So I hard-coded it into the script. Have a look at the source and you will easily find the lines. With a little Perl knowledge it should be an easy exercise to make the fields optional and set by script options. Kevin -- _ | Kevin Ivory | Tel: +49-551-370 |_ |\ || Service Network GmbH | Fax: +49-551-379 ._|ER | \|ET | Bahnhofsallee 1b | mailto:[EMAIL PROTECTED] Service Network | 37081 Goettingen |http://www.SerNet.de/ #!/usr/bin/perl # # mail2.alert - Improved Mail alert for mon # # The first line from STDIN is summary information, adequate to send # to a pager or email subject line. # # Mark Lawrence, [EMAIL PROTECTED] - based on the original from: # Jim Trocki, [EMAIL PROTECTED] # # $Id: mail.alert 1.1 Sat, 26 Aug 2000 15:22:34 -0400 trockij $ # #Copyright (C) 1998, Jim Trocki # #This program is free software; you can redistribute it and/or modify #it under the terms of the GNU General Public License as published by #the Free Software Foundation; either version 2 of the License, or #(at your option) any later version. # #This program is distributed in the hope that it will be useful, #but WITHOUT ANY WARRANTY; without even the implied warranty of #MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #GNU General Public License for more details. # #You should have received a copy of the GNU General Public License #along with this program; if not, write to the Free Software #Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # $RCSID='$Id: mail.alert 1.1 Sat, 26 Aug 2000 15:22:34 -0400 trockij $'; ### Extra Perl packages ### use Getopt::Std; use Text::Wrap; use Text::Tabs; ### A function to pretty-print statements of the form "Field: value" ### $Text::Wrap::columns = 72; $colwidth = 18; $fillspec = "\%-${colwidth}s\ "; $fill = sprintf($fillspec, " "); sub wprint { my($description, $string) = @_; $description = $description . ":"; return expand wrap ("", $fill, sprintf("${fillspec}\%s", $description, $string)), "\n"; } ### Evaluate command-line options and environment values ### getopts ("S:s:g:h:t:l:u"); $summary=; chomp $summary; $summary = $opt_S if (defined $opt_S); $mailaddrs = join (',', @ARGV); $ALERT = $ENV{"MON_ALERTTYPE"}; $firstfailt = $ENV{"MON_FIRST_FAILURE"}; $firstfail = localtime($firstfailt); my $sec= $opt_t - $firstfailt; my $days = int($sec/(24*60*60)); $days = $days > 0 ? ($days . " day" . ($days > 1 ? "s " : " ")) : ""; my $hours = ($sec/(60*60))%24; $hours = $hours > 0 ? ($hours . " hour" . ($hours > 1 ? "s " : " ")) : ""; my $mins = ($sec/60)%60; $mins = $mins > 0 ? ($mins . " min" . ($mins > 1 ? "s " : " ")) : ""; my $secs = $sec%60; $secs = $secs > 0 ? ($secs . " sec" . ($secs > 1 ? "s " : " ")) : ""; my $length = $days . $hours . $mins . $secs; if ($sec == 0) { $length = "0"; } $t = localtime($opt_t); ($wday,$mon,$day,$tm) = split (/\s+/, $t); $subject = uc($ALERT) . ": $opt_g $opt_s: $summary"; ### Create the email ### open (MAIL, "| /usr/sbin/sendmail -oi -t") || die "could not open pipe to mail: $!\n"; print MAIL <) { print MAIL; } close (MAIL); ___ mon mailing list [EMAIL PROTECTED] http://linux.kernel.org/mailman/listinfo/mon
Re: [Patch] phttp.monitor - cure EINPROGRESS false alarms
Jim Trocki wrote: i've added this to mon-0-99-3.38. that was released on Nov 28, wasn't it? Kevin -- _ | Kevin Ivory | Tel: +49-551-370 |_ |\ || Service Network GmbH | Fax: +49-551-379 ._|ER | \|ET | Bahnhofsallee 1b | mailto:[EMAIL PROTECTED] Service Network | 37081 Goettingen |http://www.SerNet.de/ ___ mon mailing list [EMAIL PROTECTED] http://linux.kernel.org/mailman/listinfo/mon
Re: mail.alert
Mark Lawrence wrote: Attached is (according to my tastes) a slighly clearer version of the mail alert. I think it is much clearer as well. Perhaps it can make it into the official mon, either as supplementary alternative or as the official mail.alert? Two things I noticed: 1. I hate the time stamp in the subject, but that is in the original as well. So its my personal preference. 2. The wording "Affected Members" is incorrect: The list given is the complete list of all hostgroup members and does not give any information which one is affected. ("Hostgoup Members" might be better) Kevin -- _ | Kevin Ivory | Tel: +49-551-370 |_ |\ | | Service Network GmbH | Fax: +49-551-379 ._|ER | \|ET | Bahnhofsallee 1b | mailto:[EMAIL PROTECTED] Service Network | 37081 Goettingen |http://www.SerNet.de/ ___ mon mailing list [EMAIL PROTECTED] http://linux.kernel.org/mailman/listinfo/mon
mail.alert
A few ideas for the mail.alert: 1. the mon description field is often more helpful than Group/Service information. It should be included in the mail body (depending on environment I often would prefer to have it as a subject) 2. I have never seen the "Secs until next alert" filled. (or am I doing something wrong). Can't it go away completely. 3. It would be nice to have a configuration option (mon.cf) to set the "From:" and "Reply-To:" headers to some helpful value for administrative e-mail communication. ad 1: (just to make it easy) # diff -u mail.alert mail.alert-patched --- mail.alert Sat Aug 26 21:22:34 2000 +++ mail.alert-patched Wed Jun 4 18:01:50 2003 @@ -58,6 +58,7 @@ Group : $opt_g Service : $opt_s +Service description : $ENV{"MON_DESCRIPTION"} Time noticed : $t Secs until next alert : $opt_l EOF Kevin -- _ | Kevin Ivory | Tel: +49-551-370 |_ |\ | | Service Network GmbH | Fax: +49-551-379 ._|ER | \|ET | Bahnhofsallee 1b | mailto:[EMAIL PROTECTED] Service Network | 37081 Goettingen |http://www.SerNet.de/ ___ mon mailing list [EMAIL PROTECTED] http://linux.kernel.org/mailman/listinfo/mon
Matching alerts and upalerts
If several hosts of a watch/service combination fail, an alert is invoked. When one of the hosts goes back up again, a new alert for the ones still unavailable is invoked. We would prefer to get an upalert for the ones available again. This is especially important if SMS are sent. Is there an option to get the upalert behavior? Best regards Kevin -- _ | Kevin Ivory | Tel: +49-551-370 |_ |\ | | Service Network GmbH | Fax: +49-551-379 ._|ER | \|ET | Bahnhofsallee 1b | mailto:[EMAIL PROTECTED] Service Network | 37081 Goettingen |http://www.SerNet.de/ ___ mon mailing list [EMAIL PROTECTED] http://linux.kernel.org/mailman/listinfo/mon
Re: Alerts during exclude_period
"Shchuka, Konstantin" wrote: > I've had to patch mon to make it process exclude_period configuration > properly. Here is the diff: Thank you. That looks good. I will give it a try. Best regards, Kevin -- _ | Kevin Ivory | Tel: +49-551-3741 |_ |\ | | Service Network GmbH | Fax: +49-551-379 ._|ER | \|ET | Bahnhofsallee 1b | mailto:[EMAIL PROTECTED] Service Network | 37081 Goettingen |http://www.SerNet.de/
Alerts during exclude_period
We are getting too many false alarms, especially from services with excluded times. The clock on the mon-server is definitely okay. Below I have two configuration examples for services where we still get alarms at 5:00am resp. 6:00am (the checked services are restarted at those times). Have we got the configuration wrong? (How to correctly debug?) service squid description xxx proxy depend xxx-proxy:ping interval 2m monitor tcp.monitor -p 3128 period wd {Sun-Sat} # exclude_period hr {4:55am-5:10am} exclude_period hr {4am-6am} alertafter 2 alertevery 12h alert mail.alert monwatch upalert mail.alert -u monwatch service xxxping description xxx ping depend xyz-xxx:ping interval 10m monitor fping.monitor -r 3 -t 6000 server ;; period wd {Sun-Sat} exclude_period hr {5am-7am} alertafter 2 alertevery 12h alert mail.alert monwatch upalert mail.alert -u monwatch Best regards Kevin -- _ | Kevin Ivory | Tel: +49-551-3741 |_ |\ | | Service Network GmbH | Fax: +49-551-379 ._|ER | \|ET | Bahnhofsallee 1b | mailto:[EMAIL PROTECTED] Service Network | 37081 Goettingen |http://www.SerNet.de/
max alerts for time period
We sometimes have monitored services that go up and down several times a day. For each we correctly get alert mails upalert mails. If the rate gets too high, we would prefer to have a rate limitation on the mails (something like no more than ten mails/service/day). Is this configurable with mon? Kevin -- _ | Kevin Ivory | Tel: +49-551-3741 |_ |\ | | Service Network GmbH | Fax: +49-551-379 ._|ER | \|ET | Bahnhofsallee 1b | mailto:[EMAIL PROTECTED] Service Network | 37081 Goettingen |http://www.SerNet.de/
monerrfile missing in mon.8
The documentation of the new monerrfile global variable is missing in the mon(8)-documentation of mon 0.99. Kevin -- _ | Kevin Ivory | Tel: +49-551-3741 |_ |\ | | Service Network GmbH | Fax: +49-551-379 ._|ER | \|ET | Bahnhofsallee 1b | mailto:[EMAIL PROTECTED] Service Network | 37081 Goettingen |http://www.SerNet.de/