Hi,
  First let me say thanks to Jim Trocki (and everyone who has helped) for
mon.  After using another package for the last couple of years, I decided
to look around for something that would be easier to maintain/extend.

  After configuring mon, I decided I wanted 2 new features:

1)  The ability to have a "only_hosts" definition in addition to
    definition "exclude_hosts" (Note: I'm not hooked on the specific
    name "only_hosts", it's just what I came up with at the time).
    The reason I wanted this was due to my wanting to group all of
    my webservers in one watch group.  My problem was that some
    webservers run on a special port.  For example:

    [...]
    define(_NORMAL_WWW_,  `sales marketing')dnl
    define(_SECURE_WWW_,  `sales')dnl
    define(_OTHER_WWW_,   `tests')dnl
    define(_WEBSERVERS_,   _NORMAL_WWW_ _SECURE_WWW_ _OTHER_WWW_)dnl

    hostgroup webservers _WEBSERVERS_

    watch webservers
        service ping
            monitor fping.monitor
        service telnet
            monitor telnet.monitor
        service freespace
            monitor snmpdiskspace.monitor --community fubar
        service http
            monitor http.monitor
            only_hosts _NORMAL_WWW_
        service https
            monitor tcp.monitor -p 443
            only_hosts _SECURE_WWW_
        service testhttp
            monitor tcp.monitor -p 8001
            only_hosts _OTHER_WWW_
    [...]

    I realize this can be accomplished using the "exclude_hosts"
    definition, but it would require one to remember all of the
    m4 host definitions that don't belong in that particular service.
    So if you add a new m4 group, you would have to find the watch
    group and make sure to add it to all of the service definitions
    for which it didn't apply.

    BTW, this patch also includes a patch to remove duplicate hosts
    from the hostgroup (since it's quite easy to have duplicate hosts
    using this method) -- no need to duplicate checks.  :-)


 2) The ability to have "hostname" entries in addition to "hostgroup"
    entries.  For hosts that are in multiple hostgroups, I'd rather
    define them and list all of the groups they're in as opposed to
    finding the individual hostgroups and adding them in.  This makes
    it much easier for me to temporarily remove a host from mon.  For
    example:

    [...]
    hostgroup webservers sales tests
    hostgroup nfsservers nfs1 nfs2

    hostname oddserver webservers nfsservers
    [...]

    Now if oddserver is going to be down for a while, I can just comment
    out the one line from my config file.  Again, I realize this can be
    accomplished in other ways (specifically using the m4 approach), but
    I don't feel this is nearly as clean as what I've implemented.


I'll append the patches (against mon-0.99.2) I've done (which include
patches to the documentation and example files) to this message.  Either
one can be installed seperately, or you can install both.

Comments/Suggestions/Flames are appreciated.  :-)

...dave alden
--- mon_ORIG    Tue Oct 16 09:28:41 2001
+++ mon Tue Oct 16 09:33:09 2001
@@ -1166,6 +1166,7 @@
                $sref->{"dep_behavior"} = $DEP_BEHAVIOR;
                $sref->{"exclude_period"} = "";
                $sref->{"exclude_hosts"} = {};
+               $sref->{"only_hosts"} = {};
                $sref->{"_op_status"} = $STAT_UNTESTED;
                $sref->{"_last_op_status"} = $STAT_UNTESTED;
                $sref->{"_ack"} = 0;
@@ -1458,6 +1459,16 @@
                    $args = $ex;
                }
 
+               elsif ($var eq "only_hosts")
+               {
+                   my $on = {};
+                   foreach my $h (split (/\s+/, $args))
+                   {
+                       $on->{$h} = 1;
+                   }
+                   $args = $on;
+               }
+
                elsif ($var eq "exclude_period" && inPeriod (time, $args) == -1)
                {
                    close (CFG);
@@ -2620,6 +2631,10 @@
            join (" ", keys %{$sref->{exclude_hosts}}) . "'"
        if (keys %{$sref->{"exclude_hosts"}});
 
+    $buf .= " only_hosts='" .
+           join (" ", keys %{$sref->{only_hosts}}) . "'"
+       if (keys %{$sref->{"only_hosts"}});
+
     $buf .= " randskew=$sref->{randskew}"
        if ($sref->{"randskew"});
 
@@ -2978,7 +2993,8 @@
 #
 sub run_monitor {
     my ($group, $service) = @_;
-    my (@args, @groupargs, $pid, @ghosts, $monitor, $monitorargs);
+    my (@args, @groupargs, $pid, @ghosts, $monitor, $monitorargs,
+       @thosts, %seen, $on);
 
     my $sref = \%{$watch{$group}->{$service}};
 
@@ -3008,23 +3024,44 @@
     # exclude disabled hosts
     #
     } else {
-       @ghosts = grep (!/^\*/, @{$groups{$group}});
 
-       #
-       # per-service excludes
-       #
-       if (keys %{$sref->{"exclude_hosts"}})
+       if (keys %{$sref->{"only_hosts"}})
        {
            my @g = ();
 
-           for (my $i=0; $i<@ghosts; $i++)
+           foreach $on (keys %{$sref->{"only_hosts"}})
            {
-               push (@g, $ghosts[$i])
-                   if !$sref->{"exclude_hosts"}->{$ghosts[$i]};
+               push (@g, $on);
            }
 
-           @ghosts = @g;
+           @thosts = @g;
+
+       } else {
+
+           @thosts = grep (!/^\*/, @{$groups{$group}});
+
+           #
+           # per-service excludes
+           #
+           if (keys %{$sref->{"exclude_hosts"}})
+           {
+               my @g = ();
+
+               for (nmy $i=0; $i<@thosts; $i++)
+               {
+                   push (@g, $thosts[$i])
+                       if !$sref->{"exclude_hosts"}->{$thosts[$i]};
+               }
+
+               @thosts = @g;
+           }
        }
+
+       #
+       # get rid of duplicate hosts
+       #
+       %seen = ();
+       @ghosts = grep { ! $seen{$_}++ } @thosts;
 
        @args = (quotewords ('\s+', 0, $monitor), @ghosts);
     }
--- doc/mon.8_ORIG      Tue Oct 16 09:28:41 2001
+++ doc/mon.8   Tue Oct 16 09:41:08 2001
@@ -991,6 +991,12 @@
 will be excluded from the service check.
 
 .TP
+.BI only_hosts " host [host...]"
+Only hosts listed after
+.B only_hosts
+will be included in the service check.
+
+.TP
 .BI exclude_period " periodspec"
 Do not run a scheduled monitor during the time
 identified by
--- etc/example.m4_ORIG Tue Oct 16 09:28:41 2001
+++ etc/example.m4      Tue Oct 16 09:40:05 2001
@@ -38,6 +38,14 @@
 define(_RAS_EMAIL_,       `bob')dnl           # bob is the remote access admin
 dnl
 dnl #
+dnl # Webserver definitions
+dnl #
+dnl
+define(_NORMAL_WEBSERVERS_,    `fubar.com sales.com')dnl
+define(_SECURE_WEBSERVERS_,    `sales.com')dnl
+define(_WEBSERVERS_,           _SECURE_WEBSERVERS_ _NORMAL_WEBSERVERS_)dnl
+dnl
+dnl #
 dnl # -------------------------actual config begins here-------------------------
 dnl #
 #
@@ -87,7 +95,7 @@
 
 hostgroup netapps f330 f540
 
-hostgroup wwwservers www
+hostgroup wwwservers _WEBSERVERS_
 
 hostgroup printers hp5si hp5c hp750c
 
@@ -200,9 +208,19 @@
        interval 4m
        monitor http.monitor
        allow_empty_group
+       only_hosts _NORMAL_WEBSERVERS_
        period _ANYTIME_
            alert qpage.alert _MIS_PAGER_
            upalert mail.alert -S "web server is back up" _MIS_EMAIL_
+           alertevery 45m
+    service https
+       interval 4m
+       monitor tcp.monitor -p 443
+       allow_empty_group
+       only_hosts _SECURE_WEBSERVERS_
+       period _ANYTIME_
+           alert qpage.alert _MIS_PAGER_
+           upalert mail.alert -S "secure web server is back up" _MIS_EMAIL_
            alertevery 45m
     service telnet
        monitor telnet.monitor
--- mon_ORIG    Tue Oct 16 09:01:53 2001
+++ mon Tue Oct 16 09:27:06 2001
@@ -754,8 +754,8 @@
 #
 sub read_cf {
     my ($CF, $commit) = @_;
-    my ($var, $watchgroup, $ingroup, $curgroup, $inwatch,
-       $args, $hosts, %disabled, $h, $i,
+    my ($var, $watchgroup, $ingroup, $inhost, $curgroup, $curhost, $inwatch,
+       $args, $hosts, $groups, %disabled, $h, $g, $i,
        $inalias, $curalias);
     my ($sref, $pref);
     my ($service, $period);
@@ -1002,11 +1002,13 @@
        if ($l eq "")
        {
            $ingroup    = 0;
+           $inhost     = 0;
            $inalias    = 0;
            $inwatch    = 0;
            $period     = 0;
 
            $curgroup   = "";
+           $curhost    = "";
            $curalias   = "";
            $watchgroup = "";
 
@@ -1015,6 +1017,46 @@
        }
 
        #
+       # hostname record
+       #
+
+       if ($l =~ /^hostname\s+([a-zA-Z0-9_.-]+)\s*(.*)/)
+       {
+           $curhost = $1;
+
+           $inhost  = 1;
+           $inalias = 0;
+           $ingroup = 0;
+           $inwatch = 0;
+           $period  = 0;
+
+           $groups = $2;
+
+           foreach $g (split(/\s+/, $groups))
+           {
+               if (! grep(/^\*?$curhost$/, @{$groups{$g}}))
+               {
+                   push(@{$new_groups{$g}}, $curhost);
+               }
+           }
+
+           next;
+       }
+
+       if ($inhost)
+       {
+           foreach $g (split(/\s+/, $l))
+           {
+               if (! grep(/^\*?$curhost$/, @{$groups{$g}}))
+               {
+                   push(@{$new_groups{$g}}, $curhost);
+               }
+           }
+
+           next;
+       }
+
+       #
        # hostgroup record
        #
        if ($l =~ /^hostgroup\s+([a-zA-Z0-9_.-]+)\s*(.*)/)
@@ -1023,6 +1065,7 @@
 
            $ingroup = 1;
            $inalias = 0;
+           $inhost  = 0;
            $inwatch = 0;
            $period  = 0;
 
@@ -1073,6 +1116,7 @@
        {
            $inalias = 1;
            $ingroup = 0;
+           $inhost  = 0;
            $inwatch = 0;
            $period  = 0;
 
@@ -1098,6 +1142,7 @@
            $inwatch = 1;
            $inalias = 0;
            $ingroup = 0;
+           $inhost  = 0;
            $period  = 0;
 
            if (!defined ($new_groups{$watchgroup}))
@@ -1115,6 +1160,7 @@
            }
 
            $curgroup   = "";
+           $curhost    = "";
            $service = "";
 
            next;
--- doc/mon.8_ORIG      Tue Oct 16 09:01:53 2001
+++ doc/mon.8   Tue Oct 16 10:16:09 2001
@@ -507,6 +507,7 @@
 
 .SH CONFIGURATION FILE
 The configuration file consists of zero or more hostgroup definitions,
+zero or more hostname definitions,
 and one or more watch definitions. Each watch definition may have one
 or more service definitions. A line beginning with optional
 leading whitespace and a pound ("#") is
@@ -860,6 +861,30 @@
        nfsserver httpserver smbserver
 
 hostgroup router_group cisco7000 agsplus
+.fi
+.RE
+
+.SS "Hostname Entries"
+
+Hostname entries begin with the keyword
+.BR hostname ,
+and are followed by a hostname (or IP address) and one or more hostgroups
+, separated by whitespace. The hostgroups must
+be composed of alphanumeric
+characters, a dash ("-"), a period ("."),
+or an underscore ("_"). Non-blank lines following
+the first hostname line are interpreted as more hostgroups.
+The hostname definition ends with a blank line. NOTE:
+.BR hostname
+entries MUST follow the
+.BR hostgroup
+entries, otherwise they will be lost.  For example:
+
+.RS
+.nf
+hostname powerfulclient server client
+
+hostname wimpyclient client
 .fi
 .RE
 
--- etc/example.m4_ORIG Tue Oct 16 09:01:53 2001
+++ etc/example.m4      Tue Oct 16 09:25:48 2001
@@ -68,8 +68,10 @@
 authtype = getpwnam
 
 #
-# NB:  hostgroup and watch entries are terminated with a blank line (or
-# end of file).  Don't forget the blank lines between them or you lose.
+# NB:  hostgroup, hostname and watch entries are terminated with a blank line
+# (or end of file).  Don't forget the blank lines between them or you lose.
+# Also note that hostname entries MUST FOLLOW the hostgroup definitions,
+# placing them before will cause them to get lost.
 #
 
 #
@@ -83,7 +85,7 @@
 
 hostgroup hubs cisco316t hp800t ssii10
 
-hostgroup workstations blue yellow red green cornflower violet
+hostgroup workstations blue yellow red cornflower violet
 
 hostgroup netapps f330 f540
 
@@ -94,6 +96,11 @@
 hostgroup new nntp
 
 hostgroup ftp ftp
+
+#
+# hostname definitions (hostnames or IP addresses)
+#
+hostname green workstations wwwservers
 
 #
 # For the servers in building 1, monitor ping and telnet
--- etc/example.cf_ORIG Tue Oct 16 09:01:53 2001
+++ etc/example.cf      Tue Oct 16 09:26:15 2001
@@ -27,8 +27,10 @@
 authtype = getpwnam
 
 #
-# NB:  hostgroup and watch entries are terminated with a blank line (or
-# end of file).  Don't forget the blank lines between them or you lose.
+# NB:  hostgroup, hostname and watch entries are terminated with a blank line
+# (or end of file).  Don't forget the blank lines between them or you lose.
+# Also note that hostname entries MUST FOLLOW the hostgroup definitions,
+# placing them before will cause them to get lost.
 #
 
 #
@@ -42,7 +44,7 @@
 
 hostgroup hubs cisco316t hp800t ssii10
 
-hostgroup workstations blue yellow red green cornflower violet
+hostgroup workstations blue yellow red cornflower violet
 
 hostgroup netapps f330 f540
 
@@ -53,6 +55,11 @@
 hostgroup new nntp
 
 hostgroup ftp ftp
+
+#
+# hostname definitions (hostnames or IP addresses)
+#
+hostname green workstations wwwservers
 
 #
 # For the servers in building 1, monitor ping and telnet

Reply via email to