Hello all, was wondering if anybody here could test this NetApp script
for me.  Basically you need a (test) NetApp that you can afford to yank
a drive out and a bring the array to a degraded state.  Then you can
put it back in and let it start building again.  If you configure an
alert and upalert in mon, you should get a notice (whatever type you
use) when it detects a degraded array and when it detects it's finished
reconstructing.  However, it won't give a notice when it changes from
"degraded" to "reconstruct" status.

This is basically the netappfree.monitor script converted to use
volTable items instead of dfTable items.  I also added 'use strict' to
the new script and a --version option (my NetApps want to force version
one by my net-snmp libs default to version 3).

Here's a sample output:
admin51 alert.d # perl ../mon.d/netappraidstat.monitor --list filer1 filer2
filer            ONTAP       Volume Name  Vol State            Vol Status
-------------------------------------------------------------------------
filer1           6.5         vol0            online                 raid4
filer2           6.5         vol0            online                 raid4

It is seemingly working fine (indicates all ok) on our production system
here, but I don't have a test system that I can yank a drive out of to
test.  Can anybody here do so?  I would be most appreciative if someone
could verify that it detects "degraded" and "reconstruct" status and
triggers the defined alert.

I've considered adding a feature to check that a (any) drive has failed
and that it has used the spare.  But during the reconstruct phase, you
should get an alert anyway, so that seems to be overkill.  It also seems
like it should check for raid_dp conditions, but we don't have raid_dp
set on any machines, so can't check that yet either (we plan to do that
upgrade soon.)

The netappraidstat.monitor is quoted inline in this email instead of
attached via MIME (I recall that this ML does mime stripping by
default).




#!/usr/bin/perl
#
# Use SNMP to get raid status from a Network Appliance
# exits with value of 1 if an array has status "degraded" or 
# "reconstruct", or exits with the value of 2 if there is a 
# "soft" error (SNMP library error, or could not get a
# response from the server).
#
# This requires the UCD SNMP library and G.S. Marzot's Perl SNMP
# module.
#
# Borrowed heavily from framework of netappfree.monitor.
# Originally by Jim Trocki.  Modified by Theo Van Dinter
# ([EMAIL PROTECTED], [EMAIL PROTECTED]) to add verbose error output,
# more error checking, etc.  Can be used in conjunction with
# snapdelete.alert to auto-remove snapshots if needed.
#
# $Id:$
#
#    Copyright (C) 1998, Jim Trocki
#    Copyright (C) 1999-2001, Theo Van Dinter
#    Copyright (C) 2005, Todd Lyons
#
#    This program is free software; you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation; either version 2 of the License, or
#    (at your option) any later version.
#
#    This program is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#    GNU General Public License for more details.
#
#    You should have received a copy of the GNU General Public License
#    along with this program; if not, write to the Free Software
#    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
#
use strict;
use SNMP;
use Getopt::Long;

sub list;
sub readcf;

$ENV{"MIBS"} = 'RFC1213-MIB:NETWORK-APPLIANCE-MIB';

my %opt;
GetOptions (\%opt, "community=s", "timeout=i", "retries=i", "config=s", 
                   "version=s", "list");

die "no host arguments\n" if (@ARGV == 0);

my $RET = 0;
my @ERRS = ();
my %HOSTS = ();
my %ARRAY = ();       # disk array names
my ($s, $v);          # handles to snmp objects
my ($listhost, $ver); # hostname and version of ONTap

my $COMM = $opt{"community"} || "public";
my $TIMEOUT = $opt{"timeout"} * 1000 * 1000 || 2000000;
my $RETRIES = $opt{"retries"} || 5;
# Reading the config file is very liberal, reads first argument, ignores
# the rest.  Allows you to symlink to existing netappfree.cf (no need to
# keep seperate config files).
my $CONFIG = $opt{"config"} || (-d "/etc/mon" ? "/etc/mon" : "/usr/lib/mon/etc")
             . "/netappraidstat.cf";
my $VERSION = $opt{"version"} || 1;

list (@ARGV) if ($opt{"list"});

my ($volIndex, $volName, $volState, $volStatus) = (0..3);

readcf ($CONFIG) || die "could not read config: $!\n";

foreach my $host (@ARGV) {
    next if (!defined $ARRAY{$host});

    $s;

    if (!defined($s = new SNMP::Session (DestHost => $host,
                Timeout => $TIMEOUT, Community => $COMM,
                Retries => $RETRIES, Version => $VERSION))) {
        $RET = ($RET == 1) ? 1 : 2;
        $HOSTS{$host} ++;
        push (@ERRS, "could not create session to $host: " . 
$SNMP::Session::ErrorStr);
        next;
    }

    $v = new SNMP::VarList (
            ['volIndex'],
            ['volName'],
            ['volState'],
            ['volStatus'],
    );

    if ( $v->[$volIndex]->tag !~ /^vol/ ) {
        push(@ERRS,"OIDs not mapping correctly!  Check that NetApp MIB is 
available!");
        $RET = 1;
        last;
    }

    while (defined $s->getnext($v)) {

        last if ($v->[$volIndex]->tag !~ /volIndex/);

        if ($v->[$volStatus]->val =~ /degraded|reconstruct/ ) {
             $HOSTS{$host}++;
             push (@ERRS, sprintf ("%s is %s, status: '%s'",
                   $host, $v->[$volState]->val, $v->[$volStatus]->val)
                  );
             $RET = 1;
        }
    }

    if ($s->{ErrorNum}) {
        $HOSTS{$host} ++;
        push (@ERRS, "could not get volIndex for $host: " . $s->{ErrorStr});
        $RET = ($RET == 1) ? 1 : 2;
    }
}


if ($RET) {
    print join(" ", sort keys %HOSTS), "\n\n", join("\n", @ERRS), "\n";
}

exit $RET;


#
# read configuration file
#
sub readcf {
    my ($f) = @_;
    my ($l, $host, $dummy);

    open (CF, $f) || return undef;
    while (<CF>) {
        next if (/^\s*#/ || /^\s*$/);
        chomp;
        ($host, $dummy) = split;
        if (!defined ($ARRAY{$host} = $host)) {
            die "error, cannot extract hostname, config $f, line $.\n";
        }
    }
    close (CF);
}

#
# Don't use config, instead just dump all data returned from netapp
#
sub list {
    my (@hosts) = @_;

    foreach my $host (@hosts) {
        if (!defined($s = new SNMP::Session (DestHost => $host,
                    Timeout => $TIMEOUT,
                    Community => $COMM,
                    Retries => $RETRIES,
                    Version => $VERSION))) {
            print STDERR "could not create session to $host: " . 
                  $SNMP::Session::ErrorStr, "\n";
            next;
        }

        $listhost = $host;         # Handles global scope in --list mode
        $ver = $s->get(['sysDescr', 0]);
        $ver =~ s/^netapp.*release\s*([^:]+):.*$/$1/i;

        $v = new SNMP::VarList (
                ['volIndex'],
                ['volName'],
                ['volState'],
                ['volStatus'],
        );

        while (defined $s->getnext($v)) {
            last if ($v->[$volIndex]->tag !~ /volIndex/);
            write;
        }
    }
    exit 0;
}
format STDOUT_TOP =
filer            ONTAP       Volume Name  Vol State            Vol Status
-------------------------------------------------------------------------
.

format STDOUT =
@<<<<<<<<<<<<<<  @<<<<<<<<<  @<<<<<<<<  @>>>>>>>>>>  @>>>>>>>>>>>>>>>>>>>
$listhost, $ver, $v->[1]->[2], $v->[2]->[2], $v->[3]->[2]
.


-- 
Regards...              Todd
OS X: We've been fighting the "It's a mac" syndrome with upper management
for  years  now.  Lately  we've  taken  to  just  referring  to  new  mac 
installations  as  "Unix"  installations  when  presenting proposals  and 
updates.  For some reason, they have no problem with that.          -- /.
Linux kernel 2.6.12-12mdksmp   2 users,  load average: 0.02, 0.05, 0.07

_______________________________________________
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon

Reply via email to