Andy. The output if their is a problem will look like this. The notification will only include the device that is in a down state.
***** Nagios ***** Notification Type: PROBLEM Service: DiskDrives Host: the.name.of.host Address: the.name.of.host State: CRITICAL Date/Time: Thu Aug 17 09:55:02 PDT 2006 Documentation: https://where.the.docs.be Additional Info: DOWN=(/dev/sdg) I believe this plugin can only detect when a drive is down and wont do much for predicting when a failure is going to happen soon. Hope this helps. Deet. > Hi Deet, > > Thanks very much for this script, had to do a minor touch of hacking, > but it also proves your script will work on SATA drives as well (at > least those SATA drives that Linux emulates as SCSI.) > > All I've touched is: > my $scsi_disks = `/usr/bin/sudo /sbin/sfdisk -s |/bin/grep -i > sd[a-z] |/bin/cut -f1 -d:`; > > /usr/bin/grep and /usr/bin/cut are in /bin/grep and /bin/cut on my > system (Fedora 5.) > > $val = `/usr/bin/sudo /usr/sbin/smartctl -d ata -s on $drive &> > /dev/null || /bin/echo MISSING`; > > In the above line I had to add the "-d ata" argument to smartctl to > read the SATA drives as ATA drives, not SCSIs. > > The script outputs "UP=(/dev/sda /dev/sdb)". > > Can I just ask what the criteria is for the script to class a drive as > failed/failing according to SMART? > > Many thanks again for sharing, it's extremely helpful! > > Regards > > Andy. > > PS. I couldn't reply to the list as I've got a problem with my DNS > server, and Sourceforge's server is bouncing any mail I send :( If > you could post what I've done to get SATA drives working, it may come > in handy for somebody too. > > --- > > Derek Olsen wrote: >> >> Andy. >> I've attached the check_smart we use. I think it's a barely modified >> version of the one that comes with the nagios plugins. In the >> script we use the output of /sbin/sfdisk -s to find out which scsi >> disks are on the local box because we ran into problems using the >> output of scsiinfo. So our sudoers file is configured to allow the >> nagios user to run /sbin/sfisk -s and /usr/sbin/smartctl. >> >> This works for us. Hope it helps. >> Deet. >>> Has anyone got a check plugin working for monitoring SMART hard disk >>> status thresholds? >>> >>> The only one I found on nagiosexchange (check_smartmon) needs to be >>> run as root to get permission to read the drive stats, and also >>> doesn't work - it causes the below Python trace-back: >>> >>> Traceback (most recent call last): >>> File "./check_smartmon", line 254, in ? >>> (healthStatus, temperature) = parseOutput(healthStatusOutput, >>> temperatureOutput) >>> File "./check_smartmon", line 163, in parseOutput >>> healthStatus = parts[-1] >>> IndexError: list index out of range >>> >>> >>> I've just ran smartctl and it appears you do need to be root, so if >>> I can find a working plugin I can just sudo the nagios user. >>> >>> Any ideas? >>> >>> Thanks >>> >>> Andy. >>> >>> ------------------------------------------------------------------------- >>> >>> Using Tomcat but need to do more? Need to support web services, >>> security? >>> Get stuff done quickly with pre-integrated technology to make your >>> job easier >>> Download IBM WebSphere Application Server v.1.0.1 based on Apache >>> Geronimo >>> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>> >>> _______________________________________________ >>> Nagios-users mailing list >>> [email protected] >>> https://lists.sourceforge.net/lists/listinfo/nagios-users >>> ::: Please include Nagios version, plugin version (-v) and OS when >>> reporting any issue. ::: Messages without supporting info will risk >>> being sent to /dev/null >>> >> >> >> >> !DSPAM:37,44f71ed4143297115289336! >> ------------------------------------------------------------------------ >> >> #!/usr/bin/perl -w >> >> # >> # This script checks the hard drives on a system for S.M.A.R.T. health >> # indicators. Only supports SCSI right now. >> # >> # >> use strict; >> >> my $debug = 0; >> my @disk_up; >> my @disk_down; >> my @disks; >> my $scsi_disks = `/usr/bin/sudo /sbin/sfdisk -s |/usr/bin/grep -i >> sd[a-z] |/usr/bin/cut -f1 -d:`; >> >> push @disks, split(' ', $scsi_disks); >> >> unless ( scalar @disks ) { >> print "0 No disks to monitor\n"; >> exit 0; >> } >> >> print "Monitoring: @disks\n" if $debug; >> >> for ( @disks ) { >> my $drive = $_; >> if($drive =~ /\/dev\/sd/) { >> my $val; >> >> $val = `/usr/bin/sudo /usr/sbin/smartctl -s on $drive &> >> /dev/null || /bin/echo MISSING`; >> if ( $val eq "MISSING\n" ) { >> push @disk_down, $drive; >> next; >> } >> >> $val = `/usr/bin/sudo /usr/sbin/smartctl -H $drive`; >> if ( $val =~ /SMART Health Status\: OK/g ) { >> print "$_ is OK\n" if $debug; >> push @disk_up, $drive; >> } else { >> print "$_ is BAD\n" if $debug; >> push @disk_down, $drive; >> } >> } >> } >> >> my $ret = 0; # OK >> if ( scalar @disk_down ) { >> print "DOWN=(@disk_down)\n"; >> exit 2; >> } >> print "UP=(@disk_up) " if ( scalar @disk_up ); >> print "DOWN=(@disk_down) " if ( scalar @disk_down ); >> print "\n"; >> >> exit 0; >> >> >> !DSPAM:37,44f71ed4143297115289336! >> > ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nagios-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
