On Sat, November 13, 2010 5:52 pm, David Balnaves wrote:

> I'm not really sure what the best indicators are of a failing hard drive.
>  I've used smart on a lot of  hard drives; I've seen undocumented smart
> values and even hard drives function fine for a number of years when smart
>  reports they are "FAILING NOW'.  I've also seen some drives enter a
> state where they wont allow further smart tests (on/offline) to be run or
> aborted. This has lead me to believe that smart as an indicator needs to
> be considered on a per model basis and run carefully within the
> capabilities of the drive.  The whole process has given me more questions
> than answers.
>
> I try to detect a failure by monitoring huge changes in the smart
> attributes.  I've configured munin to monitor the smart attributes; It
> wouldn't be too hard to change the plugin to monitor these values on your
>  NAS (I imagine you can ssh/telnet to it).  You will notice some variance
> in things like temperature and ECC, but unless they start behaving
> erratically then I wouldn't worry.
>
> Hope this helps in 'detecting and notifying' potential failures.

David, thanks

yes, I can ssh to it

I'm not very familiar with the raid utilities (beyond knowing what the
acronym stand for...)

but I get:

# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Sat Jun 19 04:35:02 2010
     Raid Level : raid0
     Array Size : 3900774400 (3720.07 GiB 3994.39 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Sat Jun 19 04:35:02 2010
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 64K

           UUID : 79e23cd2:b3f9618d:58a8936b:5e0d814b
         Events : 0.1

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       19        1      active sync   /dev/sdb3
       2       8       35        2      active sync   /dev/sdc3
       3       8       51        3      active sync   /dev/sdd3


 # mount
/proc on /proc type proc (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
sysfs on /sys type sysfs (rw)
tmpfs on /tmp type tmpfs (rw,size=32M)
none on /proc/bus/usb type usbfs (rw)
/dev/sda4 on /mnt/ext type ext3 (rw)
/dev/md9 on /mnt/HDA_ROOT type ext3 (rw)
/dev/md0 on /share/MD0_DATA type ext4
(rw,usrjquota=aquota.user,jqfmt=vfsv0,user_xattr,data=ordered,nodelalloc)

# ls  /share/MD0_DATA
ls: /share/MD0_DATA/Web: Input/output error
ls: /share/MD0_DATA/Network Recycle Bin: Input/output error
ls: /share/MD0_DATA/lost+found: Input/output error
ls: /share/MD0_DATA/Download: Input/output error
ls: /share/MD0_DATA/aquota.user: Input/output error
ls: /share/MD0_DATA/Multimedia: Input/output error
ls: /share/MD0_DATA/Usb: Input/output error
ls: /share/MD0_DATA/Recordings: Input/output error
ls: /share/MD0_DATA/Public: Input/output error
cameras/



-- 
Voytek

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Reply via email to