Hi,
I'm not sure about NAS boxes... but HP raid stores the raid array config
on the disks themselves. Such that you could take out 4 disks of a raid
array and put them in another server and the raid would come up ok. And
this is on a different raid controller.
So... if you have a backup of the data, have you tried to just take out
the disks and put them back in the same NAS box in different places?
Perhaps the connector is faulty. See whether the problem follows the
disk or the problem follows the slot where the disk is.
Thanks,
Ben Donohue
donoh...@icafe.com.au
On 14/11/2010 12:57 PM, Voytek Eymont wrote:
On Sat, November 13, 2010 5:52 pm, David Balnaves wrote:
I'm not really sure what the best indicators are of a failing hard drive.
I've used smart on a lot of hard drives; I've seen undocumented smart
values and even hard drives function fine for a number of years when smart
reports they are "FAILING NOW'. I've also seen some drives enter a
state where they wont allow further smart tests (on/offline) to be run or
aborted. This has lead me to believe that smart as an indicator needs to
be considered on a per model basis and run carefully within the
capabilities of the drive. The whole process has given me more questions
than answers.
I try to detect a failure by monitoring huge changes in the smart
attributes. I've configured munin to monitor the smart attributes; It
wouldn't be too hard to change the plugin to monitor these values on your
NAS (I imagine you can ssh/telnet to it). You will notice some variance
in things like temperature and ECC, but unless they start behaving
erratically then I wouldn't worry.
Hope this helps in 'detecting and notifying' potential failures.
David, thanks
yes, I can ssh to it
I'm not very familiar with the raid utilities (beyond knowing what the
acronym stand for...)
but I get:
# mdadm --detail /dev/md0
/dev/md0:
Version : 00.90.03
Creation Time : Sat Jun 19 04:35:02 2010
Raid Level : raid0
Array Size : 3900774400 (3720.07 GiB 3994.39 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sat Jun 19 04:35:02 2010
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Chunk Size : 64K
UUID : 79e23cd2:b3f9618d:58a8936b:5e0d814b
Events : 0.1
Number Major Minor RaidDevice State
0 8 3 0 active sync /dev/sda3
1 8 19 1 active sync /dev/sdb3
2 8 35 2 active sync /dev/sdc3
3 8 51 3 active sync /dev/sdd3
# mount
/proc on /proc type proc (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
sysfs on /sys type sysfs (rw)
tmpfs on /tmp type tmpfs (rw,size=32M)
none on /proc/bus/usb type usbfs (rw)
/dev/sda4 on /mnt/ext type ext3 (rw)
/dev/md9 on /mnt/HDA_ROOT type ext3 (rw)
/dev/md0 on /share/MD0_DATA type ext4
(rw,usrjquota=aquota.user,jqfmt=vfsv0,user_xattr,data=ordered,nodelalloc)
# ls /share/MD0_DATA
ls: /share/MD0_DATA/Web: Input/output error
ls: /share/MD0_DATA/Network Recycle Bin: Input/output error
ls: /share/MD0_DATA/lost+found: Input/output error
ls: /share/MD0_DATA/Download: Input/output error
ls: /share/MD0_DATA/aquota.user: Input/output error
ls: /share/MD0_DATA/Multimedia: Input/output error
ls: /share/MD0_DATA/Usb: Input/output error
ls: /share/MD0_DATA/Recordings: Input/output error
ls: /share/MD0_DATA/Public: Input/output error
cameras/
--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html