An interesting situation here. It seems that when 2 hard drives go
bad a RAID 5 array the array is still marked as active.
Setup:
9 9gb disks in an IBM exp10.
Kernel 2.2.13ac2
Latest raidtools. (raidtools-19990824-0.90)
While trying to 'mke2fs -R stride=1 -b4096 -s1 -c /dev/md0'
2 disk were shown as being bad, but the RAID 5 array still was shown
as active. Is this working as designed?
# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5]
read_ahead 1024 sectors
md0 : active raid5 sdl1[8](F) sdk1[7] sdj1[6] sdi1[5] sdh1[4] sdg1[3]
sdf1[2](F) sde1[1] sdd1[0] 71070720 blocks level 5, 4k chunk,
algorithm 2 [9/7] [UU_UUUUU_]
unused devices: <none>
Here are the relevant messages from syslog:
Oct 28 13:46:44 redhat kernel: raid5: Disk failure on sdl1, disabling
device. Operation continuing on 8 devices
Oct 28 13:46:44 redhat kernel: md: recovery thread got woken up ...
Oct 28 13:46:44 redhat kernel: md0: no spare disk to reconstruct
array! -- continuing in degraded mode
Oct 28 13:46:44 redhat kernel: md: recovery thread finished ...
Oct 28 13:46:44 redhat kernel: md: updating md0 RAID superblock on
device
Oct 28 13:46:44 redhat kernel: (skipping faulty sdl1 )
and
Oct 28 13:52:22 redhat kernel: raid5: Disk failure on sdf1, disabling
device. Operation continuing on 7 devices
Oct 28 13:52:22 redhat kernel: raid5: md0: unrecoverable I/O error for
block 68154012
The last one of those is the alarming one...as I thought RAID 5 could
only survive one disk failure and stay on-line...but not on two
failures.
The above 'cat /proc/mdstat' was done while the mke2fs was running.
(and still making progress)
So if the raid code marks the array as bad after a subsequent
write/read then this should have happened as the mke2fs was still
going.
If you need any further information please let me know.
Shane Owenby
IBM Linux Technology Center
[EMAIL PROTECTED]
Sorry for lengthy post, just trying to provide enough info to be
useful. :)
PS Here is my /etc/raidtab if you need it
raiddev /dev/md0
# General parameters
raid-level 5
nr-raid-disks 9
nr-spare-disks 0
chunk-size 4
parity-algorithm left-symmetric
# RAID disks
device /dev/sdd1
raid-disk 0
device /dev/sde1
raid-disk 1
device /dev/sdf1
raid-disk 2
device /dev/sdg1
raid-disk 3
device /dev/sdh1
raid-disk 4
device /dev/sdi1
raid-disk 5
device /dev/sdj1
raid-disk 6
device /dev/sdk1
raid-disk 7
device /dev/sdl1
raid-disk 8