I have a 4-disk raid5 (sda3, sdb3, hda1, hdc1). sda and sdb share a
silicon image sata card. sdb died completely, then 20 minutes later,
the sata_sil driver became fatally confused and the machine locked up.
I shut down the machine and waited until I had a replacement for sdb.
I've got a replacement for sdb now, but I can't get the array to start
so that I can add it and resync. When I try to assemble the degraded
array, I get this:
[EMAIL PROTECTED]:~# mdadm -Af /dev/md2 /dev/sda3 /dev/hda1 /dev/hdc1
mdadm: failed to RUN_ARRAY /dev/md2: Input/output error
[EMAIL PROTECTED]:~# dmesg | tail -n 15
md: bind
md: bind
md: bind
md: md2: raid array is not clean -- starting background reconstruction
raid5: device sda3 operational as raid disk 0
raid5: device hdc1 operational as raid disk 3
raid5: device hda1 operational as raid disk 2
raid5: cannot start dirty degraded array for md2
RAID5 conf printout:
--- rd:4 wd:3 fd:1
disk 0, o:1, dev:sda3
disk 2, o:1, dev:hda1
disk 3, o:1, dev:hdc1
raid5: failed to run raid set md2
md: pers->run() failed ...
How do I convince the array to start? I can add the new disk to the
array, but it simply becomes a spare and the raid5 remains inactive.
The superblock on the 1 of the 3 drives is a little different than the
other two:
[EMAIL PROTECTED]:~# mdadm -E /dev/hda1 > sb-hda1
[EMAIL PROTECTED]:~# mdadm -E /dev/hdc1 > sb-hdc1
[EMAIL PROTECTED]:~# mdadm -E /dev/sda3 > sb-sda3
[EMAIL PROTECTED]:~# diff -u sb-hda1 sb-hdc1
--- sb-hda1 2006-07-01 17:17:36.0 -0400
+++ sb-hdc1 2006-07-01 17:17:41.0 -0400
@@ -1,4 +1,4 @@
-/dev/hda1:
+/dev/hdc1:
Magic : a92b4efc
Version : 00.90.00
UUID : 6b8b4567:327b23c6:643c9869:66334873
@@ -16,14 +16,14 @@
Working Devices : 3
Failed Devices : 2
Spare Devices : 0
- Checksum : a2163da6 - correct
+ Checksum : a2163dbb - correct
Events : 0.47575379
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
-this 2 312 active sync /dev/hda1
+this 3 2213 active sync /dev/hdc1
0 0 830 active sync /dev/sda3
1 1 001 faulty removed
[EMAIL PROTECTED]:~# diff -u sb-hda1 sb-sda3
--- sb-hda1 2006-07-01 17:17:36.0 -0400
+++ sb-sda3 2006-07-01 17:17:43.0 -0400
@@ -1,4 +1,4 @@
-/dev/hda1:
+/dev/sda3:
Magic : a92b4efc
Version : 00.90.00
UUID : 6b8b4567:327b23c6:643c9869:66334873
@@ -10,22 +10,22 @@
Total Devices : 4
Preferred Minor : 2
-Update Time : Mon Jun 26 22:51:12 2006
- State : active
+Update Time : Mon Jun 26 22:51:06 2006
+ State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 2
Spare Devices : 0
- Checksum : a2163da6 - correct
- Events : 0.47575379
+ Checksum : a4ec2eec - correct
+ Events : 0.47575378
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
-this 2 312 active sync /dev/hda1
+this 0 830 active sync /dev/sda3
0 0 830 active sync /dev/sda3
- 1 1 001 faulty removed
+ 1 1 001 spare
2 2 312 active sync /dev/hda1
3 3 2213 active sync /dev/hdc1
How do I get this array going again? Am I doing something wrong?
Reading the list archives indicates that there could be bugs in this
area, or that I may need to recreate the array with -C (though that
seems heavyhanded to me).
thanks,
Jason
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html