Hi,
I have got a few questions about an array I have here, running on a RedHat
6.0 distribution with a 2.2.5-22 kernel, and raidtools 0.9.
The array has 4 SCSI disks, where one has failed:
# cat /proc/mdstat
Personalities : [raid5]
read_ahead 1024 sectors
md0 : active raid5 sda1[0](F) sdd1[3] sdc1[2] sde1[1] 26627328 blocks level
5, 128k chunk, algorithm 2 [4/3] [_UUU]
unused devices: <none>
I tried to replace sda1 (the drive with SCSI ID 0) with another physical
drive I have outside the machine (which was once part of this RAID
array. The drive that went in had ID 0 too. I played around trying to get
the array working, but for some reason it would not work (sorry I do not
have any output from this time). I suspect that the fact that the drive
had old information on it may have caused a problem, since putting the
faulty drive with time inconsistencies allowed the array to be started up
again.
I have a spare disk in this machine which has been added to the array with
raidhotadd.
I wanted this spare disk to be automatically added as a hotspare drive, but
I have been unable to get this working (now commented out in raidtab.conf
file below).
Can anyone give me some insight into what is going on here. Should I
format the partition on the drive that did not work so that superblock/etc
information is no longer present? Should I seriously consider compiling a
2.2.14 kernel with the latest raidtools patch? Should I pull my last hair
out of my head?
Kind regards, Stuart.
# raidstart --version
raidstart v0.3d compiled for md raidtools-0.90
# cat /etc/raidtab
raiddev /dev/md0
raid-level 5
nr-raid-disks 4
chunk-size 128
persistent-superblock 1
parity-algorithm left-symmetric
# Spare disks for hot reconstruction
#nr-spare-disks 1
device /dev/sda1
raid-disk 0
device /dev/sde1
raid-disk 1
device /dev/sdc1
raid-disk 2
device /dev/sdd1
raid-disk 3
#device /dev/sdb1
#spare-disk 0
# cat /var/log/dmesg
wansea University Computer Society NET3.039
NET4: Unix domain sockets 1.0 for Linux NET4.0.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
Initializing RT netlink socket
Starting kswapd v 1.5
Detected PS/2 Mouse Port.
Serial driver version 4.27 with MANY_PORTS MULTIPORT SHARE_IRQ enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
pty: 256 Unix98 ptys configured
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.9)
Real Time Clock Driver v1.09
RAM disk driver initialized: 16 RAM disks of 4096K size
PIIX: IDE controller on PCI bus 00 dev 38
PIIX: not 100% native mode: will probe irqs later
PIIX: neither IDE port enabled (BIOS)
PIIX: IDE controller on PCI bus 00 dev 39
PIIX: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xe800-0xe807, BIOS settings: hda:pio, hdb:pio
ide1: BM-DMA at 0xe808-0xe80f, BIOS settings: hdc:pio, hdd:pio
hda: ST31720A, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: ST31720A, 1626MB w/0kB Cache, CHS=826/64/63
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
md driver 0.90.0 MAX_MD_DEVS=256, MAX_REAL=12
raid5: measuring checksumming speed
8regs : 169.545 MB/sec
32regs : 149.352 MB/sec
using fastest function: 8regs (169.545 MB/sec)
scsi : 0 hosts.
scsi : detected total.
md.c: sizeof(mdp_super_t) = 4096
Partition check:
hda: hda1 hda2 hda3
RAMDISK: Compressed image found at block 0
autodetecting RAID arrays
autorun ...
... autorun DONE.
VFS: Mounted root (ext2 filesystem).
(scsi0) <Adaptec AHA-294X SCSI host adapter> found at PCI 10/0
(scsi0) Narrow Channel, SCSI ID=7, 16/255 SCBs
(scsi0) Downloading sequencer code... 406 instructions downloaded
scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.1.16/3.2.4
<Adaptec AHA-294X SCSI host adapter>
scsi : 1 host.
(scsi0:0:0:0) Synchronous at 10.0 Mbyte/sec, offset 15.
Vendor: SEAGATE Model: ST19171N Rev: 0023
Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sda at scsi0, channel 0, id 0, lun 0
(scsi0:0:1:0) Synchronous at 10.0 Mbyte/sec, offset 15.
Vendor: SEAGATE Model: ST19171N Rev: 0024
Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sdb at scsi0, channel 0, id 1, lun 0
(scsi0:0:2:0) Synchronous at 10.0 Mbyte/sec, offset 15.
Vendor: SEAGATE Model: ST19171N Rev: 0023
Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sdc at scsi0, channel 0, id 2, lun 0
(scsi0:0:3:0) Synchronous at 10.0 Mbyte/sec, offset 15.
Vendor: SEAGATE Model: ST39140N Rev: 1498
Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sdd at scsi0, channel 0, id 3, lun 0
(scsi0:0:4:0) Synchronous at 10.0 Mbyte/sec, offset 15.
Vendor: SEAGATE Model: ST19171N Rev: 0024
Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sde at scsi0, channel 0, id 4, lun 0
SCSI device sda: hdwr sector= 512 bytes. Sectors= 17783112 [8683 MB] [8.7 GB]
sda: sda1
SCSI device sdb: hdwr sector= 512 bytes. Sectors= 17783112 [8683 MB] [8.7 GB]
sdb: sdb1
SCSI device sdc: hdwr sector= 512 bytes. Sectors= 17783112 [8683 MB] [8.7 GB]
sdc: sdc1
SCSI device sdd: hdwr sector= 512 bytes. Sectors= 17783240 [8683 MB] [8.7 GB]
sdd: sdd1
SCSI device sde: hdwr sector= 512 bytes. Sectors= 17783112 [8683 MB] [8.7 GB]
sde: sde1
autodetecting RAID arrays
(read) sda1's sb offset: 8883840 [events: 0000003c]
(read) sdb1's sb offset: 8883840 [events: 00000021]
(read) sdc1's sb offset: 8875776 [events: 0000003e]
(read) sdd1's sb offset: 8883840 [events: 0000003e]
(read) sde1's sb offset: 8883840 [events: 0000003e]
autorun ...
considering sde1 ...
adding sde1 ...
adding sdd1 ...
adding sdc1 ...
adding sdb1 ...
adding sda1 ...
created md0
bind<sda1,1>
bind<sdb1,2>
bind<sdc1,3>
bind<sdd1,4>
bind<sde1,5>
running: <sde1><sdd1><sdc1><sdb1><sda1>
now!
sde1's event counter: 0000003e
sdd1's event counter: 0000003e
sdc1's event counter: 0000003e
sdb1's event counter: 00000021
sda1's event counter: 0000003c
md: superblock update time inconsistency -- using the most recent one
freshest: sde1
md: kicking non-fresh sdb1 from array!
unbind<sdb1,4>
export_rdev(sdb1)
md: kicking non-fresh sda1 from array!
unbind<sda1,3>
export_rdev(sda1)
md0: removing former faulty sda1!
kmod: failed to exec /sbin/modprobe -s -k md-personality-4, errno = 2
do_md_run() returned -22
unbind<sde1,2>
export_rdev(sde1)
unbind<sdd1,1>
export_rdev(sdd1)
unbind<sdc1,0>
export_rdev(sdc1)
md0 stopped.
... autorun DONE.
VFS: Mounted root (ext2 filesystem) readonly.
change_root: old root has d_count=1
Trying to unmount old root ... okay
Freeing unused kernel memory: 60k freed
Adding Swap: 128988k swap-space (priority -1)
(read) sda1's sb offset: 8883840 [events: 0000003c]
(read) sde1's sb offset: 8883840 [events: 0000003e]
(read) sdc1's sb offset: 8875776 [events: 0000003e]
(read) sdd1's sb offset: 8883840 [events: 0000003e]
autorun ...
considering sdd1 ...
adding sdd1 ...
adding sdc1 ...
adding sde1 ...
adding sda1 ...
created md0
bind<sda1,1>
bind<sde1,2>
bind<sdc1,3>
bind<sdd1,4>
running: <sdd1><sdc1><sde1><sda1>
now!
sdd1's event counter: 0000003e
sdc1's event counter: 0000003e
sde1's event counter: 0000003e
sda1's event counter: 0000003c
md: superblock update time inconsistency -- using the most recent one
freshest: sdd1
md: kicking non-fresh sda1 from array!
unbind<sda1,3>
export_rdev(sda1)
md0: removing former faulty sda1!
raid5 personality registered
md0: max total readahead window set to 1536k
md0: 3 data-disks, max readahead per data-disk: 512k
raid5: device sdd1 operational as raid disk 3
raid5: device sdc1 operational as raid disk 2
raid5: device sde1 operational as raid disk 1
raid5: md0, not all disks are operational -- trying to recover array
raid5: allocated 4248kB for md0
raid5: raid level 5 set md0 active with 3 out of 4 devices, algorithm 2
RAID5 conf printout:
--- rd:4 wd:3 fd:1
disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00]
disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sde1
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdc1
disk 3, s:0, o:1, n:3 rd:3 us:1 dev:sdd1
disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
RAID5 conf printout:
--- rd:4 wd:3 fd:1
disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00]
disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sde1
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdc1
disk 3, s:0, o:1, n:3 rd:3 us:1 dev:sdd1
disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
md: updating md0 RAID superblock on device
sdd1 [events: 0000003f](write) sdd1's sb offset: 8883840
md: recovery thread got woken up ...
md0: no spare disk to reconstruct array! -- continuing in degraded mode
md: recovery thread finished ...
sdc1 [events: 0000003f](write) sdc1's sb offset: 8875776
sde1 [events: 0000003f](write) sde1's sb offset: 8883840