Dear RAID experts,
I have just joined this ML today, and have a problem on RAID5 system,
which I'm installing now.
After about ten hours since "mkraid /dev/md0", HDD access stops and no
more disk operation (such as "mke2fs /dev/md0") works. I tried once
again and got exactly the same error starting from the same block
number (1073743253, according to /var/log/messages). The block number
cyclicly repeated.
I suspect when block number gets greater than 1024^3(=1073741824) some
malfunction occurs...
My system configuration, /etc/raidtab, source modification,
/var/log/messages(part) and /proc/mdstat are attached below.
Suggestions or pointers are highly appreciated.
Best regards,
Seishi Takamura
Seishi Takamura, Dr.Eng.
NTT Cyber Space Laboratories
Y922A 1-1 Hikarino-Oka, Yokosuka, Kanagawa, 239-0847 Japan
Tel: +81-468-59-2371, Fax: +81-468-59-2829
E-mail: [EMAIL PROTECTED]
(system configuration)
RedHat 6.1 (Japanese version)
kernel 2.2.14 + RAID patch(raid0145-19990824-2.2.11)
raidtools 19990824-0.90
CPU Pentium III 600MHz + 512MB memory
6 GB EIDE HDD (root and /boot), CD-ROM drive
3 SCSI Cards (Adaptec AHA2940U2W)
24 SCSI HDD Drives (Seagate ST150176LW Barracuda 50.1GB)
Each SCSI card has eight HDD's connected (properly terminated, of
course).
(/etc/raidtab)
raiddev /dev/md0
raid-level 5
nr-raid-disks 24
nr-spare-disks 0
chunk-size 32
persistent-superblock 1
parity-algorithm left-symmetric
device /dev/sda1
raid-disk 0
...
device /dev/sdx1
raid-disk 23
(Modification)
In raidtools-0.90/md-int.h and /usr/src/linux/include/linux/raid/md_p.h,
I changed from
#define MD_SB_DISKS_WORDS 384
to
#define MD_SB_DISKS_WORDS 800
to enable up to 25 disks.
(initial /proc/mdstat immediately after invoking mkraid)
Personalities : [linear] [raid0] [raid1] [raid5]
read_ahead 1024 sectors
md0 : active raid5 sdx1[23] sdw1[22] sdv1[21] sdu1[20] sdt1[19] sds1[18] sdr1[17]
sdq1[16] sdp1[15] sdo1[14] sdn1[13] sdm1[12] sdl1[11] sdk1[10] sdj1[9] sdi1[8] sdh1[7]
sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[1] sda1[0] 1123474560 blocks level 5, 32k
chunk, algorithm 2 [24/24] [UUUUUUUUUUUUUUUUUUUUUUUU] resync=0% finish=735.7min
unused devices: <none>
(/var/log/messages)
Jan 18 00:01:37 localhost kernel: ect
Jan 18 00:01:37 localhost kernel: compute_blocknr: map not correct
Jan 18 00:01:37 localhost last message repeated 112 times
Jan 18 00:01:37 localhost kernel: compute_blocknr: mapect
Jan 18 00:01:37 localhost kernel: compute_blocknr: map not correct
Jan 18 00:01:37 localhost last message repeated 454 times
Jan 18 00:01:37 localhost kernel: e I/O error for block 1073743253
Jan 18 00:01:37 localhost kernel: raid5: md0: unrecoverable I/O error for block
1073743285
Jan 18 00:01:37 localhost kernel: raid5: md0: unrecoverable I/O error for block
1073743317
Jan 18 00:01:37 localhost kernel: raid5: md0: unrecoverable I/O error for block
1073743349
Jan 18 00:01:37 localhost kernel: raid5: md0: unrecoverable I/O error for block
...
(/proc/mdstat after the error)
Personalities : [linear] [raid0] [raid1] [raid5]
read_ahead 1024 sectors
md0 : active raid5 sdx1[23](F) sdw1[22](F) sdv1[21](F) sdu1[20](F) sdt1[19](F) s
ds1[18](F) sdr1[17](F) sdq1[16](F) sdp1[15](F) sdo1[14](F) sdn1[13](F) sdm1[12]( F)
sdl1[11](F) sdk1[10](F) sdj1[9](F) sdi1[8] sdh1[7] sdg1[6] sdf1[5] sde1[4] sd d1[3]
sdc1[2](F) sdb1[1](F) sda1[0](F) 1123474560 blocks level 5, 32k chunk, alg orithm 2
[24/6] [___UUUUUU_______________]
unused devices: <none>