Dear RAID experts,

I have just joined this ML today, and have a problem on RAID5 system,
which I'm installing now.

After about ten hours since "mkraid /dev/md0", HDD access stops and no
more disk operation (such as "mke2fs /dev/md0") works.  I tried once
again and got exactly the same error starting from the same block
number (1073743253, according to /var/log/messages).  The block number
cyclicly repeated.

I suspect when block number gets greater than 1024^3(=1073741824) some
malfunction occurs...

My system configuration, /etc/raidtab, source modification,
/var/log/messages(part) and /proc/mdstat are attached below.

Suggestions or pointers are highly appreciated.

Best regards,
     Seishi Takamura

Seishi Takamura, Dr.Eng.
NTT Cyber Space Laboratories
Y922A 1-1 Hikarino-Oka, Yokosuka, Kanagawa, 239-0847 Japan
Tel: +81-468-59-2371, Fax: +81-468-59-2829
E-mail: [EMAIL PROTECTED]


(system configuration)
  RedHat 6.1 (Japanese version)
  kernel 2.2.14 + RAID patch(raid0145-19990824-2.2.11)
  raidtools 19990824-0.90
  CPU Pentium III 600MHz + 512MB memory
  6 GB EIDE HDD (root and /boot), CD-ROM drive
  3 SCSI Cards (Adaptec AHA2940U2W)
  24 SCSI HDD Drives (Seagate ST150176LW Barracuda 50.1GB)

  Each SCSI card has eight HDD's connected (properly terminated, of
  course).

(/etc/raidtab)
  raiddev /dev/md0
        raid-level      5
        nr-raid-disks   24
        nr-spare-disks  0
        chunk-size      32
        persistent-superblock 1
        parity-algorithm        left-symmetric
        device          /dev/sda1
        raid-disk       0
        ...
        device          /dev/sdx1
        raid-disk       23

(Modification)
  In raidtools-0.90/md-int.h and /usr/src/linux/include/linux/raid/md_p.h,
  I changed from
#define MD_SB_DISKS_WORDS              384
  to
#define MD_SB_DISKS_WORDS              800
  to enable up to 25 disks.


(initial /proc/mdstat immediately after invoking mkraid)
Personalities : [linear] [raid0] [raid1] [raid5] 
read_ahead 1024 sectors
md0 : active raid5 sdx1[23] sdw1[22] sdv1[21] sdu1[20] sdt1[19] sds1[18] sdr1[17] 
sdq1[16] sdp1[15] sdo1[14] sdn1[13] sdm1[12] sdl1[11] sdk1[10] sdj1[9] sdi1[8] sdh1[7] 
sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[1] sda1[0] 1123474560 blocks level 5, 32k 
chunk, algorithm 2 [24/24] [UUUUUUUUUUUUUUUUUUUUUUUU] resync=0% finish=735.7min
unused devices: <none>

(/var/log/messages)
Jan 18 00:01:37 localhost kernel: ect 
Jan 18 00:01:37 localhost kernel: compute_blocknr: map not correct 
Jan 18 00:01:37 localhost last message repeated 112 times
Jan 18 00:01:37 localhost kernel: compute_blocknr: mapect 
Jan 18 00:01:37 localhost kernel: compute_blocknr: map not correct 
Jan 18 00:01:37 localhost last message repeated 454 times
Jan 18 00:01:37 localhost kernel: e I/O error for block 1073743253 
Jan 18 00:01:37 localhost kernel: raid5: md0: unrecoverable I/O error for block 
1073743285 
Jan 18 00:01:37 localhost kernel: raid5: md0: unrecoverable I/O error for block 
1073743317 
Jan 18 00:01:37 localhost kernel: raid5: md0: unrecoverable I/O error for block 
1073743349 
Jan 18 00:01:37 localhost kernel: raid5: md0: unrecoverable I/O error for block 
...

(/proc/mdstat after the error)
Personalities : [linear] [raid0] [raid1] [raid5] 
read_ahead 1024 sectors
md0 : active raid5 sdx1[23](F) sdw1[22](F) sdv1[21](F) sdu1[20](F) sdt1[19](F) s 
ds1[18](F) sdr1[17](F) sdq1[16](F) sdp1[15](F) sdo1[14](F) sdn1[13](F) sdm1[12]( F) 
sdl1[11](F) sdk1[10](F) sdj1[9](F) sdi1[8] sdh1[7] sdg1[6] sdf1[5] sde1[4] sd d1[3] 
sdc1[2](F) sdb1[1](F) sda1[0](F) 1123474560 blocks level 5, 32k chunk, alg orithm 2 
[24/6] [___UUUUUU_______________]
unused devices: <none>

Reply via email to