James Manning <[EMAIL PROTECTED]>:   

   First start with some added information.

Ok.  I  use a Debian potato  system.  The kernel is  2.2.13 (compiled by
myself) without any patches.  The raid tools are 0.42 (old style raid).

Appended you'll find  the conf files for raid5.  I  use three scsi disks
with an almost identical layout  (one of them is slightly bigger).  Over
them I have a  raid0 array and two raid5 arrays.  I  have a swap file on
the striped array (/var on /dev/md0).  Here is the raid status:

================== /proc/mdstat ===========================
Personalities : [1 linear] [2 raid0] [3 raid1] [4 raid5]
read_ahead 128 sectors
md0 : inactive
md1 : active raid5 sda3 sdb3 sdc3 1542016 blocks level 5, 32k chunk, algorithm 2 [3/3] 
[UUU]
md2 : active raid5 sda5 sdb5 sdc5 4706816 blocks level 5, 32k chunk, algorithm 2 [3/3] 
[UUU]
md3 : active raid0 sda6 sdb6 sdc6 3405664 blocks 32k chunks
===========================================================

Now the problem: I occasionally get things like this:

Jan 12 11:14:39 pot kernel: raid5: bug: stripe->bh_new[2], sector 2622708 exists
Jan 12 11:14:39 pot kernel: raid5: bh ccc82440, bh_new c69107e0

Jan 24 11:21:49 pot kernel: raid5: bug: stripe->bh_new[0], sector 2622732 exists
Jan 24 11:21:49 pot kernel: raid5: bh cad26860, bh_new c1ec7e40

If I run ckraid  on the raid5 devices, I get an  average of ten messages
like (more  or less):  "array xxx corrupted,  cannot reconstruct  as all
devices are working".  Running again ckraid on the same devices, the xxx
changes, so these errors are not reproducible in the same place.

This  led me  to think  about hardware  failure, but  if I  dd  the disk
partitions to /dev/null  I get no errors, nor do I  get any errors using
badblocks, so  the only  thing left that  I can  think of is  a software
failure.

I also thought about passing to new-style raid arrays, but:
- is the new style more reliable than the old one?  I think not, it's
  just that it can reconstruct in the background, right?
- is it possible to convert an old-style raid5 array to new style in
  place?  I think not, but I may be wrong.

Thank you for your help. 

I also had a complete raid failure like this:

================================================================
Jan 20 04:10:25 pot kernel: RAID5: Disk failure on 08:23, disabling device.Operation 
continuing on 2 devices
Jan 20 04:10:25 pot kernel: raid5: restarting stripe 3912183324
Jan 20 04:10:25 pot kernel: attempt to access beyond end of device
Jan 20 04:10:25 pot kernel: 08:03: rw=0, want=1956091663, limit=771120
Jan 20 04:10:25 pot kernel: dev 09:01 blksize=1024 blocknr=1956091662 
sector=-382783972 size=1024 count=1
Jan 20 04:10:25 pot kernel: RAID5: Disk failure on 08:03, disabling device.Operation 
continuing on 1 devices
Jan 20 04:10:25 pot kernel: attempt to access beyond end of device
Jan 20 04:10:25 pot kernel: 08:13: rw=0, want=1956091663, limit=771120
Jan 20 04:10:25 pot kernel: dev 09:01 blksize=1024 blocknr=1956091662 
sector=-382783972 size=1024 count=1
Jan 20 04:10:25 pot kernel: RAID5: Disk failure on 08:13, disabling device.Operation 
continuing on 0 devices
Jan 20 04:10:25 pot kernel: raid5: restarting stripe 3912183324
Jan 20 04:10:25 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1764699694
Jan 20 04:10:25 pot kernel: md: updating raid superblock on device 08:03, sb_offset == 
771008
Jan 20 04:10:25 pot kernel: md: updating raid superblock on device 08:13, sb_offset == 
771008
Jan 20 04:10:25 pot kernel: md: updating raid superblock on device 08:23, sb_offset == 
771008
Jan 20 04:10:25 pot kernel: raid5: 09:01: unrecoverable I/O error for block 761869
Jan 20 04:10:25 pot kernel: EXT2-fs error (device md(9,1)): ext2_read_inode: unable to 
read inode block - inode=189745, block=761869
Jan 20 04:10:25 pot kernel: raid5: 09:01: unrecoverable I/O error for block 974896
Jan 20 04:10:25 pot kernel: EXT2-fs error (device md(9,1)): ext2_read_inode: unable to 
read inode block - inode=243068, block=974896
Jan 20 04:10:25 pot kernel: raid5: 09:01: unrecoverable I/O error for block 925716
Jan 20 04:10:25 pot kernel: EXT2-fs error (device md(9,1)): ext2_read_inode: unable to 
read inode block - inode=230602, block=925716
Jan 20 04:10:25 pot kernel: raid5: 09:01: unrecoverable I/O error for block 589854
Jan 20 04:10:25 pot kernel: EXT2-fs error (device md(9,1)): ext2_read_inode: unable to 
read inode block - inode=147048, block=589854
Jan 20 04:10:25 pot kernel: raid5: 09:01: unrecoverable I/O error for block 630796
Jan 20 04:10:25 pot kernel: EXT2-fs error (device md(9,1)): ext2_read_inode: unable to 
read inode block - inode=157101, block=630796
Jan 20 04:10:25 pot kernel: raid5: 09:01: unrecoverable I/O error for block 950292
Jan 20 04:10:25 pot kernel: EXT2-fs error (device md(9,1)): ext2_read_inode: unable to 
read inode block - inode=236722, block=950292
Jan 20 04:10:25 pot kernel: raid5: 09:01: unrecoverable I/O error for block 327715
Jan 20 04:10:25 pot kernel: EXT2-fs error (device md(9,1)): ext2_read_inode: unable to 
read inode block - inode=81803, block=327715
Jan 20 04:10:25 pot kernel: raid5: 09:01: unrecoverable I/O error for block 265
Jan 20 04:10:25 pot kernel: raid5: 09:01: unrecoverable I/O error for block 265
Jan 20 04:10:25 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1
Jan 20 04:10:26 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1
Jan 20 04:10:27 pot kernel: raid5: 09:01: unrecoverable I/O error for block 265
Jan 20 04:10:27 pot kernel: EXT2-fs error (device md(9,1)): ext2_readdir: directory #2 
contains a hole at offset 0
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 952424
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1410074
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1361517
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 747692
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 749844
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 752094
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 746753
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 753634
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 753638
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 753652
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1452430
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1464764
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1450667
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1461026
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1463258
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1471771
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1459792
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1476464
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1452430
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1464764
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1450667
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1461026
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1463258
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1471771
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1459792
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1476464
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1452430
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1464764
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1450667
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1461026
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1463258
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1471771
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1459792
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 1476464
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 265
Jan 20 04:10:28 pot kernel: EXT2-fs error (device md(9,1)): ext2_readdir: directory #2 
contains a hole at offset 0
Jan 20 04:10:28 pot kernel: raid5: 09:01: unrecoverable I/O error for block 265
... 
  (the last two lines repeated forever until the system completely froze)
================================================================

======================= scsi layout =======================
Disk /dev/sda: 255 heads, 63 sectors, 527 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/sda1           103       527   3413812+   5  Extended
/dev/sda2   *         1         6     48163+  83  Linux
/dev/sda3             7       102    771120   83  Linux
/dev/sda5           103       395   2353491   83  Linux
/dev/sda6           396       527   1060258+  83  Linux

Disk /dev/sdb: 255 heads, 63 sectors, 527 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/sdb1           103       527   3413812+   5  Extended
/dev/sdb2   *         1         6     48163+  83  Linux
/dev/sdb3             7       102    771120   83  Linux
/dev/sdb5           103       395   2353491   83  Linux
/dev/sdb6           396       527   1060258+  83  Linux

Disk /dev/sdc: 255 heads, 63 sectors, 555 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/sdc1           103       555   3638722+   5  Extended
/dev/sdc2             1         6     48163+  82  Linux swap
/dev/sdc3             7       102    771120   83  Linux
/dev/sdc5           103       395   2353491   83  Linux
/dev/sdc6           396       555   1285168+  83  Linux
============================================================

===File /etc/raid/home.conf=================================
# raid-5 for /home on /dev/md2

raiddev                 /dev/md2
raid-level              5
nr-raid-disks           3
nr-spare-disks          0
chunk-size              32
parity-algorithm        left-symmetric

device                  /dev/scsi/id2p5
raid-disk               0

device                  /dev/scsi/id3p5
raid-disk               1

device                  /dev/scsi/id9p5
raid-disk               2
============================================================

===File /etc/raid/usr.conf==================================
# raid-5 for /usr on /dev/md1

raiddev                 /dev/md1
raid-level              5
nr-raid-disks           3
nr-spare-disks          0
chunk-size              32
parity-algorithm        left-symmetric

device                  /dev/scsi/id2p3
raid-disk               0

device                  /dev/scsi/id3p3
raid-disk               1

device                  /dev/scsi/id9p3
raid-disk               2
============================================================

Reply via email to