Filesystem corruption on large IDE software-RAID-5

Bruno Prior Thu, 03 May 2001 16:28:52 -0700
I have been running software RAID for several years without trouble.
Recently, one of my hard disks in my home server died. As the others
were getting pretty noisy too (they were 4 X 17Gb Maxtors, which I learn
have a bad reputation), I decided to play safe and replace them all. I
also decided to make the most of the opportunity to go for larger
capacity disks. I purchased 4 X 40Gb Fujitsu MPG3409AT disks, which are
wonderfully quiet compared to the Maxtors.

Unfortunately, the BIOS on my old motherboard was stuck
on the 33.8Gb limit, and I could not persuade the system to boot. So I
decided to upgrade the motherboard too. My existing setup had the first
2 disks on the 2 IDE channels on the motherboard, and the other 2 disks
on a Promise ATA-33 controller. As the Fujitsu disks are capable of
ATA-100, I decided to look for a system that would allow me to run them
at a faster speed (although I appreciate that this probably would not
make much difference with 1 disk per channel). I went for a Gigabyte
GA-7ZXR, which has a Promise controller allowing RAID (under Windows) or
ATA-100 on 2 extra IDE channels, as well as the standard IDE channels.

With a new motherboard and disks, I effectively had a new system, so I
decided to install from scratch rather than try to transfer the old
system across. I went for Mandrake 7.2, because that distro has had the
most success detecting my hardware in the past. Unfortunately, it did
not successfully detect the Promise controller, so only the first 2
disks showed up. I tried the advice for older Promise controllers in the
Ultra-DMA mini-HOWTO, passing parameters for ide2 and ide3 to the kernel
at bootup. This seemed to work, as all disks were then detected.

Initially, I thought I would take the easy option and use Diskdrake to
install straight to RAID. However, this resulted in frustration, because
of Mandrake's dependence on the brain-dead Red Hat scheme, where
software RAIDs are started from rc.sysinit and stopped in halt. This
does not work for root RAID, and why anyone would choose to do this with
scripts when autodetect works so well is beyond me. The result was a
failure to correctly stop the root RAID at shutdown, and therefore a
resync every time I rebooted.

So I decided to do things manually. I installed the system to the first
disk, and defined that as a failed-disk in my raidtab when creating the
4-disk RAID-5. The raid device was created fine. I created a filesystem
on it with "mke2fs -b 4096 -R stride=16 /dev/md0". No complaints.
Mounted it to /mnt/disk. Copied my root filesystem onto it with "cd /;
find . -xdev | cpio -pm /mnt/disk". No problems. "ls /mnt/disk" seemed
happy. But if I then rebooted, the raid started fine, but e2fsck found
so many errors on /dev/md0 that the boot sequence failed. I could force
the same effect by creating the raid, mounting it, copying the files,
unmounting, running e2fsck (reports clean), then mounting it again,
where, even though e2fsck had reported the system clean immediately
beforehand, I would receive an endless stream of errors while it was
trying to mount the device.

If I only copy a small filesystem (say /root) to the device, I do not
get these problems. I have not identified the threshold size of
filesystem that provokes this corruption.

I thought the problems might be due to the onboard Promise controller.
So I disabled it, and reinstalled my old Promise ATA-33 card, which had
worked perfectly for 2 years with the 17Gb disks. This seemed to work
fine (apart from the card's BIOS thinking that the disks were only 8Gb
in capacity, which had no effect on linux's ability to correctly
identify their capacity) and was detected from scratch. However, I
experience the same corruption in the same circumstances with this card
as with the onboard controller.

The error messages are endless, but a few examples are:


When mounting:

EXT2-fs error (device md(9,0)): ext2_check_blocks_bitmap: Wrong free
blocks count for group20, stored=32217, counted=32220

EXT2-fs error (device md(9,0)): ext2_check_blocks_bitmap: Wrong free
blocks count in super block, stored=29189883, counted=29191024

EXT2-fs error (device md(9,0)): ext2_check_inodes_bitmap: Wrong free
inodes count in group20 stored=16359, counted=16362


When running e2fsck:

/dev/md0 contains a file system with errors, check forced.
Pass1: Checking inodes, blocks, and sizes
Inode 901168 has illegal block(s). Clear <y>?
       [I press y]
Illegal block #12 (3067540912) in inodes 901168. CLEARED
       [repeated many times for different blocks]
Too many illegal blocks in inode 901168
Clear inode <y>?

Inode 950308 is in use, but has dtime set. Fix <y>? 

Special [device/socket/fifo] inode 950332 has non-zero size. Fix <y>?

Inode 950336 has imagic flag set. Clear <y>?

Inode 950384, i_blocks is 8519744, should be 144. Fix <y>?

And many other similar messages. I have never had the patience to wade
through them all to the end.


A few other details:

Mandrake 7.2 uses kernel 2.2.17 with the Mandrake patch set. This
includes the software-RAID patch. I have recompiled the kernel to build
in RAID-1 and -5 support, which are modular in the default installation.
Mandrake also comes with the new raidtools (v0.90). I don't think there
is any issue of the raid support being out of date or incompatible with
the raidtools. I have commented out the mad raid sections in rc.sysinit
and halt.

My raidtab for md0 is:

raiddev /dev/md0
    raid-level                5
    nr-raid-disks             4
    nr-spare-disks            0
    persistent-superblock     1
    parity-algorithm          left-symmetric
    chunk-size                64

    device                    /dev/hdc3
    raid-disk                 0
    device                    /dev/hde2
    raid-disk                 1
    device                    /dev/hdg2
    raid-disk                 2
    device                    /dev/hda3
    failed-disk               3


The output of fdisk -l is:

Disk /dev/hda: 255 heads, 63 sectors, 4983 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hda1   *         1         2     16033+  83  Linux
/dev/hda2             3        25    184747+  82  Linux swap
/dev/hda3            26      4983  39825135   85  Linux extended
/dev/hda5            26      4983  39825103+  83  Linux

Disk /dev/hdc: 255 heads, 63 sectors, 4983 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hdc1   *         1         2     16033+  83  Linux
/dev/hdc2             3        25    184747+  82  Linux swap
/dev/hdc3            26      4983  39825135   83  Linux

Disk /dev/hde: 16 heads, 63 sectors, 79428 cylinders
Units = cylinders of 1008 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hde1             1       410    206608+  83  Linux
/dev/hde2           411     79428  39825072   83  Linux

Disk /dev/hdg: 16 heads, 63 sectors, 79428 cylinders
Units = cylinders of 1008 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hdg1             1       410    206608+  83  Linux
/dev/hdg2           411     79428  39825072   83  Linux

(I will set the type to fd to autostart, but for the sake of testing I
was starting and stopping the raid manually, so a failure at boot time
doesn't hang the system)


Output of df:

Filesystem           1k-blocks      Used Available Use% Mounted on
/dev/hda5             39199884  32040384   5168248  86% /
/dev/hdc1                15522         1     14720   0% /bakboot
/dev/hda1                15522      2489     12232  17% /boot


The reason I want to combine most of each disk into one large (~120Gb)
RAID-5 is that I mostly use my home server for storing audio, and
potentially in future, video files, and I would like them all to exist
within one filesystem structure. I don't want to make arbitrary
judgements about how much space I will need in different areas of my
filesystem in the future.

Does anyone have any ideas what could be causing this corruption? My
first instinct was to blame the controller, but I think that theory is
out of the window as I have the same problem with a controller that has
worked fine for 2 years. Does linux have problems with filesystems this
size? Could it be related to the fact that the raid is in degraded mode
(because hda is defined as a failed-disk)? Or could it be anything to do
with the fact that linux assigns different drive geometries to the first
two and the last two disks, or that the partition sizes are not exactly
equal (I don't think this should matter)?

Sorry for the long message, but I thought the background might provide
useful clues.

Cheers,

Bruno Prior
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
Filesystem corruption on large IDE software-RAID-5

Reply via email to