I had the same problem.
It was only happening on the root partition that I had mirrored. Once I
unmirrored the root partition all the io errors stopped. When I pulled the
power to one drive the other RAID-1 partitions wend straight into degraded mode
without problems.

I'm not sure if this is a bug in the current version of Linux RAID but it looks
like Linux is a little way off from being able to mirror the root device.

The RAID docs really need updating. I could not find out how to remirror a
drive once it went into degraded mode! I only happened to find it in the end by
playing with the programs that come with Linux RAID e.g. raidadd, which allowed
me to remake the mirror by specifying the "bad" partition. A reboot didn't
force a mirror rebuild.

I would suggest that everyone who sets up RAID do a test by powering down one
of the drives.

I have currently got a cron job that does a dd on the root partition to another
drive with exactly the same setup. This appears to work fine and I can boot off
it. At least if the root drive fails I only loose a few password changes,etc
from the last time the cronjob ran.




"Neulinger, Nathan R." wrote:

> We've been doing some initial testing - looking at using RAID-1 mirroring
> with md, but have not had much luck so far.
>
> We've set up a raid1 device with two separate IDE drives on separate
> controllers (on-board).
>
> To simulate a drive failure, we've cut the power to one drive while the raid
> set is being used. After a long timeout, the kernel sees the failure on hdd
> and then md says that the drive has failed and will continue in degraded
> mode.
>
> The problem is - it doesn't continue, it sits there and keeps trying to
> access that drive again and again every few seconds. The I/O operation that
> was taking place against /dev/md0 never resumes (it stopped as soon as the
> power was pulled on that one drive.)
>
> If we put the power back on to that drive, it breaks loose and starts
> running in degraded mode.
>
> This is with the 990128-2.2.0 patch applied to the 2.2.2 kernel w/ fixes.
>
> Is this a functionality issue (i.e. does md raid1 not support continuing to
> run after a drive failure if there are no spares?), or is something wrong.
>
> I can possibly upgrade to a new kernel release if absolutely necessary.
>
> -- Nathan
>
> ------------------------------------------------------------
> Nathan Neulinger                       EMail:  [EMAIL PROTECTED]
> University of Missouri - Rolla         Phone: (573) 341-4841
> Computing Services                       Fax: (573) 341-4216

Reply via email to