Re: Problems with raid1 - Best way to rebuild

David Robinson Sun, 6 Jun 1999 14:57:53 -0700
It's best not to use the mkraid --force as it may resync the dead drive. This is not
a problem in testing/playing but in production if one drive fails and you put in a
new one then do a mkraid --force there is the possibility that mkraid will use the
replaced drive as the source and overwrite all your data on the raid with garbage or
what ever was on the new drive.

I think the right command to do a resync is /sbin/raidhotadd ?? or
/sbin/raidaddspare . Can't remember now but one of those commands will allow you to
specify the md device and then dead partition eg
/dev/md0
    drive 0 /dev/sda0
    drive 1 /dev/sdb0
if /dev/sdb0 dies and needs to be rebuilt
/sbin/raidhotadd /dev/md0 /dev/sdb0
Done!

cat /proc/mdstat should show the mirror being rebuilt.

This should rebuild off the device /dev/sda0.

Are you sure that there is nothing else using the two spare drives you put in that
may cause all those errors? eg swap.



Nathan Neulinger wrote:

> For me, it wasn't the root drive. It was just a couple of extra drives I
> slapped in as slaves for testing.
>
> I also had the same problem with it never rebuilding when it went down.
> I believe I was able to recover by adjusting /etc/raidtab and running
> mkraid with the force option.
>
> What I've been doing to test is just doing a big tar -cvf to the
> /dev/md0 device, and then pulling the power on one of the drives. I'd
> expect it to choke when I pull the power and then recover and start
> writing again.
>
> -- Nathan
>
> David Robinson wrote:
> >
> > I had the same problem.
> > It was only happening on the root partition that I had mirrored. Once I
> > unmirrored the root partition all the io errors stopped. When I pulled the
> > power to one drive the other RAID-1 partitions wend straight into degraded mode
> > without problems.
> >
> > I'm not sure if this is a bug in the current version of Linux RAID but it looks
> > like Linux is a little way off from being able to mirror the root device.
> >
> > The RAID docs really need updating. I could not find out how to remirror a
> > drive once it went into degraded mode! I only happened to find it in the end by
> > playing with the programs that come with Linux RAID e.g. raidadd, which allowed
> > me to remake the mirror by specifying the "bad" partition. A reboot didn't
> > force a mirror rebuild.
> >
> > I would suggest that everyone who sets up RAID do a test by powering down one
> > of the drives.
> >
> > I have currently got a cron job that does a dd on the root partition to another
> > drive with exactly the same setup. This appears to work fine and I can boot off
> > it. At least if the root drive fails I only loose a few password changes,etc
> > from the last time the cronjob ran.
> >
> > "Neulinger, Nathan R." wrote:
> >
> > > We've been doing some initial testing - looking at using RAID-1 mirroring
> > > with md, but have not had much luck so far.
> > >
> > > We've set up a raid1 device with two separate IDE drives on separate
> > > controllers (on-board).
> > >
> > > To simulate a drive failure, we've cut the power to one drive while the raid
> > > set is being used. After a long timeout, the kernel sees the failure on hdd
> > > and then md says that the drive has failed and will continue in degraded
> > > mode.
> > >
> > > The problem is - it doesn't continue, it sits there and keeps trying to
> > > access that drive again and again every few seconds. The I/O operation that
> > > was taking place against /dev/md0 never resumes (it stopped as soon as the
> > > power was pulled on that one drive.)
> > >
> > > If we put the power back on to that drive, it breaks loose and starts
> > > running in degraded mode.
> > >
> > > This is with the 990128-2.2.0 patch applied to the 2.2.2 kernel w/ fixes.
> > >
> > > Is this a functionality issue (i.e. does md raid1 not support continuing to
> > > run after a drive failure if there are no spares?), or is something wrong.
> > >
> > > I can possibly upgrade to a new kernel release if absolutely necessary.
> > >
> > > -- Nathan
> > >
> > > ------------------------------------------------------------
> > > Nathan Neulinger                       EMail:  [EMAIL PROTECTED]
> > > University of Missouri - Rolla         Phone: (573) 341-4841
> > > Computing Services                       Fax: (573) 341-4216
>
> --
>
> ------------------------------------------------------------
> Nathan Neulinger                       EMail:  [EMAIL PROTECTED]
> University of Missouri - Rolla         Phone: (573) 341-4841
> Computing Services                       Fax: (573) 341-4216
Re: Problems with raid1 - Best way to rebuild

Reply via email to