Re: Problems with raid1 - Best way to rebuild

Nathan Neulinger Sun, 6 Jun 1999 16:15:42 -0700
David Robinson wrote:
> 
> It's best not to use the mkraid --force as it may resync the dead drive. This is not
> a problem in testing/playing but in production if one drive fails and you put in a
> new one then do a mkraid --force there is the possibility that mkraid will use the
> replaced drive as the source and overwrite all your data on the raid with garbage or
> what ever was on the new drive.
> 
> I think the right command to do a resync is /sbin/raidhotadd ?? or
> /sbin/raidaddspare . Can't remember now but one of those commands will allow you to
> specify the md device and then dead partition eg
> /dev/md0
>     drive 0 /dev/sda0
>     drive 1 /dev/sdb0
> if /dev/sdb0 dies and needs to be rebuilt
> /sbin/raidhotadd /dev/md0 /dev/sdb0
> Done!
> 
> cat /proc/mdstat should show the mirror being rebuilt.

Ah. Ok. I hadn't actually ran the install yet, so those commands weren't
linked in.

> This should rebuild off the device /dev/sda0.
> 
> Are you sure that there is nothing else using the two spare drives you put in that
> may cause all those errors? eg swap.

Definately not. In one of my tests, there weren't even any partitions on
the drives. (I stopped doing that after a while and was testing with
100M partitions, since 4GB takes too long to resync for testing
purposes.

The O/S is completely on hda. hdb and hdd are both empty and are being
used for testing of the raid stuff.

Here is what it looks like in syslogs:

-------------------------

Jun  3 15:51:30 triton kernel:  hdb: hdb1 
Jun  3 15:51:33 triton kernel:  hdb: hdb1 
Jun  3 15:52:02 triton kernel:  hdd: hdd1 
Jun  3 15:52:04 triton kernel:  hdd: hdd1 
Jun  3 15:52:28 triton kernel: bind<hdb1,1> 
Jun  3 15:52:28 triton kernel: bind<hdd1,2> 
Jun  3 15:52:28 triton kernel: hdd1's event counter: 00000000 
Jun  3 15:52:28 triton kernel: hdb1's event counter: 00000000 
Jun  3 15:52:28 triton kernel: md: md0: raid array is not clean --
starting background reconstruction 
Jun  3 15:52:28 triton kernel: md0: max total readahead window set to
128k 
Jun  3 15:52:28 triton kernel: md0: 1 data-disks, max readahead per
data-disk: 128k 
Jun  3 15:52:28 triton kernel: raid1: device hdd1 operational as mirror
1 
Jun  3 15:52:28 triton kernel: raid1: device hdb1 operational as mirror
0 
Jun  3 15:52:28 triton kernel: raid1: raid set md0 not clean;
reconstructing mirrors 
Jun  3 15:52:28 triton kernel: raid1: raid set md0 active with 2 out of
2 mirrors 
Jun  3 15:52:28 triton kernel: md: updating md0 RAID superblock on
device 
Jun  3 15:52:28 triton kernel: hdd1 [events: 00000001](write) hdd1's sb
offset: 102400 
Jun  3 15:52:28 triton kernel: md: syncing RAID array md0 
Jun  3 15:52:28 triton kernel: md: minimum _guaranteed_ reconstruction
speed: 100 KB/sec. 
Jun  3 15:52:28 triton kernel: md: using maximum available idle IO
bandwith for reconstruction. 
Jun  3 15:52:28 triton kernel: md: using 128k window. 
Jun  3 15:52:28 triton kernel: hdb1 [events: 00000001](write) hdb1's sb
offset: 104320 
Jun  3 15:52:28 triton kernel: . 
Jun  3 15:53:07 triton kernel: md: md0: sync done. 
Jun  3 15:54:52 triton login: nneul login on tty2
Jun  3 15:55:00 triton ksu[850]: 'ksu troot' authenticated [EMAIL PROTECTED]
for nneul on /dev/tty2
Jun  3 15:55:00 triton ksu[850]: Account troot: authorization for
[EMAIL PROTECTED] successful
Jun  3 15:55:18 triton kernel: hdd: status error: status=0xff { Busy } 
Jun  3 15:55:18 triton kernel: hdd: DMA disabled 
Jun  3 15:55:18 triton kernel: hdd: drive not ready for command 
Jun  3 15:55:48 triton kernel: ide1: reset timed-out, status=0x80 
Jun  3 15:55:48 triton kernel: hdd: status timeout: status=0x80 { Busy } 
Jun  3 15:55:48 triton kernel: hdd: drive not ready for command 
Jun  3 15:56:18 triton kernel: ide1: reset timed-out, status=0x80 
Jun  3 15:56:18 triton kernel: hdd: status timeout: status=0x80 { Busy } 
Jun  3 15:56:18 triton kernel: end_request: I/O error, dev 16:41 (hdd),
sector 82846 
Jun  3 15:56:18 triton kernel: raid1: Disk failure on hdd1, disabling
device.  
Jun  3 15:56:18 triton kernel:        Operation continuing on 1 devices 
Jun  3 15:56:18 triton kernel: hdd: drive not ready for command 
Jun  3 15:56:18 triton kernel: hdd: status timeout: status=0x80 { Busy } 
Jun  3 15:56:18 triton kernel: hdd: drive not ready for command 
Jun  3 15:56:18 triton kernel: md0: no spare disk to reconstruct array!
-- continuing in degraded mode 
Jun  3 15:56:18 triton kernel: md: recovery thread finished ... 
Jun  3 15:56:48 triton kernel: ide1: reset timed-out, status=0x80 
Jun  3 15:56:48 triton kernel: hdd: status timeout: status=0x80 { Busy } 
Jun  3 15:56:48 triton kernel: hdd: drive not ready for command 
Jun  3 15:57:18 triton kernel: ide1: reset timed-out, status=0x80 
Jun  3 15:57:18 triton kernel: hdd: status timeout: status=0x80 { Busy } 
Jun  3 15:57:18 triton kernel: end_request: I/O error, dev 16:41 (hdd),
sector 82848 
Jun  3 15:57:18 triton kernel: hdd: drive not ready for command 
Jun  3 15:57:18 triton kernel: hdd: status timeout: status=0x80 { Busy } 
Jun  3 15:57:18 triton kernel: hdd: drive not ready for command 
Jun  3 15:57:18 triton kernel: md: recovery thread got woken up ... 
Jun  3 15:57:18 triton kernel: md0: no spare disk to reconstruct array!
-- continuing in degraded mode 
Jun  3 15:57:18 triton kernel: md: recovery thread finished ... 
<just continues to repeat same thing over and over again>
-------------------------------------

The operation that was taking place was a "gtar -cvf /dev/md0 /usr". As
soon as I pulled the power on hdd, the tar stopped, and it went into the
loop above.

-- Nathan
Re: Problems with raid1 - Best way to rebuild

Reply via email to