David Robinson wrote:
>
> It's best not to use the mkraid --force as it may resync the dead drive. This is not
> a problem in testing/playing but in production if one drive fails and you put in a
> new one then do a mkraid --force there is the possibility that mkraid will use the
> replaced drive as the source and overwrite all your data on the raid with garbage or
> what ever was on the new drive.
>
> I think the right command to do a resync is /sbin/raidhotadd ?? or
> /sbin/raidaddspare . Can't remember now but one of those commands will allow you to
> specify the md device and then dead partition eg
> /dev/md0
> drive 0 /dev/sda0
> drive 1 /dev/sdb0
> if /dev/sdb0 dies and needs to be rebuilt
> /sbin/raidhotadd /dev/md0 /dev/sdb0
> Done!
>
> cat /proc/mdstat should show the mirror being rebuilt.
Ah. Ok. I hadn't actually ran the install yet, so those commands weren't
linked in.
> This should rebuild off the device /dev/sda0.
>
> Are you sure that there is nothing else using the two spare drives you put in that
> may cause all those errors? eg swap.
Definately not. In one of my tests, there weren't even any partitions on
the drives. (I stopped doing that after a while and was testing with
100M partitions, since 4GB takes too long to resync for testing
purposes.
The O/S is completely on hda. hdb and hdd are both empty and are being
used for testing of the raid stuff.
Here is what it looks like in syslogs:
-------------------------
Jun 3 15:51:30 triton kernel: hdb: hdb1
Jun 3 15:51:33 triton kernel: hdb: hdb1
Jun 3 15:52:02 triton kernel: hdd: hdd1
Jun 3 15:52:04 triton kernel: hdd: hdd1
Jun 3 15:52:28 triton kernel: bind<hdb1,1>
Jun 3 15:52:28 triton kernel: bind<hdd1,2>
Jun 3 15:52:28 triton kernel: hdd1's event counter: 00000000
Jun 3 15:52:28 triton kernel: hdb1's event counter: 00000000
Jun 3 15:52:28 triton kernel: md: md0: raid array is not clean --
starting background reconstruction
Jun 3 15:52:28 triton kernel: md0: max total readahead window set to
128k
Jun 3 15:52:28 triton kernel: md0: 1 data-disks, max readahead per
data-disk: 128k
Jun 3 15:52:28 triton kernel: raid1: device hdd1 operational as mirror
1
Jun 3 15:52:28 triton kernel: raid1: device hdb1 operational as mirror
0
Jun 3 15:52:28 triton kernel: raid1: raid set md0 not clean;
reconstructing mirrors
Jun 3 15:52:28 triton kernel: raid1: raid set md0 active with 2 out of
2 mirrors
Jun 3 15:52:28 triton kernel: md: updating md0 RAID superblock on
device
Jun 3 15:52:28 triton kernel: hdd1 [events: 00000001](write) hdd1's sb
offset: 102400
Jun 3 15:52:28 triton kernel: md: syncing RAID array md0
Jun 3 15:52:28 triton kernel: md: minimum _guaranteed_ reconstruction
speed: 100 KB/sec.
Jun 3 15:52:28 triton kernel: md: using maximum available idle IO
bandwith for reconstruction.
Jun 3 15:52:28 triton kernel: md: using 128k window.
Jun 3 15:52:28 triton kernel: hdb1 [events: 00000001](write) hdb1's sb
offset: 104320
Jun 3 15:52:28 triton kernel: .
Jun 3 15:53:07 triton kernel: md: md0: sync done.
Jun 3 15:54:52 triton login: nneul login on tty2
Jun 3 15:55:00 triton ksu[850]: 'ksu troot' authenticated [EMAIL PROTECTED]
for nneul on /dev/tty2
Jun 3 15:55:00 triton ksu[850]: Account troot: authorization for
[EMAIL PROTECTED] successful
Jun 3 15:55:18 triton kernel: hdd: status error: status=0xff { Busy }
Jun 3 15:55:18 triton kernel: hdd: DMA disabled
Jun 3 15:55:18 triton kernel: hdd: drive not ready for command
Jun 3 15:55:48 triton kernel: ide1: reset timed-out, status=0x80
Jun 3 15:55:48 triton kernel: hdd: status timeout: status=0x80 { Busy }
Jun 3 15:55:48 triton kernel: hdd: drive not ready for command
Jun 3 15:56:18 triton kernel: ide1: reset timed-out, status=0x80
Jun 3 15:56:18 triton kernel: hdd: status timeout: status=0x80 { Busy }
Jun 3 15:56:18 triton kernel: end_request: I/O error, dev 16:41 (hdd),
sector 82846
Jun 3 15:56:18 triton kernel: raid1: Disk failure on hdd1, disabling
device.
Jun 3 15:56:18 triton kernel: Operation continuing on 1 devices
Jun 3 15:56:18 triton kernel: hdd: drive not ready for command
Jun 3 15:56:18 triton kernel: hdd: status timeout: status=0x80 { Busy }
Jun 3 15:56:18 triton kernel: hdd: drive not ready for command
Jun 3 15:56:18 triton kernel: md0: no spare disk to reconstruct array!
-- continuing in degraded mode
Jun 3 15:56:18 triton kernel: md: recovery thread finished ...
Jun 3 15:56:48 triton kernel: ide1: reset timed-out, status=0x80
Jun 3 15:56:48 triton kernel: hdd: status timeout: status=0x80 { Busy }
Jun 3 15:56:48 triton kernel: hdd: drive not ready for command
Jun 3 15:57:18 triton kernel: ide1: reset timed-out, status=0x80
Jun 3 15:57:18 triton kernel: hdd: status timeout: status=0x80 { Busy }
Jun 3 15:57:18 triton kernel: end_request: I/O error, dev 16:41 (hdd),
sector 82848
Jun 3 15:57:18 triton kernel: hdd: drive not ready for command
Jun 3 15:57:18 triton kernel: hdd: status timeout: status=0x80 { Busy }
Jun 3 15:57:18 triton kernel: hdd: drive not ready for command
Jun 3 15:57:18 triton kernel: md: recovery thread got woken up ...
Jun 3 15:57:18 triton kernel: md0: no spare disk to reconstruct array!
-- continuing in degraded mode
Jun 3 15:57:18 triton kernel: md: recovery thread finished ...
<just continues to repeat same thing over and over again>
-------------------------------------
The operation that was taking place was a "gtar -cvf /dev/md0 /usr". As
soon as I pulled the power on hdd, the tar stopped, and it went into the
loop above.
-- Nathan