Thanks guys, I did backup, and rebuild on new disks. The 3rd/Final disk is
now rebuilding, so I guess I should be happy.
About the one error every 13TB, I guess we'll have to wait for btrfs :)

more seriously now, I should have the raid-5 array of new disks as my
primary storage. And the older 3 disks, just lying there. I would like to
use the old ones as backup for the new one. My plan is to assemble to old
disks into a new raid-5 unit, and then daily LVM-snapshot the primary unit,
then 'dump' file systems onto the backup raid unit. Any better suggestions ?

Actually I don't know if I should be making the backup unit raid-5, knowing
that raid-5 is slow for writing!

Kostas, about your performance problems, make sure you have the latest
firmware, AFAIK it mentions performance improvements for ext3 specifically.
I was too scared to re-flash my card though ;)

Regards

On Sat, May 3, 2008 at 2:22 PM, Kostas Georgiou <[EMAIL PROTECTED]>
wrote:

> On Thu, May 01, 2008 at 09:35:19PM +0300, Ahmed Kamal wrote:
>
> > Hello,
> > I'm working on a server with a 3w-9550SX controller, with 3x500G disks
> in a
> > raid-5 and 1x500G hot spare. One night, a disk fails, and the server
> > crashes! Working on the server, I see that many filesystems were
> destroyed
> > beyond repair!! This was too bad to hear. Some LVM volumes were
> repaired,
> > others were restored from backup. The bad disk was removed. I learnt
> that
> > 3ware controllers aren't really high quality, and they probably corrupt
> the
> > FSs.
> >
> > Since all disks are same age, I thought I'd buy new disks to replace the
> old
> > ones. I bought 4x500G barracuda-ES drives, which should be high quality.
> > Here lies my problem. I need to replace the 3 running disks, with 3 new
> > disks, and add an extra one as hot spare. I am scared to do that,
> because
> > the standard way is to "fail" a disk, and rebuild on a new one, then
> repeat
> > for the other 2 disks till all 3 are replaced. Now this puts me in a
> > vulnerable situation, if I "fail" a disk, and while rebuilding another
> disk
> > naturally fails, all data is gone! Is there any other "wise" way to do
> what
> > I want safely ? I contacted 3w support, and they just insist I should
> > fail/rebuild, but since I don't have much faith in their controllers or
> the
> > old disks ... any smarter way to do this ?
>
> I have some 3ware controllers as well and while I can't say that they
> are the best (performance is horrible in many cases) I never lost any
> data unless I had two dead disks in RAID5.
>
> The most common reason for a rebuild to fail is if any of your remaining
> disks in the raid have a fault (bad blocks). The best way to deal with
> this is to have the 3ware card to run a verify task every few days to
> deal with problems like this. Also have smartd running to monitor the
> disks
> so you get a warning.
>
> So before your rebuilds, *backup* your data if you can. Run a verify
> task so the controller/disks have a chance to correct any exsiting
> errors, check with smartctl your disks and start with the one with the
> most bad blocks (if any).
>
> Sadly most SATA disks have an unrecoverable read error rate of 1/10^14
> or so which means that statistically you'll get one error every ~13TB
> that you read. So during every rebuild you'll have a 1/13 chance to
> loose a block whatever you do :(
>
> Cheers,
> Kostas
>
> _______________________________________________
> rhelv5-list mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/rhelv5-list
>
_______________________________________________
rhelv5-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/rhelv5-list

Reply via email to