RE: R200 SAS 6/iR RAID-1 failing to rebuild

Brian O'Mahony Thu, 06 Aug 2009 02:54:53 -0700

If you have a spare disk I would suggest trying the rebuild on that first. Im 
always wary of upgrading drivers or firmware while the RAID volume is degraded.


If you are going to do the FW upgrade from a boot floppy/cd you could always 
pull the disks out before you do the actual upgrade (haven't done this in years)

What method did the tech use to describe breaking the RAID-ness? And how would 
you have two identical sets of data when one set is obviously out of synch with 
the constant fail/rebuild.

Another option I would also try before a FW upgrade would be take the disk 
that's not rebuilding, put in another slot, initialize it so that it is 
completely wiped, then put it back in the slot and see if it suffers the same 
issue.

Like I said, I don't like doing FW upgrades when the array is messed up.

I am paranoid though.

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of david o'donnell
Sent: 05 August 2009 23:01
To: [email protected]
Subject: Re: R200 SAS 6/iR RAID-1 failing to rebuild

Do not see why you have to kill off the RAID-ness before doing the firmware or 
driver update...!

This does not seem correct to me, I would go back and double check that. I just 
did a firmware upgrade on PERC okay, it is older but I would think the same 
principles apply.

Before you do anything I would get several backups of the data and preferably a 
few to disk as well.


Message: 5
Date: Wed, 5 Aug 2009 17:22:41 +0100
From: "Faris Raouf" <[email protected]>
Subject: R200 SAS 6/iR RAID-1 failing to rebuild
To: <[email protected]>
Message-ID: <009301ca15e8$f5257870$df7069...@net>
Content-Type: text/plain;    charset="us-ascii"

Hi all,

Although this will start a bit OT, I'm going to get on-topic in a bit.

A quick look through the logs on one of our R200s has flagged what looks
like a failing HD in our RAID-1 mirrored pair: 
Essentially I'm seeing several entries like this:

Aug  1 16:12:13 vzbeta kernel: mptbase: ioc0: RAID STATUS CHANGE for
PhysDisk 0 id=1
Aug  1 16:12:13 vzbeta kernel: mptbase: ioc0:   PhysDisk is now missing
Aug  1 16:12:13 vzbeta kernel: mptbase: ioc0: RAID STATUS CHANGE for
PhysDisk 0 id=1
Aug  1 16:12:15 vzbeta kernel: mptbase: ioc0:   PhysDisk is now missing, out
of sync
Aug  1 16:12:19 vzbeta kernel: mptbase: ioc0: RAID STATUS CHANGE for
VolumeID 0
Aug  1 16:12:19 vzbeta kernel: mptbase: ioc0:   volume is now degraded,
enabled
Aug  1 16:12:19 vzbeta kernel: mptbase: ioc0: RAID STATUS CHANGE for
PhysDisk 0 id=1
Aug  1 16:12:19 vzbeta kernel: mptbase: ioc0:   PhysDisk is now online, out
of sync
Aug  1 16:12:19 vzbeta kernel: mptbase: ioc0: Initiating recovery
Aug  1 16:12:19 vzbeta kernel: sd 0:1:0:0: mptscsih: ioc0: completing cmds:
fw_channel 0, fw_id 0, sc=ffff810234622500, mf = ffff81023e382b00, idx=6


Having a look in OMSA shows the array rebuilding, then after a few percent
complete, starts from 0% again.
In other words it looks like it is trying to rebuild, fails, then tries
again, over and over.


I've just had a chat to a nice Dell support peep and he's asked me to
upgrade the firmware on the Perc (I'm one release behind -- not sure how
that happened), and upgrade the driver, before we do anything else (e.g.
replace the drive).

The key step that's worrying me is that before I can upgrade the firmware
and driver I apparently need to basically kill off the RAID-ness on the
mirrored pair, so I end up with two individual hard disks.

Has anyone done this? What I really want to know is will doing this result
in total data loss?

My thinking is that since these are mirrored pairs I should end up with two
ordinary hard disks with identical data on them (even if one probably
doesn't work) when I delete the virtual volume. But I'd love to know if this
is really what's going to happen. Can anyone advise?

Once I've done that I'm happy with doing the firmware update, but not so
happy about doing the driver update. This is where we get back on topic.

A while ago I think someone posted something about some kind of problem with
the latest DKMS? I can't remember the details. Am I imagining things or is
there an issue? We use Centos 5.3 on these systems.

Any advice will be very much appreciated.


Thanks,

Faris.


      

_______________________________________________
Linux-PowerEdge mailing list
[email protected]
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq


The information in this email is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this email by anyone else
is unauthorized. If you are not the intended recipient, any disclosure,
copying, distribution or any action taken or omitted to be taken in reliance
on it, is prohibited and may be unlawful. If you are not the intended
addressee please contact the sender and dispose of this e-mail. Thank you.


_______________________________________________
Linux-PowerEdge mailing list
[email protected]
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

RE: R200 SAS 6/iR RAID-1 failing to rebuild

Reply via email to