Re: Disk failure during grow, what is the current state.

2008-02-06 Thread Nagilum

- Message from [EMAIL PROTECTED] -
Date: Wed, 6 Feb 2008 12:58:55 -
From: Steve Fairbairn [EMAIL PROTECTED]
Reply-To: Steve Fairbairn [EMAIL PROTECTED]
 Subject: Disk failure during grow, what is the current state.
  To: linux-raid@vger.kernel.org



As you can see, one of the original 5 devices has failed (sdd1) and
automatically removed.  The reshape has stopped, but the new disk seems
to be in and clean which is the bit I don't understand.  The new disk
hasn't been added to the size, so it would seem that md has switched it
to being used as a spare instead (possibly as the grow hadn't
completed?).

How come it seems to have recovered so nicely?
Is there something I can do to check it's integrity?
Was it just so much quicker than 2 days because it switched to only
having to sort out the 1 disk? Would it be safe to run an fsck to check
the integrity of the fs?  I don't want to inadvertently blat the raid
array by 'using' it when it's in a dodgy state.

I have unmounted the drive for the time being, so that it doesn't get
any writes until I know what state it is really in.



- End message from [EMAIL PROTECTED] -

If a drive failes during reshape the reshape will just continue.
The blocks which were on the failed drive are calculated from the the  
other disks and writes to the failed disk are simply omitted.

The result is a raid5 with a failed drive.
You should get a new drive asap to restore the redundancy.
Also it's kinda important that you don't run 2.6.23 because it has a  
nasty bug which would be triggered in this scenario.
The reshape probably increased in speed after the system was no longer  
actively used and io bandwidth freed up.

Kind regards,
Alex.



#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #




cakebox.homeunix.net - all the machine one needs..



pgpCS18uvCIqa.pgp
Description: PGP Digital Signature


RE: Disk failure during grow, what is the current state.

2008-02-06 Thread Steve Fairbairn
  -Original Message-
  From: Steve Fairbairn [mailto:[EMAIL PROTECTED]
  Sent: 06 February 2008 15:02
  To: 'Nagilum'
  Subject: RE: Disk failure during grow, what is the current state.
  
  
   Array Size : 1953535744 (1863.04 GiB 2000.42 GB)
Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
  
  Surely the added disk should now been added to the Array
  Size?  5 * 500GB is 2500GB, not 2000GB.  This is why I don't 
  think the reshape has continued.  As for speeding up because 
  of no IO badwidth, this also doesn't actually hold very true, 
  because the system was at a point of not being used anyway 
  before I added the disk, and I didn't unmount the drive until 
  this morning after it claimed it had finished doing anything.
  

Thanks again to Alex for his comments.  I've just rebooted the box, and
the reshape has continued on the degraded array and an RMA has been
raised for the faulty disk.

Thanks,

Steve.

No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.516 / Virus Database: 269.19.20/1261 - Release Date:
05/02/2008 20:57
 

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html