Re: FailSpare event?

2007-01-11 Thread Mike Hardy

google BadBlockHowto

Any just google it response sounds glib, but this is actually how to
do it :-)

If you're new to md and mdadm, don't forget to actually remove the drive
from the array before you start working on it with 'dd'

-Mike

Mike wrote:
 On Fri, 12 Jan 2007, Neil Brown might have said:
 
 On Thursday January 11, [EMAIL PROTECTED] wrote:
 So I'm ok for the moment? Yes, I need to find the error and fix everything
 back to the (S) state.
 Yes, OK for the moment.

 The messages in $HOST:/var/log/messages for the time of the email are:

 Jan 11 16:04:25 elo kernel: sd 2:0:4:0: SCSI error: return code = 0x802
 Jan 11 16:04:25 elo kernel: sde: Current: sense key: Hardware Error
 Jan 11 16:04:25 elo kernel: Additional sense: Internal target failure
 Jan 11 16:04:25 elo kernel: Info fld=0x10b93c4d
 Jan 11 16:04:25 elo kernel: end_request: I/O error, dev sde, sector 
 280575053
 Jan 11 16:04:25 elo kernel: raid5: Disk failure on sde2, disabling device. 
 Operation continuing on 5 devices
 Given the sector number it looks likely that it was a superblock
 update.
 No idea how bad an 'internal target failure' is.  Maybe powercycling
 the drive would 'fix' it, maybe not.

 On AIX boxes I can blink the drives to identify a bad/failing device. Is 
 there
 a way to blink the drives in linux?
 Unfortunately not.

 NeilBrown

 
 I found the smartctl command. I have a 'long' test running in the background.
 I checked this drive and the other drives. This drive has been used the least
 (confirms it is a spare?) and is the only one with 'Total uncorrected errors' 
  0.
 
 How to determine the error, correct the error, or clear the error?
 
 Mike
 
 [EMAIL PROTECTED] ~]# smartctl -a /dev/sde
 smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen
 Home page is http://smartmontools.sourceforge.net/
 
 Device: SEAGATE  ST3146707LC  Version: D703
 Serial number: 3KS30WY8
 Device type: disk
 Transport protocol: Parallel SCSI (SPI-4)
 Local Time is: Thu Jan 11 17:00:26 2007 CST
 Device supports SMART and is Enabled
 Temperature Warning Enabled
 SMART Health Status: OK
 
 Current Drive Temperature: 48 C
 Drive Trip Temperature:68 C
 Elements in grown defect list: 0
 Vendor (Seagate) cache information
   Blocks sent to initiator = 66108
   Blocks received from initiator = 147374656
   Blocks read from cache and sent to initiator = 42215
   Number of read and write commands whose size = segment size = 12635583
   Number of read and write commands whose size  segment size = 0
 Vendor (Seagate/Hitachi) factory information
   number of hours powered up = 3943.42
   number of minutes until next internal SMART test = 94
 
 Error counter log:
Errors Corrected by   Total   Correction Gigabytes
 Total
ECC  rereads/errors   algorithm  processed
 uncorrected
fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  
 errors
 read:3540 0   354354  0.546   
 0
 write: 00 0 0  0185.871   
 1
 
 Non-medium error count:0
 
 SMART Self-test log
 Num  Test  Status segment  LifeTime  
 LBA_first_err [SK ASC ASQ]
  Description  number   (hours)
 # 1  Background long   Completed, segment failed   -3943 
 - [-   --]
 
 Long (extended) Self Test duration: 2726 seconds [45.4 minutes]
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 repair issue with 2.6.16.36 kernel

2007-01-08 Thread Mike Hardy


Michel Lespinasse wrote:
 Hi,
 
 I'm hitting a small issue with a RAID1 array and a 2.6.16.36 kernel.
 
 Debian's mdadm package has a checkarray process which runs monthly and
 checks the RAID arrays. Among other things, this process does an
 echo check  /sys/block/md1/md/sync_action . Looking into my RAID1
 array, I noticed that /sys/block/md1/md/mismatch_cnt was set to 128 -
 so there is a small amount of unsynchronized blocks in my RAID1 partition.
 
 I tried to fix the issue by writing repair into /sys/block/md1/md/sync_action
 but the command was refused:
 
 # cat /sys/block/md0/md/sync_action
 idle
 # echo repair  /sys/block/md1/md/sync_action
 echo: write error: invalid argument
 
 I looked at the sources for my kernel (2.6.16.36) and noticed that in md.c
 action_store(), the following code rejects the repair action (but accepts
 everything else and treats it as a repair):
 
 if (cmd_match(page, check))
 set_bit(MD_RECOVERY_CHECK, mddev-recovery);
 else if (cmd_match(page, repair))
 return -EINVAL;
 
 So I tried to issue a repair the hacky way:
 
 # echo asdf  /sys/block/md1/md/sync_action
 # cat /sys/block/md1/md/sync_action
 repair
 # cat /proc/mdstat
 Personalities : [raid1]
 ...
 md1 : active raid1 hdg2[1] hde2[0]
   126953536 blocks [2/2] [UU]
   [==..]  resync = 14.2% (18054976/126953536)
 +finish=53.7min speed=33773K/sec
 ...
 unused devices: none
 # ... wait one hour ...
 # cat /sys/block/md1/md/sync_action
 idle
 # cat /sys/block/md1/md/mismatch_cnt
 128
 
 The kernel (still 2.6.16.36) reports it has repaired the array, but another
 check still shows 128 mismatched blocks:
 
 # echo check  /sys/block/md1/md/sync_action
 # cat /sys/block/md1/md/sync_action
 check

When I did the check, while I still had mismatches (and a SMART test was
failing, so the drive definitely had problems) I didn't notice the error
count going up on the drive, which I thought was odd and probably a bug.

 # ... wait one hour ...
 # cat /sys/block/md1/md/mismatch_cnt
 128

I had the same problem with mismatch_cnt not decreasing. It seems to me
that either it shouldn't be a counter, i.e. each mismatch should be
associated with a block, and the count should be decreased when that
block checks out in the future, or the mismatch and error count should
be cleared out when a repair or check is run

If it doesn't ever go back to zero though, it will be very difficult to
write a reliable monitor for array health based on those files. I'm not
sure it could ever be made perfectly reliable actually, so those files
end up not being useful

It's clear that something was done in the repair step though, as a SMART
test on the drive worked after that

 So I'm a bit confused about how to proceed now...

Well, the way I proceeded, since it didn't seem to me that I could rely
on the array mismatch count or per-drive error counts was to fail the
drive out of the array and re-add it.

Everything was reset then.

 
 I looked at the source for debian's linux-2.6_2.6.18-8 kernel and I see
 that the issue with the inverted cmd_match(page, repair) condition is
 fixed there. So I assume you guys found this issue sometime between 2.6.16
 and 2.6.18.
 
 Would you by any chance also know why the repair process did not work
 with 2.6.16.36 ??? Has any related bug been fixed recently ? Should I
 try again with a newer kernel, or should I rather avoid this for now ?
 
 Assuming the fix is small, is there any reason not to backport it into
 2.6.16.x ?
 
 I would be grateful for any suggestions.
 
 Thanks,
 

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raidreconf for 5 x 320GB - 8 x 320GB

2006-11-17 Thread Mike Hardy

You don't want to use raidreconf unless I'm misunderstanding your goal -
I have also had success with raidreconf but have had data-loss failures
as well (I've posted to the list about it if you search). The data-loss
failures were after I had run tests that showed me it should work.

raidreconf is no longer maintained, so it's a dead end to try to hunt
down the failures.

Luckily, Neil Brown has added raid5 reshape support (same thing
raidreconf did) to the md driver, so you can just use 'mdadm --grow'
commands to do what you want.

So I'd say to update your kernel to the newest 2.6.18.xx or whatever is
out, and update mdadm give that a shot with your test partitions. The
new versions are working fine as near as I can tell, and I've got them
in use (FC5 machines - you can see their versions, and call me foolish
for putting FC in production if you want) in a production environment
with no issues.

-Mike

Timo Bernack wrote:
 Hi there,
 
 i am running a 5-disk RAID5 using mdadm on a suse 10.1 system. As the
 array is running out of space, i consider adding three more HDDs. Before
 i set up the current array, i made a small test with raidreconf:
 
 - build a 4-disk RAID5 /dev/md0 (with only 1.5gb for each partition)
 with 4.5gb userspace in total
 - put an ext3 filesystem on it
 - copy some data to it -- some episodes of American Dad ;-)
 - use raidreconf to add a 5th disk
 - use resize2fs to make use of the new additional space
 - check for the video-clips.. all fine (also compared checksums)
 
 This test was a full success, but of course it was very small scaled, so
 maybe there are issues that only come up when there is (much) more space
 involved. That leads to my questions:
 
 What are potential sources for failures (and thus, losing all data)
 reconfiguring the array using the method described above? Loss of power
 during the process (which would take quite some time, 24 hours minimum,
 i think) is one of them, i suppose. But are there known issues with
 raidreconf, concerning the 2TB-barrier, for example?
 
 I know that raidreconf is quite outdated, but it did what it promised on
 my system. I heard of the possibility to achieve the same result just by
 using mdadm, but this required a newer version of mdadm, and upgrading
 it and using a method that i can't test beforehand scares me a little --
 a little more than letting out raidreconf on my precious data does ;-).
 
 All comments will be greatly appreciated!
 
 
 Timo
 
 P.S.:
 I do have a backup, but since it is scattered to a huge stack of CDs /
 DVDs (about 660 disks) it would be a terrible pain-in-the-ass to be
 forced to restore it again. In fact, getting away from storing my data
 using a DVD-burner was the main reason to build up the array at all. It
 took me about 1 week (!) to copy all these disks, as you can easily
 imagine.
 
 -
 Hardware:
 - Board / CPU: ASUS M2NPV-VM (4 x S-ATA onboard) / AMD Sempron 3200+ AM2
 - Add. S-ATA-Controller: Promise SATA300 TX4
 - HDDs: 5 x Western Digital Caviar SE 320GB SATA II (WD3200JS)
 
 Software (OpenSUSE 10.1 Default-Installation):
 - Kernel: 2.6.16
 - mdadm - v2.2 - 5 December 2005
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Checking individual drive state

2006-11-05 Thread Mike Hardy


dean gaudet wrote:
 On Sun, 5 Nov 2006, Bradshaw wrote:


 I don't know how to scan the one disk for bad sectors, stopping the array and
 doing an fsck or similar throws errors, so I need help in determining whether
 the disc itself is faulty.
 
 try swapping the cable first.  after that swap ports with another disk and 
 see if the problem follows the port or the disk.
 
 you can see if smartctl -a (from smartmontools) tells you anything 
 interesting.  (it can be quite difficult, to impossible, to understand 
 smartctl -a output though.  but if you've got errors in the SMART error 
 log that's a good place to start.)

I don't think SMART output is that hard to understand.

And checking the entire drive for errors is as easy as 'smartctl -t long
/dev/drive' usually. If it is SATA as you say, you may need to put a
'-d ata' in there.

Wait for however long it says to wait, then do a 'smartctl -a
/dev/drive' and you should see the self test log at the bottom. Did it
finish? If not, there are bad sectors. If there are bad sectors, you
should google the string 'BadBlockHowTo' to see if you can clear them
(after failing the drive out of the array)

Note that this won't tell you anything about cables or controllers or
power or anything else that could and may be wrong. It's just for the
drive media and firmware.

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New features?

2006-10-31 Thread Mike Hardy


Neil Brown wrote:
 On Tuesday October 31, [EMAIL PROTECTED] wrote:

 1 Warm swap - replacing drives without taking down the array but maybe
 having to type in a few commands. Presumably a sata or sata/raid
 interface issue. (True hot swap is nice but not worth delaying warm-
 swap.)
 
 I believe that 2.6.18 has SATA hot-swap, so this should be available
 know ... providing you can find out what commands to use.

I forgot 2.6.18 has SATA hot-swap, has anyone tested that?

FWIW, SCSI (or SAS now, using SCSI or SATA drives) has full hot-swap
with completely online drive exchanges. I have done this on recent
kernels in production and it works.

 
 2 Adding new disks to arrays. Allows incremental upgrades and to take
 advantage of the hard disk equivalent of Moore's law.
 
 Works for raid5 and linear.  Raid6 one day.

Also works for raid1!


 4. Uneven disk sizes, eg adding a 400GB disk to a 2x200GB mirror to
 create a 400GB mirror. Together with 2 and 3, allows me to continuously
 expand a disk array.
 
 So you have a RAID1 (md) from sda and sdb, both 200GB, and you now have a
 sdc which is 400GB.
 So
mdadm /dev/md0 -a /dev/sdc
mdadm /dev/md0 -f /dev/sda
mdadm /dev/md0 -r /dev/sda
# wait for recovery

Could be:

mdadm /dev/md0 -a /dev/sdc
mdadm --grow /dev/md0 --raid-devices=3 # 3-disk mirror
# wait for recovery
# don't forget grub-install /dev/sda (or similar)!
mdadm /dev/md0 -f /dev/sda
mdadm /dev/md0 -r /dev/sda
mdadm --grow /dev/md0 --raid-devices=2 # 2-disk again

# Run a 'smartctl -d ata -t long /dev/sdb' before next line...

mdadm /dev/md0 -f /dev/sdb
mdadm /dev/md0 -r /dev/sdb
mdadm -C /dev/md1 -l linear -n 2 /dev/sda /dev/sdb
mdadm /dev/md0 -a /dev/md1
# wait for recovery
mdadm --grow /dev/md0 --size=max
 
 You do run with a degraded array for a while, but you can do it
 entirely online.
 It might be possible to decrease the time when the array is degraded,
 but it is too late at night to think about that.

All I did was decrease the degradation time, but hey it could help. And
don't forget the long SMART test before running degraded for real. Could
save you some pain.

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: future hardware

2006-10-21 Thread Mike Hardy


Justin Piszcz wrote:

 cards perhaps.  Or, after reading that article, consider SAS maybe..?


I hate to be the guy that breaks out the unsubstantiated anecdotal
evidence, but I've got a RAID10 with 4x300GB Maxtor SAS drives, and I've
already had two trigger their internal SMART I'm about to fail message.

They've been in service now for around 2 months, and they do have an
okay temperature, and I have not been beating the crap out of them.

More than a little disappointing.

They are fast though...

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: MegaRaid problems..

2006-10-12 Thread Mike Hardy


Gordon Henderson wrote:
 This might not be strictly on-topic here, but you may provide
 enlightenment, as a lot of web searching hasn't helpmed me so-far )-:
 
 A client has bought some Dell hardware - Dell 1950 1U server, 2 on-board
 SATA drives connected to a Fusion MPT SAS controller. This works just
 fine. The on-board drives are mirrored using s/w RAID, which is great and
 just how I want it.
 
 The server also has 2 x Dell PERC dual-port SAS Raid Cards which have LSI
 MegaRaid chipssets on them. One cable from each raid card connect to half


 this is 2.6.18), and at boot time the dmesg output sees the drives in the
 external enclosure, but does not associate them to sdX drives! The
 underlying distro is Debian stable, but I doubt theres anything of issue
 there.

I have several Dell 2950s (same chassis) and they have this problem.

You can't do the PERC card and get JBOD basically. The PERC5 card has no
JBOD mode, whereas the PERC4 card did.

Dell said they may get a BIOS update, but wouldn't commit.

In the meantime, you have to exchange the PERC5 card for a SAS5 card,
then you can have JBOD.

I was a little disappointed, as the PERC5 card can drive 6 or 8 devices,
but the SAS5 card can only drive 4. Lame.

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: checking state of RAID (for automated notifications)

2006-09-06 Thread Mike Hardy

berlin % rpm -qf /usr/lib/nagios/plugins/contrib/check_linux_raid.pl
nagios-plugins-1.4.1-1.2.fc4.rf

It is built in to my nagios plugins package at least, and works great.

-Mike

Tomasz Chmielewski wrote:
 I would like to have RAID status monitored by nagios.
 
 This sounds like a simple script, but I'm not sure what approach is
 correct.
 
 
 Considering, that the health status of /proc/mdstat looks like this:
 
 # cat /proc/mdstat
 Personalities : [raid1] [raid10]
 md2 : active raid10 sda2[4] sdd2[3] sdc2[2] sdb2[1]
   779264640 blocks super 1.0 64K chunks 2 near-copies [4/4] []
 
 md1 : active raid1 sdd1[1] sdc1[0]
   1076224 blocks [2/2] [UU]
 
 md0 : active raid1 sdb1[1] sda1[0]
   1076224 blocks [2/2] [UU]
 
 unused devices: none
 
 
 What my script should be checking?
 
 Does the number of U (8 for this host) letters indicate that RAID is
 healthy?
 Or should I count in_sync in cat /sys/block/md*/md/rd*/state?
 Perhaps the two approaches are the same, though.
 
 
 What's the best way to determine that the RAID is running fine?
 
 
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Mike Hardy


Steve Cousins wrote:

 MAILADDR [EMAIL PROTECTED]
 ARRAY /dev/md0 level=raid5 num-devices=3
 UUID=39d07542:f3c97e69:fbb63d9d:64a052d3
 devices=/dev/sdb1,/dev/sdc1,/dev/sdd1

If you list the devices explicitly, you're opening the possibility for
errors when the devices are re-ordered following insertion (or removal)
of any other SATA or SCSI (or USB storage) device

I think you want is a DEVICE partitions line accompanied by ARRAY
lines that have the UUID attribute you've already got in there.

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID over Firewire

2006-08-23 Thread Mike Hardy

Richard Scobie wrote:
 Dexter Filmore wrote:
 
 Of all modes I wouldn't use a linear setup for backups. One disk dies
 - all data is lost.
 
 I'd go for an external raid5 solution, tho those tend to be slow and
 expensive.

 
 Unfortunately budget is the overriding factor here. Unlike RAID 0, I
 thought there may be a way of recovering data from undamaged disks in a
 linear array, although I guess the file system used has some say in this.
 
 I hope to mitgate the risk somewhat by regularly using smartd to do long
 self tests on the disks.


Long self tests will just tell you that you lost a block before RAID or
the FS notices it, it's not going to stop the block (and your data) from
going away.

One more disk and you have raid 5 at least with the same storage
capacity. md will transparently (to the OS, you'll get a log message)
recover from single block errors in raid5.

I'm not sure SMART works over firewire anyway. That's a question.

http://smartmontools.sourceforge.net/:

As for USB and FireWire (ieee1394) disks and tape drives, the news is
not good. They appear to Linux as SCSI devices but their implementations
do not usually support those SCSI commands needed by smartmontools.

Note that page is slightly out of date - they mention SMART for SATA is
supported through a patch to mainline, but it is in fact mainline now.

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is mdadm --create safe for existing arrays ?

2006-08-16 Thread Mike Hardy

Warning: I'm not certain this info is correct (I test on fake loopback
arrays before taking my own advice - be warned). More authoritative
folks are more than welcome to correct me or disagree.

create is safe on existing arrays in general, so long as you get the old
device order correct in the new create statement, and you use the
'missing' keyword appropriately so resyncs don't start immediately and
you can mount the device to make sure you're data is there. Once you're
certain, you add a drive in place of the missing component, sync up, and
you're set.

In this case, that'd be an 'mdadm --create -l1 -n2 /dev/md0 /dev/sda1
missing'. You should have an array there that you can test.

But wait!

If the superblock wasn't persistent before, it's possible that the
device is using the space that would be used for the superblock for
filesystem information - it may not be reserved for md use. This is
where I'm not sure.

If that's the case, re-creating with persistent superblocks may clobber
the end of your filesystem, and you may not notice until you try to use it.

There was a thread quite recently (one week? two? can't quite remember)
specifically about putting a non-raid FS into a raid set that touched on
these issues, and how to do the FS shrink so it would have room for the
raid superblock. I'd refer to that. The goal being to shrink 1MB or so
off the FS, create the raid, then grow the FS to max again (or let it
be, whatever)

-Mike

Peter Greis wrote:
 Greetings,
 
 I have a SuSE 10.0 raid-1 root which will not properly
 boot, and I have noticed that / and /boot have
 non-persistent super blocks (which I read is required
 for booting).
 
 So, how do I change / and /boot to make the super
 blocks persistent ? Is it safe to run mdadm --create
 /dev/md0 --raid-devices=2 --level=1 /dev/sda1
 /dev/sdb1 without loosing any data ?
 
 regards,
 
 Peter
 
 PS Yes, I have googled extensively without finding a
 conclusive answer.
  
 
 Peter Greis
 freethinker gmbh
 Stäfa Switzerland
 
 __
 Do You Yahoo!?
 Tired of spam?  Yahoo! Mail has the best spam protection around 
 http://mail.yahoo.com 
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-19 Thread Mike Hardy


Nigel J. Terry wrote:

 One comment - As I look at the rebuild, which is now over 20%, the time
 till finish makes no sense. It did make sense when the first reshape
 started. I guess your estimating / averaging algorithm doesn't work for
 a restarted reshape. A minor cosmetic issue - see below
 
 Nigel
 [EMAIL PROTECTED] ~]$ cat /proc/mdstat
 Personalities : [raid5] [raid4]
 md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
  490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3]
 [UUU_]
  []  reshape = 22.7% (55742816/245111552)
 finish=5.8min speed=542211K/sec


Unless something has changed recently the parity-rebuild-interrupted /
restarted-parity-rebuild case shows the same behavior.

It's probably the same chunk of code (I haven't looked, bad hacker!
bad!), but I thought I'd mention it in case Neil goes looking

The speed is truly impressive though. I'll almost be sorry to see it
fixed :-)

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 disaster

2006-05-23 Thread Mike Hardy

Bruno Seoane wrote:

 mdadm -C -l5 -n5 
 -c=128 /dev/md0 /dev/sdb1 /dev/sdd1 /dev/sde1 /dev/sdc1 /dev/sda1
 
 I took the devices order from the mdadm output on a working device. Is this 
 the way it's supposed to be the command assembled?
 
 Is there anything alse I should consider or any other valid solution to gain 
 access to my data?


If you create the array, it will immediately start resyncing unless you
list one of the devices in your command line as missing. Just pick one
(ideally one of the ones that isn't getting picked up anyway) and put
'missing' in its place.

Using missing is the only way to have it be read-only in the data
regions. That'll let you make a mistake and still be able to recover
data after you find the right command line.

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slackware -current softraid5 boot problem - additional info

2006-05-09 Thread Mike Hardy

Something fishy here

Dexter Filmore wrote:

 # mdadm -E /dev/sdd

Device /dev/sdd

 # cat /proc/mdstat
 Personalities : [raid5]
 md0 : active raid5 sda1[0] sdd1[3] sdc1[2] sdb1[1]
   732563712 blocks level 5, 32k chunk, algorithm 2 [4/4] []

Components that are all the first partition.

Are you using the whole disk, or the first partition?

It appears that to some extent, you are using both.

Perhaps some confusion on that point between your boot scripts and your
manual run explains things?


-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 resizing

2006-05-01 Thread Mike Hardy

Neil Brown wrote:
 On Monday May 1, [EMAIL PROTECTED] wrote:
 
Hey folks.

There's no point in using LVM on a raid5 setup if all you intend to do
in the future is resize the filesystem on it, is there? The new raid5
resizing code takes care of providing the extra space and then as long
as the say ext3 filesystem is created with resize_inode all should be
sweet. Right? Or have I missed something crucial here? :)
 
 
 You are correct.  md/raid5 makes the extra space available all by
 itself. 

Further - even if you don't create the filesystem with the right amount
of extra metadata space for online resizing, you can resize any ext2/3
filesystem offline, and it doesn't take very long. You just use
resize2fsf instead of ext2online

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md: Change ENOTSUPP to EOPNOTSUPP

2006-05-01 Thread Mike Hardy

Paul Clements wrote:
 Gil wrote:
 
 So for those of us using other filesystems (e.g. ext2/3), is there
 some way to determine whether or not barriers are available?
 
 
 You'll see something like this in your system log if barriers are not
 supported:
 
 Apr  3 16:44:01 adam kernel: JBD: barrier-based sync failed on md0 -
 disabling barriers
 
 
 Otherwise, assume that they are. But like Neil said, it shouldn't matter
 to a user whether they are supported or not. Filesystems will work
 correctly either way.

This seems very important to me to understand thoroughly, so please
forgive me if I'm being dense.

What I'm not sure of in the above is for what definition of working?

For the definition where the code simply doesn't bomb out, or for the
stricter definition that despite write caching at the drive level there
is no point where there could possibly be a data inconsistency between
what the filesystem thinks is written and what got written, power loss
or no?

My understanding to this point is that with write caching and no barrier
support, you would still care as power loss would give you a window of
inconsistency.

With the exception of the very minor situation Neil mentioned about the
first write through md not being a superblock write...

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: data recovery on raid5

2006-04-21 Thread Mike Hardy

Recreate the array from the constituent drives in the order you mention,
with 'missing' in place of the first drive that failed?

It won't resync because it has a missing drive.

If you created it correctly, the data will be there

If you didn't create it correctly, you can keep trying permutations of
4-disk arrays with one missing until you see your data, and you should
find it.

-Mike

Sam Hopkins wrote:
 Hello,
 
 I have a client with a failed raid5 that is in desperate need of the
 data that's on the raid.  The attached file holds the mdadm -E
 superblocks that are hopefully the keys to the puzzle.  Linux-raid
 folks, if you can give any help here it would be much appreciated.
 
 # mdadm -V
 mdadm - v1.7.0 - 11 August 2004
 # uname -a
 Linux hazel 2.6.13-gentoo-r5 #1 SMP Sat Jan 21 13:24:15 PST 2006 i686 
 Intel(R) Pentium(R) 4 CPU 2.40GHz GenuineIntel GNU/Linux
 
 Here's my take:
 
 Logfiles show that last night drive /dev/etherd/e0.4 failed and around
 noon today /dev/etherd/e0.0 failed.  This jibes with the superblock
 dates and info.
 
 My assessment is that since the last known good configuration was
 0 missing
 1 /dev/etherd/e0.0
 2 /dev/etherd/e0.2
 3 /dev/etherd/e0.3
 
 then we should shoot for this.  I couldn't figure out how to get there
 using mdadm -A since /dev/etherd/e0.0 isn't in sync with e0.2 or e0.3.
 If anyone can suggest a way to get this back using -A, please chime in.
 
 The alternative is to recreate the array with this configuration hoping
 the data blocks will all line up properly so the filesystem can be mounted
 and data retrieved.  It looks like the following command is the right
 way to do this, but not being an expert I (and the client) would like
 someone else to verify the sanity of this approach.
 
 Will
 
 mdadm -C /dev/md0 -n 4 -l 5 missing /dev/etherd/e0.[023]
 
 do what we want?
 
 Linux-raid folks, please reply-to-all as we're probably all not on
 the list.
 
 Thanks for your help,
 
 Sam
 
 
 
 
 /dev/etherd/e0.0:
   Magic : a92b4efc
 Version : 00.90.00
UUID : 8fe1fe85:eeb90460:c525faab:cdaab792
   Creation Time : Mon Jan  3 03:16:48 2005
  Raid Level : raid5
 Device Size : 195360896 (186.31 GiB 200.05 GB)
Raid Devices : 4
   Total Devices : 5
 Preferred Minor : 0
 
 Update Time : Fri Apr 21 12:45:07 2006
   State : clean
  Active Devices : 3
 Working Devices : 4
  Failed Devices : 1
   Spare Devices : 1
Checksum : 4cc955da - correct
  Events : 0.3488315
 
  Layout : left-asymmetric
  Chunk Size : 32K
 
   Number   Major   Minor   RaidDevice State
 this 1 15201  active sync   /dev/etherd/e0.0
 
0 0   000  removed
1 1 15201  active sync   /dev/etherd/e0.0
2 2 152   322  active sync   /dev/etherd/e0.2
3 3 152   483  active sync   /dev/etherd/e0.3
4 4 152   160  spare   /dev/etherd/e0.1
 /dev/etherd/e0.2:
   Magic : a92b4efc
 Version : 00.90.00
UUID : 8fe1fe85:eeb90460:c525faab:cdaab792
   Creation Time : Mon Jan  3 03:16:48 2005
  Raid Level : raid5
 Device Size : 195360896 (186.31 GiB 200.05 GB)
Raid Devices : 4
   Total Devices : 5
 Preferred Minor : 0
 
 Update Time : Fri Apr 21 14:03:12 2006
   State : clean
  Active Devices : 2
 Working Devices : 3
  Failed Devices : 3
   Spare Devices : 1
Checksum : 4cc991e9 - correct
  Events : 0.3493633
 
  Layout : left-asymmetric
  Chunk Size : 32K
 
   Number   Major   Minor   RaidDevice State
 this 2 152   322  active sync   /dev/etherd/e0.2
 
0 0   000  removed
1 1   001  faulty removed
2 2 152   322  active sync   /dev/etherd/e0.2
3 3 152   483  active sync   /dev/etherd/e0.3
4 4 152   164  spare   /dev/etherd/e0.1
 /dev/etherd/e0.3:
   Magic : a92b4efc
 Version : 00.90.00
UUID : 8fe1fe85:eeb90460:c525faab:cdaab792
   Creation Time : Mon Jan  3 03:16:48 2005
  Raid Level : raid5
 Device Size : 195360896 (186.31 GiB 200.05 GB)
Raid Devices : 4
   Total Devices : 5
 Preferred Minor : 0
 
 Update Time : Fri Apr 21 14:03:12 2006
   State : clean
  Active Devices : 2
 Working Devices : 3
  Failed Devices : 3
   Spare Devices : 1
Checksum : 4cc991fb - correct
  Events : 0.3493633
 
  Layout : left-asymmetric
  Chunk Size : 32K
 
   Number   Major   Minor   RaidDevice State
 this 3 152   483  active sync   /dev/etherd/e0.3
 
0 0   000  removed
1 1   001  faulty removed
2 

Re: A failed-disk-how-to anywhere?

2006-04-09 Thread Mike Hardy

Brad Campbell wrote:
 Martin Stender wrote:
 
 Hi there!

 I have two identical disks sitting on a Promise dual channel IDE
 controller. I guess both disks are primary's then.

 One of the disks have failed, so I bought a new disk, took out the
 failed disk, and put in the new one.
 That might seem a little naive, and apparently it was, since the
 system won't boot up now.
 It boots fine, when only the old, healthy disk is connected.


 My initial thought would be you have hde and hdg in a raid-1 and nothing
 on the on-board controllers. hde has failed and when you removed it your
 controller tried the 1st disk it could find (hdg) to boot of.. Bingo..
 away we go.
 You plug a new shiny disk into hde and now the controller tries to boot
 off that, except it's blank and therefore a no-go.
 
 I'd either try and force the controller to boot off hdg (which might be
 a controller bios option) or swap hde  hdg.. then it might boot and let
 you create your partitions on hdg and then add it back into the mirror.


I'd add another stab in the dark and guess that you didn't install your
boot loader on both drives.

Not that I've ever done that before (ok, a few times, most recently two
days ago, sigh)

Typically the BIOS will try all hard drives and so it should have rolled
to one that worked, but if only the failed drive had the boot loader
then you are of course not bootable.

I solved this by booting rescue mode, starting up the raid arrays,
mounting them, and manually grub installing. Here's a good page for the
grub incantations:
http://gentoo-wiki.com/HOWTO_Gentoo_Install_on_Software_RAID_mirror_and_LVM2_on_top_of_RAID#Bootloader_installation_and_configuration

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: addendum: was Re: recovering data on a failed raid-0 installation

2006-03-31 Thread Mike Hardy

Well, honestly I'm not really sure. I've never done this as I only use
the redundant raid levels, and when they're gone, things are a complete
hash and there's no hope. In fact, with raid-0 (striping, right? not
linear/append?) I believe you are in the same boat. Each large file will
have half its contents on the disk that died. So really, there's very
little hope.

Anyway, I'll try to give you pointers to what I would try anyway, with
as much detail as I can.

First, you just need to get the raid device up. It sounds like you are
actually already doing that, but who knows. If you have one drive but
not the other, you could make a sparse file that is the same size as the
disk you lost. I know this is possible, but haven't done it so you'll
have to see for yourself - I think there are examples in linux-raid
archives in reference to testing very large raid arrays. Loopback mount
the file as a device (losetup is he command to use here) and now you
have a virtual device of the same size as the drive you lost.

Recreate the raid array using the drive you have, and the new virtual
drive in place of the one you lost. It's probably best to do this with
non-persistent superblocks and just generally as read-only as possible
for data preservation on the drive you have.

So now you have a raid array.

For the filesystem, well, I don't know. That's a mess. I assume it's
possible to mount the filesystem with some degree of force (probably a
literally -force argument) as well as read-only. You may need to point
at a different superblock, who knows?

You just want to get the filesystem to mount somehow, any way you need
to, but hopefully in a read-only mode.

I would not even attempt to fsck it.

At this point, you have a mostly busted filesystem on a fairly broken
raid setup, but it might be possible to pull some data out of it, who
knows? You could pull what looks like data but is instead garbage to
though - if you don't have md5sums of the files you get (if you get any)
it'll be hard to tell without checking them all.

Honestly, that's as much as I can think of.

I know I'm just repeating myself when I say this, but raid is no
replacement for backups. They have different purposes, and backups are
no less necessary. I was sorry to hear you didn't have any, because that
probably seals the coffin on your data.

With regard to people recommending you get a pro. In this field (data
recovery) there are software guys (most of the people on this list) that
can do a lot while the platters are spinning and there are hardware guys
(the pros I think most people are talking about). They have physical
tools that can get data out of platters that wouldn't spin otherwise.

There's nothing the folks on the list can do really other than recommend
seeing someone (or shipping the drive to) one of those dudes. When you
get the replacement drive back from them with your data on it, then
we're back in software land and you may have half a chance.

That said, it sounded like you had already tried to fsck the filesystem
on this thing, so you may have hashed the remaining drive. It's hard to
say. Truly bleak though...

-Mike

Technomage wrote:
 mike.
 
 given the problem, I have a request.
 
 
 On Friday 31 March 2006 15:55, Mike Hardy wrote:
 
I can't imagine how to coax a filesystem to work when it's missing half
it's contents, but maybe a combination of forcing a start on the raid
and read-only FS mounts could make it hobble along.
 
 
 we will test any well laid out plan. 
 
 lay out for us (from beginning to end) all the steps required, in your test. 
 do not be afraid to detail the obvious. it is better that we be in good 
 communication than to be working on assumptions. it will save you a lot of 
 frustration trying to correct for our assumptions, if there are none. 
 
 tmh
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recommendations for supported 4-port SATA PCI card ?

2006-03-30 Thread Mike Hardy

Addonics adst114 was the cheapest one I've found that works. I found it
for $41 at thenerds.net but you may be better at the price searching
than me.

It's a Silicon Images 3114 chip, driven by the sata_sil driver

I honestly don't recall if it was out-of-the-box working on FC4, but the
updated kernels drive it fine, and FC5 (with 2.6.16+) should be fine
with it.

-Mike

Ian Thurlbeck wrote:
 
 Dear All
 
 I have 4x500GB Maxtor SATA drives and I want to attach
 these to a 4-port SATA PCI card and RAID5 them using md
 
 Could anybody recommend a card that will have out of
 box support on a Fedora system ?
 
 Many thanks
 
 Ian
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to clone a disk

2006-03-11 Thread Mike Hardy

I can think of two things I'd do slightly differently...

Do a smartctl -t long on each disk before you do anything, to verify
that you don't have single sector errors on other drives

Use ddrescue for better results copying a failing drive

-Mike

PFC wrote:
 
 I have a raid5 array that contain 4 disk and 1 spare disk. now i saw one
 disk have sign of going fail via smart log.
 
 
 Better safe than sorry... replace the failing disk and resync,
 that's all.
 
 You might want to do cat /dev/md#  /dev/null, or cat /dev/hd? 
 
 /dev/null first. This is to be sure there isn't some yet-unseen bad  
 
 sector on some other drive which would screw your resync.
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: No syncing after crash. Is this a software raid bug?

2006-03-01 Thread Mike Hardy

Why would you not be happy? resyncs in general are bad since they
indicate your data is possibly out-of-sync and the resync itself
consumes an enormous amount of resources

This is a feature of new-ish md driver code that more aggressively marks
the array as clean after writes

The end result is that the array will most likely be clean in all
circumstances even a crash, and you simply won't need to resync

That's a good thing!

-Mike

Kasper Dupont wrote:
 I have a FC4 installation (upgraded from FC3) using kernel
 version 2.6.15-1.1831_FC4. I see some symptoms in the software
 raid, which I'm not quite happy about.
 
 After an unclean shutdown caused by a crash or power failure,
 it does not resync the md devices. I have tried comparing the
 contents of the two mirrors for each of the md devices. And I
 found that on the swap device, there were differences.
 
 Isn't this a bug in the software raid? Shouldn't it always
 resync after reboot, if there could possibly be any difference
 between the contents on the two disks?
 
 I know that as long as only swap is affected, it is not going
 to cause data loss. But how can I be sure it is not going to
 happen on file systems as well?
 
 Should I report this as a bug in Fedora Core or did I miss
 something?
 
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Question: array locking, possible?

2006-02-07 Thread Mike Hardy


Chris Osicki wrote:

 
 To rephrase my question, is there any way to make it visible to the
 other host that the array is up an running on the this host?
 
 Any comments, ideas?

Would that not imply an unlock command before you could run the array
on the other host?

Would that not then break the automatic fail-over you want, as no
machine that died or hung would issue the unlock command, meaning that
the fail-over node could not then use the disks

It's an interesting idea, I just can't think of a way to make it work
unattended

It might be possible wrap the 'mdadm' binary with a script that checks
(maybe via some deep check using ssh to execute remote commands, or just
a ping) the hosts status and just prints a little table of host status
that can only be avoided by passing a special --yes-i-know flag to the
wrapper


-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Newbie questions: Max Active disks, RAID migration, Compiling mdadm 2.3

2006-02-04 Thread Mike Hardy

If you remove the '-Werror' it'll compile and work, but you still can't
convert a raid 0 to a raid 5. You're raid level understanding is off as
well, raid 5 is a parity block rotating around all drives, you were
thinking of raid 4 which has a single parity disk. Migrating raid 0 to
raid 4 (and vice versa) should be possible technically, but I don't
think it's implemented anywhere

You should be able to have more than 4 active drives though. I am at
this moment building an array with 6 components, I'm running a few with
more than that, and these are by no means the largest arrays that people
are running - just examples of it working.

-Mike

Martin Ritchie wrote:
 Sorry if these are total newbie questions.
 
 Why can't I have more than 4 active drives in my md RAID?
 
 Why can't I easily migrate a RAID 0 to RAID 5. As I see it RAID 0 is 
 just RAID 5 with a failed parity check drive?
 
 Perhaps this is a limitation of the old v1.11 that FC4 updates to.
 
 I tried to compile 2.3 but I get this error:
 
 $make
 gcc -Wall -Werror -Wstrict-prototypes -DCONFFILE=\/etc/mdadm.conf\ -
 ggdb -DSendmail=\/usr/sbin/sendmail -t\   -c -o super0.o super0.c
 In file included from super0.c:31:
 /usr/include/asm/byteorder.h:6:2: error: #warning using private  kernel
 header; include endian.h instead!
 make: *** [super0.o] Error 1
 
 I'm not too familiar with compiling this sort of thing. (I usually  live
 further away from the hardware and endian issues). I'm guessing  there
 is some sort of option i have to specify to say that this  should use
 the private kernel headers. Including endian.h instead  didn't help:
 
 $make
 gcc -Wall -Werror -Wstrict-prototypes -DCONFFILE=\/etc/mdadm.conf\ -
 ggdb -DSendmail=\/usr/sbin/sendmail -t\   -c -o super0.o super0.c
 cc1: warnings being treated as errors
 super0.c: In function ‘add_internal_bitmap0’:
 super0.c:737: warning: implicit declaration of function ‘__cpu_to_le32’
 super0.c:742: warning: implicit declaration of function ‘__cpu_to_le64’
 make: *** [super0.o] Error 1
 
 Oh just because I know it is going to be an issue I'm building on a 
 Athlon 64... my first 64bit linux box so I'm sure there are going to  be
 gotchas that I've not thought about.
 
 Is there somewhere I over looked for finding this information.
 
 TIA

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ludicrous speed: raid6 reconstruction

2006-02-03 Thread Mike Hardy

I saw this on my array, and other(s) have reported it as well.

Apparently the reconstruction speed algorithm doesn't understand that
it's not syncing all the blocks and hilarity ensues. I believe that was
it, anyway

Either that or you really have a hell of a server :-)

-Mike

jurriaan wrote:
 Personalities : [linear] [raid0] [raid1] [raid5] [raid4] [raid6] 
 md0 : active raid6 sdh1[7] sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[1] 
 sda1[0]
   1465175424 blocks level 6, 64k chunk, algorithm 2 [8/8] []
   [==..]  resync = 91.0% (39104/244195904) 
 finish=0.0min speed=7369041K/sec
   bitmap: 23/233 pages [92KB], 512KB chunk
 I am reminded of Spaceball's 'ludicrous speed' here. This is after a
 reboot from 2.6.16-rc1-mm3 (where the array was built) to 2.6.16-rc1-mm5
 (where rebuilding continued thanks to the bitmap).
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid reconstruction speed

2006-01-19 Thread Mike Hardy

PFC wrote:

 When rebuilding md1, it does not realize accesses to md0 wait for
 the  same disks. Thus reconstruction of md1 runs happily at full speed,
 and the  machine is dog slow, because the OS and everything is on md0.
 (I cat /dev/zero to a file on md1 to slow the rebuild so it would
 let me  start a web browser so I don't get bored to death)

echo 1  /proc/sys/dev/raid/speed-limit-max (or similar?)

You can do that in /etc/rc.local or something to make sure it sticks,
then you'll be able to use your machine while any array rebuilds.

I guess the feature you're asking for is for md to guess that accessing
any partition component on a disk that has a partition being rebuilt
should throttle the rebuild, right?

Can that heuristic be successful at all times? I think it might.

Does md have enough information to do that? I don't know...

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: Linux MD raid5 and reiser4... Any experience ?

2006-01-06 Thread Mike Hardy

Slightly off-topic, but:

Simon Valiquette wrote:
 Francois Barre a écrit :


   On production server with large RAID array, I tends to like very
 much XFS and trust it more than ReiserFS (I had some bad experience
 with ReiserFS in the past).  You can also grow a XFS filesystem live,
 which is really nice.

I didn't know this until recently, but ext2/3 can be grown online as
well (using 'ext2online'), given that you create it originally with
enough block group descriptor table room to support the size you're
growing too.

From the man page for mke2fs:

-E extended-options
   Set  extended options for the filesystem.  Extended options are
   comma separated, and may take  an  argument  using  the  equals
   (’=’) sign.  The -E option used to be -R in earlier versions of
   mke2fs.  The -R option is still accepted for backwards compati-
   bility.   The following extended options are supported:
 stride=stripe-size
   Configure  the  filesystem  for  a  RAID array with
   stripe-size filesystem blocks per stripe.

 resize=max-online-resize
   Reserve  enough  space  so  that  the  block  group
   descriptor  table  can grow to support a filesystem
   that has max-online-resize blocks.

I have done it, and it works.

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 write performance

2005-11-18 Thread Mike Hardy

Moreover, and I'm sure Neil will chime in here, isn't the clean/unclean
thing designed to prevent this exact scenario?

The array is marked unclean immediately prior to write, then the write
and parity write happens, then the array is marked clean.

If you crash during the write but before parity is correct, the array is
unclean and you resync (quickly now thanks to intent logging if you have
that on)

The non-parity blocks that were partially written are then the
responsibility of your journalling filesystem, which should make sure
there is no corruption, silent or otherwise.

If I'm misunderstanding that, I'd love to be corrected. I was under the
impression that the silent corruption issue was mythical at this point
and if it's not I'd like to know.

-Mike

Dan Stromberg wrote:
 Would it really be that much slower to have a journal of RAID 5 writes?
 
 On Fri, 2005-11-18 at 15:05 +0100, Jure Pečar wrote:
 
Hi all,

Currently zfs is a major news in the storage area. It is very interesting to 
read various details about it on varios blogs of Sun employees. Among the 
more interesting I found was this:

http://blogs.sun.com/roller/page/bonwick?entry=raid_z

The point the guy makes is that it is impossible to atomically both write 
data and update parity, which leaves a window of crash that would silently 
leave on-disk data+paritiy in an inconsistent state. Then he mentions that 
there are software only workarounds for that but that they are very very slow.

It's interesting that my expirience with veritas raid5 for example is just 
that: slow to the point of being unuseable. Now, I'm wondering what kind of 
magic does linux md raid5 does, since its write performance is quite good? 
Or, does it actually do something regarding this? :)

Niel?

 
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 write performance

2005-11-18 Thread Mike Hardy


Guy wrote:

 It is not just a parity issue.  If you have a 4 disk RAID 5, you can't be
 sure which if any have written the stripe.  Maybe the parity was updated,
 but nothing else.  Maybe the parity and 2 data disks, leaving 1 data disk
 with old data.
 
 Beyond that, md does write caching.  I don't think the file system can tell
 when a write is truly complete.  I don't recall ever having a Linux system
 crash, so I am not worried.  But power failures cause the same risk, or
 maybe more.  I have seen power failures, even with a UPS!

Good points there Guy - I do like your example. I'll go further with
crashing too and say that I actually crash outright occasionally.
Usually when building out new machines where I don't know the proper
driver tweaks, or failing hardware, but it happens without power loss.
Its important to get this correct and well understood.

That said, unless I hear otherwise from someone that works in the code,
I think md won't report the write as complete to upper layers until it
actually is. I don't believe it does write-caching, and regardless, if
it does it must not do it until some durable representation of the data
is committed to hardware and the parity stays dirty until redundancy is
committed.

Building on that, barring hardware write-caching, I think with a
journalling FS like ext3 and md only reporting the write complete when
it really is, things won't be trusted at the FS level unless they're
durably written to hardware.

I think that's sufficient to prove consistency across crashes.

For example, even if you crash during an update to a file smaller than a
stripe, the stripe will be dirty so the bad parity will be discarded and
the filesystem won't trust the blocks that didn't get reported back as
written by md. So that file update is lost, but the FS is consistent and
all the data it can reach is consistent with what it thinks is there.

So, I continue to believe silent corruption is mythical. I'm still open
to good explanation it's not though.

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


raidreconf / growing raid 5 doesn't seem to work anymore

2005-04-03 Thread Mike Hardy

Hello all -

This is more of a cautionary tale than anything, as I have not attempted
to determine the root cause or anything, but I have been able to add a
disk to a raid5 array using raidreconf in the past and my last attempt
looked like it worked but still scrambled the filesystem.

So, if you're thinking of relying on raidreconf (instead of a
backup/restore cycle) to grow your raid 5 array, I'd say its probably
time to finally invest in enough backup space. Or you could dig in and
test raidreconf until you know it will work.

I'll paste the commands and their output in below so you can see what
happened - raidreconf appeared to work just fine, but the file-system is
completely corrupted as far as I can tell. Maybe I just did something
wrong though. I used a make no changes mke2fs command to generate the
list of alternate superblock locations. They could be wrong, but the
first one being corrupt is enough by itself to be a fail mark for
raidreconf.

This isn't a huge deal in my opinion, as this actually is my backup
array, but it would have been cool if it had worked. I'm not going to be
able to do any testing on it past this point though as I'm going to
rsync the main array onto this thing ASAP...

-Mike


---
marvin/root # raidreconf -o /etc/raidtab -n /etc/raidtab.new -m /dev/md2
Working with device /dev/md2
Parsing /etc/raidtab
Parsing /etc/raidtab.new
Size of old array: 2441960010 blocks,  Size of new array: 2930352012 blocks
Old raid-disk 0 has 953890 chunks, 244195904 blocks
Old raid-disk 1 has 953890 chunks, 244195904 blocks
Old raid-disk 2 has 953890 chunks, 244195904 blocks
Old raid-disk 3 has 953890 chunks, 244195904 blocks
Old raid-disk 4 has 953890 chunks, 244195904 blocks
New raid-disk 0 has 953890 chunks, 244195904 blocks
New raid-disk 1 has 953890 chunks, 244195904 blocks
New raid-disk 2 has 953890 chunks, 244195904 blocks
New raid-disk 3 has 953890 chunks, 244195904 blocks
New raid-disk 4 has 953890 chunks, 244195904 blocks
New raid-disk 5 has 953890 chunks, 244195904 blocks
Using 256 Kbyte blocks to move from 256 Kbyte chunks to 256 Kbyte chunks.
Detected 256024 KB of physical memory in system
A maximum of 292 outstanding requests is allowed
---
I will grow your old device /dev/md2 of 3815560 blocks
to a new device /dev/md2 of 4769450 blocks
using a block-size of 256 KB
Is this what you want? (yes/no): yes
Converting 3815560 block device to 4769450 block device
Allocated free block map for 5 disks
6 unique disks detected.
Working (\) [03815560/03815560]
[]
Source drained, flushing sink.
Reconfiguration succeeded, will update superblocks...
Updating superblocks...
handling MD device /dev/md2
analyzing super-block
disk 0: /dev/hdc1, 244196001kB, raid superblock at 244195904kB
disk 1: /dev/hde1, 244196001kB, raid superblock at 244195904kB
disk 2: /dev/hdg1, 244196001kB, raid superblock at 244195904kB
disk 3: /dev/hdi1, 244196001kB, raid superblock at 244195904kB
disk 4: /dev/hdk1, 244196001kB, raid superblock at 244195904kB
disk 5: /dev/hdj1, 244196001kB, raid superblock at 244195904kB
Array is updated with kernel.
Disks re-inserted in array... Hold on while starting the array...
Maximum friend-freeing depth: 8
Total wishes hooked:3815560
Maximum wishes hooked:  292
Total gifts hooked: 3815560
Maximum gifts hooked:   200
Congratulations, your array has been reconfigured,
and no errors seem to have occured.
marvin/root # cat /proc/mdstat
Personalities : [raid1] [raid5]
md1 : active raid1 hda1[0] hdb1[1]
  146944 blocks [2/2] [UU]

md3 : active raid1 hda2[0] hdb2[1]
  440384 blocks [2/2] [UU]

md2 : active raid5 hdj1[5] hdk1[4] hdi1[3] hdg1[2] hde1[1] hdc1[0]
  1220979200 blocks level 5, 256k chunk, algorithm 0 [6/6] [UU]
  [=...]  resync =  7.7% (19008512/244195840)
finish=434.5min speed=8635K/sec
md0 : active raid1 hda3[0] hdb3[1]
  119467264 blocks [2/2] [UU]

unused devices: none
marvin/root # mount /backup
mount: wrong fs type, bad option, bad superblock on /dev/md2,
   or too many mounted file systems
   (aren't you trying to mount an extended partition,
   instead of some logical partition inside?)
marvin/root # fsck.ext3 -C 0 -v /dev/md2
e2fsck 1.35 (28-Feb-2004)
fsck.ext3: Filesystem revision too high while trying to open /dev/md2
The filesystem revision is apparently too high for this version of e2fsck.
(Or the filesystem superblock is corrupt)


The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 device

marvin/root # mke2fs -j -m 1 -n -v
Usage: mke2fs [-c|-t|-l filename] [-b block-size] [-f 

Re: raidreconfig advice

2005-03-12 Thread Mike Hardy

Max Waterman wrote:
OK, I am going to try to expand the capacity of my raid5 array and I 
want to make sure I've got it right.
Not a bad idea, as its all or nothing...
Disk /dev/hdg: 200.0 GB, 200049647616 bytes
Disk /dev/hdi: 200.0 GB, 200049647616 bytes
Disk /dev/hdk: 200.0 GB, 200049647616 bytes
Disk /dev/sda: 200.0 GB, 200049647616 bytes
Disk /dev/sdb: 200.0 GB, 200049647616 bytes
Disk /dev/sdc: 200.0 GB, 200049647616 bytes
Disk /dev/sdd: 200.0 GB, 200049647616 bytes
They certainly all looked the same (including the C/H/S counts)
This leaves me with sdc which I can try to add. If that goes OK, I'll 
trash the backup and add sd[ab] too.
I'd be very wary of this, for two reasons. One, you have the backup 
during the add for a reason. If anything goes wrong, there goes your 
data. Second, where would you ever back your raid up to? What about fs 
corruption?

The rule of thumb with databases is to always have enough contiguous 
scratch space to dump and restore your biggest table. With large RAID, 
you should always be able to dump and restore your largest raid device, 
imho. Its a bunch more disk yes, but you'll need it at some point, I 
promise. Many future tears can be averted...

2) Where do I get raidreconfig from? Google wasn't much help.
I saw you noticed it raidreconf - you should be set there
3) Are there any instructions for raidreconfig? I understand is uses 
some non-mdadm config files as from/to input.
the man page is great - honest. Two conf files (current and future) and 
you're set

The last question should be an open-ended is there anything else?
1) Run a long SMART test on all drives first. Imagine if you get a bad 
block during the reconfig...
2) Validate your backup (just in case)
3) ?? It takes a long time to do, be patient I guess
4) You could use the script I posted earlier that sets up a loopback 
device practice raid set to practice perhaps (if you really wanted)

Good luck-
-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md Grow for Raid 5

2005-03-08 Thread Mike Hardy
Frank Wittig wrote:
It actually is available.
I've tested it and it worked fine for me. But taking a backup is highly 
recommended.
The trick is not to use mdadm, since growing with mdadm is not possible 
at the moment. Use raid-tools instead.
The program raidreconf comes along with raidtools. This prog takes two 
raid-tab files as input which describe the array configuration before 
and after reconfiguration. (See man raidreconf for further details)
I'll second both major points here:
raidreconf does work, but it can fail and leave things completely 
destroyed (imagine one bad block somewhere after parity was partially 
migrated), so take a backup.

Given that you're taking a backup already then, creating a new array 
(with its optimized resync) might be faster if its an online backup.

I'm 2 for 4 now on raidreconf working, with the two failures (sadly) 
being of the operator error variety - raidreconf is picky and fails 
slow if your disk sizes aren't what it expects, I found. It got to the 
end and ran out of space on me due to a slightly different 250GB disk 
size once. The other was a bad block along the way - I should have done 
smartctl -t long on all drives prior to resize. Both lessons learned...

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md Grow for Raid 5

2005-03-08 Thread Mike Hardy
berk walker wrote:
Have you guys seen/tried mdadm 1.90?  I am delightfully experiencing the 
I believe the mdadm based grow does not work for raid5, but only for 
raid0 or raid1. raidreconf is actually capable of adding disks to raid5 
and re-laying out the stripes / moving parity blocks, etc

You're very correct about needing to grow the FS after growing the 
device though. Most FS's have tools for that, or there's LVM...

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID 1 on a server with possible mad mobo

2005-03-08 Thread Mike Hardy

Colin McDonald wrote:
 Is it a bad idea to write the grub to a software mirror. Is it written
to a specific disk when this is done?
The Software Raid and Grub HOW-TO
http://lists.us.dell.com/pipermail/linux-poweredge/2003-July/014331.html
I use grub+raid1 on the root drive of a number of machines, but you do 
have to be careful as its my understanding its not raid aware, and 
picks a drive, whereas lilo was raid-aware and wrote the boot sector 
on all components.

After following the directions at the link above, I've been able to boot 
the machine off each component (during failure testing), so the 
directions appear to work, to me.

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH md 9 of 9] Optimise reconstruction when re-adding a recently failed drive.

2005-02-17 Thread Mike Hardy
NeilBrown wrote:
When an array is degraded, bit in the intent-bitmap are
never cleared. So if a recently failed drive is re-added, we only need
to reconstruct the block that are still reflected in the
bitmap.
This patch adds support for this re-adding.
Hi there -
If I understand this correctly, this means that:
1) if I had a raid1 mirror (for example) that has no writes to it since 
a resync
2) a drive fails out, and some writes occur
3) when I re-add the drive, only the areas where the writes occurred 
would be re-synced?

I can think of a bunch of peripheral questions around this scenario, and 
bad sectors / bad sector clearing, but I may not be understanding the 
basic idea, so I wanted to ask first.

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH md 2 of 4] Fix raid6 problem

2005-02-13 Thread Mike Hardy

Mark Hahn wrote:
Interesting - the private mail was from me, and I've got two dual 
Opterons in service. The one with significantly more PCI activity has 
significantly more problems then the one with less PCI activity.

that's pretty odd, since the most intense IO devices I know of 
are cluster interconnect (quadrics, myrinet, infiniband),
and those vendors *love* opterons.  I've never heard any of them
say other than that Opteron IO handling is noticably better than
Intel's.
Sure, but which variables are changed between the rigs the vendors 
loved, and the rig we're having problems with?

otoh, I could easily believe that if you're running the Opteron 
systems in acts-like-a-faster-xeon mode (ie, not x86_64),
you might be exercising some less-tested paths.
Its running x86_64 (Fedora Core 3) and the problem is rooted in the 
chipset I believe. I don't think its Opterons per se, I think its just 
the Athlon take two - which is to say that its a wonderful chip, but 
some of the chipsets its saddled with are horrible, and careful 
selection (as well as heavy testing prior to putting a machine in 
service) is essential.

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Migrating from SINGLE DISK to RAID1

2005-02-01 Thread Mike Hardy
Robert Heinzmann wrote:
Hello,
can someone verify if the following statements are true ?
- It's not possible to simply  convert a existing partition with a 
filesystem on it to a raid1 mirror set.
I believe you're right, but I'm not totally sure on this one. I'd take 
the second disk, create a new RAID1 with the first drive missing on 
the mdadm --create commandline, copy everything over to it, put grub on 
it, then test that it boots correctly by pulling the first drive out.

Only once the RAID1 is working (in degraded mode) add the original first 
drive back in, but booting off the RAID1, then add it to the RAID set to 
complete the pair. With that process, the question is somewhat moot, 
although I'm interested in the real answer too

- Using a former disk of a raid1 array as a usual disk (not mounted as 
degrated /dev/mdX, but instead mounted as /dev/sdX or /dev/hdX) is 
successfull.

This is because the MD device layer reports the device size as size of 
disk - superblock offset during the creation of a filesystem on the MD 
device. Thus the used size of the disk, when mounting it as /dev/sdX 
/dev/hdX, is some KB smaller than it could be, but no data is lost.
This matches my experience, although autodiscovery can get in your way, 
as you mention yourself

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Broken harddisk

2005-01-29 Thread Mike Hardy
Guy wrote:
For future reference:
Everyone should do a nightly disk test to prevent bad blocks from hiding
undetected.  smartd, badblocks or dd can be used.  Example:
dd if=/dev/sda of=/dev/null bs=64k
Just create a nice little script that emails you the output.  Put this
script in a nighty cron to run while the system is idle.
While I agree with your purpose 100% Guy, I respectfully disagree with 
the method. If at all possible, you should use tools that access the 
SMART capabilities of the device so that you get more than a read test - 
you also get statistics on the various other health parameters the drive 
checks some of which can serve fair warning of impending death before 
you get bad blocks.

http://smartmontools.sf.net is the source for fresh packages there, and 
smartd can be set up with a config file to do tests on any schedule you 
like, emailing you urgent results as it gets them, or just putting 
information of general interest in the logs that Logwatch picks up.

If you're drives don't talk SMART (older ones don't, it doesn't work 
through all interfaces either) then by all means take Guy's advice. A 
'dd' test is certainly valuable. But if they do talk SMART, I think its 
better

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: migrating raid-1 to different drive geometry ?

2005-01-25 Thread Mike Hardy

Robin Bowes wrote:
Mike Hardy wrote:
To grow component count on raid5 you have to use raidreconf, which can 
work, but will toast the array if anything goes bad. I have personally 
had it work, and not work, in different instances. The failures were 
not necessarily raidreconf's fault, but its not fault tolerant is the 
point, as it starts at the first stripe, laying things out the new 
way, and if it doesn't finish, and finish correctly, you are in an 
irretrievable inconsistent state.

Bah, too bad.
I don't need it yet, but at some stage I'd like to be able to add 
another 250GB drive(s) to me array and grow the array to use the 
additional space in a safe/failsafe way.

Perhaps by the time I come to need it this might be possible?
Well, I want to be clear here, as who ever wrote raidreconf deserves 
some respect, and I don't want to appear to be disparaging it.

raidreconf works. I'm not aware of any bugs in it.
Further, if mdadm was to implement the feature of adding components to a 
raid5 array, I'm guessing it would look exactly the same as raidreconf, 
simply because of the work it has to do (re-configuring each stripe, 
moving parity blocks and data blocks around, etc). Its just the way the 
raid5 disk layout is.

So, since raidreconf does work, its definitely possible now, but you 
have to make absolutely amazingly sure of three things:

1) the component size you add is at least as large as the rest of the 
components (it'll barf at the end if not)
2) the old and new configurations you feed raidreconf are perfect (or 
what happens is undefined)
3) you have absolutely no bad blocks on any component, as it will read 
each block on each component and write each block on each component. 
(that's a tall order these days, if you get a bad block, what can it do?)

If any of those things go bad, your array goes bad, but its not the 
algorithm's fault, as far as I can tell. Its constrained by the 
problem's requirements. So I'd add:

4) you have a perfect, fresh backup of the array ;-)
Honestly, I've done it, and it does work, its just touchy. You can 
practice with it with loop devices (check for a raid5 loop array creator 
and destructor script I posted a week or so back) if you want to see it.

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


raid5 chunk calculator (was Re: the dreaded double disk failure)

2005-01-16 Thread Mike Hardy
Mike Hardy wrote:
Mike Hardy wrote:
What I'm thinking of doing is writing a small (or, as small as 
possible, anyway) perl program that can take a few command line 
arguments (like the array construction information) and know how to 
read the data blocks on the array, and calculate parity, as a 
baseline. If perl offends you, sorry, I'm quicker at it than C by a 
long-shot, and I don't really care about speed here, just speed of 
development.

Here's the shell script I'm using as a test harness. It creates a 
loopback raid5 system, fills it up with random data, and then takes the 
md5sum. It has a few modes of operation (to initialize or not as it 
starts or stops the array).
Probably bad form to keep replying to myself, but what the heck.
Ok, I've got a basic perl program together where you specify an
arbitrary raid5 array layout, an array component, and a sector address
in that component, and it can tell you:
 a) what the computed value of the sector's chunk should be
 b) if the real data in the chunk matches the computed value
It still needs more structure and cleaning to be useful (it needs a loop
to be a general parity checker, or some write logic to be a
bad-sector-clearance script). However, the basic raid math seems to work
  with the test-array creation script I posted earlier in the testing I
threw at it, and it might already be useful to others.
If anyone checks it out and finds bugs I need to fix or can think of a
use for it other than what I'm thinking, let me know, and that'll save
me time or show me where I'm missing useful abstractions so I can clean
it up properly.
Otherwise I'm going to do a lot more testing, wrap this up tomorrow, and
(hopefully!) fix the unreadable sectors on the second bad drive in my
array with it.
-Mike
#!/usr/bin/perl -w

#
# raid5 perl utility
#   Copyright (C) 2005 Mike Hardy [EMAIL PROTECTED]
#
# This script understands the default linux raid5 disk layout,
# and can be used to check parity in an array stripe, or to calculate
# the data that should be present in a chunk with a read error.
#
# Constructive criticism, detailed bug reports, patches, etc gladly accepted!
#
# Thanks to Ashford Computer Consulting Service for their handy RAID 
information:
#http://www.accs.com/p_and_p/RAID/index.html
#
# Thanks also to the various linux kernel hackers that have worked on 'md',
# the header files and source code were quite informative when writing this.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2, or (at your option)
# any later version.
#
# You should have received a copy of the GNU General Public License
# (for example /usr/src/linux/COPYING); if not, write to the Free
# Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
#

my @array_components = (
/dev/loop0,
/dev/loop1,
/dev/loop2,
/dev/loop3,
/dev/loop4,
/dev/loop5,
/dev/loop6,
/dev/loop7
);

my $chunk_size = 64 * 1024; # chunk size is 64K
my $sectors_per_chunk = $chunk_size / 512;


# Problem - I have a bad sector on one disk in an array
my %component = (
sector = 2032,
device = /dev/loop3
);


# 1) Get the array-related info for that sector
# 2) See if it was the parity disk or not
# 2a) If it was the parity disk, calculate the parity
# 2b) If it was not the parity disk, calculate its value from parity
# 3) Write the data back into the sector

(
 $component{array_chunk},
 $component{chunk_offset}, 
 $component{stripe},
 $component{parity_device}
 ) = getInfoForComponentAddress($component{sector}, $component{device});

foreach my $KEY (keys(%component)) {
print $KEY .  =  . $component{$KEY} . \n;
}

# We started with the information on the bad sector, and now we know how it 
fits into the array
# Lets see if we can fix the bad sector with the information at hand

# Build up the list of devices to xor in order to derive our value
my $xor_count = -1;
for (my $i = 0; $i = $#array_components; $i++) {

# skip ourselves as we roll through
next if ($component{device} eq $array_components[$i]);

# skip the parity chunk as we roll through
next if ($component{parity_device} eq $array_components[$i]);

$xor_devices{++$xor_count} = $array_components[$i];

print 
Adding xor device  . 
$array_components[$i] .  as xor device  . 
$xor_count . \n;
}

# If we are not the parity device, put the parity device at the end
if (!($component{device} eq $component{parity_device})) {

$xor_devices{++$xor_count} = $component{parity_device};

print 
Adding parity device  . 
$component{parity_device} .  as xor device  . 
$xor_count . \n;
}


# pre-calculate the device