----- Message from [EMAIL PROTECTED] ---------
Date: Fri, 4 Jan 2008 09:37:24 +1100
From: Neil Brown <[EMAIL PROTECTED]>
Reply-To: Neil Brown <[EMAIL PROTECTED]>
Subject: Re: PROBLEM: RAID5 reshape data corruption
To: Nagilum <[EMAIL PROTECTED]>
Cc: [email protected], Dan Williams
<[EMAIL PROTECTED]>, "H. Peter Anvin" <[EMAIL PROTECTED]>I'm not just interested in a simple behaviour fix I'm also interested in what actually happens and if possible a repair program for that kind of data corruption.What happens is that when reshape happens while a device is missing, the data on that device should be computed from the other data devices and parity. However because of the above bug, the data is copied into the new layout before the compute is complete. This means that the data that was on that device is really lost beyond recovery. I'm really sorry about that, but there is nothing that can be done to recover the lost data.
Thanks a lot Neil!I can confirm your findings, the data in the chunks is the data from the broken device. Now to my particular case:
I still have the old disk and I haven't touched the array since.I just run a dd_rescue -r (reverse) on the old disk and as I expected most of it (>99%) is still readable. So what I want to do is read the chunks from that disk - starting at the end down to the 4% point where the reshape was interrupted due to the disk read error - and replace the chunks on md0.
That should restore most of the data.Now in order to do so I need to know how to calculate the different positions of the chunks.
So for the old disk I have:
nas:~# mdadm -E /dev/sdg
/dev/sdg:
Magic : a92b4efc
Version : 00.91.00
UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
Creation Time : Sat Sep 15 21:11:41 2007
Raid Level : raid5
Used Dev Size : 488308672 (465.69 GiB 500.03 GB)
Array Size : 2441543360 (2328.44 GiB 2500.14 GB)
Raid Devices : 6
Total Devices : 7
Preferred Minor : 0
Reshape pos'n : 118360960 (112.88 GiB 121.20 GB)
Delta Devices : 1 (5->6)
Update Time : Fri Nov 23 20:05:50 2007
State : active
Active Devices : 6
Working Devices : 7
Failed Devices : 0
Spare Devices : 1
Checksum : 9a8358c4 - correct
Events : 0.677965
Layout : left-symmetric
Chunk Size : 16K
Number Major Minor RaidDevice State
this 3 8 96 3 active sync /dev/sdg
0 0 8 0 0 active sync /dev/sda
1 1 8 16 1 active sync /dev/sdb
2 2 8 32 2 active sync /dev/sdc
3 3 8 96 3 active sync /dev/sdg
4 4 8 64 4 active sync /dev/sde
5 5 8 80 5 active sync /dev/sdf
6 6 8 48 6 spare /dev/sdd
the current array is:
nas:~# mdadm -Q --detail /dev/md0
/dev/md0:
Version : 00.90.03
Creation Time : Sat Sep 15 21:11:41 2007
Raid Level : raid5
Array Size : 2441543360 (2328.44 GiB 2500.14 GB)
Used Dev Size : 488308672 (465.69 GiB 500.03 GB)
Raid Devices : 6
Total Devices : 6
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sat Jan 5 17:53:54 2008
State : clean
Active Devices : 6
Working Devices : 6
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 16K
UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
Events : 0.986918
Number Major Minor RaidDevice State
0 8 0 0 active sync /dev/sda
1 8 16 1 active sync /dev/sdb
2 8 32 2 active sync /dev/sdc
3 8 48 3 active sync /dev/sdd
4 8 64 4 active sync /dev/sde
5 8 80 5 active sync /dev/sdf
At the moment I'm thinking about writing a small perl program that
will generate me a shell script or makefile containing dd commands
that will copy the chunks from the drive to /dev/md0. I don't care if
that will be dog slow as long as I get most of my data back. (I'd
probably go forward instead of backward to take advantage of the
readahead, after I've determined the exact start chunk.)
For that I need to know one more thing. Used Dev Size is 488308672k for md0 as well as the disk, 16k chunk size. 488308672/16 = 30519292.00 so the first dd would look like: dd if=/dev/sdg of=/dev/md0 bs=16k count=1 skip=30519291 seek=X The big question now being how to calculate X.Since I have a working testcase I can do a lot of testing before touching the real thing. The formula to get X will probably contain a 5 for the 5(+1) devices the raid spans now, a 4 for the 4(+1) devices the raid spanned before the reshape, a 3 for the device number of the disk that failed and of course the skip/current chunk number.
Can you help me come up with it? Thanks again for looking into the whole issue. Alex. ======================================================================== # _ __ _ __ http://www.nagilum.org/ \n icq://69646724 # # / |/ /__ ____ _(_) /_ ____ _ [EMAIL PROTECTED] \n +491776461165 # # / / _ `/ _ `/ / / // / ' \ Amiga (68k/PPC): AOS/NetBSD/Linux # # /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/ Mac (PPC): MacOS-X / NetBSD /Linux # # /___/ x86: FreeBSD/Linux/Solaris/Win2k ARM9: EPOC EV6 # ======================================================================== ---------------------------------------------------------------- cakebox.homeunix.net - all the machine one needs..
pgpIAV7eDAlSk.pgp
Description: PGP Digital Signature
