Re: data recovery on raid5

2006-04-22 Thread David Greaves
Sam Hopkins wrote:
 Hello,

 I have a client with a failed raid5 that is in desperate need of the
 data that's on the raid.  The attached file holds the mdadm -E
 superblocks that are hopefully the keys to the puzzle.  Linux-raid
 folks, if you can give any help here it would be much appreciated.
   
snip
 Linux-raid folks, please reply-to-all as we're probably all not on
 the list.
   
If you're going to post messages to public mailing lists (and solicit
help and private cc's!!!) then you should not be using mechanisms like
the one below. Please Google if you don't understand why not.

I've been getting so much junk mail that I'm resorting to
a draconian mechanism to avoid the mail.  In order
to make sure that there's a real person sending mail, I'm
asking you to explicitly enable access.  To do that, send
mail to sah at this domain with the token:
qSGTt
in the subject of your mail message.  After that, you
shouldn't get any bounces from me.  Sorry if this is
an inconvenience.

David

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: data recovery on raid5

2006-04-22 Thread Molle Bestefich
Sam Hopkins wrote:
 mdadm -C /dev/md0 -n 4 -l 5 missing /dev/etherd/e0.[023]

While it should work, a bit drastic perhaps?
I'd start with mdadm --assemble --force.

With --force, mdadm will pull the event counter of the most-recently
failed drive up to current status which should give you a readable
array.

After that, you could try running a check by echo'ing check into
sync_action.
If the check succeeds, fine, hotadd the last drive to your array and
MD will start resync'ing.

If the check fails because of a bad block, you'll have to make a decision.
Live with the lost blocks, or try and reconstruct from the first kicked disk.

I posted a patch this week that will allow you to forcefully get the
array started with all of the disks - but beware, MD wasn't made with
this in mind and will probably be confused and sometimes pick data
from the first-kicked drive over data from the other drives.  Only
forcefully start the array with all drives if you absolutely have
to...

Oh, and I'm not an expert by any means, so take everything I say with
a grain of salt :-).
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: data recovery on raid5

2006-04-22 Thread Jonathan
Having raid fail on friday evening is pretty bad timing - not there is 
perhaps any good time for such a thing.  I'm the sys-admin for the 
machine in question (apologies for starting a new thread rather than 
replying - I just subscribed to the list)


From my reading, it seems like maybe:

mdadm --assemble /dev/md0 --uuid=8fe1fe85:eeb90460:c525faab:cdaab792 
/dev/etherd/e0.[01234]


would be a thing to try?

Frankly, I'm terrified that I'll screw this up - I'm not too savy with raid.

following is a record of the only thing that I've done so far:

Please note that /dev/md1 is composed of 5 attitional drive which share 
the same hardware as the failed /dev/md0, but are in no other way related.


We're seriously considering sending the drives to a data recovery place 
and spending a bazillion bucks to recover the data.  if anyone reading 
this feels confident that they can help us rebuild this array and get us 
to a place where we can copy the data off of it. Please send mail to 
[EMAIL PROTECTED]  We'll be happy to pay you for your services. - I'll 
post a summary of what we did when all is done.


help, please.

comparing the superblocks below with those posted yesterday, you can see 
that things have changed. I'm pulling my hair out - I hope I didn't bork 
our data.


-- Jonathan

hazel /tmp # df -H
Filesystem Size   Used  Avail Use% Mounted on
/dev/hda4   67G   5.8G58G  10% /
udev   526M   177k   526M   1% /dev
/dev/hda3  8.1G34M   7.7G   1% /tmp
none   526M  0   526M   0% /dev/shm
/dev/md1   591G34M   561G   1% /md1
hazel /tmp # mdadm -C /dev/md0 -n 4 -l 5 missing /dev/etherd/e0.[023]
mdadm: /dev/etherd/e0.0 appears to be part of a raid array:
   level=5 devices=4 ctime=Mon Jan  3 03:16:48 2005
mdadm: /dev/etherd/e0.2 appears to be part of a raid array:
   level=5 devices=4 ctime=Mon Jan  3 03:16:48 2005
mdadm: /dev/etherd/e0.3 appears to contain an ext2fs file system
   size=720300416K  mtime=Wed Oct  5 16:39:28 2005
mdadm: /dev/etherd/e0.3 appears to be part of a raid array:
   level=5 devices=4 ctime=Mon Jan  3 03:16:48 2005
Continue creating array? y
mdadm: array /dev/md0 started.
hazel /tmp # aoe-stat
   e0.0eth1  up
   e0.1eth1  up
   e0.2eth1  up
   e0.3eth1  up
   e0.4eth1  up
   e0.5eth1  up
   e0.6eth1  up
   e0.7eth1  up
   e0.8eth1  up
   e0.9eth1  up
hazel /tmp # cat /proc/mdstat
Personalities : [raid5]
md1 : active raid5 etherd/e0.9[4] etherd/e0.8[3] etherd/e0.7[2] 
etherd/e0.6[1] etherd/e0.5[0]

 586082688 blocks level 5, 32k chunk, algorithm 0 [4/4] []

md0 : active raid5 etherd/e0.3[3] etherd/e0.2[2] etherd/e0.0[1]
 586082688 blocks level 5, 64k chunk, algorithm 2 [4/3] [_UUU]

unused devices: none
hazel /tmp # mkdir /md0
hazel /tmp # mount -r /dev/md0 /md0
mount: wrong fs type, bad option, bad superblock on /dev/md0,
  or too many mounted file systems
hazel /tmp # mount -t ext2 -r /dev/md0 /md0
mount: wrong fs type, bad option, bad superblock on /dev/md0,
  or too many mounted file systems
hazel /tmp # mdadm -S /dev/md0
hazel /tmp # aoe-stat
   e0.0eth1  up
   e0.1eth1  up
   e0.2eth1  up
   e0.3eth1  up
   e0.4eth1  up
   e0.5eth1  up
   e0.6eth1  up
   e0.7eth1  up
   e0.8eth1  up
   e0.9eth1  up
hazel /tmp # cat /proc/mdstat
Personalities : [raid5]
md1 : active raid5 etherd/e0.9[4] etherd/e0.8[3] etherd/e0.7[2] 
etherd/e0.6[1] etherd/e0.5[0]

 586082688 blocks level 5, 32k chunk, algorithm 0 [4/4] []

unused devices: none
hazel /tmp # mdadm -E /dev/etherd/e0.[01234]
/dev/etherd/e0.0:
 Magic : a92b4efc
   Version : 00.90.02
  UUID : ec0bdbb3:f625880f:dbf65130:057d069c
 Creation Time : Fri Apr 21 22:56:18 2006
Raid Level : raid5
   Device Size : 195360896 (186.31 GiB 200.05 GB)
  Raid Devices : 4
 Total Devices : 3
Preferred Minor : 0

   Update Time : Fri Apr 21 22:56:18 2006
 State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
 Spare Devices : 0
  Checksum : 1742f65 - correct
Events : 0.3493634

Layout : left-symmetric
Chunk Size : 64K

 Number   Major   Minor   RaidDevice State
this 1 15201  active sync   /dev/etherd/e0.0

  0 0   000  removed
  1 1 15201  active sync   /dev/etherd/e0.0
  2 2 152   322  active sync   /dev/etherd/e0.2
  3 3 152   483  active sync   

Re: data recovery on raid5

2006-04-22 Thread Molle Bestefich
Jonathan wrote:
 # mdadm -C /dev/md0 -n 4 -l 5 missing /dev/etherd/e0.[023]

I think you should have tried mdadm --assemble --force first, as I
proposed earlier.

By doing the above, you have effectively replaced your version 0.9.0
superblocks with version 0.9.2.  I don't know if version 0.9.2
superblocks are larger than 0.9.0, Neil hasn't responded to that yet. 
Potentially hazardous, who knows.

Anyway.
This is from your old superblock as described by Sam Hopkins:

 /dev/etherd/blah:
  Chunk Size : 32K

This is from what you've just posted:
 /dev/etherd/blah:
  Chunk Size : 64K

If I were you, I'd recreate your superblocks now, but with the correct
chunk size (use -c).

 We'll be happy to pay you for your services.

I'll be modest and charge you a penny per byte of data recovered, ho hum.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: data recovery on raid5

2006-04-22 Thread Molle Bestefich
Jonathan wrote:
 I was already terrified of screwing things up
 now I'm afraid of making things worse

Adrenalin... makes life worth living there for a sec, doesn't it ;o)

 based on what was posted before is this a sensible thing to try?
 mdadm -C /dev/md0 -c 32 -n 4 -l 5 missing /dev/etherd/e0.[023]

Yes, looks exactly right.

 Is what I've done to the superblock size recoverable?

I don't think you've done anything at all.
I just *don't know* if you have, that's all.

Was just trying to say that it wasn't super-cautious of you to begin
with, that's all :-).

 I don't understand how mdadm --assemble would know what to do,
 which is why I didn't try it initially.

By giving it --force, you tell it to forcefully mount the array even
though it might be damaged.
That means including some disks (the freshest ones) that are out of sync.

That help?
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: data recovery on raid5

2006-04-22 Thread Jonathan


Well, the block sizes are back to 32k now, but I still had no luck 
mounting /dev/md0 once I created the array.  below is a record of what I 
just tried:


how safe should the following be?

mdadm --assemble /dev/md0 --uuid=8fe1fe85:eeb90460:c525faab:cdaab792 
/dev/etherd/e0.[01234]


I am *really* not interested in making my situation worse.

-- Jonathan

hazel /virtual # mdadm -C /dev/md0 -c 32 -n 4 -l 5 missing 
/dev/etherd/e0.[023]

mdadm: /dev/etherd/e0.0 appears to be part of a raid array:
   level=5 devices=4 ctime=Fri Apr 21 22:56:18 2006
mdadm: /dev/etherd/e0.2 appears to be part of a raid array:
   level=5 devices=4 ctime=Fri Apr 21 22:56:18 2006
mdadm: /dev/etherd/e0.3 appears to contain an ext2fs file system
   size=720300416K  mtime=Wed Oct  5 16:39:28 2005
mdadm: /dev/etherd/e0.3 appears to be part of a raid array:
   level=5 devices=4 ctime=Fri Apr 21 22:56:18 2006
Continue creating array? y
mdadm: array /dev/md0 started.
hazel /virtual # cat /proc/mdstat
Personalities : [raid5]
md1 : active raid5 etherd/e0.9[4] etherd/e0.8[3] etherd/e0.7[2] 
etherd/e0.6[1] etherd/e0.5[0]

 586082688 blocks level 5, 32k chunk, algorithm 0 [4/4] []

md0 : active raid5 etherd/e0.3[3] etherd/e0.2[2] etherd/e0.0[1]
 586082688 blocks level 5, 32k chunk, algorithm 2 [4/3] [_UUU]

unused devices: none
hazel /virtual # mount -t ext2 -r /dev/md0 /md0
mount: wrong fs type, bad option, bad superblock on /dev/md0,
  or too many mounted file systems
hazel /virtual # mdadm -S /dev/md0
hazel /virtual # cat /proc/mdstat
Personalities : [raid5]
md1 : active raid5 etherd/e0.9[4] etherd/e0.8[3] etherd/e0.7[2] 
etherd/e0.6[1] etherd/e0.5[0]

 586082688 blocks level 5, 32k chunk, algorithm 0 [4/4] []

unused devices: none
hazel /virtual # mdadm -E /dev/etherd/e0.[01234]
/dev/etherd/e0.0:
 Magic : a92b4efc
   Version : 00.90.02
  UUID : 518b5d59:44292ca3:6c358813:c6f00804
 Creation Time : Sat Apr 22 13:25:40 2006
Raid Level : raid5
   Device Size : 195360896 (186.31 GiB 200.05 GB)
  Raid Devices : 4
 Total Devices : 3
Preferred Minor : 0

   Update Time : Sat Apr 22 13:25:40 2006
 State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
 Spare Devices : 0
  Checksum : 6aaa56f - correct
Events : 0.3493635

Layout : left-symmetric
Chunk Size : 32K

 Number   Major   Minor   RaidDevice State
this 1 15201  active sync   /dev/etherd/e0.0

  0 0   000  removed
  1 1 15201  active sync   /dev/etherd/e0.0
  2 2 152   322  active sync   /dev/etherd/e0.2
  3 3 152   483  active sync   /dev/etherd/e0.3
/dev/etherd/e0.1:
 Magic : a92b4efc
   Version : 00.90.00
  UUID : 8fe1fe85:eeb90460:c525faab:cdaab792
 Creation Time : Mon Jan  3 03:16:48 2005
Raid Level : raid5
   Device Size : 195360896 (186.31 GiB 200.05 GB)
  Raid Devices : 4
 Total Devices : 5
Preferred Minor : 0

   Update Time : Fri Apr 21 14:03:12 2006
 State : clean
Active Devices : 2
Working Devices : 3
Failed Devices : 3
 Spare Devices : 1
  Checksum : 4cc991d7 - correct
Events : 0.3493633

Layout : left-asymmetric
Chunk Size : 32K

 Number   Major   Minor   RaidDevice State
this 4 152   164  spare   /dev/etherd/e0.1

  0 0   000  removed
  1 1   001  faulty removed
  2 2 152   322  active sync   /dev/etherd/e0.2
  3 3 152   483  active sync   /dev/etherd/e0.3
  4 4 152   164  spare   /dev/etherd/e0.1
/dev/etherd/e0.2:
 Magic : a92b4efc
   Version : 00.90.02
  UUID : 518b5d59:44292ca3:6c358813:c6f00804
 Creation Time : Sat Apr 22 13:25:40 2006
Raid Level : raid5
   Device Size : 195360896 (186.31 GiB 200.05 GB)
  Raid Devices : 4
 Total Devices : 3
Preferred Minor : 0

   Update Time : Sat Apr 22 13:25:40 2006
 State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
 Spare Devices : 0
  Checksum : 6aaa591 - correct
Events : 0.3493635

Layout : left-symmetric
Chunk Size : 32K

 Number   Major   Minor   RaidDevice State
this 2 152   322  active sync   /dev/etherd/e0.2

  0 0   000  removed
  1 1 15201  active sync   /dev/etherd/e0.0
  2 2 152   322  active sync   /dev/etherd/e0.2
  3 3 152   483  active sync   /dev/etherd/e0.3
/dev/etherd/e0.3:
 Magic : a92b4efc
   Version : 00.90.02
  UUID : 518b5d59:44292ca3:6c358813:c6f00804
 Creation Time : Sat Apr 22 13:25:40 2006
Raid Level : raid5
   Device Size : 195360896 (186.31 GiB 200.05 GB)
  Raid Devices : 4
 Total Devices : 3
Preferred Minor : 0

   Update Time : Sat Apr 

Re: data recovery on raid5

2006-04-22 Thread Molle Bestefich
Jonathan wrote:
 Well, the block sizes are back to 32k now, but I still had no luck
 mounting /dev/md0 once I created the array.

Ahem, I missed something.
Sorry, the 'a' was hard to spot.

Your array used layout : left-asymmetric, while the superblock you've
just created has layout: left-symmetric.

Try again, but add the option --parity=left-asymmetric
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: data recovery on raid5

2006-04-22 Thread Molle Bestefich
Jonathan wrote:
 how safe should the following be?

 mdadm --assemble /dev/md0 --uuid=8fe1fe85:eeb90460:c525faab:cdaab792
 /dev/etherd/e0.[01234]

You can hardly do --assemble anymore.
After you have recreated superblocks on some of the devices, those are
conceptually part of a different raid array.  At least as seen by MD.

 I am *really* not interested in making my situation worse.

We'll keep going till you got your data back..
Recreating superblocks again on e0.{0,2,3} can't hurt, since you've
already done this and thereby nuked the old superblocks.

You can shake your own hand and thank yourself now (oh, and Sam too)
for posting all the debug output you have.  Otherwise we would
probably never have spotted nor known about the parity/chunk size
differences :o).
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: data recovery on raid5

2006-04-22 Thread Jonathan
hazel /virtual # mdadm -C /dev/md0 -c 32 -n 4 -l 5 
--parity=left-asymmetric missing /dev/etherd/e0.[023]

mdadm: /dev/etherd/e0.0 appears to be part of a raid array:
   level=5 devices=4 ctime=Sat Apr 22 13:25:40 2006
mdadm: /dev/etherd/e0.2 appears to be part of a raid array:
   level=5 devices=4 ctime=Sat Apr 22 13:25:40 2006
mdadm: /dev/etherd/e0.3 appears to contain an ext2fs file system
   size=720300416K  mtime=Wed Oct  5 16:39:28 2005
mdadm: /dev/etherd/e0.3 appears to be part of a raid array:
   level=5 devices=4 ctime=Sat Apr 22 13:25:40 2006
Continue creating array? y
mdadm: array /dev/md0 started.
hazel /virtual # mount -t ext2 -r /dev/md0 /md0
hazel /virtual # df -H
Filesystem Size   Used  Avail Use% Mounted on
/dev/hda4   67G   5.8G58G  10% /
udev   526M   177k   526M   1% /dev
/dev/hda3  8.1G34M   7.7G   1% /tmp
none   526M  0   526M   0% /dev/shm
/dev/md1   591G11G   551G   2% /virtual
/dev/md0   591G54G   507G  10% /md0

now I'm doing a:

(cd /md0  tar cf - . ) | (cd /virtual/recover/  tar xvfp -)

thank you thank you thank you thank you thank you thank you


Molle Bestefich wrote:


Jonathan wrote:
 


Well, the block sizes are back to 32k now, but I still had no luck
mounting /dev/md0 once I created the array.
   



Ahem, I missed something.
Sorry, the 'a' was hard to spot.

Your array used layout : left-asymmetric, while the superblock you've
just created has layout: left-symmetric.

Try again, but add the option --parity=left-asymmetric



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: data recovery on raid5

2006-04-22 Thread Mike Tran

Sam Hopkins wrote:


Hello,

I have a client with a failed raid5 that is in desperate need of the
data that's on the raid.  The attached file holds the mdadm -E
superblocks that are hopefully the keys to the puzzle.  Linux-raid
folks, if you can give any help here it would be much appreciated.

# mdadm -V
mdadm - v1.7.0 - 11 August 2004
# uname -a
Linux hazel 2.6.13-gentoo-r5 #1 SMP Sat Jan 21 13:24:15 PST 2006 i686 Intel(R) 
Pentium(R) 4 CPU 2.40GHz GenuineIntel GNU/Linux

Here's my take:

Logfiles show that last night drive /dev/etherd/e0.4 failed and around
noon today /dev/etherd/e0.0 failed.  This jibes with the superblock
dates and info.

My assessment is that since the last known good configuration was
0 missing
1 /dev/etherd/e0.0
2 /dev/etherd/e0.2
3 /dev/etherd/e0.3

then we should shoot for this.  I couldn't figure out how to get there
using mdadm -A since /dev/etherd/e0.0 isn't in sync with e0.2 or e0.3.
If anyone can suggest a way to get this back using -A, please chime in.

The alternative is to recreate the array with this configuration hoping
the data blocks will all line up properly so the filesystem can be mounted
and data retrieved.  It looks like the following command is the right
way to do this, but not being an expert I (and the client) would like
someone else to verify the sanity of this approach.

Will

mdadm -C /dev/md0 -n 4 -l 5 missing /dev/etherd/e0.[023]

do what we want?

Linux-raid folks, please reply-to-all as we're probably all not on
the list.

 

Yes, I would re-create the array with 1 missing disk.  mount read-only, 
verify your data.  If things are ok, remount read-write and remember to 
add a new disk to fix the degrade array.


With the missing keyword, no resync/recovery, thus the data on disk 
will be intact.


--
Regards,
Mike T.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: data recovery on raid5

2006-04-22 Thread Christian Pedaschus
nice to hear you got your data back.
now it's perhaps a good time to donate some money to some
ppls/oss-projects for saving your ass ;) ;)

greets, chris



Jonathan wrote:

 hazel /virtual # mdadm -C /dev/md0 -c 32 -n 4 -l 5
 --parity=left-asymmetric missing /dev/etherd/e0.[023]
 mdadm: /dev/etherd/e0.0 appears to be part of a raid array:
level=5 devices=4 ctime=Sat Apr 22 13:25:40 2006
 mdadm: /dev/etherd/e0.2 appears to be part of a raid array:
level=5 devices=4 ctime=Sat Apr 22 13:25:40 2006
 mdadm: /dev/etherd/e0.3 appears to contain an ext2fs file system
size=720300416K  mtime=Wed Oct  5 16:39:28 2005
 mdadm: /dev/etherd/e0.3 appears to be part of a raid array:
level=5 devices=4 ctime=Sat Apr 22 13:25:40 2006
 Continue creating array? y
 mdadm: array /dev/md0 started.
 hazel /virtual # mount -t ext2 -r /dev/md0 /md0
 hazel /virtual # df -H
 Filesystem Size   Used  Avail Use% Mounted on
 /dev/hda4   67G   5.8G58G  10% /
 udev   526M   177k   526M   1% /dev
 /dev/hda3  8.1G34M   7.7G   1% /tmp
 none   526M  0   526M   0% /dev/shm
 /dev/md1   591G11G   551G   2% /virtual
 /dev/md0   591G54G   507G  10% /md0

 now I'm doing a:

 (cd /md0  tar cf - . ) | (cd /virtual/recover/  tar xvfp -)

 thank you thank you thank you thank you thank you thank you


 Molle Bestefich wrote:

 Jonathan wrote:
  

 Well, the block sizes are back to 32k now, but I still had no luck
 mounting /dev/md0 once I created the array.
   


 Ahem, I missed something.
 Sorry, the 'a' was hard to spot.

 Your array used layout : left-asymmetric, while the superblock you've
 just created has layout: left-symmetric.

 Try again, but add the option --parity=left-asymmetric


 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: data recovery on raid5

2006-04-22 Thread David Greaves
Molle Bestefich wrote:
 Anyway, a quick cheat sheet might come in handy:
   
Which is why I posted about a wiki a few days back :)

I'm progressing it and I'll see if we can't get something up.

There's a lot of info on the list and it would be nice to get it a
little more focused...

David

-- 

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: data recovery on raid5

2006-04-22 Thread Neil Brown
On Saturday April 22, [EMAIL PROTECTED] wrote:
 Jonathan wrote:
  # mdadm -C /dev/md0 -n 4 -l 5 missing /dev/etherd/e0.[023]
 
 I think you should have tried mdadm --assemble --force first, as I
 proposed earlier.
 
 By doing the above, you have effectively replaced your version 0.9.0
 superblocks with version 0.9.2.  I don't know if version 0.9.2
 superblocks are larger than 0.9.0, Neil hasn't responded to that yet. 
 Potentially hazardous, who knows.

There is no difference in the superblock between 0.90.0 and 0.90.2.

md has always used version numbers, but always in a confusing way.
There should be two completely separate version numbers: the version
for the format of the superblock, and the version for the software
implementation.  md confuses these two.

To try to sort it out, I have decided that:
 - The 'major' version number is the overall choice of superblock
This is currently 0 or 1
 - The 'minor' version encodes minor variation in the superblock.
For version 1, this is different locations (there are other bits
in the superblock to allow new fields to be added)
For version 0, it is currently only used to make sure old software
  doesn't try to assemble an array which is undergoing a
  shape, as that would confuse it totals.

 - The 'patchlevel' is used to indicate feature availability in
   the implementation.  It really should be stored in the superblock,
   but it is for historical reasons.  It is not checked when
   validating a superblock.
  To quote from md.h

/*
 * MD_PATCHLEVEL_VERSION indicates kernel functionality.
 * =1 means different superblock formats are selectable using SET_ARRAY_INFO
 * and major_version/minor_version accordingly
 * =2 means that Internal bitmaps are supported by setting MD_SB_BITMAP_PRESENT
 * in the super status byte
 * =3 means that bitmap superblock version 4 is supported, which uses
 * little-ending representation rather than host-endian
 */

Hope that helps.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: data recovery on raid5

2006-04-21 Thread Mike Hardy

Recreate the array from the constituent drives in the order you mention,
with 'missing' in place of the first drive that failed?

It won't resync because it has a missing drive.

If you created it correctly, the data will be there

If you didn't create it correctly, you can keep trying permutations of
4-disk arrays with one missing until you see your data, and you should
find it.

-Mike

Sam Hopkins wrote:
 Hello,
 
 I have a client with a failed raid5 that is in desperate need of the
 data that's on the raid.  The attached file holds the mdadm -E
 superblocks that are hopefully the keys to the puzzle.  Linux-raid
 folks, if you can give any help here it would be much appreciated.
 
 # mdadm -V
 mdadm - v1.7.0 - 11 August 2004
 # uname -a
 Linux hazel 2.6.13-gentoo-r5 #1 SMP Sat Jan 21 13:24:15 PST 2006 i686 
 Intel(R) Pentium(R) 4 CPU 2.40GHz GenuineIntel GNU/Linux
 
 Here's my take:
 
 Logfiles show that last night drive /dev/etherd/e0.4 failed and around
 noon today /dev/etherd/e0.0 failed.  This jibes with the superblock
 dates and info.
 
 My assessment is that since the last known good configuration was
 0 missing
 1 /dev/etherd/e0.0
 2 /dev/etherd/e0.2
 3 /dev/etherd/e0.3
 
 then we should shoot for this.  I couldn't figure out how to get there
 using mdadm -A since /dev/etherd/e0.0 isn't in sync with e0.2 or e0.3.
 If anyone can suggest a way to get this back using -A, please chime in.
 
 The alternative is to recreate the array with this configuration hoping
 the data blocks will all line up properly so the filesystem can be mounted
 and data retrieved.  It looks like the following command is the right
 way to do this, but not being an expert I (and the client) would like
 someone else to verify the sanity of this approach.
 
 Will
 
 mdadm -C /dev/md0 -n 4 -l 5 missing /dev/etherd/e0.[023]
 
 do what we want?
 
 Linux-raid folks, please reply-to-all as we're probably all not on
 the list.
 
 Thanks for your help,
 
 Sam
 
 
 
 
 /dev/etherd/e0.0:
   Magic : a92b4efc
 Version : 00.90.00
UUID : 8fe1fe85:eeb90460:c525faab:cdaab792
   Creation Time : Mon Jan  3 03:16:48 2005
  Raid Level : raid5
 Device Size : 195360896 (186.31 GiB 200.05 GB)
Raid Devices : 4
   Total Devices : 5
 Preferred Minor : 0
 
 Update Time : Fri Apr 21 12:45:07 2006
   State : clean
  Active Devices : 3
 Working Devices : 4
  Failed Devices : 1
   Spare Devices : 1
Checksum : 4cc955da - correct
  Events : 0.3488315
 
  Layout : left-asymmetric
  Chunk Size : 32K
 
   Number   Major   Minor   RaidDevice State
 this 1 15201  active sync   /dev/etherd/e0.0
 
0 0   000  removed
1 1 15201  active sync   /dev/etherd/e0.0
2 2 152   322  active sync   /dev/etherd/e0.2
3 3 152   483  active sync   /dev/etherd/e0.3
4 4 152   160  spare   /dev/etherd/e0.1
 /dev/etherd/e0.2:
   Magic : a92b4efc
 Version : 00.90.00
UUID : 8fe1fe85:eeb90460:c525faab:cdaab792
   Creation Time : Mon Jan  3 03:16:48 2005
  Raid Level : raid5
 Device Size : 195360896 (186.31 GiB 200.05 GB)
Raid Devices : 4
   Total Devices : 5
 Preferred Minor : 0
 
 Update Time : Fri Apr 21 14:03:12 2006
   State : clean
  Active Devices : 2
 Working Devices : 3
  Failed Devices : 3
   Spare Devices : 1
Checksum : 4cc991e9 - correct
  Events : 0.3493633
 
  Layout : left-asymmetric
  Chunk Size : 32K
 
   Number   Major   Minor   RaidDevice State
 this 2 152   322  active sync   /dev/etherd/e0.2
 
0 0   000  removed
1 1   001  faulty removed
2 2 152   322  active sync   /dev/etherd/e0.2
3 3 152   483  active sync   /dev/etherd/e0.3
4 4 152   164  spare   /dev/etherd/e0.1
 /dev/etherd/e0.3:
   Magic : a92b4efc
 Version : 00.90.00
UUID : 8fe1fe85:eeb90460:c525faab:cdaab792
   Creation Time : Mon Jan  3 03:16:48 2005
  Raid Level : raid5
 Device Size : 195360896 (186.31 GiB 200.05 GB)
Raid Devices : 4
   Total Devices : 5
 Preferred Minor : 0
 
 Update Time : Fri Apr 21 14:03:12 2006
   State : clean
  Active Devices : 2
 Working Devices : 3
  Failed Devices : 3
   Spare Devices : 1
Checksum : 4cc991fb - correct
  Events : 0.3493633
 
  Layout : left-asymmetric
  Chunk Size : 32K
 
   Number   Major   Minor   RaidDevice State
 this 3 152   483  active sync   /dev/etherd/e0.3
 
0 0   000  removed
1 1   001  faulty removed
2