Re: assemble vs create an array.......

2008-02-02 Thread Dragos

Hello,
I am not sure if you have received my email from last week with the 
results of the different combinations prescribed (it contained html code).
Anyway, I did a ro mount to check the partition and was happy to see a 
lot of files intact. A few seemed destroyed, but I am not sure. I tried 
a xfs_check on the partition and it told me:


ERROR: The filesystem have valuable metadata changes in a log which 
needs to be replayed. Mount the filesystem to replay the log, and 
unmount it before re-running xfs_check. If you are unable to mount the 
filesystem, then use the xfs_repair -L option to destroy the log and 
attempt a repair.


Since I am unable to mount the partition, shoud I use the -L option with 
xfs_repair, or let it run without it?
Again, please let me know if I should resend my previous email with the 
log file of xfs_repair -n.


Thank you for your time,
Dragos


David Chinner wrote:

On Thu, Dec 06, 2007 at 07:39:28PM +0300, Michael Tokarev wrote:
  

What to do is to give repairfs a try for each permutation,
but again without letting it to actually fix anything.
Just run it in read-only mode and see which combination
of drives gives less errors, or no fatal errors (there
may be several similar combinations, with the same order
of drives but with different drive missing).



Ugggh. 

  

It's sad that xfs refuses mount when structure needs
cleaning - the best way here is to actually mount it
and see how it looks like, instead of trying repair
tools. 



It self protection - if you try to write to a corrupted filesystem,
you'll only make the corruption worse. Mounting involves log
recovery, which writes to the filesystem

  

Is there some option to force-mount it still
(in readonly mode, knowing it may OOPs kernel etc)?



Sure you can: mount -o ro,norecovery dev mtpt

But it you hit corruption it will still shut down on you. If
the machine oopses then that is a bug.

  

thread prompted me to think.  If I can't force-mount it
(or browse it using other ways) as I can almost always
do with (somewhat?) broken ext[23] just to examine things,
maybe I'm trying it before it's mature enough? ;)



Hehe ;)

For maximum uber-XFS-guru points, learn to browse your filesystem
with xfs_db. :P

Cheers,

Dave.
  

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-12-06 Thread Dragos

Thank you.
I want to make sure I understand.

1- Does it matter which permutation of drives I use for xfs_repair (as 
long as it tells me that the Structure needs cleaning)? When it comes to 
linux I consider myself at intermediate level, but I am a beginner when 
it comes to raid and filesystem issues.


2- After I do it, assuming that it worked, how do I reintegrate the 
'missing' drive while keeping my data?


Thank you for you time.
Dragos


David Greaves wrote:

Dragos wrote:
  

Thank you for your very fast answers.

First I tried 'fsck -n' on the existing array. The answer was that If I
wanted to check a XFS partition I should use 'xfs_check'. That seems to
say that my array was partitioned with xfs, not reiserfs. Am I correct?

Then I tried the different permutations:
mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sda1
/dev/sdb1
mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sda1 missing
/dev/sdb1
mount /dev/md0 temp
mdadm --stop --scan



[etc]

  

With some arrays mount reported:
  mount: you must specify the filesystem type
and with others:
  mount: Structure needs cleaning

No choice seems to have been successful.



OK, not as good as you could have hoped for.

Make sure you have the latest xfs tools.

you may want to try xfs_repair and you can use the -n (I think - check man page)
option.

You may need to force it to ignore the log

David



  

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-12-06 Thread Michael Tokarev
[Cc'd to xfs list as it contains something related]

Dragos wrote:
 Thank you.
 I want to make sure I understand.

[Some background for XFS list.  The talk is about a broken linux software
raid (the reason for breakage isn't relevant anymore).  The OP seems to
lost the order of drives in his array, and now tries to create new array
ontop, trying different combinations of drives.  The filesystem there
WAS XFS.  One point is that linux refuses to mount it, saying
structure needs cleaning.  This all is mostly md-related, but there
are several XFS-related questions and concerns too.]

 
 1- Does it matter which permutation of drives I use for xfs_repair (as
 long as it tells me that the Structure needs cleaning)? When it comes to
 linux I consider myself at intermediate level, but I am a beginner when
 it comes to raid and filesystem issues.

The permutation DOES MATTER - for all the devices.
Linux, when mounting an fs, only looks at the superblock of the filesystem,
which is usually located at the beginning of the device.

So in each case linux actually recognizes the filesystem (instead of
seeing complete garbage), the same device is the first one - I.e, this
way you found your first device.  The rest may be still out of order.

Raid5 data is laid like this (with 3 drives for simplicity, it's similar
with more drives):

   DiskA   DiskB   DiskC
Blk0   Data0   Data1   P0
Blk1   P1  Data2   Data3
Blk2   Data4   P2  Data5
Blk3   Data6   Data7   P3
... and so on ...

where your actual data blocks are Data0, Data1, ... DataN,
and PX are parity blocks.

As long as DiskA remains in this position, the beginning of
the array is Data0 block, -- hence linux sees the beginning
of the filesystem and recognizes it.  But you can switch
DiskB and DiskC still, and the rest of the data will be
complete garbage, only data blocks on DiskA will be in
place.

So you still need to find order of the other drives
(you found your first drive, DriveA, already).

Note also that if Data1 block is all-zeros (a situation
which is unlikely for a non-empty filesystem), P0 (first
parity block) will be exactly the same as Data0, because
XORing anything with zeros gives the same anything again
(XOR is the operation used to calculate parity blocks in
RAID5).  So there's still a remote chance you've TWO
first disks...

What to do is to give repairfs a try for each permutation,
but again without letting it to actually fix anything.
Just run it in read-only mode and see which combination
of drives gives less errors, or no fatal errors (there
may be several similar combinations, with the same order
of drives but with different drive missing).

It's sad that xfs refuses mount when structure needs
cleaning - the best way here is to actually mount it
and see how it looks like, instead of trying repair
tools.  Is there some option to force-mount it still
(in readonly mode, knowing it may OOPs kernel etc)?

I'm not very familiar with xfs yet - it seems to be
much faster than ext3 for our workload (mostly databases),
and I'm experimenting with it slowly.  But this very
thread prompted me to think.  If I can't force-mount it
(or browse it using other ways) as I can almost always
do with (somewhat?) broken ext[23] just to examine things,
maybe I'm trying it before it's mature enough? ;)  Note
the smile, but note there's a bit of joke in every joke... :)

 2- After I do it, assuming that it worked, how do I reintegrate the
 'missing' drive while keeping my data?

Just add it back -- mdadm --add /dev/mdX /dev/sdYZ.
But don't do that till you actually see your data.

/mjt
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-12-06 Thread Eric Sandeen
Michael Tokarev wrote:

 It's sad that xfs refuses mount when structure needs
 cleaning - the best way here is to actually mount it
 and see how it looks like, instead of trying repair
 tools.  Is there some option to force-mount it still
 (in readonly mode, knowing it may OOPs kernel etc)?

depends what went wrong, but in general that error means that metadata
corruption was encountered which was sufficient for xfs to abort
whatever it was doing.  It's not done lightly; it's likely bailing out
because it had no other choice.

You can't force mount something which is sufficiently corrupted that
xfs can't understand it anymore...  IOW you can't traverse and read
corrupted/scrambled metadata, no mount option can help you.  :)

If the shutdown were encountered during use, you could maybe avoid the
bad metadata.  If it's during mount that's probably a more fundamental
problem.

kernel messages when you get the structure needs cleaning error would
be a clue as to what it actually hit.

-Eric
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-12-06 Thread David Chinner
On Thu, Dec 06, 2007 at 07:39:28PM +0300, Michael Tokarev wrote:
 What to do is to give repairfs a try for each permutation,
 but again without letting it to actually fix anything.
 Just run it in read-only mode and see which combination
 of drives gives less errors, or no fatal errors (there
 may be several similar combinations, with the same order
 of drives but with different drive missing).

Ugggh. 

 It's sad that xfs refuses mount when structure needs
 cleaning - the best way here is to actually mount it
 and see how it looks like, instead of trying repair
 tools. 

It self protection - if you try to write to a corrupted filesystem,
you'll only make the corruption worse. Mounting involves log
recovery, which writes to the filesystem

 Is there some option to force-mount it still
 (in readonly mode, knowing it may OOPs kernel etc)?

Sure you can: mount -o ro,norecovery dev mtpt

But it you hit corruption it will still shut down on you. If
the machine oopses then that is a bug.

 thread prompted me to think.  If I can't force-mount it
 (or browse it using other ways) as I can almost always
 do with (somewhat?) broken ext[23] just to examine things,
 maybe I'm trying it before it's mature enough? ;)

Hehe ;)

For maximum uber-XFS-guru points, learn to browse your filesystem
with xfs_db. :P

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-12-05 Thread David Greaves
Dragos wrote:
 Thank you for your very fast answers.
 
 First I tried 'fsck -n' on the existing array. The answer was that If I
 wanted to check a XFS partition I should use 'xfs_check'. That seems to
 say that my array was partitioned with xfs, not reiserfs. Am I correct?
 
 Then I tried the different permutations:
 mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sda1
 /dev/sdb1
 mount /dev/md0 temp
 mdadm --stop --scan
 
 mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sda1 missing
 /dev/sdb1
 mount /dev/md0 temp
 mdadm --stop --scan
 
[etc]

 
 With some arrays mount reported:
   mount: you must specify the filesystem type
 and with others:
   mount: Structure needs cleaning
 
 No choice seems to have been successful.

OK, not as good as you could have hoped for.

Make sure you have the latest xfs tools.

you may want to try xfs_repair and you can use the -n (I think - check man page)
option.

You may need to force it to ignore the log

David



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-12-04 Thread Dragos

Thank you for your very fast answers.

First I tried 'fsck -n' on the existing array. The answer was that If I 
wanted to check a XFS partition I should use 'xfs_check'. That seems to 
say that my array was partitioned with xfs, not reiserfs. Am I correct?


Then I tried the different permutations:
mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sda1 
/dev/sdb1

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sda1 missing 
/dev/sdb1

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sda1 /dev/sdb1 
missing

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sda1 
/dev/sdc1

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sda1 missing 
/dev/sdc1

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sda1 /dev/sdc1 
missing

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sdb1 
/dev/sdc1

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 missing 
/dev/sdc1

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 /dev/sdc1 
missing

mount /dev/md0 temp
mdadm --stop --scan

With some arrays mount reported:
  mount: you must specify the filesystem type
and with others:
  mount: Structure needs cleaning

No choice seems to have been successful.
Please let me know of other ideas.

Thank you again,
Dragos

PS:Also, the array was already reporting 'mount: Structure needs 
cleaning' after I had recreated the array.



David Greaves wrote:

Neil Brown wrote:
  

On Thursday November 29, [EMAIL PROTECTED] wrote:

2. Do you know of any way to recover from this mistake? Or at least what 
filesystem it was formated with.
  

It may not have been lost - yet.


  

If you created the same array with the same devices and layout etc,
the data will still be there, untouched.
Try to assemble the array and use fsck on it.


To be safe I'd use fsck -n (check the man page as this is odd for reiserfs)


  

When you create a RAID5 array, all that is changed is the metadata (at
the end of the device) and one drive is changed to be the xor of all
the others.


In other words, one of your 3 drives has just been erased.
Unless you know the *exact* command you used and have the dmesg output to hand
then we won't know which one.

Now what you need to do is to try all the permutations of creating a degraded
array using 2 of the drives and specify the 3rd as 'missing':

So something like:
mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sdb1 /dev/sdc1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 missing /dev/sdc1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 /dev/sdc1 missing
mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sdb1 /dev/sdd1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 missing /dev/sdd1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 /dev/sdd1 missing
etc etc

It is important to create the array using a 'missing' device so the xor data
isn't written.

There is a program here: http://linux-raid.osdl.org/index.php/Permute_array.pl
that may help...

David


  

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-11-30 Thread Bryce

Dragos wrote:

Hello,
I had created a raid 5 array on 3 232GB SATA drives. I had created one 
partition (for /home) formatted with either xfs or reiserfs (I do not 
recall).
Last week I reinstalled my box from scratch with Ubuntu 7.10, with 
mdadm v. 2.6.2-1ubuntu2.
Then I made a rookie mistake: I --create instead of --assemble. The 
recovery completed. I then stopped the array, realizing the mistake.


1. Please make the warning more descriptive: ALL DATA WILL BE LOST, 
when attempting to created an array over an existing one.
2. Do you know of any way to recover from this mistake? Or at least 
what filesystem it was formated with.


Any help would be greatly appreciated. I have hundreds of family 
digital pictures and videos that are irreplaceable.

Thank you in advance,
Dragos

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Meh,...
I do that all the time for testing
The raid metadata is separate from the FS in that you can trash it as 
much as you like and the FS it refers to will be fine as long as you 
don't decide to mkfs over it
If you've an old /var/log/messages kicking around from when the raid was 
correct you should be able to extract the order eg,


RAID5 conf printout:
--- rd:5 wd:5
disk 0, o:1, dev:sdf1
disk 1, o:1, dev:sde1
disk 2, o:1, dev:sdg1
disk 3, o:1, dev:sdc1
disk 4, o:1, dev:sdd1

Unfortunately, there is no point looking at mdadm -E participating 
disk as you've already trashed the information there

Anyway From the above the recreation of the array would be

mdadm -C -l5 -n5 -c128  /dev/md0 /dev/sdf1 /dev/sde1 /dev/sdg1 /dev/sdc1 
/dev/sdd1
(where -l5 = raid 5, -n5 = number of participating drives and -c128 = 
chunk size of 128K)


IF you don't have the configuration printout, then you're left with 
exhaustive brute force searching of the combinations
disks. Unfortunately possible combinations increase geometrically and 
going beyond 8 disks is a suicidally *bad* idea


2=2
3=6
4=24
5=120
6=720
7=5040
8=40320

You only have 3 drives so only 6 possible combinations to try (unlike 
myself with 5)


So, just write yourself a small script with all 6 combinations and run 
them through a piece of shell similar to this pseudo script


lvchange -an /dev/VolGroup01/LogVol00 # if you use lvm at all (change as 
appropriate or discard)

mdadm --stop --scan
yes | mdadm -C -l5 -n3 /dev/md0  /dev/sdd1 /dev/sde1 /dev/sdf1 # 
(replaceable combinations)

lvchange -ay  /dev/VolGroup01/LogVol00 # if you use lvm (or discard)
mount /dev/md0 /mnt
# Lets use the success return code for mount to indicate we're able to 
mount the FS again and bail out (man mount)

if [ $? eq 0 ] ; then
exit 0
fi



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-11-30 Thread David Greaves
Neil Brown wrote:
 On Thursday November 29, [EMAIL PROTECTED] wrote:
 2. Do you know of any way to recover from this mistake? Or at least what 
 filesystem it was formated with.
It may not have been lost - yet.


 If you created the same array with the same devices and layout etc,
 the data will still be there, untouched.
 Try to assemble the array and use fsck on it.
To be safe I'd use fsck -n (check the man page as this is odd for reiserfs)


 When you create a RAID5 array, all that is changed is the metadata (at
 the end of the device) and one drive is changed to be the xor of all
 the others.
In other words, one of your 3 drives has just been erased.
Unless you know the *exact* command you used and have the dmesg output to hand
then we won't know which one.

Now what you need to do is to try all the permutations of creating a degraded
array using 2 of the drives and specify the 3rd as 'missing':

So something like:
mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sdb1 /dev/sdc1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 missing /dev/sdc1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 /dev/sdc1 missing
mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sdb1 /dev/sdd1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 missing /dev/sdd1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 /dev/sdd1 missing
etc etc

It is important to create the array using a 'missing' device so the xor data
isn't written.

There is a program here: http://linux-raid.osdl.org/index.php/Permute_array.pl
that may help...

David


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-11-30 Thread Dragos

I forgot one thing.
After re-creating the array which deleted my data in the first place, 
'mount' was giving me this answer:

  mount: Structure needs cleaning

Thank you for your time,
Dragos
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


assemble vs create an array.......

2007-11-29 Thread Dragos

Hello,
I had created a raid 5 array on 3 232GB SATA drives. I had created one 
partition (for /home) formatted with either xfs or reiserfs (I do not 
recall).
Last week I reinstalled my box from scratch with Ubuntu 7.10, with mdadm 
v. 2.6.2-1ubuntu2.
Then I made a rookie mistake: I --create instead of --assemble. The 
recovery completed. I then stopped the array, realizing the mistake.


1. Please make the warning more descriptive: ALL DATA WILL BE LOST, when 
attempting to created an array over an existing one.
2. Do you know of any way to recover from this mistake? Or at least what 
filesystem it was formated with.


Any help would be greatly appreciated. I have hundreds of family digital 
pictures and videos that are irreplaceable.

Thank you in advance,
Dragos

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-11-29 Thread Neil Brown
On Thursday November 29, [EMAIL PROTECTED] wrote:
 Hello,
 I had created a raid 5 array on 3 232GB SATA drives. I had created one 
 partition (for /home) formatted with either xfs or reiserfs (I do not 
 recall).
 Last week I reinstalled my box from scratch with Ubuntu 7.10, with mdadm 
 v. 2.6.2-1ubuntu2.
 Then I made a rookie mistake: I --create instead of --assemble. The 
 recovery completed. I then stopped the array, realizing the mistake.
 
 1. Please make the warning more descriptive: ALL DATA WILL BE LOST, when 
 attempting to created an array over an existing one.

No matter how loud the warning is, people will get it wrong... unless
I make it actually impossible to corrupt data (which may not be
possible) in which case it will inconvenience many more people.

 2. Do you know of any way to recover from this mistake? Or at least what 
 filesystem it was formated with.

If you created the same array with the same devices and layout etc,
the data will still be there, untouched.
Try to assemble the array and use fsck on it.

When you create a RAID5 array, all that is changed is the metadata (at
the end of the device) and one drive is changed to be the xor of all
the others.

 
 Any help would be greatly appreciated. I have hundreds of family digital 
 pictures and videos that are irreplaceable.

You have probably heard it before, but RAID is no replacement for
backups. 
My photos are one two separate computers, one with RAID.  And I will
be backing them up to DVD any day now . really!!   or maybe next
year, if I remember :-)

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html