Re: [Lustre-discuss] Completely lost MGT/MDT

2013-07-08 Thread Венедикт Федотов
On 06/29/2013 12:24 AM, Dilger, Andreas wrote:
 On 2013/28/06 3:25 PM, Andrus, Brian Contractor bdand...@nps.edu wrote:
 
 Basically, I was adding capacity to a system while doing a fresh install.
 Turns out /dev/sda which used to be the disk in the bottom slot became
 the disk in the top slot instead.
 That happened to be where the MDT was, which was promptly repartitioned
 and formatted.

 Not exactly something I was expecting
 
 Presumably you have no backups or snapshots of the MDT device?  Lustre can
 handle a lot of inconsistency between the MDT and OSTs, even without
 running lfsck.
 
 Also, there was once a similar situation with a reformatted MDT that was
 partly recovered using the ext3grep utility.  This allowed finding the
 filename-inode mappings in the dirents in directory leaf blocks, and the
 .. dirent allowed connecting the parent directories.  In Lustre 2.x, the
 link xattr on the MDT inodes could also be used to recover the filenames
 even if the directory entries are lost.

I guess you are referring to the recovery I did at DDN (*)? Actually,
ext3grep didn't do what we needed, so I wrote our own tool. In the mean
time Kit did another recovery, further improved the tools and uploaded
them to

http://code.google.com/p/decode-ost-attr/
http://code.google.com/p/mdt-recovery/

Cheers,
Венедикт


PS: Sorry, for some reasons I'm using an alias name.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Completely lost MGT/MDT

2013-06-28 Thread Andrus, Brian Contractor
Basically, I was adding capacity to a system while doing a fresh install. Turns 
out /dev/sda which used to be the disk in the bottom slot became the disk in 
the top slot instead.
That happened to be where the MDT was, which was promptly repartitioned and 
formatted.

Not exactly something I was expecting


Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238




 -Original Message-
 From: Colin Faber [mailto:colin_fa...@xyratex.com]
 Sent: Wednesday, June 26, 2013 5:08 PM
 To: Andrus, Brian Contractor
 Cc: lustre-discuss@lists.lustre.org
 Subject: Re: [Lustre-discuss] Completely lost MGT/MDT
 
 Can you describe the failure in more detail?
 
 Andrus, Brian Contractor bdand...@nps.edu wrote:
 
 All,
 
 We have a sizeable filesystem and during a hardware upgrade, our MDT
 disk was completely lost.
 I am trying to find if and how to recover from such an event, but am not
 finding anything.
 
 We were running lustre 2.3 and have upgraded to 2.4 (or are in the process
 of it).
 
 Can anyone point me in the right direction here?
 
 Thanks in advance,
 
 
 Brian Andrus
 ITACS/Research Computing
 Naval Postgraduate School
 Monterey, California
 voice: 831-656-6238
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Completely lost MGT/MDT

2013-06-28 Thread Dilger, Andreas
On 2013/28/06 3:25 PM, Andrus, Brian Contractor bdand...@nps.edu wrote:

Basically, I was adding capacity to a system while doing a fresh install.
Turns out /dev/sda which used to be the disk in the bottom slot became
the disk in the top slot instead.
That happened to be where the MDT was, which was promptly repartitioned
and formatted.

Not exactly something I was expecting

Presumably you have no backups or snapshots of the MDT device?  Lustre can
handle a lot of inconsistency between the MDT and OSTs, even without
running lfsck.

Also, there was once a similar situation with a reformatted MDT that was
partly recovered using the ext3grep utility.  This allowed finding the
filename-inode mappings in the dirents in directory leaf blocks, and the
.. dirent allowed connecting the parent directories.  In Lustre 2.x, the
link xattr on the MDT inodes could also be used to recover the filenames
even if the directory entries are lost.

This won't help as much if the whole disk has been overwritten by an OS
install, but if only part of the MDT was overwritten you may be surprised
how much is recoverable with ext4.

First order is to make a copy of the whole disk before you try any further
changes (this lets you try things and restart without losing any data if
things go badly).

Repartition the disk as it was before (possibly without any partition
table at all for Lustre, or it could be dumped into an image file if not
too huge).  Then build and run the findsuper utility from the e2fsprogs
code (I've attached it here) and try and find any existing (old)
superblocks from before the reformat.  You can tell superblocks from the
same filesystem by the same start/end/blocks and increasing group number:

byte_offset  byte_startbyte_end  fs_blocks blksz  grp  mkfs/mount_time
  sb_uuid label
10496001048576   525336576512000  1024   0 Wed Sep 12 16:39:47
2012 8f8531a2
94382081048576   525336576512000  1024   1 Wed Sep 12 16:39:47
2012 8f8531a2
   262154241048576   525336576512000  1024   3  Wed Sep 12
16:39:47 2012 8f8531a2
   429926401048576   525336576512000  1024   5  Wed Sep 12
16:39:47 2012 8f8531a2
   597698561048576   525336576512000  1024   7  Wed Sep 12
16:39:47 2012 8f8531a2
   765470721048576   525336576512000  1024   9  Wed Sep 12
16:39:47 2012 8f8531a2
  1352663041048576  8590983168   2097152  4096   1  Tue Jan 18
15:06:12 2011 e1e13f16 boot
  2107648001048576   525336576512000  1024  25  Wed Sep 12
16:39:47 2012 8f8531a2
  2275420161048576   525336576512000  1024  27  Wed Sep 12
16:39:47 2012 8f8531a2
  4037017601048576  8590983168   2097152  4096   3  Tue Jan 18
15:06:12 2011 e1e13f16 boot
  4120913921048576   525336576512000  1024  49  Wed Sep 12
16:39:47 2012 8f8531a2
  525337600  525336576  9115271168   2097152  4096   0  Tue Jan 18
15:06:12 2011 e1e13f16 root_fc13
  659554304  525336576  9115271168   2097152  4096   1  Tue Jan 18
15:06:12 2011 e1e13f16 root_fc13
  659750912  525533184  17705402368   4194304  4096   1 Thu Jan 13
14:29:26 2011 6740a155



Then, run e2fsck -fn -b {block} -B 4096 /dev/XXX for one of the MDT
superblocks (which will clobber the old superblocks.  This will
potentially recover some of your old MDT filesystem into lost+found, and
you can move these into a directory called ROOT at the top.  Use
getfattr to extract the filenames from the link xattr.

Hope this helps.  This is one reason why I encourage everyone to make full
dd backups of their MDT device.  It doesn't take much space, but is
critical to the whole filesystem.

Cheers, Andreas

 -Original Message-
 From: Colin Faber [mailto:colin_fa...@xyratex.com]
 Sent: Wednesday, June 26, 2013 5:08 PM
 To: Andrus, Brian Contractor
 Cc: lustre-discuss@lists.lustre.org
 Subject: Re: [Lustre-discuss] Completely lost MGT/MDT
 
 Can you describe the failure in more detail?
 
 Andrus, Brian Contractor bdand...@nps.edu wrote:
 
 All,
 
 We have a sizeable filesystem and during a hardware upgrade, our MDT
 disk was completely lost.
 I am trying to find if and how to recover from such an event, but am
not
 finding anything.
 
 We were running lustre 2.3 and have upgraded to 2.4 (or are in the
process
 of it).
 
 Can anyone point me in the right direction here?
 
 Thanks in advance,



Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division




findsuper.c
Description: findsuper.c
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Completely lost MGT/MDT

2013-06-26 Thread Jeff Johnson
I am not aware of any tool or method to recover from a lost MGT/MDT. Do 
you have any recent backups of your MDT device?

I would hold on to your MDT device with care and see if someone can help 
you resurrect it.

--Jeff


On 6/26/13 3:01 PM, Andrus, Brian Contractor wrote:
 All,

 We have a sizeable filesystem and during a hardware upgrade, our MDT disk was 
 completely lost.
 I am trying to find if and how to recover from such an event, but am not 
 finding anything.

 We were running lustre 2.3 and have upgraded to 2.4 (or are in the process of 
 it).

 Can anyone point me in the right direction here?

 Thanks in advance,


 Brian Andrus
 ITACS/Research Computing
 Naval Postgraduate School
 Monterey, California
 voice: 831-656-6238


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


-- 
--
Jeff Johnson
Co-Founder
Aeon Computing

jeff.john...@aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x101   f: 858-412-3845
m: 619-204-9061

/* New Address */
4170 Morena Boulevard, Suite D - San Diego, CA 92117

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Completely lost MGT/MDT

2013-06-26 Thread Colin Faber
Can you describe the failure in more detail?

Andrus, Brian Contractor bdand...@nps.edu wrote:

All,

We have a sizeable filesystem and during a hardware upgrade, our MDT disk was 
completely lost.
I am trying to find if and how to recover from such an event, but am not 
finding anything.

We were running lustre 2.3 and have upgraded to 2.4 (or are in the process of 
it).

Can anyone point me in the right direction here?

Thanks in advance,


Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss