Excuse me, I said 'lfsck' below, but I meant 'fsck'.
________________________________________
From: lustre-discuss [[email protected]] on behalf of 
Patrick Farrell [[email protected]]
Sent: Tuesday, October 27, 2015 11:06 AM
To: Chris Hunter; [email protected]
Subject: Re: [lustre-discuss] recovery MDT ".." directory entries (LU-5626)

Chris,

I had the joy of taking this one apart personally.  We mostly let lfsck do the 
repair and moved on, accepting that some of the dentries were trashed.  I 
think, for important things, our field staff did some manual recovery with the 
e2fsprogs tools, but it was not a common enough problem that we documented a 
procedure.

If you read LU-5626 carefully, there's an explanation of the exact nature of 
the damage, and having that should let you make partial recoveries by hand.  
I'm not familiar with the ll_recover_lost_found_objs tool, but I doubt it would 
prove helpful in this instance.

Note that there's two forms to this corruption.  One is if you move a directory 
which was created before dirdata was enabled, then the '..' entry ends up in 
the wrong place.  This does not trouble Lustre, but fsck reports it as an error 
and will 'correct' it, which has the effect of (usually) overwriting one dentry 
in the directory when it creates a new '..' dentry in the correct location.

I don't *think* that one causes the MDT to go read only, but I could be wrong.  
I *think* what causes the MDT to go read only is the other problem:

When you have a non-htree directory (not too many items in it, all directory 
entries in a single inode) that is in the bad state described above (with the 
'..' dentry in the wrong place after being moved) and that directory has enough 
files added to it that it becomes an htree directory, the resulting directory 
is corrupted more severely.  We never sorted out the precise details of this - 
I believe we chose to simply delete any directories in this state.  (I think 
lfsck did it for us, but can't recall for sure.)

I'd advise reading LU-5626 with care, and I'd also suggest you might turn off 
'dirdata' on your MDT until you have this under control.  That will at least 
prevent any more directories from ending up in either of these bad states if 
you use the filesystem without updating Lustre to a version with the LU-5626 
patch in it.

- Patrick
________________________________________
From: lustre-discuss [[email protected]] on behalf of 
Chris Hunter [[email protected]]
Sent: Tuesday, October 27, 2015 10:22 AM
To: [email protected]
Subject: [lustre-discuss]  recovery MDT ".." directory entries (LU-5626)

We have a lustre 1.8 filesystem that was upgraded to lustre 2.x and
"dirdata" feature was enabled. We encountered LU-5626/LU-2638 issue with
".." directory entries. Are there established recovery steps for this
issue ?

If I run fsck, the directory entries will be moved into lost+found.
I assume the next step is to run the ll_recover_lost_found_objs tool ?

Can you share any advice/experience about recovery ?

thanks,
chris hunter
[email protected]

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to