Martin,

Our observation at the time was that lfsck did not add the fid to the .. dentry 
unless there was already space in the appropriate location.  I don't remember 
digging in to the details, but that was our observation at the time.  (Since it 
meant lfsck namespace was behaving, in a sense, correctly, we were initially 
puzzled but decided it was all right.  I seem to remember reading a comment 
somewhere that the developers decided rearranging the dentries was too hard, so 
they'd only add fids were space was already present.)

It's possible we didn't get that quite right, though it would have to be 
partial somehow - misplaced .. dentries with fids were definitely not universal 
after running the namespace lfsck. (Drawing on experience from other sites here 
as well.)

In any case, directories with bad .. dentries can be identified with fsck 
anyway.

- Patrick
________________________________________
From: Martin Hecht [[email protected]]
Sent: Wednesday, November 04, 2015 3:42 AM
To: Patrick Farrell; Mohr Jr, Richard Frank (Rick Mohr)
Cc: [email protected]
Subject: Re: [lustre-discuss] recovery MDT ".." directory entries (LU-5626)

On 11/04/2015 03:23 AM, Patrick Farrell wrote:
> PAF: Remember, the specific conditions are pretty tight.  Created under 1.8, 
> not empty (if it's empty, the .. dentry is not misplaced when moved) but also 
> non-htree, then moved with dirdata enabled, and then grown to this larger 
> size.  How many existing (small) directories do you move and then add a bunch 
> of files to?  It's a pretty rare operation.  We only hit it at Martin's site 
> because of an automated tool they have to re-arrange user/job directories.
Well, not only because of the tool. Especially, because when the
directories have been moved by the tool, no files are added anymore.
However, our mechanism gives a reason to the users to move their data
from time to time (that's not the intention of the mechanism, but that's
how some users react).

But I'm not quite sure anymore if moving the directories is really a
precondition to run into LU-5626.
We have run the background lfsck which adds the FID to the existing
dentries. This might be an important detail, because in our case a
second '..' entry containing the FID was presumably created by lfsck (in
the wrong place), and not by moving the directory. To my current
understanding the user then only has to add some files to trigger the LBUG.
A subsequent e2fsck will not only find this particular directory but all
other small directories with a '..' entry in the wrong place. When
e2fsck tries to fix these directories, some entries are overwritten by
the FID and these files are then moved to lost+found.
If one of these first entries happens to be a small subdirectory, I
believe there is a chance to run into the same issue again, when you
move everything back to the original location after the e2fsck and
someone starts adding files in these subdirectories.

However, the preconditions are still quite narrow: small directories,
not empty, created without fid, then converted by lfsck (or
alternatively moved to a different place which would also create the
second '..' entry). To trigger the LBUG files need to be added to one of
these directories and for a second occurrence of the LBUG the same
conditions must hold for another subdirectory which must have been at
the very beginning of the directory.

Martin


_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to