On 2010-11-30, at 15:46, Bob Ball wrote:
> Thanks.  Can you tell me how to do the mapping back to the MDS inode?  For 
> example, is 1162976 in the list below the MDS inode?  May as well look.

You can use the "ll_decode_filter_fid" tool on the OST object files (e.g. the 
"1162976" file below) and it will print out the MDS inode number and generation.

> On 11/30/2010 4:17 PM, Andreas Dilger wrote:
>> On 2010-11-30, at 11:17, Bob Ball wrote:
>>> [r...@umdist03 d0]# ls -l
>>> total 182976
>>> -rw-rw-rw- 1 daits users 45002956 Jul  5 20:52 1162976
>>> -rw-rw-rw- 1 daits users 44569036 Jul  7 02:53 1200608
>>> -rw-rw-rw- 1 daits users 49108913 Jun 28 04:43 1218976
>>> -rw-rw-rw- 1 daits users 48658429 Jul 16 13:29 1254176
>>> -rwSrwSrw- 1 root  root         0 Sep  2 15:11 128
>>> -rwSrwSrw- 1 root  root         0 Sep  2 15:11 9152
>>> -rwSrwSrw- 1 root  root         0 Sep  2 15:11 9216
>>> -rwSrwSrw- 1 root  root         0 Sep  2 15:11 9248
>>> 
>>> Some time back we had an MDT issue, and upon running e2fsck, saw a LOT
>>> of corrupted entries that were just deleted.  I suspect that these may
>>> have been entries pointing to these files?
>> Likely, yes.
>> 
>>> "lfs find" comes up empty handed for this OST, indeed, there are 6 OST
>>> here, each with about 10GB worth of files of this kind.  Are those 60GB
>>> just lost?  Short of pawing through these, by hand, to see what we can
>>> make of the content, is there a snowball's chance in Hades of identifying
>>> these files?
>> They can be mapped back to an MDS inode number, and the user/group 
>> information is intact, but that doesn't help if the MDS inodes were deleted 
>> by e2fsck since there will not be any file name available.
>> 
>>> Can I simply copy them out of this "ldiskfs" mount of the file system,
>>> back into some recovery directory in the real file system, so that users
>>> can pick through them?
>> Yes, just rsync the non-zero-length files from the ldiskfs-mounted OST 
>> filesystem into a new "lost+found" directory created in the lustre 
>> mountpoint on a client.  If you "chmod 1775 /path/to/lustre/lost+found" the 
>> owners of the file will be able to read/delete their files, but others will 
>> not (like /tmp).
>> 
>>> After they are moved, the file system will be reformatted and returned to 
>>> use.
>> The whole Lustre filesystem, or the OST?  If you are replacing the OST, then 
>> you should still do a backup of last_rcvd, CONFIGS/, and O/0/LAST_ID from 
>> the OST, and then restore them to the after the OST is reformatted.  This 
>> process was very recently discussed on this list.
>> 
>>> On 11/30/2010 8:53 AM, Bob Ball wrote:
>>>> OK, thanks.  Scary, to see errors out of lfs find.
>>>> 
>>>> bob
>>>> 
>>>> On 11/30/2010 1:47 AM, Andreas Dilger wrote:
>>>>> On 2010-11-29, at 20:18, Bob Ball wrote:
>>>>>> I have an odd problem.  I am trying to empty all files from a set of OST
>>>>>> as indicated below, by making a list via lfs find and then sending that
>>>>>> list to lfs_migrate.  However, I have just gotten this message back from
>>>>>> the lfs find:
>>>>>> 
>>>>>> llapi_semantic_traverse: Failed to open
>>>>>> '/lustre/umt3/data13/daits/p15.6.3.10/prod/W1J_munu216465_simul': No
>>>>>> such file or directory (2)
>>>>>> error: find failed for umt3-OST0021.
>>>>> This may mean that the file was deleted while "lfs find" was running.
>>>>> 
>>>>>> On the OSS, I see this but not much else:
>>>>>> LustreError: 5226:0:(ldlm_resource.c:861:ldlm_resource_add()) lvbo_init
>>>>>> failed for resource 9101: rc -2
>>>>>> 
>>>>>> Can someone give me an idea of what is wrong  here?  And what can be
>>>>>> done about it, if anything?
>>>>> This might mean that the file was deleted at the same time the MDS 
>>>>> crashed, and the objects were removed but the MDS file was not.  It is 
>>>>> possible to just delete this file using the "unlink" command - it does 
>>>>> not contain any data in any case.
>>>>> 
>>>>> Cheers, Andreas
>>>>> --
>>>>> Andreas Dilger
>>>>> Lustre Technical Lead
>>>>> Oracle Corporation Canada Inc.
>>>>> 
>>>>> 
>>>>> 
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> [email protected]
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>> 
>>>> 
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> [email protected]
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>> 
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Lustre Technical Lead
>> Oracle Corporation Canada Inc.
>> 
>> 
>> 


Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to