Hi Andreas,

Thanks.

With debugfs /dev/nvme6n1, I get:
debugfs:  ls -l O
 393217   40755 (2)      0      0    4096 28-Jul-2021 17:06 .
      2   40755 (2)      0      0    4096 28-Jul-2021 17:02 ..
 393218   40755 (2)      0      0    4096 28-Jul-2021 17:02 200000003
 524291   40755 (2)      0      0    4096 28-Jul-2021 17:02 1
 655364   40755 (2)      0      0    4096 28-Jul-2021 17:02 10
 786437   40755 (2)      0      0    4096 28-Jul-2021 17:06 0
 917510   40755 (2)      0      0    4096 28-Jul-2021 17:06 23c0000402
 1048583   40755 (2)      0      0    4096 28-Jul-2021 17:06 23c0000401
 1179656   40755 (2)      0      0    4096 28-Jul-2021 17:06 23c0000400

Then e.g.:
debugfs:  stat O/23c0000400
Inode: 1179656   Type: directory    Mode:  0755   Flags: 0x80000
Generation: 2411782533    Version: 0x00000000:00000000
User:     0   Group:     0   Project:     0   Size: 4096
File ACL: 0
Links: 34   Blockcount: 8
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x6101806b:306016bc -- Wed Jul 28 17:06:03 2021
 atime: 0x6101806b:2d83aad8 -- Wed Jul 28 17:06:03 2021
 mtime: 0x6101806b:306016bc -- Wed Jul 28 17:06:03 2021
crtime: 0x6101806b:2d83aad8 -- Wed Jul 28 17:06:03 2021
Size of extra inode fields: 32
Extended attributes:
  lma: fid=[0x120008:0x8fc0e185:0x0] compat=c incompat=0
EXTENTS:
(0):33989


But then on a client:
lfs fid2path /snap8 [0x120008:0x8fc0e185:0x0]
lfs fid2path: cannot find '[0x120008:0x8fc0e185:0x0]': No such file or directory

(and likewise for the others).

Not quite sure what you meant by the O/*/d* as there are no directories within O/, and there is no d/ or d*/ either at top level or within O/


Running (on the OST):
lctl lfsck_start -M snap8-OST004e
seems to work (at least, doesn't return any error).

However, lctl lfsck_query -M snap8-OST004e   gives:
Fail to query LFSCK: Inappropriate ioctl for device


Thanks,
Alastair.


On Sat, 4 Sep 2021, Andreas Dilger wrote:

[EXTERNAL EMAIL]

You could run debugfs on that OST and use "ls -l" to examine the O/*/d* directories for large 
objects, then "stat" any suspicious objects within debugfs to dump the parent FID, and "lfs 
fid2path" on a client to determine the path.

Alternately, see "lctl-lfsck-start.8" man page for options to link orphan 
objects to the .lustre/lost+found directory if you think there are no files referencing 
those objects.

Cheers, Andreas

On Sep 4, 2021, at 00:54, Alastair Basden <[email protected]> wrote:

Ah, of course - has to be done on a client.

None of these files are on the dodgy OST.

Any further suggestions?  Essentially we have what seems to be a full OST with 
nothing on it.

Thanks,
Alastair.

On Sat, 4 Sep 2021, Andreas Dilger wrote:

[EXTERNAL EMAIL]
$ man lfs-fid2path.1
lfs-fid2path(1)                                       user utilities            
                         lfs-fid2path(1)

NAME
     lfs fid2path - print the pathname(s) for a file identifier

SYNOPSIS
     lfs fid2path [OPTION]... <FSNAME|MOUNT_POINT> <FID>...

DESCRIPTION
     lfs  fid2path  maps  a  numeric  Lustre File IDentifier (FID) to one or 
more pathnames
     that have hard links to that file.  This allows resolving filenames for 
FIDs used in console
     error messages, and resolving all of the pathnames for a file that has 
multiple hard links.
     Pathnames are resolved relative to the MOUNT_POINT specified, or relative 
to the
     filesystem mount point if FSNAME is provided.

OPTIONS
     -f, --print-fid
            Print the FID with the path.

     -c, --print-link
            Print the current link number with each pathname or parent 
directory.

     -l, --link=LINK
            If a file has multiple hard links, then print only the specified 
LINK, starting at link 0.
            If multiple FIDs are given, but only one pathname is needed for 
each file, use --link=0.

EXAMPLES
     $ lfs fid2path /mnt/testfs [0x200000403:0x11f:0x0]
            /mnt/testfs/etc/hosts


On Sep 3, 2021, at 14:51, Alastair Basden 
<[email protected]<mailto:[email protected]>> wrote:

Hi,

lctl get_param mdt.*.exports.*.open_files  returns:
[email protected]_files=
[0x20000b90e:0x10aa:0x0]
[email protected]_files=
[0x20000b90e:0x21b3:0x0]
[email protected]_files=
[0x20000b90e:0x21b3:0x0]
[0x20000b90e:0x21b4:0x0]
[0x20000b90c:0x1574:0x0]
[0x20000b90c:0x1575:0x0]
[0x20000b90c:0x1576:0x0]

Doesn't seem to be many open, so I don't think it's a problem of open files.

Not sure which bit of this I need to use with lfs fid2path either...

Cheers,
Alastair.


On Fri, 3 Sep 2021, Andreas Dilger wrote:

[EXTERNAL EMAIL]
You can also check "mdt.*.exports.*.open_files" on the MDTs for a list of FIDs open on 
each client, and use "lfs fid2path" to resolve them to a pathname.

On Sep 3, 2021, at 02:09, Degremont, Aurelien via lustre-discuss 
<[email protected]<mailto:[email protected]><mailto:[email protected]>>
 wrote:

Hi

It could be a bug, but most of the time, this is due to an open-unlinked file, 
typically a log file which is still in use and some processes keep writing to 
it until it fills the OSTs it is using.

Look for such files on your clients (use lsof).

Aurélien


Le 03/09/2021 09:50, « lustre-discuss au nom de Alastair Basden » 
<[email protected]<mailto:[email protected]><mailto:[email protected]>
 au nom de 
[email protected]<mailto:[email protected]><mailto:[email protected]>> a 
écrit :

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Hi,

We have a file system where each OST is a single SSD.

One of those is reporting as 100% full (lfs df -h /snap8):
snap8-OST004d_UUID          5.8T        2.0T        3.5T  37% /snap8[OST:77]
snap8-OST004e_UUID          5.8T        5.5T        7.5G 100% /snap8[OST:78]
snap8-OST004f_UUID          5.8T        2.0T        3.4T  38% /snap8[OST:79]

However, I can't find any files on it:
lfs find --ost snap8-OST004e /snap8/
returns nothing.

I guess that it has filled up, and that there is some bug or other that is
now preventing proper behaviour - but I could be wrong.

Does anyone have any suggestions?

Essentially, I'd like to find some of the files and delete or migrate
some, and thus return it to useful production.

Cheers,
Alastair.
_______________________________________________
lustre-discuss mailing list
[email protected]<mailto:[email protected]><mailto:[email protected]>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

_______________________________________________
lustre-discuss mailing list
[email protected]<mailto:[email protected]><mailto:[email protected]>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud








Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud







_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to