Thank you Mr. Dilger & Mr. Rick. Your explanations help me a lot. Now I can take a deep breath that everything is fine with the file system.
Regards, Ihsan On Wed, Nov 12, 2025 at 2:02 AM Andreas Dilger <[email protected]> wrote: > For ZFS OSTs/MDTs, they will not "run out of inodes" before they run out > of usable space. In other words, for ZFS the used inode percentage == used > space percentage, and both will hit 100% when the OST is full. > > That is fine (and expected). It may be that the MDT has free space (== > free inodes), but this is just a small fraction of the total space in the > filesystem (a few percent at most). It is typical to over-provision the > MDTs a bit, to allow adding more OSTs to the filesystem easily, and to > store log files (e.g. changelogs). > > Having the "lfs df" information as well as "lfs df -i" would be helpful, > but I don't think there is anything actually wrong here. As Rick wrote, > the "free inodes" is just an estimate for ZFS ("free space / average object > size"), so they will not "run out". > > Cheers, Andreas > > On Nov 11, 2025, at 3:53 PM, Mohr, Rick via lustre-discuss < > [email protected]> wrote: > > > > Ihsan, > > > > Lustre doesn't allocate inodes in the same way that you're probably used > to thinking about, say in the context of an ext4 filesystem. The inode > usage you see for each mdt/ost is just the inode usage of the underlying > filesystem (ldiskfs or zfs). Lustre itself doesn't have a list of inodes > that it gives out. Instead, Lustre identifies a file using a 128-bit FID > (File Identifier) that is unique for each file in the filesystem. A new > FID is allocated when a file is created, and FIDs are not reused. The > number of files that Lustre can hold will be limited by capacity and > numbers of inodes on the individual mdts/osts, but the total number of FIDs > will be much larger than that (so that Lustre won't run out of FIDs before > running out of resources on the backend). Commands like 'ls -I' will list > an inode number for a Lustre file, but it isn't actually an inode. Lustre > just has a way to convert the 128-bit FID number into a 64-bit number that > it displays for the inode number. > > > > Using files with stripe count of 1 (which you are doing) will help to > conserve ost inodes. But since you are using zfs for the ost backend, you > should keep in mind that those inode numbers are just estimates. Zfs > doesn't have fixed numbers of inodes like ldiskfs does. So it's possible > you could have more files than what might be indicated by the ost inode > usage. I am not a zfs expert, so I am not sure how that inode estimate is > calculated or how accurate it might be. > > > > --Rick > > > > > > On 11/11/25, 2:48 AM, "Ihsan Ur Rahman" <[email protected] <mailto: > [email protected]>> wrote: > > > > Thank you Rick for the detailed explanation. > > > > For the OSTs we have used, zfs is a backend file system. If MDT is > responsible for giving the inodes then why OST inodes are consumed. If OST > is also giving the inodes to the store data, I am afraid that sooner or > later we will run out of inodes on the OSTs. > > We are using a strip count of 1. > > > > Regards, > > > > Ihsan > > > > > > > > On Tue, Nov 11, 2025 at 1:41 AM Mohr, Rick <[email protected] <mailto: > [email protected]> <mailto:[email protected] <mailto:[email protected]>>> wrote: > > > > > > Ihsan, > > > > > > Roughly speaking, every file/dir in lustre will consume one inode on the > mdt that hosts it, and each file will also consume one inode on each ost > that has a stripe allocated to that file. The exact inode usage can be > complicated with more advanced features like DNE, PFL, etc. but that is a > simple estimate on how inodes are used. > > > > > > Now, how the inode usage is presented is a bit tricky. In your case, the > mdts have 156.7M and 127.7M inodes used for a combined total of 284.4M > inodes. Since inode usage for filesystems is usually an indication of how > many files/dirs exist on the filesystem, the sum of the mdt inode usage is > reported as the overall filesystem inode usage. (Because even though a file > with stripe_count=4 might consume 1 inode on an mdt and 4 inodes on 4 > different osts, it still only counts as 1 file. So it only adds 1 to the > total inode usage and not 5.) > > > > > > Free inodes are calculated differently. In the simplest case, a file > with stripe_count=1 would consume 1 mdt inode and 1 ost inode. Since your > filesystem has a lot more mdt inodes than ost inodes, lustre assumes that > the number of ost inodes is the limiting factor, so it uses the sum of all > the free ost inodes as the total number of free inodes remaining. If you > add up 17.9M+16.6M+..., you will get 173.6M which basically matches the > number of free inodes. (The total number of filesystem inodes is then the > sum of the used inodes and free inodes.). Of course, the calculation of > free inodes can be off depending on the circumstance. If you use DoM, then > it is possible to have a small file that consumes an inode on a mdt but > doesn't consume any ost inodes which means your filesystem could > accommodate more additional files than the 173.7M indicated by the number > of free inodes. On the other hand, if you created files with > stripe_count=10, you would only be able to create about 16.4M files. Since > the total inode usage on your mdts is 284.4M, but the total inode usage on > all your osts is around 125M, I'm guessing maybe you are using DoM for a > bunch of small files. > > > > > > The above explanation assumes you are using ldiskfs for the backend > which formats the mdts and osts with a fixed number of inodes. If you are > using zfs for the backend, then I think the inode values for each mdt/ost > are merely estimates anyway since zfs doesn't have fixed inodes like > ldiskfs does. > > > > > > Hope that helps. > > > > > > --Rick > > > > > > > > > > On 11/10/25, 6:30 AM, "lustre-discuss on behalf of Ihsan Ur Rahman via > lustre-discuss" <[email protected] <mailto: > [email protected]> <_blank>> wrote: > > > > > > Hello lustre folks, > > > > > > In the Lustre file system who is responsible for giving inodes. As per > my understanding, it is MDS/MGS who is giving inodes. > > Below is the output of the inodes distribution in our lustre file > system. Is this correct? Because the ost is also giving the inodes and most > of the used more than 40%. > > > > > > lfs df -ih /mnt/lust-das > > UUID Inodes IUsed IFree IUse% Mounted on > > lust-das-MDT0000_UUID 745.2M 156.7M 588.5M 22% /mnt/lust-das[MDT:0] > > lust-das-MDT0001_UUID 745.2M 127.7M 617.6M 18% /mnt/lust-das[MDT:1] > > lust-das-OST0000_UUID 30.4M 12.5M 17.9M 42% /mnt/lust-das[OST:0] > > lust-das-OST0001_UUID 29.1M 12.5M 16.6M 43% /mnt/lust-das[OST:1] > > lust-das-OST0002_UUID 30.2M 12.5M 17.7M 42% /mnt/lust-das[OST:2] > > lust-das-OST0003_UUID 30.7M 12.5M 18.2M 41% /mnt/lust-das[OST:3] > > lust-das-OST0004_UUID 29.7M 12.5M 17.3M 42% /mnt/lust-das[OST:4] > > lust-das-OST0005_UUID 29.8M 12.5M 17.3M 42% /mnt/lust-das[OST:5] > > lust-das-OST0006_UUID 29.9M 12.5M 17.4M 42% /mnt/lust-das[OST:6] > > lust-das-OST0007_UUID 29.8M 12.5M 17.3M 42% /mnt/lust-das[OST:7] > > lust-das-OST0008_UUID 28.8M 12.5M 16.4M 44% /mnt/lust-das[OST:8] > > lust-das-OST0009_UUID 30.0M 12.5M 17.5M 42% /mnt/lust-das[OST:9] > > > > > > filesystem_summary: 458.1M 284.4M 173.7M 63% /mnt/lust-das > > > > > > Regards, > > Ihsan > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > lustre-discuss mailing list > > [email protected] > > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > > Cheers, Andreas > > > > > >
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
