Lustre won't automatically spread files across multiple mdts.  You can use "lfs 
mkdir -i <mdt_index> <dir>" to create a remote directory on a specific mdt 
which will cause all files in that directory to also reside on the same mdt.  
You can also stripe a directory across multiple mdts using "lfs mkdir -c 
<stripe_count> <dir>".  My guess is that remote directories or striped 
directories were never created so all files end up on MDT0 by default.  Look 
for the section titled "Creating a sub-directory on a specific MDT" in the 
Lustre manual for more details.  Hopefully that will resolve your issues.

--Rick


On 1/7/25, 3:29 AM, "lustre-discuss on behalf of Ihsan Ur Rahman" 
<[email protected] 
<mailto:[email protected]> on behalf of 
[email protected] <mailto:[email protected]>> wrote:


Hello lustre folks,
New to this form and lustre as well. 




We have a lustre system and the users are getting an error that no space is 
left on the device. after checking, we have realised that the inodes are full 
for one of the MDT. 




lfs df -ihv /mnt/lustre/
UUID Inodes IUsed IFree IUse% Mounted on
lustre-MDT0000_UUID 894.0M 894.0M 58 100% /mnt/lustre[MDT:0]
lustre-MDT0001_UUID 894.0M 313 894.0M 1% /mnt/lustre[MDT:1]
lustre-MDT0002_UUID 894.0M 313 894.0M 1% /mnt/lustre[MDT:2]
lustre-OST0000_UUID 4.0G 26.2M 4.0G 1% /mnt/lustre[OST:0]
lustre-OST0001_UUID 4.0G 26.1M 4.0G 1% /mnt/lustre[OST:1]
lustre-OST0002_UUID 4.0G 28.1M 4.0G 1% /mnt/lustre[OST:2]
lustre-OST0003_UUID 4.0G 26.6M 4.0G 1% /mnt/lustre[OST:3]
lustre-OST0004_UUID 4.0G 28.2M 4.0G 1% /mnt/lustre[OST:4]
lustre-OST0005_UUID 4.0G 27.3M 4.0G 1% /mnt/lustre[OST:5]
lustre-OST0006_UUID 4.0G 27.5M 4.0G 1% /mnt/lustre[OST:6]
lustre-OST0007_UUID 4.0G 28.0M 4.0G 1% /mnt/lustre[OST:7]
lustre-OST0008_UUID 4.0G 27.5M 4.0G 1% /mnt/lustre[OST:8]
lustre-OST0009_UUID 4.0G 26.4M 4.0G 1% /mnt/lustre[OST:9]
lustre-OST000a_UUID 4.0G 27.9M 4.0G 1% /mnt/lustre[OST:10]
lustre-OST000b_UUID 4.0G 28.4M 4.0G 1% /mnt/lustre[OST:11]
lustre-OST000c_UUID 4.0G 28.3M 4.0G 1% /mnt/lustre[OST:12]
lustre-OST000d_UUID 4.0G 27.8M 4.0G 1% /mnt/lustre[OST:13]
lustre-OST000e_UUID 4.0G 27.6M 4.0G 1% /mnt/lustre[OST:14]
lustre-OST000f_UUID 4.0G 27.1M 4.0G 1% /mnt/lustre[OST:15]
lustre-OST0010_UUID 4.0G 26.5M 4.0G 1% /mnt/lustre[OST:16]
lustre-OST0011_UUID 4.0G 27.3M 4.0G 1% /mnt/lustre[OST:17]
lustre-OST0012_UUID 4.0G 27.1M 4.0G 1% /mnt/lustre[OST:18]
lustre-OST0013_UUID 4.0G 28.8M 4.0G 1% /mnt/lustre[OST:19]
lustre-OST0014_UUID 4.0G 28.2M 4.0G 1% /mnt/lustre[OST:20]
lustre-OST0015_UUID 4.0G 26.1M 4.0G 1% /mnt/lustre[OST:21]
lustre-OST0016_UUID 4.0G 27.2M 4.0G 1% /mnt/lustre[OST:22]
lustre-OST0017_UUID 4.0G 28.7M 4.0G 1% /mnt/lustre[OST:23]
lustre-OST0018_UUID 4.0G 28.5M 4.0G 1% /mnt/lustre[OST:24]
lustre-OST0019_UUID 4.0G 28.3M 4.0G 1% /mnt/lustre[OST:25]
lustre-OST001a_UUID 4.0G 27.3M 4.0G 1% /mnt/lustre[OST:26]
lustre-OST001b_UUID 4.0G 27.0M 4.0G 1% /mnt/lustre[OST:27]
lustre-OST001c_UUID 4.0G 28.8M 4.0G 1% /mnt/lustre[OST:28]
lustre-OST001d_UUID 4.0G 28.5M 4.0G 1% /mnt/lustre[OST:29]


filesystem_summary: 2.6G 894.0M 1.7G 34% /mnt/lustre




After some search on google I have found that there may be some open files on 
the compute which can lead to consuming of the inodes. 
With the command below I have got the list of the nodes where the files were 
open. 
lctl get_param mdt.*.exports.*.open_files
I login to each server and with lsof, I figure out the files and kill all those 
files, but still it does not work for us. 




Our primary goal is to bring the usage of inodes from 100% to below 90%, and 
then we can share the load of inodes on the other two MDT nodes. 
Need your guidance and support. 




regards,




Ihsan 















_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to