The DNE auto-split functionality is disabled by default and not fully completed (e.g. preserve inode numbers) because it had issues with significant performance impact/latency while splitting a directory that was currently in use (which is exactly when you would want to use it), so I wouldn't recommend to use it at this time.
Instead, development efforts were focussed on DNE MDT space balancing. This adds two different features that allow all of the MDTs in a filesystem to be used without user/admin intervention (though it is still possible to manually create directories on specific MDTs as before). The "round-robin" MDT selection ("lfs setdirstripe -D --max-depth-rr=N -c 1 -i -1") for top-level directories (enabled for the top 3 levels of the filesystem by default) will, as the name suggests, round robin new directories across all of the available MDTs, when their space is evenly balanced (within 5% free space*inodes by default). That is important to distribute *new* directories across MDTs in new filesystems when e.g. .../home/$user or .../project/$project or .../scratch/$user are being created. The "space balance" MDT selection ("lctl set_param lmv.*.qos_threshold_rr=N" on the *CLIENT*) kicks in when MDT space usage becomes imbalanced (free space*inodes difference above 5% by default), and then starts selecting the MDT for *new* directories based on the ratio of free space*inodes. That allows the MDTs to return toward balance over time, without causing a performance imbalance when it isn't necessary. Note that both of these heuristics operate on *single-stripe directories* and not regular files, so the MDT balance will not be perfect if some directory tree has millions more files/subdirectories than another. However, the main issue being avoided is the *very* common case of MDT0000 getting full and MDT0001..N being (almost) totally unused. These features also make the MDT *usage* balance also pretty good as a result, so it is a win-win. For most filesystems, the MDT capacity is not the limiting factor (it only makes up a few percent of the total storage). Cheers, Andreas On Mar 23, 2023, at 15:31, Bertschinger, Thomas Andrew Hjorth via lustre-discuss <lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>> wrote: Hello, We've been experimenting with DNEv3 recently and have run into this issue: https://jira.whamcloud.com/browse/LU-7607 where the directory inode number changes after auto-split. In addition to the problem noted with backups that track the inode number, we have found that file access through a previously open file descriptor is broken post migration. This can occur when a shell's CWD is the affected directory. For example: mds0 # lctl get_param mdt.mylustre-MDT0000.{dir_split_count,enable_dir_auto_split} mdt.mylustre-MDT0000.dir_split_count=100 mdt.mylustre-MDT0000.enable_dir_auto_split=1 client $ pwd /mnt/mylustre/dnetest client $ for i in {0..100}; do touch file$i; done client $ ls ls: cannot open directory '.': Operation not permitted client $ ls file0 ls: cannot access 'file0': No such file or directory client $ ls /mnt/mylustre/dnetest/file0 /mnt/mylustre/dnetest/file0 (This is from a build of the current master branch.) We believe users will certainly encounter this, because users monitor output directories of jobs as they run. Therefore this issue is a dealbreaker with DNEv3 for us. I wanted to ask about the status of the linked issue, since it looks like it hasn't been updated in a while. Would the resolution to LU-7607 be expected to fix the file access problem I've noted here or will this require additional changes to resolve? Thanks! - Thomas Bertschinger _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org Cheers, Andreas -- Andreas Dilger Lustre Principal Architect Whamcloud
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org