** Description changed: + [ Impact ] + + Users running Lustre filesystems on Ubuntu experience severe performance + degradation with 'du' on large directories (10K+) files. Operations that should take seconds can take minutes. + + The issue occurs because by default, coreutils sorts directory entries by + increasing inode number for large directories. This breaks Lustre's + "statahead" prefetching feature, which activates when it detects sequential + file access patterns and prefetches metadata accordingly. When coreutils sorts + entries by inode (which are scattered), Lustre sees random access and doesn't + activate statahead at all, forcing every file access to be an individual + network round-trip to the server instead of being served from prefetched cache. + + Testing the du command after applying the patch to coreutils demonstrates ~5x + performance improvement, with ~9x improvement reported in production + environments. This primarily affects HPC and research environments, where Lustre is commonly deployed. + + The fix adds Lustre's filesystem magic number (0x0BD00BD0) to the list of + filesystems where inode sorting should be skipped, matching existing behavior + for NFS, CIFS, and tmpfs. The change is minimal, has been accepted upstream in + coreutils, and only affects Lustre filesystems - all other filesystems remain + unchanged. + + [ Test Plan ] + + This test requires setting up a Lustre filesystem environment to reproduce + the performance issue. The test can be performed in VMs using LXD. + + ** Setup Lustre Environment (one-time) ** + + 1. Create VMs on the host machine: + $ lxc launch ubuntu:noble lustre-server --vm \ + -c limits.cpu=2 -c limits.memory=4GiB -c security.secureboot=false + $ lxc launch ubuntu:noble lustre-client --vm \ + -c limits.cpu=2 -c limits.memory=2GiB -c security.secureboot=false + + 2. Build and configure Lustre server (on lustre-server VM): + Install dependencies: + $ sudo add-apt-repository universe -y + $ sudo apt update && sudo apt upgrade -y + $ sudo apt install -y build-essential git libtool m4 autoconf \ + libzfslinux-dev zfsutils-linux zfs-dkms dkms \ + linux-headers-$(uname -r) libyaml-dev bison flex libmount-dev \ + debhelper devscripts quilt python3 python-is-python3 libkeyutils-dev \ + pkg-config libnl-3-dev libnl-genl-3-dev zlib1g-dev module-assistant \ + libreadline-dev libpython3-dev swig + + Install Lustre: + $ git clone https://github.com/lustre/lustre-release.git + $ cd lustre-release + $ sh autogen.sh && ./configure --with-zfs --disable-ldiskfs && make debs + $ cd debs + $ sudo dpkg -i lustre-server-modules-*.deb lustre-server-utils_*.deb + $ sudo depmod -a + $ sudo modprobe lustre + + Configure networking: + $ sudo lnetctl lnet configure + $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') + $ sudo lnetctl net del --net tcp + $ sudo lnetctl net add --net tcp0 --if $IFACE + + Create and mount storage (start from here if restarting server): + $ truncate -s 5G /tmp/mdt.img && truncate -s 10G /tmp/ost.img + $ sudo losetup /dev/loop0 /tmp/mdt.img + $ sudo losetup /dev/loop1 /tmp/ost.img + $ SERVER_IP=$(hostname -I | awk '{print $1}') + $ sudo mkfs.lustre --fsname=testfs --mgs --mdt --backfstype=zfs \ + --index=0 --reformat mdtpool/mdt /dev/loop0 + $ sudo mkfs.lustre --fsname=testfs --ost --backfstype=zfs \ + --mgsnode=${SERVER_IP}@tcp0 --index=0 --reformat ostpool/ost /dev/loop1 + $ sudo mkdir -p /mnt/mdt /mnt/ost + $ sudo mount -t lustre mdtpool/mdt /mnt/mdt + $ sudo mount -t lustre ostpool/ost /mnt/ost + + 3. Build and configure Lustre client (on lustre-client VM): + Install dependencies: + $ sudo add-apt-repository universe -y + $ sudo apt update && sudo apt upgrade -y + For Noble: + $ sudo apt install -y build-essential git libtool m4 autoconf dkms \ + gperf texinfo help2man linux-headers-$(uname -r) libyaml-dev bison \ + flex libmount-dev debhelper devscripts quilt python3 \ + python-is-python3 libkeyutils-dev pkg-config libnl-3-dev \ + libnl-genl-3-dev zlib1g-dev module-assistant libreadline-dev \ + libpython3-dev swig libssl-dev + For Jammy: + $ sudo apt install -y git m4 autoconf dkms gperf texinfo help2man \ + linux-headers-$(uname -r) libreadline-dev python3 python-is-python3 \ + libpython3-dev libkrb5-dev libkeyutils-dev flex bison libmount-dev \ + quilt swig libtool make libnl-genl-3-dev libnl-3-dev zlib1g-dev \ + pkg-config libhwloc-dev libyaml-dev ed dpatch libsnmp-dev \ + mpi-default-dev libncurses5-dev libncurses-dev gnupg libelf-dev gcc \ + libssl-dev bc wget bzip2 build-essential udev kmod cpio \ + module-assistant debhelper devscripts python3-distutils-extra rsync + + Install Lustre: + $ git clone https://github.com/lustre/lustre-release.git + $ cd lustre-release + For Noble: + $ sh autogen.sh && ./configure --disable-server && make debs + For Jammy: + $ git checkout tags/2.15.5 + Remove all occurrences of + linux-headers-generic | linux-headers-amd64 | linux-headers-arm64, + linux-image | linux-image-amd64 | linux-image-arm64, + linux-headers-generic | linux-headers-amd64 + in debian/control and debian/control.main + $ sh autogen.sh && ./configure --disable-server --with-o2ib=no && \ + make debs + $ cd debs && sudo dpkg -i *.deb + + Configure networking (start from here if restarting client): + $ sudo modprobe lustre + $ sudo lnetctl lnet configure + $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') + $ sudo lnetctl net del --net tcp + $ sudo lnetctl net add --net tcp0 --if $IFACE + + Mount Lustre filesystem (replace SERVER_IP with actual server IP): + $ sudo mkdir -p /mnt/lustre + $ sudo mount -t lustre SERVER_IP@tcp0:/testfs /mnt/lustre + + ** Reproduce the Performance Issue ** + + 4. Create test files with scattered inodes (on lustre-client VM): + $ mkdir -p /mnt/lustre/bench + $ cd /mnt/lustre/bench + $ seq 1 100000 | shuf | xargs -P 20 -I {} touch file_{} + + 5. Test unpatched coreutils: + $ pull-lp-source coreutils noble-updates + $ cd coreutils-* + $ ./bootstrap && export FORCE_UNSAFE_CONFIGURE=1 && ./configure && make + If you get error "parse-datetime.tab.h: No such file or directory": + $ cd lib + $ bison -d parse-datetime.y -o parse-datetime.tab.c + Run make again + $ sync && echo 3 > /proc/sys/vm/drop_caches + $ sudo lctl set_param llite.*.max_cached_mb=0 + $ sudo lctl set_param llite.*.max_cached_mb=1024 + $ sudo lctl set_param llite.*.statahead_stats=0 + $ time ./src/du /mnt/lustre/bench + $ sudo lctl get_param llite.*.statahead_stats | grep -E \ + "(statahead total|hit_total)" + + Measured results: + - Time: ~35+ seconds + - statahead total: 0 (not activated) + - hit_total: 0 (N/A since statahead did not activate) + + ** Verify the Fix ** + + 6. Test patched coreutils: + Follow the instructions above to build the patched coreutils, then: + $ sync && echo 3 > /proc/sys/vm/drop_caches + $ sudo lctl set_param llite.*.max_cached_mb=0 + $ sudo lctl set_param llite.*.max_cached_mb=1024 + $ sudo lctl set_param llite.*.statahead_stats=0 + $ time ./src/du /mnt/lustre/bench + $ sudo lctl get_param llite.*.statahead_stats | grep -E \ + "(statahead total|hit_total)" + + Measured results: + - Time: ~6 seconds (5x faster) + - statahead total: 1 (activated) + - hit_total: ~99,000+ (prefetching worked) + + [ Where problems could occur ] + + The change modifies coreutils' file tree traversal logic in lib/fts.c, + specifically adding Lustre to the list of filesystems where directory entry + sorting is skipped. If the filesystem detection is incorrect or the change + has unintended side effects, problems could manifest in several ways: + + 1. Incorrect filesystem detection: If the Lustre magic number check + (0x0BD00BD0) incorrectly matches a different filesystem type, that + filesystem would skip inode sorting when it shouldn't, potentially causing + performance degradation on other filesystems. + + 2. Broader FTS impact: Since this change affects the core FTS library used by + multiple utilities (du, ls, find, chmod, chown), any regression would + impact all these tools, not just du. Users could experience performance + issues or incorrect behavior across file traversal operations. + + [ Other Info ] + + Kernel Compatibility Limitation: Testing was only performed on Noble and Jammy. Lustre does not currently support the newer kernel versions in Questing and Resolute. Attempts to build Lustre on kernels 6.17+ fail with compilation errors due to kernel API changes: + - 'struct page' no longer has 'index' member + - 'dev_get_flags()' function removed from kernel API + + Changelog Handling: The upstream gnulib ChangeLog entries were not backported + as these are maintained in the gnulib repository, not coreutils. Only the code + changes to lib/fts.c were applied. + + Upstream Status: This fix has been accepted upstream in response to bug report + [1], with the fix committed in [2]. + + [1] - https://bugs.gnu.org/80106 + [2] - https://github.com/coreutils/gnulib/commit/578b8d7dc5e3fc00d308660fa60bb529a2e42bb3 + + Original Description: Find the original description of the case below: + Quoting the upstream bug report: "The gnulib function dirent_inode_sort_may_be_useful() should return false for Lustre (i.e. #define S_MAGIC_LUSTRE 0x0BD00BD0 as seen in lustre/include/uapi/linux/lustre/lustre_user.h in the Lustre source tree as LL_SUPER_MAGIC [1]). Sorting dirents negatively impacts du performance on Lustre because it interferes with Lustre's ability to prefetch file metadata (via statahead). For context, Lustre is an open-source (GPLv2) out-of-tree Linux filesystem commonly used for HPC applications." This patch was merged upstream as: https://github.com/coreutils/gnulib/commit/578b8d7dc5e3fc00d308660fa60bb529a2e42bb3 Could we cherry-pick this to Ubuntu gnulib?
** Description changed: [ Impact ] - Users running Lustre filesystems on Ubuntu experience severe performance - degradation with 'du' on large directories (10K+) files. Operations that should take seconds can take minutes. - - The issue occurs because by default, coreutils sorts directory entries by - increasing inode number for large directories. This breaks Lustre's - "statahead" prefetching feature, which activates when it detects sequential - file access patterns and prefetches metadata accordingly. When coreutils sorts - entries by inode (which are scattered), Lustre sees random access and doesn't - activate statahead at all, forcing every file access to be an individual + Users running Lustre filesystems on Ubuntu experience severe performance + degradation with 'du' on large directories (10K+) files. Operations that + should take seconds can take minutes. + + The issue occurs because by default, coreutils sorts directory entries by + increasing inode number for large directories. This breaks Lustre's + "statahead" prefetching feature, which activates when it detects sequential + file access patterns and prefetches metadata accordingly. When coreutils sorts + entries by inode (which are scattered), Lustre sees random access and doesn't + activate statahead at all, forcing every file access to be an individual network round-trip to the server instead of being served from prefetched cache. - Testing the du command after applying the patch to coreutils demonstrates ~5x - performance improvement, with ~9x improvement reported in production + Testing the du command after applying the patch to coreutils demonstrates ~5x + performance improvement, with ~9x improvement reported in production environments. This primarily affects HPC and research environments, where Lustre is commonly deployed. - The fix adds Lustre's filesystem magic number (0x0BD00BD0) to the list of - filesystems where inode sorting should be skipped, matching existing behavior - for NFS, CIFS, and tmpfs. The change is minimal, has been accepted upstream in - coreutils, and only affects Lustre filesystems - all other filesystems remain + The fix adds Lustre's filesystem magic number (0x0BD00BD0) to the list of + filesystems where inode sorting should be skipped, matching existing behavior + for NFS, CIFS, and tmpfs. The change is minimal, has been accepted upstream in + coreutils, and only affects Lustre filesystems - all other filesystems remain unchanged. [ Test Plan ] - This test requires setting up a Lustre filesystem environment to reproduce + This test requires setting up a Lustre filesystem environment to reproduce the performance issue. The test can be performed in VMs using LXD. ** Setup Lustre Environment (one-time) ** 1. Create VMs on the host machine: - $ lxc launch ubuntu:noble lustre-server --vm \ - -c limits.cpu=2 -c limits.memory=4GiB -c security.secureboot=false - $ lxc launch ubuntu:noble lustre-client --vm \ - -c limits.cpu=2 -c limits.memory=2GiB -c security.secureboot=false + $ lxc launch ubuntu:noble lustre-server --vm \ + -c limits.cpu=2 -c limits.memory=4GiB -c security.secureboot=false + $ lxc launch ubuntu:noble lustre-client --vm \ + -c limits.cpu=2 -c limits.memory=2GiB -c security.secureboot=false 2. Build and configure Lustre server (on lustre-server VM): - Install dependencies: - $ sudo add-apt-repository universe -y - $ sudo apt update && sudo apt upgrade -y - $ sudo apt install -y build-essential git libtool m4 autoconf \ - libzfslinux-dev zfsutils-linux zfs-dkms dkms \ - linux-headers-$(uname -r) libyaml-dev bison flex libmount-dev \ - debhelper devscripts quilt python3 python-is-python3 libkeyutils-dev \ - pkg-config libnl-3-dev libnl-genl-3-dev zlib1g-dev module-assistant \ - libreadline-dev libpython3-dev swig - - Install Lustre: - $ git clone https://github.com/lustre/lustre-release.git - $ cd lustre-release - $ sh autogen.sh && ./configure --with-zfs --disable-ldiskfs && make debs - $ cd debs - $ sudo dpkg -i lustre-server-modules-*.deb lustre-server-utils_*.deb - $ sudo depmod -a - $ sudo modprobe lustre - - Configure networking: - $ sudo lnetctl lnet configure - $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') - $ sudo lnetctl net del --net tcp - $ sudo lnetctl net add --net tcp0 --if $IFACE - - Create and mount storage (start from here if restarting server): - $ truncate -s 5G /tmp/mdt.img && truncate -s 10G /tmp/ost.img - $ sudo losetup /dev/loop0 /tmp/mdt.img - $ sudo losetup /dev/loop1 /tmp/ost.img - $ SERVER_IP=$(hostname -I | awk '{print $1}') - $ sudo mkfs.lustre --fsname=testfs --mgs --mdt --backfstype=zfs \ - --index=0 --reformat mdtpool/mdt /dev/loop0 - $ sudo mkfs.lustre --fsname=testfs --ost --backfstype=zfs \ - --mgsnode=${SERVER_IP}@tcp0 --index=0 --reformat ostpool/ost /dev/loop1 - $ sudo mkdir -p /mnt/mdt /mnt/ost - $ sudo mount -t lustre mdtpool/mdt /mnt/mdt - $ sudo mount -t lustre ostpool/ost /mnt/ost + Install dependencies: + $ sudo add-apt-repository universe -y + $ sudo apt update && sudo apt upgrade -y + $ sudo apt install -y build-essential git libtool m4 autoconf \ + libzfslinux-dev zfsutils-linux zfs-dkms dkms \ + linux-headers-$(uname -r) libyaml-dev bison flex libmount-dev \ + debhelper devscripts quilt python3 python-is-python3 libkeyutils-dev \ + pkg-config libnl-3-dev libnl-genl-3-dev zlib1g-dev module-assistant \ + libreadline-dev libpython3-dev swig + + Install Lustre: + $ git clone https://github.com/lustre/lustre-release.git + $ cd lustre-release + $ sh autogen.sh && ./configure --with-zfs --disable-ldiskfs && make debs + $ cd debs + $ sudo dpkg -i lustre-server-modules-*.deb lustre-server-utils_*.deb + $ sudo depmod -a + $ sudo modprobe lustre + + Configure networking: + $ sudo lnetctl lnet configure + $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') + $ sudo lnetctl net del --net tcp + $ sudo lnetctl net add --net tcp0 --if $IFACE + + Create and mount storage (start from here if restarting server): + $ truncate -s 5G /tmp/mdt.img && truncate -s 10G /tmp/ost.img + $ sudo losetup /dev/loop0 /tmp/mdt.img + $ sudo losetup /dev/loop1 /tmp/ost.img + $ SERVER_IP=$(hostname -I | awk '{print $1}') + $ sudo mkfs.lustre --fsname=testfs --mgs --mdt --backfstype=zfs \ + --index=0 --reformat mdtpool/mdt /dev/loop0 + $ sudo mkfs.lustre --fsname=testfs --ost --backfstype=zfs \ + --mgsnode=${SERVER_IP}@tcp0 --index=0 --reformat ostpool/ost /dev/loop1 + $ sudo mkdir -p /mnt/mdt /mnt/ost + $ sudo mount -t lustre mdtpool/mdt /mnt/mdt + $ sudo mount -t lustre ostpool/ost /mnt/ost 3. Build and configure Lustre client (on lustre-client VM): - Install dependencies: - $ sudo add-apt-repository universe -y - $ sudo apt update && sudo apt upgrade -y - For Noble: - $ sudo apt install -y build-essential git libtool m4 autoconf dkms \ - gperf texinfo help2man linux-headers-$(uname -r) libyaml-dev bison \ - flex libmount-dev debhelper devscripts quilt python3 \ - python-is-python3 libkeyutils-dev pkg-config libnl-3-dev \ - libnl-genl-3-dev zlib1g-dev module-assistant libreadline-dev \ - libpython3-dev swig libssl-dev - For Jammy: - $ sudo apt install -y git m4 autoconf dkms gperf texinfo help2man \ - linux-headers-$(uname -r) libreadline-dev python3 python-is-python3 \ - libpython3-dev libkrb5-dev libkeyutils-dev flex bison libmount-dev \ - quilt swig libtool make libnl-genl-3-dev libnl-3-dev zlib1g-dev \ - pkg-config libhwloc-dev libyaml-dev ed dpatch libsnmp-dev \ - mpi-default-dev libncurses5-dev libncurses-dev gnupg libelf-dev gcc \ - libssl-dev bc wget bzip2 build-essential udev kmod cpio \ - module-assistant debhelper devscripts python3-distutils-extra rsync - - Install Lustre: - $ git clone https://github.com/lustre/lustre-release.git - $ cd lustre-release - For Noble: - $ sh autogen.sh && ./configure --disable-server && make debs - For Jammy: - $ git checkout tags/2.15.5 - Remove all occurrences of - linux-headers-generic | linux-headers-amd64 | linux-headers-arm64, - linux-image | linux-image-amd64 | linux-image-arm64, - linux-headers-generic | linux-headers-amd64 - in debian/control and debian/control.main - $ sh autogen.sh && ./configure --disable-server --with-o2ib=no && \ - make debs - $ cd debs && sudo dpkg -i *.deb - - Configure networking (start from here if restarting client): - $ sudo modprobe lustre - $ sudo lnetctl lnet configure - $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') - $ sudo lnetctl net del --net tcp - $ sudo lnetctl net add --net tcp0 --if $IFACE - - Mount Lustre filesystem (replace SERVER_IP with actual server IP): - $ sudo mkdir -p /mnt/lustre - $ sudo mount -t lustre SERVER_IP@tcp0:/testfs /mnt/lustre + Install dependencies: + $ sudo add-apt-repository universe -y + $ sudo apt update && sudo apt upgrade -y + For Noble: + $ sudo apt install -y build-essential git libtool m4 autoconf dkms \ + gperf texinfo help2man linux-headers-$(uname -r) libyaml-dev bison \ + flex libmount-dev debhelper devscripts quilt python3 \ + python-is-python3 libkeyutils-dev pkg-config libnl-3-dev \ + libnl-genl-3-dev zlib1g-dev module-assistant libreadline-dev \ + libpython3-dev swig libssl-dev + For Jammy: + $ sudo apt install -y git m4 autoconf dkms gperf texinfo help2man \ + linux-headers-$(uname -r) libreadline-dev python3 python-is-python3 \ + libpython3-dev libkrb5-dev libkeyutils-dev flex bison libmount-dev \ + quilt swig libtool make libnl-genl-3-dev libnl-3-dev zlib1g-dev \ + pkg-config libhwloc-dev libyaml-dev ed dpatch libsnmp-dev \ + mpi-default-dev libncurses5-dev libncurses-dev gnupg libelf-dev gcc \ + libssl-dev bc wget bzip2 build-essential udev kmod cpio \ + module-assistant debhelper devscripts python3-distutils-extra rsync + + Install Lustre: + $ git clone https://github.com/lustre/lustre-release.git + $ cd lustre-release + For Noble: + $ sh autogen.sh && ./configure --disable-server && make debs + For Jammy: + $ git checkout tags/2.15.5 + Remove all occurrences of + linux-headers-generic | linux-headers-amd64 | linux-headers-arm64, + linux-image | linux-image-amd64 | linux-image-arm64, + linux-headers-generic | linux-headers-amd64 + in debian/control and debian/control.main + $ sh autogen.sh && ./configure --disable-server --with-o2ib=no && \ + make debs + $ cd debs && sudo dpkg -i *.deb + + Configure networking (start from here if restarting client): + $ sudo modprobe lustre + $ sudo lnetctl lnet configure + $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') + $ sudo lnetctl net del --net tcp + $ sudo lnetctl net add --net tcp0 --if $IFACE + + Mount Lustre filesystem (replace SERVER_IP with actual server IP): + $ sudo mkdir -p /mnt/lustre + $ sudo mount -t lustre SERVER_IP@tcp0:/testfs /mnt/lustre ** Reproduce the Performance Issue ** 4. Create test files with scattered inodes (on lustre-client VM): - $ mkdir -p /mnt/lustre/bench - $ cd /mnt/lustre/bench - $ seq 1 100000 | shuf | xargs -P 20 -I {} touch file_{} + $ mkdir -p /mnt/lustre/bench + $ cd /mnt/lustre/bench + $ seq 1 100000 | shuf | xargs -P 20 -I {} touch file_{} 5. Test unpatched coreutils: - $ pull-lp-source coreutils noble-updates - $ cd coreutils-* - $ ./bootstrap && export FORCE_UNSAFE_CONFIGURE=1 && ./configure && make - If you get error "parse-datetime.tab.h: No such file or directory": - $ cd lib - $ bison -d parse-datetime.y -o parse-datetime.tab.c - Run make again - $ sync && echo 3 > /proc/sys/vm/drop_caches - $ sudo lctl set_param llite.*.max_cached_mb=0 - $ sudo lctl set_param llite.*.max_cached_mb=1024 - $ sudo lctl set_param llite.*.statahead_stats=0 - $ time ./src/du /mnt/lustre/bench - $ sudo lctl get_param llite.*.statahead_stats | grep -E \ - "(statahead total|hit_total)" - - Measured results: - - Time: ~35+ seconds - - statahead total: 0 (not activated) - - hit_total: 0 (N/A since statahead did not activate) + $ pull-lp-source coreutils noble-updates + $ cd coreutils-* + $ ./bootstrap && export FORCE_UNSAFE_CONFIGURE=1 && ./configure && make + If you get error "parse-datetime.tab.h: No such file or directory": + $ cd lib + $ bison -d parse-datetime.y -o parse-datetime.tab.c + Run make again + $ sync && echo 3 > /proc/sys/vm/drop_caches + $ sudo lctl set_param llite.*.max_cached_mb=0 + $ sudo lctl set_param llite.*.max_cached_mb=1024 + $ sudo lctl set_param llite.*.statahead_stats=0 + $ time ./src/du /mnt/lustre/bench + $ sudo lctl get_param llite.*.statahead_stats | grep -E \ + "(statahead total|hit_total)" + + Measured results: + - Time: ~35+ seconds + - statahead total: 0 (not activated) + - hit_total: 0 (N/A since statahead did not activate) ** Verify the Fix ** 6. Test patched coreutils: - Follow the instructions above to build the patched coreutils, then: - $ sync && echo 3 > /proc/sys/vm/drop_caches - $ sudo lctl set_param llite.*.max_cached_mb=0 - $ sudo lctl set_param llite.*.max_cached_mb=1024 - $ sudo lctl set_param llite.*.statahead_stats=0 - $ time ./src/du /mnt/lustre/bench - $ sudo lctl get_param llite.*.statahead_stats | grep -E \ - "(statahead total|hit_total)" - - Measured results: - - Time: ~6 seconds (5x faster) - - statahead total: 1 (activated) - - hit_total: ~99,000+ (prefetching worked) + Follow the instructions above to build the patched coreutils, then: + $ sync && echo 3 > /proc/sys/vm/drop_caches + $ sudo lctl set_param llite.*.max_cached_mb=0 + $ sudo lctl set_param llite.*.max_cached_mb=1024 + $ sudo lctl set_param llite.*.statahead_stats=0 + $ time ./src/du /mnt/lustre/bench + $ sudo lctl get_param llite.*.statahead_stats | grep -E \ + "(statahead total|hit_total)" + + Measured results: + - Time: ~6 seconds (5x faster) + - statahead total: 1 (activated) + - hit_total: ~99,000+ (prefetching worked) [ Where problems could occur ] - The change modifies coreutils' file tree traversal logic in lib/fts.c, - specifically adding Lustre to the list of filesystems where directory entry - sorting is skipped. If the filesystem detection is incorrect or the change + The change modifies coreutils' file tree traversal logic in lib/fts.c, + specifically adding Lustre to the list of filesystems where directory entry + sorting is skipped. If the filesystem detection is incorrect or the change has unintended side effects, problems could manifest in several ways: 1. Incorrect filesystem detection: If the Lustre magic number check - (0x0BD00BD0) incorrectly matches a different filesystem type, that - filesystem would skip inode sorting when it shouldn't, potentially causing - performance degradation on other filesystems. - - 2. Broader FTS impact: Since this change affects the core FTS library used by - multiple utilities (du, ls, find, chmod, chown), any regression would - impact all these tools, not just du. Users could experience performance - issues or incorrect behavior across file traversal operations. + (0x0BD00BD0) incorrectly matches a different filesystem type, that + filesystem would skip inode sorting when it shouldn't, potentially causing + performance degradation on other filesystems. + + 2. Broader FTS impact: Since this change affects the core FTS library used by + multiple utilities (du, ls, find, chmod, chown), any regression would + impact all these tools, not just du. Users could experience performance + issues or incorrect behavior across file traversal operations. [ Other Info ] Kernel Compatibility Limitation: Testing was only performed on Noble and Jammy. Lustre does not currently support the newer kernel versions in Questing and Resolute. Attempts to build Lustre on kernels 6.17+ fail with compilation errors due to kernel API changes: - 'struct page' no longer has 'index' member - 'dev_get_flags()' function removed from kernel API - Changelog Handling: The upstream gnulib ChangeLog entries were not backported - as these are maintained in the gnulib repository, not coreutils. Only the code - changes to lib/fts.c were applied. - - Upstream Status: This fix has been accepted upstream in response to bug report + Changelog Handling: The upstream gnulib ChangeLog entries were not backported + as these are maintained in the gnulib repository, not coreutils. Only the code + changes to lib/fts.c were applied. + + Upstream Status: This fix has been accepted upstream in response to bug report [1], with the fix committed in [2]. [1] - https://bugs.gnu.org/80106 [2] - https://github.com/coreutils/gnulib/commit/578b8d7dc5e3fc00d308660fa60bb529a2e42bb3 Original Description: Find the original description of the case below: Quoting the upstream bug report: "The gnulib function dirent_inode_sort_may_be_useful() should return false for Lustre (i.e. #define S_MAGIC_LUSTRE 0x0BD00BD0 as seen in lustre/include/uapi/linux/lustre/lustre_user.h in the Lustre source tree as LL_SUPER_MAGIC [1]). Sorting dirents negatively impacts du performance on Lustre because it interferes with Lustre's ability to prefetch file metadata (via statahead). For context, Lustre is an open-source (GPLv2) out-of-tree Linux filesystem commonly used for HPC applications." This patch was merged upstream as: https://github.com/coreutils/gnulib/commit/578b8d7dc5e3fc00d308660fa60bb529a2e42bb3 Could we cherry-pick this to Ubuntu gnulib? ** Description changed: [ Impact ] Users running Lustre filesystems on Ubuntu experience severe performance degradation with 'du' on large directories (10K+) files. Operations that should take seconds can take minutes. The issue occurs because by default, coreutils sorts directory entries by increasing inode number for large directories. This breaks Lustre's "statahead" prefetching feature, which activates when it detects sequential file access patterns and prefetches metadata accordingly. When coreutils sorts entries by inode (which are scattered), Lustre sees random access and doesn't activate statahead at all, forcing every file access to be an individual network round-trip to the server instead of being served from prefetched cache. Testing the du command after applying the patch to coreutils demonstrates ~5x performance improvement, with ~9x improvement reported in production - environments. This primarily affects HPC and research environments, where Lustre is commonly deployed. + environments. This primarily affects HPC and research environments, + where Lustre is commonly deployed. The fix adds Lustre's filesystem magic number (0x0BD00BD0) to the list of filesystems where inode sorting should be skipped, matching existing behavior for NFS, CIFS, and tmpfs. The change is minimal, has been accepted upstream in coreutils, and only affects Lustre filesystems - all other filesystems remain unchanged. [ Test Plan ] This test requires setting up a Lustre filesystem environment to reproduce the performance issue. The test can be performed in VMs using LXD. ** Setup Lustre Environment (one-time) ** 1. Create VMs on the host machine: $ lxc launch ubuntu:noble lustre-server --vm \ -c limits.cpu=2 -c limits.memory=4GiB -c security.secureboot=false $ lxc launch ubuntu:noble lustre-client --vm \ -c limits.cpu=2 -c limits.memory=2GiB -c security.secureboot=false 2. Build and configure Lustre server (on lustre-server VM): Install dependencies: $ sudo add-apt-repository universe -y $ sudo apt update && sudo apt upgrade -y $ sudo apt install -y build-essential git libtool m4 autoconf \ libzfslinux-dev zfsutils-linux zfs-dkms dkms \ linux-headers-$(uname -r) libyaml-dev bison flex libmount-dev \ debhelper devscripts quilt python3 python-is-python3 libkeyutils-dev \ pkg-config libnl-3-dev libnl-genl-3-dev zlib1g-dev module-assistant \ libreadline-dev libpython3-dev swig Install Lustre: $ git clone https://github.com/lustre/lustre-release.git $ cd lustre-release $ sh autogen.sh && ./configure --with-zfs --disable-ldiskfs && make debs $ cd debs $ sudo dpkg -i lustre-server-modules-*.deb lustre-server-utils_*.deb $ sudo depmod -a $ sudo modprobe lustre Configure networking: $ sudo lnetctl lnet configure $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') $ sudo lnetctl net del --net tcp $ sudo lnetctl net add --net tcp0 --if $IFACE Create and mount storage (start from here if restarting server): $ truncate -s 5G /tmp/mdt.img && truncate -s 10G /tmp/ost.img $ sudo losetup /dev/loop0 /tmp/mdt.img $ sudo losetup /dev/loop1 /tmp/ost.img $ SERVER_IP=$(hostname -I | awk '{print $1}') $ sudo mkfs.lustre --fsname=testfs --mgs --mdt --backfstype=zfs \ --index=0 --reformat mdtpool/mdt /dev/loop0 $ sudo mkfs.lustre --fsname=testfs --ost --backfstype=zfs \ --mgsnode=${SERVER_IP}@tcp0 --index=0 --reformat ostpool/ost /dev/loop1 $ sudo mkdir -p /mnt/mdt /mnt/ost $ sudo mount -t lustre mdtpool/mdt /mnt/mdt $ sudo mount -t lustre ostpool/ost /mnt/ost 3. Build and configure Lustre client (on lustre-client VM): Install dependencies: $ sudo add-apt-repository universe -y $ sudo apt update && sudo apt upgrade -y For Noble: $ sudo apt install -y build-essential git libtool m4 autoconf dkms \ gperf texinfo help2man linux-headers-$(uname -r) libyaml-dev bison \ flex libmount-dev debhelper devscripts quilt python3 \ python-is-python3 libkeyutils-dev pkg-config libnl-3-dev \ libnl-genl-3-dev zlib1g-dev module-assistant libreadline-dev \ libpython3-dev swig libssl-dev For Jammy: $ sudo apt install -y git m4 autoconf dkms gperf texinfo help2man \ linux-headers-$(uname -r) libreadline-dev python3 python-is-python3 \ libpython3-dev libkrb5-dev libkeyutils-dev flex bison libmount-dev \ quilt swig libtool make libnl-genl-3-dev libnl-3-dev zlib1g-dev \ pkg-config libhwloc-dev libyaml-dev ed dpatch libsnmp-dev \ mpi-default-dev libncurses5-dev libncurses-dev gnupg libelf-dev gcc \ libssl-dev bc wget bzip2 build-essential udev kmod cpio \ module-assistant debhelper devscripts python3-distutils-extra rsync Install Lustre: $ git clone https://github.com/lustre/lustre-release.git $ cd lustre-release For Noble: $ sh autogen.sh && ./configure --disable-server && make debs For Jammy: $ git checkout tags/2.15.5 Remove all occurrences of linux-headers-generic | linux-headers-amd64 | linux-headers-arm64, linux-image | linux-image-amd64 | linux-image-arm64, linux-headers-generic | linux-headers-amd64 in debian/control and debian/control.main $ sh autogen.sh && ./configure --disable-server --with-o2ib=no && \ make debs $ cd debs && sudo dpkg -i *.deb Configure networking (start from here if restarting client): $ sudo modprobe lustre $ sudo lnetctl lnet configure $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') $ sudo lnetctl net del --net tcp $ sudo lnetctl net add --net tcp0 --if $IFACE Mount Lustre filesystem (replace SERVER_IP with actual server IP): $ sudo mkdir -p /mnt/lustre $ sudo mount -t lustre SERVER_IP@tcp0:/testfs /mnt/lustre ** Reproduce the Performance Issue ** 4. Create test files with scattered inodes (on lustre-client VM): $ mkdir -p /mnt/lustre/bench $ cd /mnt/lustre/bench $ seq 1 100000 | shuf | xargs -P 20 -I {} touch file_{} 5. Test unpatched coreutils: $ pull-lp-source coreutils noble-updates $ cd coreutils-* $ ./bootstrap && export FORCE_UNSAFE_CONFIGURE=1 && ./configure && make If you get error "parse-datetime.tab.h: No such file or directory": $ cd lib $ bison -d parse-datetime.y -o parse-datetime.tab.c Run make again $ sync && echo 3 > /proc/sys/vm/drop_caches $ sudo lctl set_param llite.*.max_cached_mb=0 $ sudo lctl set_param llite.*.max_cached_mb=1024 $ sudo lctl set_param llite.*.statahead_stats=0 $ time ./src/du /mnt/lustre/bench $ sudo lctl get_param llite.*.statahead_stats | grep -E \ "(statahead total|hit_total)" Measured results: - Time: ~35+ seconds - statahead total: 0 (not activated) - hit_total: 0 (N/A since statahead did not activate) ** Verify the Fix ** 6. Test patched coreutils: Follow the instructions above to build the patched coreutils, then: $ sync && echo 3 > /proc/sys/vm/drop_caches $ sudo lctl set_param llite.*.max_cached_mb=0 $ sudo lctl set_param llite.*.max_cached_mb=1024 $ sudo lctl set_param llite.*.statahead_stats=0 $ time ./src/du /mnt/lustre/bench $ sudo lctl get_param llite.*.statahead_stats | grep -E \ "(statahead total|hit_total)" Measured results: - Time: ~6 seconds (5x faster) - statahead total: 1 (activated) - hit_total: ~99,000+ (prefetching worked) [ Where problems could occur ] The change modifies coreutils' file tree traversal logic in lib/fts.c, specifically adding Lustre to the list of filesystems where directory entry sorting is skipped. If the filesystem detection is incorrect or the change has unintended side effects, problems could manifest in several ways: 1. Incorrect filesystem detection: If the Lustre magic number check (0x0BD00BD0) incorrectly matches a different filesystem type, that filesystem would skip inode sorting when it shouldn't, potentially causing performance degradation on other filesystems. 2. Broader FTS impact: Since this change affects the core FTS library used by multiple utilities (du, ls, find, chmod, chown), any regression would impact all these tools, not just du. Users could experience performance issues or incorrect behavior across file traversal operations. [ Other Info ] Kernel Compatibility Limitation: Testing was only performed on Noble and Jammy. Lustre does not currently support the newer kernel versions in Questing and Resolute. Attempts to build Lustre on kernels 6.17+ fail with compilation errors due to kernel API changes: - 'struct page' no longer has 'index' member - 'dev_get_flags()' function removed from kernel API Changelog Handling: The upstream gnulib ChangeLog entries were not backported as these are maintained in the gnulib repository, not coreutils. Only the code changes to lib/fts.c were applied. Upstream Status: This fix has been accepted upstream in response to bug report [1], with the fix committed in [2]. [1] - https://bugs.gnu.org/80106 [2] - https://github.com/coreutils/gnulib/commit/578b8d7dc5e3fc00d308660fa60bb529a2e42bb3 Original Description: Find the original description of the case below: Quoting the upstream bug report: "The gnulib function dirent_inode_sort_may_be_useful() should return false for Lustre (i.e. #define S_MAGIC_LUSTRE 0x0BD00BD0 as seen in lustre/include/uapi/linux/lustre/lustre_user.h in the Lustre source tree as LL_SUPER_MAGIC [1]). Sorting dirents negatively impacts du performance on Lustre because it interferes with Lustre's ability to prefetch file metadata (via statahead). For context, Lustre is an open-source (GPLv2) out-of-tree Linux filesystem commonly used for HPC applications." This patch was merged upstream as: https://github.com/coreutils/gnulib/commit/578b8d7dc5e3fc00d308660fa60bb529a2e42bb3 Could we cherry-pick this to Ubuntu gnulib? ** Description changed: [ Impact ] Users running Lustre filesystems on Ubuntu experience severe performance degradation with 'du' on large directories (10K+) files. Operations that should take seconds can take minutes. The issue occurs because by default, coreutils sorts directory entries by increasing inode number for large directories. This breaks Lustre's "statahead" prefetching feature, which activates when it detects sequential file access patterns and prefetches metadata accordingly. When coreutils sorts entries by inode (which are scattered), Lustre sees random access and doesn't activate statahead at all, forcing every file access to be an individual network round-trip to the server instead of being served from prefetched cache. Testing the du command after applying the patch to coreutils demonstrates ~5x performance improvement, with ~9x improvement reported in production environments. This primarily affects HPC and research environments, where Lustre is commonly deployed. The fix adds Lustre's filesystem magic number (0x0BD00BD0) to the list of filesystems where inode sorting should be skipped, matching existing behavior for NFS, CIFS, and tmpfs. The change is minimal, has been accepted upstream in coreutils, and only affects Lustre filesystems - all other filesystems remain unchanged. [ Test Plan ] This test requires setting up a Lustre filesystem environment to reproduce the performance issue. The test can be performed in VMs using LXD. ** Setup Lustre Environment (one-time) ** 1. Create VMs on the host machine: $ lxc launch ubuntu:noble lustre-server --vm \ -c limits.cpu=2 -c limits.memory=4GiB -c security.secureboot=false $ lxc launch ubuntu:noble lustre-client --vm \ -c limits.cpu=2 -c limits.memory=2GiB -c security.secureboot=false 2. Build and configure Lustre server (on lustre-server VM): Install dependencies: $ sudo add-apt-repository universe -y $ sudo apt update && sudo apt upgrade -y $ sudo apt install -y build-essential git libtool m4 autoconf \ libzfslinux-dev zfsutils-linux zfs-dkms dkms \ linux-headers-$(uname -r) libyaml-dev bison flex libmount-dev \ debhelper devscripts quilt python3 python-is-python3 libkeyutils-dev \ pkg-config libnl-3-dev libnl-genl-3-dev zlib1g-dev module-assistant \ libreadline-dev libpython3-dev swig Install Lustre: $ git clone https://github.com/lustre/lustre-release.git $ cd lustre-release $ sh autogen.sh && ./configure --with-zfs --disable-ldiskfs && make debs $ cd debs $ sudo dpkg -i lustre-server-modules-*.deb lustre-server-utils_*.deb $ sudo depmod -a $ sudo modprobe lustre Configure networking: $ sudo lnetctl lnet configure $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') $ sudo lnetctl net del --net tcp $ sudo lnetctl net add --net tcp0 --if $IFACE Create and mount storage (start from here if restarting server): $ truncate -s 5G /tmp/mdt.img && truncate -s 10G /tmp/ost.img $ sudo losetup /dev/loop0 /tmp/mdt.img $ sudo losetup /dev/loop1 /tmp/ost.img $ SERVER_IP=$(hostname -I | awk '{print $1}') $ sudo mkfs.lustre --fsname=testfs --mgs --mdt --backfstype=zfs \ --index=0 --reformat mdtpool/mdt /dev/loop0 $ sudo mkfs.lustre --fsname=testfs --ost --backfstype=zfs \ --mgsnode=${SERVER_IP}@tcp0 --index=0 --reformat ostpool/ost /dev/loop1 $ sudo mkdir -p /mnt/mdt /mnt/ost $ sudo mount -t lustre mdtpool/mdt /mnt/mdt $ sudo mount -t lustre ostpool/ost /mnt/ost 3. Build and configure Lustre client (on lustre-client VM): Install dependencies: $ sudo add-apt-repository universe -y $ sudo apt update && sudo apt upgrade -y For Noble: $ sudo apt install -y build-essential git libtool m4 autoconf dkms \ gperf texinfo help2man linux-headers-$(uname -r) libyaml-dev bison \ flex libmount-dev debhelper devscripts quilt python3 \ python-is-python3 libkeyutils-dev pkg-config libnl-3-dev \ libnl-genl-3-dev zlib1g-dev module-assistant libreadline-dev \ libpython3-dev swig libssl-dev For Jammy: $ sudo apt install -y git m4 autoconf dkms gperf texinfo help2man \ linux-headers-$(uname -r) libreadline-dev python3 python-is-python3 \ libpython3-dev libkrb5-dev libkeyutils-dev flex bison libmount-dev \ quilt swig libtool make libnl-genl-3-dev libnl-3-dev zlib1g-dev \ pkg-config libhwloc-dev libyaml-dev ed dpatch libsnmp-dev \ mpi-default-dev libncurses5-dev libncurses-dev gnupg libelf-dev gcc \ libssl-dev bc wget bzip2 build-essential udev kmod cpio \ module-assistant debhelper devscripts python3-distutils-extra rsync Install Lustre: $ git clone https://github.com/lustre/lustre-release.git $ cd lustre-release For Noble: $ sh autogen.sh && ./configure --disable-server && make debs For Jammy: $ git checkout tags/2.15.5 Remove all occurrences of linux-headers-generic | linux-headers-amd64 | linux-headers-arm64, linux-image | linux-image-amd64 | linux-image-arm64, linux-headers-generic | linux-headers-amd64 in debian/control and debian/control.main $ sh autogen.sh && ./configure --disable-server --with-o2ib=no && \ make debs $ cd debs && sudo dpkg -i *.deb Configure networking (start from here if restarting client): $ sudo modprobe lustre $ sudo lnetctl lnet configure $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') $ sudo lnetctl net del --net tcp $ sudo lnetctl net add --net tcp0 --if $IFACE Mount Lustre filesystem (replace SERVER_IP with actual server IP): $ sudo mkdir -p /mnt/lustre $ sudo mount -t lustre SERVER_IP@tcp0:/testfs /mnt/lustre ** Reproduce the Performance Issue ** 4. Create test files with scattered inodes (on lustre-client VM): $ mkdir -p /mnt/lustre/bench $ cd /mnt/lustre/bench $ seq 1 100000 | shuf | xargs -P 20 -I {} touch file_{} 5. Test unpatched coreutils: $ pull-lp-source coreutils noble-updates $ cd coreutils-* $ ./bootstrap && export FORCE_UNSAFE_CONFIGURE=1 && ./configure && make If you get error "parse-datetime.tab.h: No such file or directory": $ cd lib $ bison -d parse-datetime.y -o parse-datetime.tab.c Run make again $ sync && echo 3 > /proc/sys/vm/drop_caches $ sudo lctl set_param llite.*.max_cached_mb=0 $ sudo lctl set_param llite.*.max_cached_mb=1024 $ sudo lctl set_param llite.*.statahead_stats=0 $ time ./src/du /mnt/lustre/bench $ sudo lctl get_param llite.*.statahead_stats | grep -E \ "(statahead total|hit_total)" Measured results: - Time: ~35+ seconds - statahead total: 0 (not activated) - hit_total: 0 (N/A since statahead did not activate) ** Verify the Fix ** 6. Test patched coreutils: Follow the instructions above to build the patched coreutils, then: $ sync && echo 3 > /proc/sys/vm/drop_caches $ sudo lctl set_param llite.*.max_cached_mb=0 $ sudo lctl set_param llite.*.max_cached_mb=1024 $ sudo lctl set_param llite.*.statahead_stats=0 $ time ./src/du /mnt/lustre/bench $ sudo lctl get_param llite.*.statahead_stats | grep -E \ "(statahead total|hit_total)" Measured results: - Time: ~6 seconds (5x faster) - statahead total: 1 (activated) - hit_total: ~99,000+ (prefetching worked) [ Where problems could occur ] The change modifies coreutils' file tree traversal logic in lib/fts.c, specifically adding Lustre to the list of filesystems where directory entry sorting is skipped. If the filesystem detection is incorrect or the change has unintended side effects, problems could manifest in several ways: 1. Incorrect filesystem detection: If the Lustre magic number check (0x0BD00BD0) incorrectly matches a different filesystem type, that filesystem would skip inode sorting when it shouldn't, potentially causing performance degradation on other filesystems. 2. Broader FTS impact: Since this change affects the core FTS library used by multiple utilities (du, ls, find, chmod, chown), any regression would impact all these tools, not just du. Users could experience performance issues or incorrect behavior across file traversal operations. [ Other Info ] - Kernel Compatibility Limitation: Testing was only performed on Noble and Jammy. Lustre does not currently support the newer kernel versions in Questing and Resolute. Attempts to build Lustre on kernels 6.17+ fail with compilation errors due to kernel API changes: + Kernel Compatibility Limitation: Testing was only performed on Noble and Jammy. + Lustre does not currently support the newer kernel versions in Questing and Resolute. Attempts to build Lustre on kernels 6.17+ + fail with compilation errors due to kernel API changes: - 'struct page' no longer has 'index' member - 'dev_get_flags()' function removed from kernel API Changelog Handling: The upstream gnulib ChangeLog entries were not backported as these are maintained in the gnulib repository, not coreutils. Only the code changes to lib/fts.c were applied. Upstream Status: This fix has been accepted upstream in response to bug report [1], with the fix committed in [2]. [1] - https://bugs.gnu.org/80106 [2] - https://github.com/coreutils/gnulib/commit/578b8d7dc5e3fc00d308660fa60bb529a2e42bb3 Original Description: Find the original description of the case below: Quoting the upstream bug report: "The gnulib function dirent_inode_sort_may_be_useful() should return false for Lustre (i.e. #define S_MAGIC_LUSTRE 0x0BD00BD0 as seen in lustre/include/uapi/linux/lustre/lustre_user.h in the Lustre source tree as LL_SUPER_MAGIC [1]). Sorting dirents negatively impacts du performance on Lustre because it interferes with Lustre's ability to prefetch file metadata (via statahead). For context, Lustre is an open-source (GPLv2) out-of-tree Linux filesystem commonly used for HPC applications." This patch was merged upstream as: https://github.com/coreutils/gnulib/commit/578b8d7dc5e3fc00d308660fa60bb529a2e42bb3 Could we cherry-pick this to Ubuntu gnulib? ** Description changed: [ Impact ] Users running Lustre filesystems on Ubuntu experience severe performance degradation with 'du' on large directories (10K+) files. Operations that should take seconds can take minutes. The issue occurs because by default, coreutils sorts directory entries by increasing inode number for large directories. This breaks Lustre's "statahead" prefetching feature, which activates when it detects sequential file access patterns and prefetches metadata accordingly. When coreutils sorts entries by inode (which are scattered), Lustre sees random access and doesn't activate statahead at all, forcing every file access to be an individual network round-trip to the server instead of being served from prefetched cache. Testing the du command after applying the patch to coreutils demonstrates ~5x performance improvement, with ~9x improvement reported in production environments. This primarily affects HPC and research environments, where Lustre is commonly deployed. The fix adds Lustre's filesystem magic number (0x0BD00BD0) to the list of filesystems where inode sorting should be skipped, matching existing behavior for NFS, CIFS, and tmpfs. The change is minimal, has been accepted upstream in coreutils, and only affects Lustre filesystems - all other filesystems remain unchanged. [ Test Plan ] This test requires setting up a Lustre filesystem environment to reproduce the performance issue. The test can be performed in VMs using LXD. ** Setup Lustre Environment (one-time) ** 1. Create VMs on the host machine: $ lxc launch ubuntu:noble lustre-server --vm \ -c limits.cpu=2 -c limits.memory=4GiB -c security.secureboot=false $ lxc launch ubuntu:noble lustre-client --vm \ -c limits.cpu=2 -c limits.memory=2GiB -c security.secureboot=false 2. Build and configure Lustre server (on lustre-server VM): Install dependencies: $ sudo add-apt-repository universe -y $ sudo apt update && sudo apt upgrade -y $ sudo apt install -y build-essential git libtool m4 autoconf \ libzfslinux-dev zfsutils-linux zfs-dkms dkms \ linux-headers-$(uname -r) libyaml-dev bison flex libmount-dev \ debhelper devscripts quilt python3 python-is-python3 libkeyutils-dev \ pkg-config libnl-3-dev libnl-genl-3-dev zlib1g-dev module-assistant \ libreadline-dev libpython3-dev swig Install Lustre: $ git clone https://github.com/lustre/lustre-release.git $ cd lustre-release $ sh autogen.sh && ./configure --with-zfs --disable-ldiskfs && make debs $ cd debs $ sudo dpkg -i lustre-server-modules-*.deb lustre-server-utils_*.deb $ sudo depmod -a $ sudo modprobe lustre Configure networking: $ sudo lnetctl lnet configure $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') $ sudo lnetctl net del --net tcp $ sudo lnetctl net add --net tcp0 --if $IFACE Create and mount storage (start from here if restarting server): $ truncate -s 5G /tmp/mdt.img && truncate -s 10G /tmp/ost.img $ sudo losetup /dev/loop0 /tmp/mdt.img $ sudo losetup /dev/loop1 /tmp/ost.img $ SERVER_IP=$(hostname -I | awk '{print $1}') $ sudo mkfs.lustre --fsname=testfs --mgs --mdt --backfstype=zfs \ --index=0 --reformat mdtpool/mdt /dev/loop0 $ sudo mkfs.lustre --fsname=testfs --ost --backfstype=zfs \ --mgsnode=${SERVER_IP}@tcp0 --index=0 --reformat ostpool/ost /dev/loop1 $ sudo mkdir -p /mnt/mdt /mnt/ost $ sudo mount -t lustre mdtpool/mdt /mnt/mdt $ sudo mount -t lustre ostpool/ost /mnt/ost 3. Build and configure Lustre client (on lustre-client VM): Install dependencies: $ sudo add-apt-repository universe -y $ sudo apt update && sudo apt upgrade -y For Noble: $ sudo apt install -y build-essential git libtool m4 autoconf dkms \ gperf texinfo help2man linux-headers-$(uname -r) libyaml-dev bison \ flex libmount-dev debhelper devscripts quilt python3 \ python-is-python3 libkeyutils-dev pkg-config libnl-3-dev \ libnl-genl-3-dev zlib1g-dev module-assistant libreadline-dev \ libpython3-dev swig libssl-dev For Jammy: $ sudo apt install -y git m4 autoconf dkms gperf texinfo help2man \ linux-headers-$(uname -r) libreadline-dev python3 python-is-python3 \ libpython3-dev libkrb5-dev libkeyutils-dev flex bison libmount-dev \ quilt swig libtool make libnl-genl-3-dev libnl-3-dev zlib1g-dev \ pkg-config libhwloc-dev libyaml-dev ed dpatch libsnmp-dev \ mpi-default-dev libncurses5-dev libncurses-dev gnupg libelf-dev gcc \ libssl-dev bc wget bzip2 build-essential udev kmod cpio \ module-assistant debhelper devscripts python3-distutils-extra rsync Install Lustre: $ git clone https://github.com/lustre/lustre-release.git $ cd lustre-release For Noble: $ sh autogen.sh && ./configure --disable-server && make debs For Jammy: $ git checkout tags/2.15.5 Remove all occurrences of linux-headers-generic | linux-headers-amd64 | linux-headers-arm64, linux-image | linux-image-amd64 | linux-image-arm64, linux-headers-generic | linux-headers-amd64 in debian/control and debian/control.main $ sh autogen.sh && ./configure --disable-server --with-o2ib=no && \ make debs $ cd debs && sudo dpkg -i *.deb Configure networking (start from here if restarting client): $ sudo modprobe lustre $ sudo lnetctl lnet configure $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') $ sudo lnetctl net del --net tcp $ sudo lnetctl net add --net tcp0 --if $IFACE Mount Lustre filesystem (replace SERVER_IP with actual server IP): $ sudo mkdir -p /mnt/lustre $ sudo mount -t lustre SERVER_IP@tcp0:/testfs /mnt/lustre ** Reproduce the Performance Issue ** 4. Create test files with scattered inodes (on lustre-client VM): $ mkdir -p /mnt/lustre/bench $ cd /mnt/lustre/bench $ seq 1 100000 | shuf | xargs -P 20 -I {} touch file_{} 5. Test unpatched coreutils: $ pull-lp-source coreutils noble-updates $ cd coreutils-* $ ./bootstrap && export FORCE_UNSAFE_CONFIGURE=1 && ./configure && make If you get error "parse-datetime.tab.h: No such file or directory": $ cd lib $ bison -d parse-datetime.y -o parse-datetime.tab.c Run make again $ sync && echo 3 > /proc/sys/vm/drop_caches $ sudo lctl set_param llite.*.max_cached_mb=0 $ sudo lctl set_param llite.*.max_cached_mb=1024 $ sudo lctl set_param llite.*.statahead_stats=0 $ time ./src/du /mnt/lustre/bench $ sudo lctl get_param llite.*.statahead_stats | grep -E \ "(statahead total|hit_total)" Measured results: - Time: ~35+ seconds - statahead total: 0 (not activated) - hit_total: 0 (N/A since statahead did not activate) ** Verify the Fix ** 6. Test patched coreutils: Follow the instructions above to build the patched coreutils, then: $ sync && echo 3 > /proc/sys/vm/drop_caches $ sudo lctl set_param llite.*.max_cached_mb=0 $ sudo lctl set_param llite.*.max_cached_mb=1024 $ sudo lctl set_param llite.*.statahead_stats=0 $ time ./src/du /mnt/lustre/bench $ sudo lctl get_param llite.*.statahead_stats | grep -E \ "(statahead total|hit_total)" Measured results: - Time: ~6 seconds (5x faster) - statahead total: 1 (activated) - hit_total: ~99,000+ (prefetching worked) [ Where problems could occur ] The change modifies coreutils' file tree traversal logic in lib/fts.c, specifically adding Lustre to the list of filesystems where directory entry sorting is skipped. If the filesystem detection is incorrect or the change has unintended side effects, problems could manifest in several ways: 1. Incorrect filesystem detection: If the Lustre magic number check (0x0BD00BD0) incorrectly matches a different filesystem type, that filesystem would skip inode sorting when it shouldn't, potentially causing performance degradation on other filesystems. 2. Broader FTS impact: Since this change affects the core FTS library used by multiple utilities (du, ls, find, chmod, chown), any regression would impact all these tools, not just du. Users could experience performance issues or incorrect behavior across file traversal operations. [ Other Info ] Kernel Compatibility Limitation: Testing was only performed on Noble and Jammy. - Lustre does not currently support the newer kernel versions in Questing and Resolute. Attempts to build Lustre on kernels 6.17+ - fail with compilation errors due to kernel API changes: + Lustre does not currently support the newer kernel versions in Questing and Resolute. Attempts to build Lustre on kernels 6.17+ fail with compilation errors due to kernel API changes: - 'struct page' no longer has 'index' member - 'dev_get_flags()' function removed from kernel API Changelog Handling: The upstream gnulib ChangeLog entries were not backported as these are maintained in the gnulib repository, not coreutils. Only the code changes to lib/fts.c were applied. Upstream Status: This fix has been accepted upstream in response to bug report [1], with the fix committed in [2]. [1] - https://bugs.gnu.org/80106 [2] - https://github.com/coreutils/gnulib/commit/578b8d7dc5e3fc00d308660fa60bb529a2e42bb3 Original Description: Find the original description of the case below: Quoting the upstream bug report: "The gnulib function dirent_inode_sort_may_be_useful() should return false for Lustre (i.e. #define S_MAGIC_LUSTRE 0x0BD00BD0 as seen in lustre/include/uapi/linux/lustre/lustre_user.h in the Lustre source tree as LL_SUPER_MAGIC [1]). Sorting dirents negatively impacts du performance on Lustre because it interferes with Lustre's ability to prefetch file metadata (via statahead). For context, Lustre is an open-source (GPLv2) out-of-tree Linux filesystem commonly used for HPC applications." This patch was merged upstream as: https://github.com/coreutils/gnulib/commit/578b8d7dc5e3fc00d308660fa60bb529a2e42bb3 Could we cherry-pick this to Ubuntu gnulib? ** Description changed: [ Impact ] Users running Lustre filesystems on Ubuntu experience severe performance degradation with 'du' on large directories (10K+) files. Operations that should take seconds can take minutes. The issue occurs because by default, coreutils sorts directory entries by increasing inode number for large directories. This breaks Lustre's "statahead" prefetching feature, which activates when it detects sequential file access patterns and prefetches metadata accordingly. When coreutils sorts entries by inode (which are scattered), Lustre sees random access and doesn't activate statahead at all, forcing every file access to be an individual network round-trip to the server instead of being served from prefetched cache. Testing the du command after applying the patch to coreutils demonstrates ~5x performance improvement, with ~9x improvement reported in production environments. This primarily affects HPC and research environments, where Lustre is commonly deployed. The fix adds Lustre's filesystem magic number (0x0BD00BD0) to the list of filesystems where inode sorting should be skipped, matching existing behavior for NFS, CIFS, and tmpfs. The change is minimal, has been accepted upstream in coreutils, and only affects Lustre filesystems - all other filesystems remain unchanged. [ Test Plan ] This test requires setting up a Lustre filesystem environment to reproduce the performance issue. The test can be performed in VMs using LXD. ** Setup Lustre Environment (one-time) ** 1. Create VMs on the host machine: $ lxc launch ubuntu:noble lustre-server --vm \ -c limits.cpu=2 -c limits.memory=4GiB -c security.secureboot=false $ lxc launch ubuntu:noble lustre-client --vm \ -c limits.cpu=2 -c limits.memory=2GiB -c security.secureboot=false 2. Build and configure Lustre server (on lustre-server VM): Install dependencies: $ sudo add-apt-repository universe -y $ sudo apt update && sudo apt upgrade -y $ sudo apt install -y build-essential git libtool m4 autoconf \ libzfslinux-dev zfsutils-linux zfs-dkms dkms \ linux-headers-$(uname -r) libyaml-dev bison flex libmount-dev \ debhelper devscripts quilt python3 python-is-python3 libkeyutils-dev \ pkg-config libnl-3-dev libnl-genl-3-dev zlib1g-dev module-assistant \ libreadline-dev libpython3-dev swig Install Lustre: $ git clone https://github.com/lustre/lustre-release.git $ cd lustre-release $ sh autogen.sh && ./configure --with-zfs --disable-ldiskfs && make debs $ cd debs $ sudo dpkg -i lustre-server-modules-*.deb lustre-server-utils_*.deb $ sudo depmod -a $ sudo modprobe lustre Configure networking: $ sudo lnetctl lnet configure $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') $ sudo lnetctl net del --net tcp $ sudo lnetctl net add --net tcp0 --if $IFACE Create and mount storage (start from here if restarting server): $ truncate -s 5G /tmp/mdt.img && truncate -s 10G /tmp/ost.img $ sudo losetup /dev/loop0 /tmp/mdt.img $ sudo losetup /dev/loop1 /tmp/ost.img $ SERVER_IP=$(hostname -I | awk '{print $1}') $ sudo mkfs.lustre --fsname=testfs --mgs --mdt --backfstype=zfs \ --index=0 --reformat mdtpool/mdt /dev/loop0 $ sudo mkfs.lustre --fsname=testfs --ost --backfstype=zfs \ --mgsnode=${SERVER_IP}@tcp0 --index=0 --reformat ostpool/ost /dev/loop1 $ sudo mkdir -p /mnt/mdt /mnt/ost $ sudo mount -t lustre mdtpool/mdt /mnt/mdt $ sudo mount -t lustre ostpool/ost /mnt/ost 3. Build and configure Lustre client (on lustre-client VM): Install dependencies: $ sudo add-apt-repository universe -y $ sudo apt update && sudo apt upgrade -y For Noble: $ sudo apt install -y build-essential git libtool m4 autoconf dkms \ gperf texinfo help2man linux-headers-$(uname -r) libyaml-dev bison \ flex libmount-dev debhelper devscripts quilt python3 \ python-is-python3 libkeyutils-dev pkg-config libnl-3-dev \ libnl-genl-3-dev zlib1g-dev module-assistant libreadline-dev \ libpython3-dev swig libssl-dev For Jammy: $ sudo apt install -y git m4 autoconf dkms gperf texinfo help2man \ linux-headers-$(uname -r) libreadline-dev python3 python-is-python3 \ libpython3-dev libkrb5-dev libkeyutils-dev flex bison libmount-dev \ quilt swig libtool make libnl-genl-3-dev libnl-3-dev zlib1g-dev \ pkg-config libhwloc-dev libyaml-dev ed dpatch libsnmp-dev \ mpi-default-dev libncurses5-dev libncurses-dev gnupg libelf-dev gcc \ libssl-dev bc wget bzip2 build-essential udev kmod cpio \ module-assistant debhelper devscripts python3-distutils-extra rsync Install Lustre: $ git clone https://github.com/lustre/lustre-release.git $ cd lustre-release For Noble: $ sh autogen.sh && ./configure --disable-server && make debs For Jammy: $ git checkout tags/2.15.5 Remove all occurrences of linux-headers-generic | linux-headers-amd64 | linux-headers-arm64, linux-image | linux-image-amd64 | linux-image-arm64, linux-headers-generic | linux-headers-amd64 in debian/control and debian/control.main $ sh autogen.sh && ./configure --disable-server --with-o2ib=no && \ make debs $ cd debs && sudo dpkg -i *.deb Configure networking (start from here if restarting client): $ sudo modprobe lustre $ sudo lnetctl lnet configure $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') $ sudo lnetctl net del --net tcp $ sudo lnetctl net add --net tcp0 --if $IFACE Mount Lustre filesystem (replace SERVER_IP with actual server IP): $ sudo mkdir -p /mnt/lustre $ sudo mount -t lustre SERVER_IP@tcp0:/testfs /mnt/lustre ** Reproduce the Performance Issue ** 4. Create test files with scattered inodes (on lustre-client VM): $ mkdir -p /mnt/lustre/bench $ cd /mnt/lustre/bench $ seq 1 100000 | shuf | xargs -P 20 -I {} touch file_{} 5. Test unpatched coreutils: $ pull-lp-source coreutils noble-updates $ cd coreutils-* $ ./bootstrap && export FORCE_UNSAFE_CONFIGURE=1 && ./configure && make If you get error "parse-datetime.tab.h: No such file or directory": $ cd lib $ bison -d parse-datetime.y -o parse-datetime.tab.c Run make again $ sync && echo 3 > /proc/sys/vm/drop_caches $ sudo lctl set_param llite.*.max_cached_mb=0 $ sudo lctl set_param llite.*.max_cached_mb=1024 $ sudo lctl set_param llite.*.statahead_stats=0 $ time ./src/du /mnt/lustre/bench $ sudo lctl get_param llite.*.statahead_stats | grep -E \ "(statahead total|hit_total)" Measured results: - Time: ~35+ seconds - statahead total: 0 (not activated) - hit_total: 0 (N/A since statahead did not activate) ** Verify the Fix ** 6. Test patched coreutils: Follow the instructions above to build the patched coreutils, then: $ sync && echo 3 > /proc/sys/vm/drop_caches $ sudo lctl set_param llite.*.max_cached_mb=0 $ sudo lctl set_param llite.*.max_cached_mb=1024 $ sudo lctl set_param llite.*.statahead_stats=0 $ time ./src/du /mnt/lustre/bench $ sudo lctl get_param llite.*.statahead_stats | grep -E \ "(statahead total|hit_total)" Measured results: - Time: ~6 seconds (5x faster) - statahead total: 1 (activated) - hit_total: ~99,000+ (prefetching worked) [ Where problems could occur ] The change modifies coreutils' file tree traversal logic in lib/fts.c, specifically adding Lustre to the list of filesystems where directory entry sorting is skipped. If the filesystem detection is incorrect or the change has unintended side effects, problems could manifest in several ways: 1. Incorrect filesystem detection: If the Lustre magic number check (0x0BD00BD0) incorrectly matches a different filesystem type, that filesystem would skip inode sorting when it shouldn't, potentially causing performance degradation on other filesystems. 2. Broader FTS impact: Since this change affects the core FTS library used by multiple utilities (du, ls, find, chmod, chown), any regression would impact all these tools, not just du. Users could experience performance issues or incorrect behavior across file traversal operations. [ Other Info ] Kernel Compatibility Limitation: Testing was only performed on Noble and Jammy. - Lustre does not currently support the newer kernel versions in Questing and Resolute. Attempts to build Lustre on kernels 6.17+ fail with compilation errors due to kernel API changes: + Lustre does not currently support the newer kernel versions in Questing and Resolute. + Attempts to build Lustre on kernels 6.17+ fail with compilation errors due to kernel API changes: - 'struct page' no longer has 'index' member - 'dev_get_flags()' function removed from kernel API Changelog Handling: The upstream gnulib ChangeLog entries were not backported as these are maintained in the gnulib repository, not coreutils. Only the code changes to lib/fts.c were applied. Upstream Status: This fix has been accepted upstream in response to bug report [1], with the fix committed in [2]. [1] - https://bugs.gnu.org/80106 [2] - https://github.com/coreutils/gnulib/commit/578b8d7dc5e3fc00d308660fa60bb529a2e42bb3 Original Description: Find the original description of the case below: Quoting the upstream bug report: "The gnulib function dirent_inode_sort_may_be_useful() should return false for Lustre (i.e. #define S_MAGIC_LUSTRE 0x0BD00BD0 as seen in lustre/include/uapi/linux/lustre/lustre_user.h in the Lustre source tree as LL_SUPER_MAGIC [1]). Sorting dirents negatively impacts du performance on Lustre because it interferes with Lustre's ability to prefetch file metadata (via statahead). For context, Lustre is an open-source (GPLv2) out-of-tree Linux filesystem commonly used for HPC applications." This patch was merged upstream as: https://github.com/coreutils/gnulib/commit/578b8d7dc5e3fc00d308660fa60bb529a2e42bb3 Could we cherry-pick this to Ubuntu gnulib? ** Description changed: [ Impact ] Users running Lustre filesystems on Ubuntu experience severe performance degradation with 'du' on large directories (10K+) files. Operations that should take seconds can take minutes. The issue occurs because by default, coreutils sorts directory entries by increasing inode number for large directories. This breaks Lustre's "statahead" prefetching feature, which activates when it detects sequential file access patterns and prefetches metadata accordingly. When coreutils sorts entries by inode (which are scattered), Lustre sees random access and doesn't activate statahead at all, forcing every file access to be an individual network round-trip to the server instead of being served from prefetched cache. Testing the du command after applying the patch to coreutils demonstrates ~5x performance improvement, with ~9x improvement reported in production environments. This primarily affects HPC and research environments, where Lustre is commonly deployed. The fix adds Lustre's filesystem magic number (0x0BD00BD0) to the list of filesystems where inode sorting should be skipped, matching existing behavior for NFS, CIFS, and tmpfs. The change is minimal, has been accepted upstream in coreutils, and only affects Lustre filesystems - all other filesystems remain unchanged. [ Test Plan ] This test requires setting up a Lustre filesystem environment to reproduce the performance issue. The test can be performed in VMs using LXD. ** Setup Lustre Environment (one-time) ** 1. Create VMs on the host machine: $ lxc launch ubuntu:noble lustre-server --vm \ -c limits.cpu=2 -c limits.memory=4GiB -c security.secureboot=false $ lxc launch ubuntu:noble lustre-client --vm \ -c limits.cpu=2 -c limits.memory=2GiB -c security.secureboot=false 2. Build and configure Lustre server (on lustre-server VM): Install dependencies: $ sudo add-apt-repository universe -y $ sudo apt update && sudo apt upgrade -y $ sudo apt install -y build-essential git libtool m4 autoconf \ libzfslinux-dev zfsutils-linux zfs-dkms dkms \ linux-headers-$(uname -r) libyaml-dev bison flex libmount-dev \ debhelper devscripts quilt python3 python-is-python3 libkeyutils-dev \ pkg-config libnl-3-dev libnl-genl-3-dev zlib1g-dev module-assistant \ libreadline-dev libpython3-dev swig Install Lustre: $ git clone https://github.com/lustre/lustre-release.git $ cd lustre-release $ sh autogen.sh && ./configure --with-zfs --disable-ldiskfs && make debs $ cd debs $ sudo dpkg -i lustre-server-modules-*.deb lustre-server-utils_*.deb $ sudo depmod -a $ sudo modprobe lustre Configure networking: $ sudo lnetctl lnet configure $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') $ sudo lnetctl net del --net tcp $ sudo lnetctl net add --net tcp0 --if $IFACE Create and mount storage (start from here if restarting server): $ truncate -s 5G /tmp/mdt.img && truncate -s 10G /tmp/ost.img $ sudo losetup /dev/loop0 /tmp/mdt.img $ sudo losetup /dev/loop1 /tmp/ost.img $ SERVER_IP=$(hostname -I | awk '{print $1}') $ sudo mkfs.lustre --fsname=testfs --mgs --mdt --backfstype=zfs \ --index=0 --reformat mdtpool/mdt /dev/loop0 $ sudo mkfs.lustre --fsname=testfs --ost --backfstype=zfs \ --mgsnode=${SERVER_IP}@tcp0 --index=0 --reformat ostpool/ost /dev/loop1 $ sudo mkdir -p /mnt/mdt /mnt/ost $ sudo mount -t lustre mdtpool/mdt /mnt/mdt $ sudo mount -t lustre ostpool/ost /mnt/ost 3. Build and configure Lustre client (on lustre-client VM): Install dependencies: $ sudo add-apt-repository universe -y $ sudo apt update && sudo apt upgrade -y For Noble: $ sudo apt install -y build-essential git libtool m4 autoconf dkms \ gperf texinfo help2man linux-headers-$(uname -r) libyaml-dev bison \ flex libmount-dev debhelper devscripts quilt python3 \ python-is-python3 libkeyutils-dev pkg-config libnl-3-dev \ libnl-genl-3-dev zlib1g-dev module-assistant libreadline-dev \ libpython3-dev swig libssl-dev For Jammy: $ sudo apt install -y git m4 autoconf dkms gperf texinfo help2man \ linux-headers-$(uname -r) libreadline-dev python3 python-is-python3 \ libpython3-dev libkrb5-dev libkeyutils-dev flex bison libmount-dev \ quilt swig libtool make libnl-genl-3-dev libnl-3-dev zlib1g-dev \ pkg-config libhwloc-dev libyaml-dev ed dpatch libsnmp-dev \ mpi-default-dev libncurses5-dev libncurses-dev gnupg libelf-dev gcc \ libssl-dev bc wget bzip2 build-essential udev kmod cpio \ module-assistant debhelper devscripts python3-distutils-extra rsync Install Lustre: $ git clone https://github.com/lustre/lustre-release.git $ cd lustre-release For Noble: $ sh autogen.sh && ./configure --disable-server && make debs For Jammy: $ git checkout tags/2.15.5 Remove all occurrences of linux-headers-generic | linux-headers-amd64 | linux-headers-arm64, linux-image | linux-image-amd64 | linux-image-arm64, linux-headers-generic | linux-headers-amd64 in debian/control and debian/control.main $ sh autogen.sh && ./configure --disable-server --with-o2ib=no && \ make debs $ cd debs && sudo dpkg -i *.deb Configure networking (start from here if restarting client): $ sudo modprobe lustre $ sudo lnetctl lnet configure $ IFACE=$(ip -o -4 route show to default | awk '{print $5}') $ sudo lnetctl net del --net tcp $ sudo lnetctl net add --net tcp0 --if $IFACE Mount Lustre filesystem (replace SERVER_IP with actual server IP): $ sudo mkdir -p /mnt/lustre $ sudo mount -t lustre SERVER_IP@tcp0:/testfs /mnt/lustre ** Reproduce the Performance Issue ** 4. Create test files with scattered inodes (on lustre-client VM): $ mkdir -p /mnt/lustre/bench $ cd /mnt/lustre/bench $ seq 1 100000 | shuf | xargs -P 20 -I {} touch file_{} 5. Test unpatched coreutils: $ pull-lp-source coreutils noble-updates $ cd coreutils-* $ ./bootstrap && export FORCE_UNSAFE_CONFIGURE=1 && ./configure && make If you get error "parse-datetime.tab.h: No such file or directory": $ cd lib $ bison -d parse-datetime.y -o parse-datetime.tab.c Run make again $ sync && echo 3 > /proc/sys/vm/drop_caches $ sudo lctl set_param llite.*.max_cached_mb=0 $ sudo lctl set_param llite.*.max_cached_mb=1024 $ sudo lctl set_param llite.*.statahead_stats=0 $ time ./src/du /mnt/lustre/bench $ sudo lctl get_param llite.*.statahead_stats | grep -E \ "(statahead total|hit_total)" Measured results: - Time: ~35+ seconds - statahead total: 0 (not activated) - hit_total: 0 (N/A since statahead did not activate) ** Verify the Fix ** 6. Test patched coreutils: Follow the instructions above to build the patched coreutils, then: $ sync && echo 3 > /proc/sys/vm/drop_caches $ sudo lctl set_param llite.*.max_cached_mb=0 $ sudo lctl set_param llite.*.max_cached_mb=1024 $ sudo lctl set_param llite.*.statahead_stats=0 $ time ./src/du /mnt/lustre/bench $ sudo lctl get_param llite.*.statahead_stats | grep -E \ "(statahead total|hit_total)" Measured results: - Time: ~6 seconds (5x faster) - statahead total: 1 (activated) - hit_total: ~99,000+ (prefetching worked) [ Where problems could occur ] The change modifies coreutils' file tree traversal logic in lib/fts.c, specifically adding Lustre to the list of filesystems where directory entry sorting is skipped. If the filesystem detection is incorrect or the change has unintended side effects, problems could manifest in several ways: 1. Incorrect filesystem detection: If the Lustre magic number check (0x0BD00BD0) incorrectly matches a different filesystem type, that filesystem would skip inode sorting when it shouldn't, potentially causing performance degradation on other filesystems. 2. Broader FTS impact: Since this change affects the core FTS library used by multiple utilities (du, ls, find, chmod, chown), any regression would impact all these tools, not just du. Users could experience performance issues or incorrect behavior across file traversal operations. [ Other Info ] Kernel Compatibility Limitation: Testing was only performed on Noble and Jammy. Lustre does not currently support the newer kernel versions in Questing and Resolute. - Attempts to build Lustre on kernels 6.17+ fail with compilation errors due to kernel API changes: + Attempts to build Lustre on kernels 6.17+ fail with compilation errors due to + kernel API changes: - 'struct page' no longer has 'index' member - 'dev_get_flags()' function removed from kernel API Changelog Handling: The upstream gnulib ChangeLog entries were not backported as these are maintained in the gnulib repository, not coreutils. Only the code changes to lib/fts.c were applied. Upstream Status: This fix has been accepted upstream in response to bug report [1], with the fix committed in [2]. [1] - https://bugs.gnu.org/80106 [2] - https://github.com/coreutils/gnulib/commit/578b8d7dc5e3fc00d308660fa60bb529a2e42bb3 Original Description: Find the original description of the case below: Quoting the upstream bug report: "The gnulib function dirent_inode_sort_may_be_useful() should return false for Lustre (i.e. #define S_MAGIC_LUSTRE 0x0BD00BD0 as seen in lustre/include/uapi/linux/lustre/lustre_user.h in the Lustre source tree as LL_SUPER_MAGIC [1]). Sorting dirents negatively impacts du performance on Lustre because it interferes with Lustre's ability to prefetch file metadata (via statahead). For context, Lustre is an open-source (GPLv2) out-of-tree Linux filesystem commonly used for HPC applications." This patch was merged upstream as: https://github.com/coreutils/gnulib/commit/578b8d7dc5e3fc00d308660fa60bb529a2e42bb3 Could we cherry-pick this to Ubuntu gnulib? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2137373 Title: Slow du performance on Lustre for large directories To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/coreutils/+bug/2137373/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
