On Mar 26, 2025, at 18:28, 권세훈 via lustre-discuss <[email protected]> wrote: > > My name is Sehoon Kwon, and I’m a developer at Gluesys, a storage solution > provider based in South Korea.
Hello. > We are currently working with Lustre version 2.15.5, and during testing in a > RoCE environment, we encountered an LBUG issue. Upon checking the community > issue tracker (LU-16637), we confirmed that a similar issue had been reported > and resolved in a later release (Lustre 2.16). > We also noted that there had been an effort to backport the fix to the b2_15 > branch. However, based on our investigation, it appears that the patch has > not yet been merged. As the stability of the fix remains unverified in this > branch, we are preparing to evaluate the patch internally, referring to the > Mallo-based testing you conducted as a reference. > > We have backported the commit addressing LU-16637 to our ZFS-based Lustre > 2.15.5 environment and successfully completed the build process, along with > several other fixes. > Following the Testing HOWTO on the Lustre Wiki, we executed sanity.sh and > observed that the script includes nearly 1000 test cases. However, in some > shared test logs from Whamcloud, we noticed that only around 300 tests were > actually run. > > We would appreciate your clarification on the following points: > • Are there any default test sets or predefined exclusions when running > sanity.sh? > Alternatively, does Whamcloud maintain an internal list of commonly executed > tests? The number of subtests that are run depends on the configuration. It will print a message for each subtest that is not run, for example because it depends on a newer server version, or two or more MDTs or OSTs, missing tools, etc. > • For the 2.15 branch, is there any recommended test suite or guideline > for verifying backported patches? The tests that should be run depend on what the patch is changing. We run nearly all of the tests for every patch (about 150h of tests with different configurations, kernels, features, etc.), unless the patch is not changing any functional code and is marked "trivial" so it only runs about 6-8h of testing (sanity, sanity-lnet). > • In addition to the sanity suite, we are aware of several other test > categories. > If there is a commonly used baseline set for general validation, your > guidance would be greatly appreciated. > We aim to align our testing with community standards and ensure compatibility > and stability, so any information or reference materials you could provide > would be of great help. Nearly all of the tests run in review testing will pass. However, given the distributed nature of the filesystem and running in VMs, there are some subtests that fail intermittently. It should be possible to re-run the failed tests to have them pass. You are welcome to push the backported patch to the b2_15 branch of the fs/lustre repository in Gerrit. Please follow the submission guidelines: https://wiki.lustre.org/Submitting_Changes https://wiki.lustre.org/Using_Gerrit https://wiki.lustre.org/Commit_Comments Since this is a backported patch, please add the following labels to indicate this is backported from the master branch (see any backported patch on the b2_15 branch will have these labels.): Lustre-change: https://jira.whamcloud.com/browse/LU-nnnnn Lustre-commit: {git commit hash of master patch} and remove the existing "Reviewed-on:" , "Reviewed-by: Oleg Drokin", and "Tested-by:" labels from the patch. Cheers, Andreas — Andreas Dilger Lustre Principal Architect Whamcloud/DDN _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
