Hi all,
This patchset removes READ_ONLY_THP_FOR_FS Kconfig and enables creating
file-backed THPs for FSes with large folio support (the supported orders
need to include PMD_ORDER) by default, including for writable files. It
is an in-place replacement of V5 in mm-new. It affects Mike Rapoport's
"make MM selftests more CI friendly", since "selftests/mm: khugepaged:
use kselftest framework" needs to be updated. I updated it and put it at
the end of this cover letter.
Before the patchset, the status of creating read-only THPs is below:
| PF | MADV_COLLAPSE | khugepaged |
|-----------|---------------|------------|
large folio FSes only | ✓ | x | x |
READ_ONLY_THP_FOR_FS only | x | ✓ | ✓ |
both | ✓ | ✓ | ✓ |
where READ_ONLY_THP_FOR_FS implies no large folio FSes.
Now without READ_ONLY_THP_FOR_FS:
| PF | MADV_COLLAPSE | khugepaged |
|-----------|---------------|------------|
large folio FSes (read-only fd) | ✓ | ✓ | ✓ |
large folio FSes (read-write fd) | ✓ | ✓ | ✓* |
no large folio FSes | x | x | x |
* khugepaged only collapses clean folios from writable files. Userspace
must flush dirty folios explicitly before khugepaged can collapse them.
MADV_COLLAPSE handles the flush automatically via its writeback-and-retry
path. Collapsing writable MAP_PRIVATE pagecache folios is still not
supported, since PMD THP CoW only faults in at PTE level to avoid long
CoW latency, and file_backed_vma_is_retractable() prevents it.
This means no-large-folio FSes need to add large folio support (the
supported orders need to include PMD_ORDER), so that they can leverage
file THP creation.
To prevent breaking file THP support for large folio FSes,
1. first 4 patches enable the support, so that without READ_ONLY_THP_FOR_FS,
file THP still works for large folio FSes,
2. Patch 5 removes READ_ONLY_THP_FOR_FS Kconfig,
3. patches 6-12 remove code related to READ_ONLY_THP_FOR_FS,
4. patches 13-14 enable clean pagecache folio collapse for writable files.
NOTE: collapsing writable MAP_PRIVATE pagecache folios is not supported,
since:
1. PMD THP CoW only faults in at PTE level to avoid long CoW latency,
2. the first check, due to 1, in file_backed_vma_is_retractable() prevents it.
Overview
===
1. collapse_file() checks for to-be-collapsed folio dirtiness after they
are locked and unmapped to make sure no new write happens. Before,
mapping->nr_thps and inode->i_writecount were used to cause read-only
THP truncation before a fd becomes writable.
2. hugepage_enabled() is true for anon, shmem, and file-backed cases
if the global khugepaged control is on, otherwise, khugepaged for
file-backed case is turned off and anon and shmem depend on per-size
control knobs.
3. collapse_file() from mm/khugepaged.c, instead of checking
CONFIG_READ_ONLY_THP_FOR_FS, makes sure the mapping_max_folio_order()
of struct address_space of the file is at least PMD_ORDER.
4. file_thp_enabled() checks mapping_max_folio_order() instead of
CONFIG_READ_ONLY_THP_FOR_FS and no longer checks if the file is opened
read-only. The dirty folio check after try_to_unmap() (Change 1)
handles writable files correctly.
5. truncate_inode_partial_folio() calls folio_split() directly instead
of the removed try_folio_split_to_order(), since large folios can
only show up on a FS with large folio support.
6. nr_thps is removed from struct address_space, since it is no longer
needed to drop all read-only THPs from a FS without large folio
support when the fd becomes writable. Its related filemap_nr_thps*()
are removed too.
7. folio_check_splittable() no longer checks READ_ONLY_THP_FOR_FS.
8. collapse_file() only calls filemap_flush() for read-only files.
Blindly flushing dirty folios from writable files would cause
undesirable system-wide writeback; userspace is expected to flush
explicitly, or use MADV_COLLAPSE which handles it via its retry path.
9. Updated comments and selftests in various places.
Changelog
===
>From V5[6]:
1. added mapping_min_folio_order(mapping) <= PMD_ORDER check to
mapping_pmd_folio_support() in Patch 1 to correctly handle
filesystems whose minimum folio order exceeds PMD_ORDER. Also
improved the kernel-doc comment per David's suggestions.
2. cleaned up Patch 11 per David's review: use const for open_opt and
mmap_prot, remove mmap_opt (use MAP_SHARED for both read-only and
read-write mappings), inline file_fault_common() into separate
file_fault_read() and file_fault_write() functions, fix "read only"
typo to "read-only", update usage message to "with PMD-sized large
folio support". Also fixed run_vmtests.sh to use elif test_selected
thp for the SKIP case to avoid spurious [SKIP] output per Nico's
report.
3. revised stale comment in Patch 13: removed "There won't be new dirty
pages" and updated "khugepaged only works on read-only fd" to reflect
that writable files are now supported; merged the comment blocks per
David's suggestion.
>From V4[5]:
1. fixed Patch 1's compilation error in !CONFIG_TRANSPARENT_HUGEPAGE
2. changed Patch 3 to no longer enable collapse for read-write fd but only
allowe read-only fd.
3. added two new patches to enable clean pagecache folio collapse for
writable files:
- Patch 13: remove inode_is_open_for_write() from file_thp_enabled()
so that khugepaged and MADV_COLLAPSE can process writable files.
filemap_flush() in collapse_file() is now conditionalized on the file
being read-only, to avoid repeatedly writing back dirty folios from
writable files.
- Patch 14: add read_write_file_read_ops and read_write_file_write_ops
to the khugepaged selftest to cover the new writable-file collapse paths.
>From V3[4]:
1. added a TODO comment in patch 1 noting that the is_shmem exception in
the VM_WARN_ON_ONCE() check can be removed once shmem always calls
mapping_set_large_folios() on its mapping. Used VM_WARN_ON_ONCE() in
mapping_pmd_thp_support() instead.
2. fixed the dirty folio bail-out path in patch 2: add xas_unlock_irq()
and folio_putback_lru() before the goto, which were missing and would
have left the XA lock held and the LRU isolation ref leaked.
3. renamed hugepage_pmd_enabled() to hugepage_enabled() to reflect it
controls khugepaged for all transparent hugepage types.
4. reverted the comment in hugepage_enabled() in patch 4 to the original;
only removed the phrase "when configured in," which referred to
CONFIG_READ_ONLY_THP_FOR_FS.
5. fixed commit message in patch 6: the dirty folio check is added after
try_to_unmap() in collapse_file(), not after try_to_unmap_flush().
>From V2[3]:
1. removed unnecessary check in collapse_scan_file().
2. removed inode_is_open_for_write() check in file_thp_enabled().
3. changed hugepage_enabled() to return true if khugepaged global
control is on instead of false. cleaned up anon and shmem code in the
function.
4. moved folio dirtiness check after try_to_unmap() but before
try_to_unmap_flush(), since that is sufficient to prevent new writes.
5. reordered patch 4 and 5, so that khugepaged behavior does not change
after READ_ONLY_THP_FOR_FS is removed.
6. added read-write file test in khugepaged selftest.
7. removed the read-only file restriction from guard-region selftest.
>From V1[2]:
1. removed inode_is_open_for_write() check in collapse_file(), since the
added folio dirtiness check after try_to_unmap_flush() should be
sufficient to prevent writes to candidate folios.
2. removed READ_ONLY_THP_FOR_FS check in hugepage_enabled(), please
see Patch 5 and item 2 in the overview for more details.
3. moved the patch removing READ_ONLY_THP_FOR_FS Kconfig after enabling
khugepaged and MADV_COLLAPSE to create read-only THPs.
4. added mapping_pmd_thp_support() helper function.
5. used VM_WARN_ON_ONCE() in collapse_file() for mapping eligibility check
and address alignment check instead of if + return error code. Always
allow shmem, since MADV_COLLAPSE ignore shmem huge config.
6. added mapping eligibility check in collapse_scan_file().
7. removed trailing ; for folio_split() in the !CONFIG_TRANSPARENT_HUGEPAGE.
8. simplified code in folio_check_splittable() after removing
READ_ONLY_THP_FOR_FS code.
9. clarified that read-only THP works for FSes with PMD THP support by
default.
>From RFC[1]:
1. instead of removing READ_ONLY_THP_FOR_FS function entirely, turn it
on by default for all FSes with large folio support and the supported
orders includes PMD_ORDER.
Suggestions and comments are welcome.
Link: https://lore.kernel.org/all/[email protected]/ [1]
Link: https://lore.kernel.org/all/[email protected]/ [2]
Link: https://lore.kernel.org/all/[email protected]/ [3]
Link: https://lore.kernel.org/all/[email protected]/ [4]
Link: https://lore.kernel.org/all/[email protected]/ [5]
Link: https://lore.kernel.org/all/[email protected]/ [6]
For Andrew to update "selftests/mm: khugepaged: use kselftest framework"
from Mike Rapoport's "make MM selftests more CI friendly" series.
===
>From 29f1e70373419e304ba7a69bc78fb43ba40ebfed Mon Sep 17 00:00:00 2001
From: "Mike Rapoport (Microsoft)" <[email protected]>
Date: Mon, 11 May 2026 19:27:58 +0300
Subject: [PATCH] selftests/mm: khugepaged: use kselftest framework
Convert khugepaged tests to use kselftest framework for reporting and
tracking successful and failing runs.
The conversion is mostly about replacing printf()/perror() + exit() pairs
with their ksft_ counterparts.
The nice colored success and failure indications are left intact.
Replace the progress report in collapse_compound_extreme() with a single
ksft_print_msg() to avoid headache with formatting and make the test
output more concise.
Link: https://lore.kernel.org/[email protected]
Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
Tested-by: Luiz Capitulino <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: Barry Song <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Dev Jain <[email protected]>
Cc: Donet Tom <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Lance Yang <[email protected]>
Cc: Leon Romanovsky <[email protected]>
Cc: Liam Howlett <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nico Pache <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Ryan Roberts <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Li Wang <[email protected]>
Cc: Sarthak Sharma <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---
tools/testing/selftests/mm/khugepaged.c | 321 ++++++++++--------------
1 file changed, 132 insertions(+), 189 deletions(-)
diff --git a/tools/testing/selftests/mm/khugepaged.c
b/tools/testing/selftests/mm/khugepaged.c
index 7f61bfa455e96..a2a3a52031031 100644
--- a/tools/testing/selftests/mm/khugepaged.c
+++ b/tools/testing/selftests/mm/khugepaged.c
@@ -86,17 +86,19 @@ static int exit_status;
static void success(const char *msg)
{
printf(" \e[32m%s\e[0m\n", msg);
+ exit_status = KSFT_PASS;
}
static void fail(const char *msg)
{
printf(" \e[31m%s\e[0m\n", msg);
- exit_status++;
+ exit_status = KSFT_FAIL;
}
static void skip(const char *msg)
{
printf(" \e[33m%s\e[0m\n", msg);
+ exit_status = KSFT_SKIP;
}
static void restore_settings_atexit(void)
@@ -104,22 +106,24 @@ static void restore_settings_atexit(void)
if (skip_settings_restore)
return;
- printf("Restore THP and khugepaged settings...");
+ ksft_print_msg("Restore THP and khugepaged settings...");
thp_restore_settings();
success("OK");
skip_settings_restore = true;
+ ksft_print_cnts();
+ exit(exit_status);
}
static void restore_settings(int sig)
{
/* exit() will invoke the restore_settings_atexit handler. */
- exit(sig ? EXIT_FAILURE : exit_status);
+ exit(sig ? KSFT_FAIL : exit_status);
}
static void save_settings(void)
{
- printf("Save THP and khugepaged settings...");
+ ksft_print_msg("Save THP and khugepaged settings...");
if ((read_only_file_ops || read_write_file_read_ops ||
read_write_file_write_ops) &&
finfo.type == VMA_FILE)
@@ -145,19 +149,13 @@ static void get_finfo(const char *dir)
finfo.dir = dir;
stat(finfo.dir, &path_stat);
- if (!S_ISDIR(path_stat.st_mode)) {
- printf("%s: Not a directory (%s)\n", __func__, finfo.dir);
- exit(EXIT_FAILURE);
- }
+ if (!S_ISDIR(path_stat.st_mode))
+ ksft_exit_fail_msg("%s: Not a directory (%s)\n", __func__,
finfo.dir);
if (snprintf(finfo.path, sizeof(finfo.path), "%s/" TEST_FILE,
- finfo.dir) >= sizeof(finfo.path)) {
- printf("%s: Pathname is too long\n", __func__);
- exit(EXIT_FAILURE);
- }
- if (statfs(finfo.dir, &fs)) {
- perror("statfs()");
- exit(EXIT_FAILURE);
- }
+ finfo.dir) >= sizeof(finfo.path))
+ ksft_exit_fail_msg("%s: Pathname is too long\n", __func__);
+ if (statfs(finfo.dir, &fs))
+ ksft_exit_fail_perror("statfs()");
finfo.type = fs.f_type == TMPFS_MAGIC ? VMA_SHMEM : VMA_FILE;
if (finfo.type == VMA_SHMEM)
return;
@@ -165,40 +163,30 @@ static void get_finfo(const char *dir)
/* Find owning device's queue/read_ahead_kb control */
if (snprintf(path, sizeof(path), "/sys/dev/block/%d:%d/uevent",
major(path_stat.st_dev), minor(path_stat.st_dev))
- >= sizeof(path)) {
- printf("%s: Pathname is too long\n", __func__);
- exit(EXIT_FAILURE);
- }
- if (read_file(path, buf, sizeof(buf)) < 0) {
- perror("read_file(read_num)");
- exit(EXIT_FAILURE);
- }
+ >= sizeof(path))
+ ksft_exit_fail_msg("%s: Pathname is too long\n", __func__);
+ if (read_file(path, buf, sizeof(buf)) < 0)
+ ksft_exit_fail_perror("read_file(read_num)");
if (strstr(buf, "DEVTYPE=disk")) {
/* Found it */
if (snprintf(finfo.dev_queue_read_ahead_path,
sizeof(finfo.dev_queue_read_ahead_path),
"/sys/dev/block/%d:%d/queue/read_ahead_kb",
major(path_stat.st_dev), minor(path_stat.st_dev))
- >= sizeof(finfo.dev_queue_read_ahead_path)) {
- printf("%s: Pathname is too long\n", __func__);
- exit(EXIT_FAILURE);
- }
+ >= sizeof(finfo.dev_queue_read_ahead_path))
+ ksft_exit_fail_msg("%s: Pathname is too long\n",
__func__);
return;
}
- if (!strstr(buf, "DEVTYPE=partition")) {
- printf("%s: Unknown device type: %s\n", __func__, path);
- exit(EXIT_FAILURE);
- }
+ if (!strstr(buf, "DEVTYPE=partition"))
+ ksft_exit_fail_msg("%s: Unknown device type: %s\n", __func__,
path);
/*
* Partition of block device - need to find actual device.
* Using naming convention that devnameN is partition of
* device devname.
*/
str = strstr(buf, "DEVNAME=");
- if (!str) {
- printf("%s: Could not read: %s", __func__, path);
- exit(EXIT_FAILURE);
- }
+ if (!str)
+ ksft_exit_fail_msg("%s: Could not read: %s", __func__, path);
str += 8;
end = str;
while (*end) {
@@ -207,16 +195,13 @@ static void get_finfo(const char *dir)
if (snprintf(finfo.dev_queue_read_ahead_path,
sizeof(finfo.dev_queue_read_ahead_path),
"/sys/block/%s/queue/read_ahead_kb",
- str) >=
sizeof(finfo.dev_queue_read_ahead_path)) {
- printf("%s: Pathname is too long\n", __func__);
- exit(EXIT_FAILURE);
- }
+ str) >=
sizeof(finfo.dev_queue_read_ahead_path))
+ ksft_exit_fail_msg("%s: Pathname is too
long\n", __func__);
return;
}
++end;
}
- printf("%s: Could not read: %s\n", __func__, path);
- exit(EXIT_FAILURE);
+ ksft_exit_fail_msg("%s: Could not read: %s\n", __func__, path);
}
static bool check_swap(void *addr, unsigned long size)
@@ -229,26 +214,19 @@ static bool check_swap(void *addr, unsigned long size)
ret = snprintf(addr_pattern, MAX_LINE_LENGTH, "%08lx-",
(unsigned long) addr);
- if (ret >= MAX_LINE_LENGTH) {
- printf("%s: Pattern is too long\n", __func__);
- exit(EXIT_FAILURE);
- }
-
+ if (ret >= MAX_LINE_LENGTH)
+ ksft_exit_fail_msg("%s: Pattern is too long\n", __func__);
fp = fopen(PID_SMAPS, "r");
- if (!fp) {
- printf("%s: Failed to open file %s\n", __func__, PID_SMAPS);
- exit(EXIT_FAILURE);
- }
+ if (!fp)
+ ksft_exit_fail_msg("%s: Failed to open file %s\n", __func__,
PID_SMAPS);
if (!check_for_pattern(fp, addr_pattern, buffer, sizeof(buffer)))
goto err_out;
ret = snprintf(addr_pattern, MAX_LINE_LENGTH, "Swap:%19ld kB",
size >> 10);
- if (ret >= MAX_LINE_LENGTH) {
- printf("%s: Pattern is too long\n", __func__);
- exit(EXIT_FAILURE);
- }
+ if (ret >= MAX_LINE_LENGTH)
+ ksft_exit_fail_msg("%s: Pattern is too long\n", __func__);
/*
* Fetch the Swap: in the same block and check whether it got
* the expected number of hugeepages next.
@@ -271,10 +249,8 @@ static void *alloc_mapping(int nr)
p = mmap(BASE_ADDR, nr * hpage_pmd_size, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
- if (p != BASE_ADDR) {
- printf("Failed to allocate VMA at %p\n", BASE_ADDR);
- exit(EXIT_FAILURE);
- }
+ if (p != BASE_ADDR)
+ ksft_exit_fail_msg("Failed to allocate VMA at %p\n", BASE_ADDR);
return p;
}
@@ -324,19 +300,13 @@ static void *alloc_hpage(struct mem_ops *ops)
* khugepaged on low-load system (like a test machine), which
* would cause MADV_COLLAPSE to fail with EAGAIN.
*/
- printf("Allocate huge page...");
- if (madvise_collapse_retry(p, hpage_pmd_size)) {
- perror("madvise(MADV_COLLAPSE)");
- exit(EXIT_FAILURE);
- }
- if (!ops->check_huge(p, 1)) {
- perror("madvise(MADV_COLLAPSE)");
- exit(EXIT_FAILURE);
- }
- if (madvise(p, hpage_pmd_size, MADV_HUGEPAGE)) {
- perror("madvise(MADV_HUGEPAGE)");
- exit(EXIT_FAILURE);
- }
+ ksft_print_msg("Allocate huge page...");
+ if (madvise_collapse_retry(p, hpage_pmd_size))
+ ksft_exit_fail_perror("madvise(MADV_COLLAPSE)");
+ if (!ops->check_huge(p, 1))
+ ksft_exit_fail_perror("madvise(MADV_COLLAPSE)");
+ if (madvise(p, hpage_pmd_size, MADV_HUGEPAGE))
+ ksft_exit_fail_perror("madvise(MADV_HUGEPAGE)");
success("OK");
return p;
}
@@ -346,11 +316,9 @@ static void validate_memory(int *p, unsigned long start,
unsigned long end)
int i;
for (i = start / page_size; i < end / page_size; i++) {
- if (p[i * page_size / sizeof(*p)] != i + 0xdead0000) {
- printf("Page %d is corrupted: %#x\n",
- i, p[i * page_size / sizeof(*p)]);
- exit(EXIT_FAILURE);
- }
+ if (p[i * page_size / sizeof(*p)] != i + 0xdead0000)
+ ksft_exit_fail_msg("Page %d is corrupted: %#x\n",
+ i, p[i * page_size / sizeof(*p)]);
}
}
@@ -383,14 +351,12 @@ static void *file_setup_area_common(int nr_hpages, enum
file_setup_ops setup)
unsigned long size;
unlink(finfo.path); /* Cleanup from previous failed tests */
- printf("Creating %s for collapse%s...", finfo.path,
- finfo.type == VMA_SHMEM ? " (tmpfs)" : "");
+ ksft_print_msg("Creating %s for collapse%s...", finfo.path,
+ finfo.type == VMA_SHMEM ? " (tmpfs)" : "");
fd = open(finfo.path, O_CREAT | O_RDWR | O_TRUNC | O_EXCL,
777);
- if (fd < 0) {
- perror("open()");
- exit(EXIT_FAILURE);
- }
+ if (fd < 0)
+ ksft_exit_fail_perror("open()");
size = nr_hpages * hpage_pmd_size;
if (ftruncate(fd, size)) {
@@ -411,22 +377,17 @@ static void *file_setup_area_common(int nr_hpages, enum
file_setup_ops setup)
close(fd);
munmap(p, size);
success("OK");
-
- printf("Opening %s %s for collapse...", finfo.path,
+ ksft_print_msg("Opening %s %s for collapse...", finfo.path,
setup == FILE_SETUP_READ_ONLY_FS ? "read-only" :
setup == FILE_SETUP_READ_WRITE_FS_READ_DATA ?
"read-write (read)" :
"read-write (write)");
finfo.fd = open(finfo.path, open_opt, 777);
- if (finfo.fd < 0) {
- perror("open()");
- exit(EXIT_FAILURE);
- }
+ if (finfo.fd < 0)
+ ksft_exit_fail_perror("open()");
p = mmap(BASE_ADDR, size, mmap_prot, MAP_SHARED, finfo.fd, 0);
- if (p == MAP_FAILED || p != BASE_ADDR) {
- perror("mmap()");
- exit(EXIT_FAILURE);
- }
+ if (p == MAP_FAILED || p != BASE_ADDR)
+ ksft_exit_fail_perror("mmap()");
/* Drop page cache */
write_file("/proc/sys/vm/drop_caches", "3", 2);
@@ -458,10 +419,8 @@ static void file_cleanup_area(void *p, unsigned long size)
static void file_fault_read(void *p, unsigned long start, unsigned long end)
{
- if (madvise(((char *)p) + start, end - start, MADV_POPULATE_READ)) {
- perror("madvise(MADV_POPULATE_READ)");
- exit(EXIT_FAILURE);
- }
+ if (madvise(((char *)p) + start, end - start, MADV_POPULATE_READ))
+ ksft_exit_fail_perror("madvise(MADV_POPULATE_READ)");
}
static void file_fault_read_and_flush(void *p, unsigned long start, unsigned
long end)
@@ -476,10 +435,8 @@ static void file_fault_read_and_flush(void *p, unsigned
long start, unsigned lon
static void file_fault_write(void *p, unsigned long start, unsigned long end)
{
- if (madvise(((char *)p) + start, end - start, MADV_POPULATE_WRITE)) {
- perror("madvise(MADV_POPULATE_WRITE)");
- exit(EXIT_FAILURE);
- }
+ if (madvise(((char *)p) + start, end - start, MADV_POPULATE_WRITE))
+ ksft_exit_fail_perror("madvise(MADV_POPULATE_WRITE)");
}
static bool file_check_huge(void *addr, int nr_hpages)
@@ -501,20 +458,14 @@ static void *shmem_setup_area(int nr_hpages)
unsigned long size = nr_hpages * hpage_pmd_size;
finfo.fd = memfd_create("khugepaged-selftest-collapse-shmem", 0);
- if (finfo.fd < 0) {
- perror("memfd_create()");
- exit(EXIT_FAILURE);
- }
- if (ftruncate(finfo.fd, size)) {
- perror("ftruncate()");
- exit(EXIT_FAILURE);
- }
+ if (finfo.fd < 0)
+ ksft_exit_fail_perror("memfd_create()");
+ if (ftruncate(finfo.fd, size))
+ ksft_exit_fail_perror("ftruncate()");
p = mmap(BASE_ADDR, size, PROT_READ | PROT_WRITE, MAP_SHARED, finfo.fd,
0);
- if (p != BASE_ADDR) {
- perror("mmap()");
- exit(EXIT_FAILURE);
- }
+ if (p != BASE_ADDR)
+ ksft_exit_fail_perror("mmap()");
return p;
}
@@ -588,7 +539,7 @@ static void __madvise_collapse(const char *msg, char *p,
int nr_hpages,
int ret;
struct thp_settings settings = *thp_current_settings();
- printf("%s...", msg);
+ ksft_print_msg("%s...", msg);
/*
* read&write file collapse succeeds for MADV_COLLAPSE because dirty
@@ -621,10 +572,8 @@ static void madvise_collapse(const char *msg, char *p, int
nr_hpages,
struct mem_ops *ops, bool expect)
{
/* Sanity check */
- if (!ops->check_huge(p, 0)) {
- printf("Unexpected huge page\n");
- exit(EXIT_FAILURE);
- }
+ if (!ops->check_huge(p, 0))
+ ksft_exit_fail_msg("Unexpected huge page\n");
__madvise_collapse(msg, p, nr_hpages, ops, expect);
}
@@ -636,17 +585,15 @@ static bool wait_for_scan(const char *msg, char *p, int
nr_hpages,
int timeout = 6; /* 3 seconds */
/* Sanity check */
- if (!ops->check_huge(p, 0)) {
- printf("Unexpected huge page\n");
- exit(EXIT_FAILURE);
- }
+ if (!ops->check_huge(p, 0))
+ ksft_exit_fail_msg("Unexpected huge page\n");
madvise(p, nr_hpages * hpage_pmd_size, MADV_HUGEPAGE);
/* Wait until the second full_scan completed */
full_scans = thp_read_num("khugepaged/full_scans") + 2;
- printf("%s...", msg);
+ ksft_print_msg("%s...", msg);
while (timeout--) {
if (ops->check_huge(p, nr_hpages))
break;
@@ -713,7 +660,7 @@ static void alloc_at_fault(void)
p = alloc_mapping(1);
*p = 1;
- printf("Allocate huge page on fault...");
+ ksft_print_msg("Allocate huge page on fault...");
if (check_huge_anon(p, 1, hpage_pmd_size))
success("OK");
else
@@ -722,12 +669,14 @@ static void alloc_at_fault(void)
thp_pop_settings();
madvise(p, page_size, MADV_DONTNEED);
- printf("Split huge PMD on MADV_DONTNEED...");
+ ksft_print_msg("Split huge PMD on MADV_DONTNEED...");
if (check_huge_anon(p, 0, hpage_pmd_size))
success("OK");
else
fail("Fail");
munmap(p, hpage_pmd_size);
+
+ ksft_test_result_report(exit_status, "allocate on fault and split\n");
}
static void collapse_full(struct collapse_context *c, struct mem_ops *ops)
@@ -742,6 +691,8 @@ static void collapse_full(struct collapse_context *c,
struct mem_ops *ops)
ops, true);
validate_memory(p, 0, size);
ops->cleanup_area(p, size);
+
+ ksft_test_result_report(exit_status, "%s\n", __func__);
}
static void collapse_empty(struct collapse_context *c, struct mem_ops *ops)
@@ -751,6 +702,7 @@ static void collapse_empty(struct collapse_context *c,
struct mem_ops *ops)
p = ops->setup_area(1);
c->collapse("Do not collapse empty PTE table", p, 1, ops, false);
ops->cleanup_area(p, hpage_pmd_size);
+ ksft_test_result_report(exit_status, "%s\n", __func__);
}
static void collapse_single_pte_entry(struct collapse_context *c, struct
mem_ops *ops)
@@ -762,6 +714,7 @@ static void collapse_single_pte_entry(struct
collapse_context *c, struct mem_ops
c->collapse("Collapse PTE table with single PTE entry present", p,
1, ops, true);
ops->cleanup_area(p, hpage_pmd_size);
+ ksft_test_result_report(exit_status, "%s\n", __func__);
}
static void collapse_max_ptes_none(struct collapse_context *c, struct mem_ops
*ops)
@@ -801,6 +754,7 @@ static void collapse_max_ptes_none(struct collapse_context
*c, struct mem_ops *o
skip:
ops->cleanup_area(p, hpage_pmd_size);
thp_pop_settings();
+ ksft_test_result_report(exit_status, "%s\n", __func__);
}
static void collapse_swapin_single_pte(struct collapse_context *c, struct
mem_ops *ops)
@@ -810,11 +764,9 @@ static void collapse_swapin_single_pte(struct
collapse_context *c, struct mem_op
p = ops->setup_area(1);
ops->fault(p, 0, hpage_pmd_size);
- printf("Swapout one page...");
- if (madvise(p, page_size, MADV_PAGEOUT)) {
- perror("madvise(MADV_PAGEOUT)");
- exit(EXIT_FAILURE);
- }
+ ksft_print_msg("Swapout one page...");
+ if (madvise(p, page_size, MADV_PAGEOUT))
+ ksft_exit_fail_perror("madvise(MADV_PAGEOUT)");
if (check_swap(p, page_size)) {
success("OK");
} else {
@@ -827,6 +779,7 @@ static void collapse_swapin_single_pte(struct
collapse_context *c, struct mem_op
validate_memory(p, 0, hpage_pmd_size);
out:
ops->cleanup_area(p, hpage_pmd_size);
+ ksft_test_result_report(exit_status, "%s\n", __func__);
}
static void collapse_max_ptes_swap(struct collapse_context *c, struct mem_ops
*ops)
@@ -837,11 +790,9 @@ static void collapse_max_ptes_swap(struct collapse_context
*c, struct mem_ops *o
p = ops->setup_area(1);
ops->fault(p, 0, hpage_pmd_size);
- printf("Swapout %d of %d pages...", max_ptes_swap + 1, hpage_pmd_nr);
- if (madvise(p, (max_ptes_swap + 1) * page_size, MADV_PAGEOUT)) {
- perror("madvise(MADV_PAGEOUT)");
- exit(EXIT_FAILURE);
- }
+ ksft_print_msg("Swapout %d of %d pages...", max_ptes_swap + 1,
hpage_pmd_nr);
+ if (madvise(p, (max_ptes_swap + 1) * page_size, MADV_PAGEOUT))
+ ksft_exit_fail_perror("madvise(MADV_PAGEOUT)");
if (check_swap(p, (max_ptes_swap + 1) * page_size)) {
success("OK");
} else {
@@ -855,12 +806,10 @@ static void collapse_max_ptes_swap(struct
collapse_context *c, struct mem_ops *o
if (c->enforce_pte_scan_limits) {
ops->fault(p, 0, hpage_pmd_size);
- printf("Swapout %d of %d pages...", max_ptes_swap,
+ ksft_print_msg("Swapout %d of %d pages...", max_ptes_swap,
hpage_pmd_nr);
- if (madvise(p, max_ptes_swap * page_size, MADV_PAGEOUT)) {
- perror("madvise(MADV_PAGEOUT)");
- exit(EXIT_FAILURE);
- }
+ if (madvise(p, max_ptes_swap * page_size, MADV_PAGEOUT))
+ ksft_exit_fail_perror("madvise(MADV_PAGEOUT)");
if (check_swap(p, max_ptes_swap * page_size)) {
success("OK");
} else {
@@ -874,6 +823,7 @@ static void collapse_max_ptes_swap(struct collapse_context
*c, struct mem_ops *o
}
out:
ops->cleanup_area(p, hpage_pmd_size);
+ ksft_test_result_report(exit_status, "%s\n", __func__);
}
static void collapse_single_pte_entry_compound(struct collapse_context *c,
struct mem_ops *ops)
@@ -890,7 +840,7 @@ static void collapse_single_pte_entry_compound(struct
collapse_context *c, struc
}
madvise(p, hpage_pmd_size, MADV_NOHUGEPAGE);
- printf("Split huge page leaving single PTE mapping compound page...");
+ ksft_print_msg("Split huge page leaving single PTE mapping compound
page...");
madvise(p + page_size, hpage_pmd_size - page_size, MADV_DONTNEED);
if (ops->check_huge(p, 0))
success("OK");
@@ -902,6 +852,7 @@ static void collapse_single_pte_entry_compound(struct
collapse_context *c, struc
validate_memory(p, 0, page_size);
skip:
ops->cleanup_area(p, hpage_pmd_size);
+ ksft_test_result_report(exit_status, "%s\n", __func__);
}
static void collapse_full_of_compound(struct collapse_context *c, struct
mem_ops *ops)
@@ -909,7 +860,7 @@ static void collapse_full_of_compound(struct
collapse_context *c, struct mem_ops
void *p;
p = alloc_hpage(ops);
- printf("Split huge page leaving single PTE page table full of compound
pages...");
+ ksft_print_msg("Split huge page leaving single PTE page table full of
compound pages...");
madvise(p, page_size, MADV_NOHUGEPAGE);
madvise(p, hpage_pmd_size, MADV_NOHUGEPAGE);
if (ops->check_huge(p, 0))
@@ -921,6 +872,7 @@ static void collapse_full_of_compound(struct
collapse_context *c, struct mem_ops
true);
validate_memory(p, 0, hpage_pmd_size);
ops->cleanup_area(p, hpage_pmd_size);
+ ksft_test_result_report(exit_status, "%s\n", __func__);
}
static void collapse_compound_extreme(struct collapse_context *c, struct
mem_ops *ops)
@@ -929,16 +881,12 @@ static void collapse_compound_extreme(struct
collapse_context *c, struct mem_ops
int i;
p = ops->setup_area(1);
+ ksft_print_msg("Construct PTE page table full of different PTE-mapped
compound pages\n");
for (i = 0; i < hpage_pmd_nr; i++) {
- printf("\rConstruct PTE page table full of different PTE-mapped
compound pages %3d/%d...",
- i + 1, hpage_pmd_nr);
-
madvise(BASE_ADDR, hpage_pmd_size, MADV_HUGEPAGE);
ops->fault(BASE_ADDR, 0, hpage_pmd_size);
- if (!ops->check_huge(BASE_ADDR, 1)) {
- printf("Failed to allocate huge page\n");
- exit(EXIT_FAILURE);
- }
+ if (!ops->check_huge(BASE_ADDR, 1))
+ ksft_exit_fail_msg("Failed to allocate huge page\n");
madvise(BASE_ADDR, hpage_pmd_size, MADV_NOHUGEPAGE);
p = mremap(BASE_ADDR - i * page_size,
@@ -946,20 +894,16 @@ static void collapse_compound_extreme(struct
collapse_context *c, struct mem_ops
(i + 1) * page_size,
MREMAP_MAYMOVE | MREMAP_FIXED,
BASE_ADDR + 2 * hpage_pmd_size);
- if (p == MAP_FAILED) {
- perror("mremap+unmap");
- exit(EXIT_FAILURE);
- }
+ if (p == MAP_FAILED)
+ ksft_exit_fail_perror("mremap+unmap");
p = mremap(BASE_ADDR + 2 * hpage_pmd_size,
(i + 1) * page_size,
(i + 1) * page_size + hpage_pmd_size,
MREMAP_MAYMOVE | MREMAP_FIXED,
BASE_ADDR - (i + 1) * page_size);
- if (p == MAP_FAILED) {
- perror("mremap+alloc");
- exit(EXIT_FAILURE);
- }
+ if (p == MAP_FAILED)
+ ksft_exit_fail_perror("mremap+alloc");
}
ops->cleanup_area(BASE_ADDR, hpage_pmd_size);
@@ -974,6 +918,7 @@ static void collapse_compound_extreme(struct
collapse_context *c, struct mem_ops
validate_memory(p, 0, hpage_pmd_size);
ops->cleanup_area(p, hpage_pmd_size);
+ ksft_test_result_report(exit_status, "%s\n", __func__);
}
static void collapse_fork(struct collapse_context *c, struct mem_ops *ops)
@@ -983,18 +928,17 @@ static void collapse_fork(struct collapse_context *c,
struct mem_ops *ops)
p = ops->setup_area(1);
- printf("Allocate small page...");
+ ksft_print_msg("Allocate small page...");
ops->fault(p, 0, page_size);
if (ops->check_huge(p, 0))
success("OK");
else
fail("Fail");
- printf("Share small page over fork()...");
+ ksft_print_msg("Share small page over fork()...");
if (!fork()) {
/* Do not touch settings on child exit */
skip_settings_restore = true;
- exit_status = 0;
if (ops->check_huge(p, 0))
success("OK");
@@ -1011,15 +955,16 @@ static void collapse_fork(struct collapse_context *c,
struct mem_ops *ops)
}
wait(&wstatus);
- exit_status += WEXITSTATUS(wstatus);
+ exit_status = WEXITSTATUS(wstatus);
- printf("Check if parent still has small page...");
+ ksft_print_msg("Check if parent still has small page...");
if (ops->check_huge(p, 0))
success("OK");
else
fail("Fail");
validate_memory(p, 0, page_size);
ops->cleanup_area(p, hpage_pmd_size);
+ ksft_test_result_report(exit_status, "%s\n", __func__);
}
static void collapse_fork_compound(struct collapse_context *c, struct mem_ops
*ops)
@@ -1028,18 +973,17 @@ static void collapse_fork_compound(struct
collapse_context *c, struct mem_ops *o
void *p;
p = alloc_hpage(ops);
- printf("Share huge page over fork()...");
+ ksft_print_msg("Share huge page over fork()...");
if (!fork()) {
/* Do not touch settings on child exit */
skip_settings_restore = true;
- exit_status = 0;
if (ops->check_huge(p, 1))
success("OK");
else
fail("Fail");
- printf("Split huge page PMD in child process...");
+ ksft_print_msg("Split huge page PMD in child process...");
madvise(p, page_size, MADV_NOHUGEPAGE);
madvise(p, hpage_pmd_size, MADV_NOHUGEPAGE);
if (ops->check_huge(p, 0))
@@ -1060,15 +1004,16 @@ static void collapse_fork_compound(struct
collapse_context *c, struct mem_ops *o
}
wait(&wstatus);
- exit_status += WEXITSTATUS(wstatus);
+ exit_status = WEXITSTATUS(wstatus);
- printf("Check if parent still has huge page...");
+ ksft_print_msg("Check if parent still has huge page...");
if (ops->check_huge(p, 1))
success("OK");
else
fail("Fail");
validate_memory(p, 0, hpage_pmd_size);
ops->cleanup_area(p, hpage_pmd_size);
+ ksft_test_result_report(exit_status, "%s\n", __func__);
}
static void collapse_max_ptes_shared(struct collapse_context *c, struct
mem_ops *ops)
@@ -1078,18 +1023,17 @@ static void collapse_max_ptes_shared(struct
collapse_context *c, struct mem_ops
void *p;
p = alloc_hpage(ops);
- printf("Share huge page over fork()...");
+ ksft_print_msg("Share huge page over fork()...");
if (!fork()) {
/* Do not touch settings on child exit */
skip_settings_restore = true;
- exit_status = 0;
if (ops->check_huge(p, 1))
success("OK");
else
fail("Fail");
- printf("Trigger CoW on page %d of %d...",
+ ksft_print_msg("Trigger CoW on page %d of %d...",
hpage_pmd_nr - max_ptes_shared - 1, hpage_pmd_nr);
ops->fault(p, 0, (hpage_pmd_nr - max_ptes_shared - 1) * page_size);
if (ops->check_huge(p, 0))
@@ -1101,7 +1045,7 @@ static void collapse_max_ptes_shared(struct
collapse_context *c, struct mem_ops
1, ops, !c->enforce_pte_scan_limits);
if (c->enforce_pte_scan_limits) {
- printf("Trigger CoW on page %d of %d...",
+ ksft_print_msg("Trigger CoW on page %d of %d...",
hpage_pmd_nr - max_ptes_shared, hpage_pmd_nr);
ops->fault(p, 0, (hpage_pmd_nr - max_ptes_shared) *
page_size);
@@ -1120,15 +1064,16 @@ static void collapse_max_ptes_shared(struct
collapse_context *c, struct mem_ops
}
wait(&wstatus);
- exit_status += WEXITSTATUS(wstatus);
+ exit_status = WEXITSTATUS(wstatus);
- printf("Check if parent still has huge page...");
+ ksft_print_msg("Check if parent still has huge page...");
if (ops->check_huge(p, 1))
success("OK");
else
fail("Fail");
validate_memory(p, 0, hpage_pmd_size);
ops->cleanup_area(p, hpage_pmd_size);
+ ksft_test_result_report(exit_status, "%s\n", __func__);
}
static void madvise_collapse_existing_thps(struct collapse_context *c,
@@ -1145,6 +1090,7 @@ static void madvise_collapse_existing_thps(struct
collapse_context *c,
__madvise_collapse("Re-collapse PMD-mapped hugepage", p, 1, ops, true);
validate_memory(p, 0, hpage_pmd_size);
ops->cleanup_area(p, hpage_pmd_size);
+ ksft_test_result_report(exit_status, "%s\n", __func__);
}
/*
@@ -1172,6 +1118,7 @@ static void madvise_retracted_page_tables(struct
collapse_context *c,
true);
validate_memory(p, 0, size);
ops->cleanup_area(p, size);
+ ksft_test_result_report(exit_status, "%s\n", __func__);
}
static void usage(void)
@@ -1280,10 +1227,8 @@ static int nr_test_cases;
#define TEST(t, c, o) do { \
if (c && o) { \
- if (nr_test_cases >= MAX_TEST_CASES) { \
- printf("MAX_TEST_CASES is too small\n"); \
- exit(EXIT_FAILURE); \
- } \
+ if (nr_test_cases >= MAX_TEST_CASES) \
+ ksft_exit_fail_msg("MAX_TEST_CASES is too small\n"); \
test_cases[nr_test_cases++] = (struct test_case){ \
.ctx = c, \
.ops = o, \
@@ -1316,10 +1261,10 @@ int main(int argc, char **argv)
.read_ahead_kb = 0,
};
- if (!thp_is_enabled()) {
- printf("Transparent Hugepages not available\n");
- return KSFT_SKIP;
- }
+ ksft_print_header();
+
+ if (!thp_is_enabled())
+ ksft_exit_skip("Transparent Hugepages not available\n");
parse_test_type(argc, argv);
@@ -1327,10 +1272,8 @@ int main(int argc, char **argv)
page_size = getpagesize();
hpage_pmd_size = read_pmd_pagesize();
- if (!hpage_pmd_size) {
- printf("Reading PMD pagesize failed");
- exit(EXIT_FAILURE);
- }
+ if (!hpage_pmd_size)
+ ksft_exit_fail_msg("Reading PMD pagesize failed\n");
hpage_pmd_nr = hpage_pmd_size / page_size;
hpage_pmd_order = __builtin_ctz(hpage_pmd_nr);
@@ -1346,8 +1289,6 @@ int main(int argc, char **argv)
save_settings();
thp_push_settings(&default_settings);
- alloc_at_fault();
-
TEST(collapse_full, khugepaged_context, anon_ops);
TEST(collapse_full, khugepaged_context, read_only_file_ops);
TEST(collapse_full, khugepaged_context, read_write_file_read_ops);
@@ -1425,11 +1366,13 @@ int main(int argc, char **argv)
TEST(madvise_retracted_page_tables, madvise_context,
read_write_file_read_ops);
TEST(madvise_retracted_page_tables, madvise_context, shmem_ops);
- exit_status = KSFT_PASS;
+ ksft_set_plan(nr_test_cases + 1);
+
+ alloc_at_fault();
for (int i = 0; i < nr_test_cases; i++) {
struct test_case *t = &test_cases[i];
- printf("\nRun test: %s (%s:%s)\n", t->desc, t->ctx->name,
t->ops->name);
+ ksft_print_msg("\nRun test: %s (%s:%s)\n", t->desc,
t->ctx->name, t->ops->name);
t->fn(t->ctx, t->ops);
}
--
2.53.0
Zi Yan (14):
mm/khugepaged: remove READ_ONLY_THP_FOR_FS check
mm/khugepaged: add folio dirty check after try_to_unmap()
mm/huge_memory: remove READ_ONLY_THP_FOR_FS from file_thp_enabled()
mm/khugepaged: remove READ_ONLY_THP_FOR_FS check in hugepage_enabled()
mm: remove READ_ONLY_THP_FOR_FS Kconfig option
mm: fs: remove filemap_nr_thps*() functions and their users
fs: remove nr_thps from struct address_space
mm/huge_memory: remove folio split check for READ_ONLY_THP_FOR_FS
mm/truncate: use folio_split() in truncate_inode_partial_folio()
fs/btrfs: remove a comment referring to READ_ONLY_THP_FOR_FS
selftests/mm: remove READ_ONLY_THP_FOR_FS in khugepaged
selftests/mm: remove READ_ONLY_THP_FOR_FS code from guard-regions
mm/khugepaged: enable clean pagecache folio collapse for writable
files
selftests/mm: add writable-file collapse tests for khugepaged
fs/btrfs/defrag.c | 3 -
fs/inode.c | 3 -
fs/open.c | 27 ---
include/linux/fs.h | 5 -
include/linux/huge_mm.h | 25 +--
include/linux/pagemap.h | 50 +++---
include/linux/shmem_fs.h | 2 +-
mm/Kconfig | 11 --
mm/filemap.c | 1 -
mm/huge_memory.c | 39 +----
mm/khugepaged.c | 107 ++++++------
mm/truncate.c | 8 +-
tools/testing/selftests/mm/guard-regions.c | 18 +-
tools/testing/selftests/mm/khugepaged.c | 184 ++++++++++++++++-----
tools/testing/selftests/mm/run_vmtests.sh | 12 +-
15 files changed, 254 insertions(+), 241 deletions(-)
--
2.53.0