On 9 May 2026, at 4:13, David Hildenbrand (Arm) wrote: > On 4/29/26 17:35, Zi Yan wrote: >> collapse_file() is capable of collapsing pagecache folios from writable >> files to PMD folios. Now enable clean pagecache folio collapse in addition >> to read-only pagecache folio collapse by removing the >> inode_is_open_for_write() from file_thp_enabled() and only performing >> filemap_flush() if the file is read-only. >> >> This means userspace needs to explicitly flush the content of pagecache >> folios before khugepaged can collapse the folios, or use >> madvise(MADV_COLLAPSE), which does the flush in the retry. The reason is >> that blindly enabling dirty pagecache folio from writable files collapse >> makes khugepaged flush these folios all the time. It is undesirable to >> cause system level pagecache flushes. >> >> To properly support dirty pagecache folio collapse, filemap_flush() needs >> to be avoided. Potentially, merging associated buffer instead of dropping >> it with filemap_release_folio() might be needed. >> >> NOTE: this breaks khugepaged selftests for writable file pagecache >> collapse, which is set to fail all the time. The next commit fix it. >> >> Signed-off-by: Zi Yan <[email protected]> >> --- >> mm/huge_memory.c | 2 +- >> mm/khugepaged.c | 9 ++++++++- >> 2 files changed, 9 insertions(+), 2 deletions(-) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 9b3abb98a7e51..e1e9d59db6e70 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -97,7 +97,7 @@ static inline bool file_thp_enabled(struct vm_area_struct >> *vma) >> if (!mapping_pmd_folio_support(vma->vm_file->f_mapping)) >> return false; >> >> - return !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode); >> + return S_ISREG(inode->i_mode); >> } >> >> /* If returns true, we are unable to access the VMA's folios. */ >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >> index 1ee15b48962a3..fb7ff643973cc 100644 >> --- a/mm/khugepaged.c >> +++ b/mm/khugepaged.c >> @@ -2345,7 +2345,14 @@ static enum scan_result collapse_file(struct >> mm_struct *mm, unsigned long addr, >> * forcing writeback in loop. >> */ >> xas_unlock_irq(&xas); >> - filemap_flush(mapping); >> + /* >> + * Only flush for read-only files. Writable >> + * files can have their folios dirty at any >> + * time; blindly flushing them would cause >> + * undesirable system-wide writeback. >> + */ > > That comment should really be merged in the comment above. > > Also, there we say "khugepaged only works on read-only fd" ... which is now > just > wrong? > > Please revise that whole comment as you incorporate your comment. Sure.
> > Apart from that I guess this is fine ... or we'll learn rather quickly, haha. :) Best Regards, Yan, Zi

