On 4/29/26 17:35, Zi Yan wrote:
> collapse_file() is capable of collapsing pagecache folios from writable
> files to PMD folios. Now enable clean pagecache folio collapse in addition
> to read-only pagecache folio collapse by removing the
> inode_is_open_for_write() from file_thp_enabled() and only performing
> filemap_flush() if the file is read-only.
> 
> This means userspace needs to explicitly flush the content of pagecache
> folios before khugepaged can collapse the folios, or use
> madvise(MADV_COLLAPSE), which does the flush in the retry. The reason is
> that blindly enabling dirty pagecache folio from writable files collapse
> makes khugepaged flush these folios all the time. It is undesirable to
> cause system level pagecache flushes.
> 
> To properly support dirty pagecache folio collapse, filemap_flush() needs
> to be avoided. Potentially, merging associated buffer instead of dropping
> it with filemap_release_folio() might be needed.
> 
> NOTE: this breaks khugepaged selftests for writable file pagecache
> collapse, which is set to fail all the time. The next commit fix it.
> 
> Signed-off-by: Zi Yan <[email protected]>
> ---
>  mm/huge_memory.c | 2 +-
>  mm/khugepaged.c  | 9 ++++++++-
>  2 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 9b3abb98a7e51..e1e9d59db6e70 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -97,7 +97,7 @@ static inline bool file_thp_enabled(struct vm_area_struct 
> *vma)
>       if (!mapping_pmd_folio_support(vma->vm_file->f_mapping))
>               return false;
>  
> -     return !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
> +     return S_ISREG(inode->i_mode);
>  }
>  
>  /* If returns true, we are unable to access the VMA's folios. */
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 1ee15b48962a3..fb7ff643973cc 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -2345,7 +2345,14 @@ static enum scan_result collapse_file(struct mm_struct 
> *mm, unsigned long addr,
>                                * forcing writeback in loop.
>                                */
>                               xas_unlock_irq(&xas);
> -                             filemap_flush(mapping);
> +                             /*
> +                              * Only flush for read-only files. Writable
> +                              * files can have their folios dirty at any
> +                              * time; blindly flushing them would cause
> +                              * undesirable system-wide writeback.
> +                              */

That comment should really be merged in the comment above.

Also, there we say "khugepaged only works on read-only fd" ... which is now just
wrong?

Please revise that whole comment as you incorporate your comment.

Apart from that I guess this is fine ... or we'll learn rather quickly, haha.

-- 
Cheers,

David

Reply via email to