On Fri, Mar 27, 2026 at 11:12:46AM -0400, Zi Yan wrote:
> On 27 Mar 2026, at 8:42, Lorenzo Stoakes (Oracle) wrote:
>
> > On Thu, Mar 26, 2026 at 09:42:50PM -0400, Zi Yan wrote:
> >> Replace it with a check on the max folio order of the file's address space
> >> mapping, making sure PMD_ORDER is supported.
> >>
> >> Signed-off-by: Zi Yan <[email protected]>
> >> ---
> >> mm/huge_memory.c | 6 +++---
> >> 1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >> index c7873dbdc470..1da1467328a3 100644
> >> --- a/mm/huge_memory.c
> >> +++ b/mm/huge_memory.c
> >> @@ -89,9 +89,6 @@ static inline bool file_thp_enabled(struct
> >> vm_area_struct *vma)
> >> {
> >> struct inode *inode;
> >>
> >> - if (!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS))
> >> - return false;
> >> -
> >> if (!vma->vm_file)
> >> return false;
> >>
> >> @@ -100,6 +97,9 @@ static inline bool file_thp_enabled(struct
> >> vm_area_struct *vma)
> >> if (IS_ANON_FILE(inode))
> >> return false;
> >>
> >> + if (mapping_max_folio_order(inode->i_mapping) < PMD_ORDER)
> >> + return false;
> >> +
> >
> > At this point I think this should be a separate function quite honestly and
> > share it with 2/10's use, and then you can put the comment in here re: anon
> > shmem etc.
> >
> > Though that won't apply here of course as shmem_allowable_huge_orders()
> > would
> > have been invoked :)
> >
> > But no harm in refactoring it anyway, and the repetitive < PMD_ORDER stuff
> > is
> > unfortunate.
> >
> > Buuut having said that is this right actually?
> >
> > Because we have:
> >
> > if (((in_pf || smaps)) && vma->vm_ops->huge_fault)
> > return orders;
> >
> > Above it, and now you're enabling huge folio file systems to do non-page
> > fault
> > THP and that's err... isn't that quite a big change?
>
> That is what READ_ONLY_THP_FOR_FS does, creating THPs after page faults,
> right?
> This patchset changes the condition from all FSes to FSes with large folio
> support.
No, READ_ONLY_THP_FOR_FS operates differently.
It explicitly _only_ is allowed for MADV_COLLAPSE and only if the file is
mounted read-only.
So due to:
if (((in_pf || smaps)) && vma->vm_ops->huge_fault)
return orders;
if (((!in_pf || smaps)) && file_thp_enabled(vma))
return orders;
| PF | MADV_COLLAPSE | khugepaged |
|-----------|---------------|------------|
large folio fs | ✓ | x | x |
READ_ONLY_THP_FOR_FS | x | ✓ | ✓ |
After this change:
| PF | MADV_COLLAPSE | khugepaged |
|-----------|---------------|------------|
large folio fs | ✓ | ✓ | ? |
(I hope we're not enabling khugepaged for large folio fs - which shouldn't
be necessary anyway as we try to give them folios on page fault and they
use thp-friendly get_unused_area etc. :)
We shouldn't be doing this.
It should remain:
| PF | MADV_COLLAPSE | khugepaged |
|-----------|---------------|------------|
large folio fs | ✓ | x | x |
If we're going to remove it, we should first _just remove it_, not
simultaneously increase the scope of what all the MADV_COLLAPSE code is
doing without any confidence in any of it working properly.
And it makes the whole series misleading - you're actually _enabling_ a
feature not (only) _removing_ one.
So let's focus as David suggested on one thing at a time, incrementally.
And let's please try and sort some of this confusing mess out in the code
if at all possible...
>
> Will add a helper, mapping_support_pmd_folio(), for
> mapping_max_folio_order(inode->i_mapping) < PMD_ORDER.
>
> >
> > So yeah probably no to this patch as is :) we should just drop
> > file_thp_enabled()?
>
>
>
> >
> >> return !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
> >> }
> >>
> >> --
> >> 2.43.0
> >>
>
>
> Best Regards,
> Yan, Zi
Cheers, Lorenzo