On 27 Mar 2026, at 11:29, Lorenzo Stoakes (Oracle) wrote: > On Fri, Mar 27, 2026 at 11:12:46AM -0400, Zi Yan wrote: >> On 27 Mar 2026, at 8:42, Lorenzo Stoakes (Oracle) wrote: >> >>> On Thu, Mar 26, 2026 at 09:42:50PM -0400, Zi Yan wrote: >>>> Replace it with a check on the max folio order of the file's address space >>>> mapping, making sure PMD_ORDER is supported. >>>> >>>> Signed-off-by: Zi Yan <[email protected]> >>>> --- >>>> mm/huge_memory.c | 6 +++--- >>>> 1 file changed, 3 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>>> index c7873dbdc470..1da1467328a3 100644 >>>> --- a/mm/huge_memory.c >>>> +++ b/mm/huge_memory.c >>>> @@ -89,9 +89,6 @@ static inline bool file_thp_enabled(struct >>>> vm_area_struct *vma) >>>> { >>>> struct inode *inode; >>>> >>>> - if (!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS)) >>>> - return false; >>>> - >>>> if (!vma->vm_file) >>>> return false; >>>> >>>> @@ -100,6 +97,9 @@ static inline bool file_thp_enabled(struct >>>> vm_area_struct *vma) >>>> if (IS_ANON_FILE(inode)) >>>> return false; >>>> >>>> + if (mapping_max_folio_order(inode->i_mapping) < PMD_ORDER) >>>> + return false; >>>> + >>> >>> At this point I think this should be a separate function quite honestly and >>> share it with 2/10's use, and then you can put the comment in here re: anon >>> shmem etc. >>> >>> Though that won't apply here of course as shmem_allowable_huge_orders() >>> would >>> have been invoked :) >>> >>> But no harm in refactoring it anyway, and the repetitive < PMD_ORDER stuff >>> is >>> unfortunate. >>> >>> Buuut having said that is this right actually? >>> >>> Because we have: >>> >>> if (((in_pf || smaps)) && vma->vm_ops->huge_fault) >>> return orders; >>> >>> Above it, and now you're enabling huge folio file systems to do non-page >>> fault >>> THP and that's err... isn't that quite a big change? >> >> That is what READ_ONLY_THP_FOR_FS does, creating THPs after page faults, >> right? >> This patchset changes the condition from all FSes to FSes with large folio >> support. > > No, READ_ONLY_THP_FOR_FS operates differently. > > It explicitly _only_ is allowed for MADV_COLLAPSE and only if the file is > mounted read-only. > > So due to: > > if (((in_pf || smaps)) && vma->vm_ops->huge_fault) > return orders; > > if (((!in_pf || smaps)) && file_thp_enabled(vma)) > return orders; > > | PF | MADV_COLLAPSE | khugepaged | > |-----------|---------------|------------| > large folio fs | ✓ | x | x | > READ_ONLY_THP_FOR_FS | x | ✓ | ✓ | > > After this change: > > | PF | MADV_COLLAPSE | khugepaged | > |-----------|---------------|------------| > large folio fs | ✓ | ✓ | ? | > > (I hope we're not enabling khugepaged for large folio fs - which shouldn't > be necessary anyway as we try to give them folios on page fault and they > use thp-friendly get_unused_area etc. :) > > We shouldn't be doing this. > > It should remain: > > | PF | MADV_COLLAPSE | khugepaged | > |-----------|---------------|------------| > large folio fs | ✓ | x | x | > > If we're going to remove it, we should first _just remove it_, not > simultaneously increase the scope of what all the MADV_COLLAPSE code is > doing without any confidence in any of it working properly. > > And it makes the whole series misleading - you're actually _enabling_ a > feature not (only) _removing_ one.
That is what my RFC patch does, but David and willy told me to do this. :) IIUC, with READ_ONLY_THP_FOR_FS, FSes with large folio support will get THP via MADV_COLLAPSE or khugepaged. So removing the code like I did in RFC would cause regressions. I guess I need to rename the series to avoid confusion. How about? Remove read-only THP support for FSes without large folio support. [1] https://lore.kernel.org/all/[email protected]/ > > So let's focus as David suggested on one thing at a time, incrementally. > > And let's please try and sort some of this confusing mess out in the code > if at all possible... > >> >> Will add a helper, mapping_support_pmd_folio(), for >> mapping_max_folio_order(inode->i_mapping) < PMD_ORDER. >> >>> >>> So yeah probably no to this patch as is :) we should just drop >>> file_thp_enabled()? >> >> >> >>> >>>> return !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode); >>>> } >>>> >>>> -- >>>> 2.43.0 >>>> >> >> >> Best Regards, >> Yan, Zi > > Cheers, Lorenzo Best Regards, Yan, Zi

