On Tue, May 20, 2025 at 4:09 AM Baolin Wang <baolin.w...@linux.alibaba.com> wrote: > > Sorry for late reply. > > On 2025/5/17 14:47, Nico Pache wrote: > > On Thu, May 15, 2025 at 9:20 PM Baolin Wang > > <baolin.w...@linux.alibaba.com> wrote: > >> > >> > >> > >> On 2025/5/15 11:22, Nico Pache wrote: > >>> khugepaged scans anons PMD ranges for potential collapse to a hugepage. > >>> To add mTHP support we use this scan to instead record chunks of utilized > >>> sections of the PMD. > >>> > >>> khugepaged_scan_bitmap uses a stack struct to recursively scan a bitmap > >>> that represents chunks of utilized regions. We can then determine what > >>> mTHP size fits best and in the following patch, we set this bitmap while > >>> scanning the anon PMD. A minimum collapse order of 2 is used as this is > >>> the lowest order supported by anon memory. > >>> > >>> max_ptes_none is used as a scale to determine how "full" an order must > >>> be before being considered for collapse. > >>> > >>> When attempting to collapse an order that has its order set to "always" > >>> lets always collapse to that order in a greedy manner without > >>> considering the number of bits set. > >>> > >>> Signed-off-by: Nico Pache <npa...@redhat.com> > >> > >> Sigh. You still haven't addressed or explained the issues I previously > >> raised [1], so I don't know how to review this patch again... > > Can you still reproduce this issue? > > Yes, I can still reproduce this issue with today's (5/20) mm-new branch. > > I've disabled PMD-sized THP in my system: > [root]# cat /sys/kernel/mm/transparent_hugepage/enabled > always madvise [never] > [root]# cat /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled > always inherit madvise [never] > > And I tried calling madvise() with MADV_COLLAPSE for anonymous memory, > and I can still see it collapsing to a PMD-sized THP. Hi Baolin ! Thank you for your reply and willingness to test again :)
I didn't realize we were talking about madvise collapse-- this makes sense now. I also figured out why I could "reproduce" it before. My script was always enabling the THP settings in two places, and I only commented out one to test this. But this time I was doing more manual testing. The original design of madvise_collapse ignores the sysfs and collapses even if you have an order disabled. I believe this behavior is wrong, but by design. I spent some time playing around with madvise collapses with and w/o my changes. This is not a new thing, I reproduced the issue in 6.11 (Fedora 41), and I think its been possible since the inception of madvise collapse 3 years ago. I noticed a similar behavior on one of my RFC since it was "breaking" selftests, and the fix was to reincorporate this broken sysfs behavior. 7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") "This call is independent of the system-wide THP sysfs settings, but will fail for memory marked VM_NOHUGEPAGE." The second condition holds true (and fails for VM_NOHUGEPAGE), but I dont know if we actually want madvise_collapse to be independent of the system-wide. So I'll ask the authors +David Rientjes +zoke...@google.com Was this brought up as a concern when this feature was first introduced, was there any pushback, what was the outcome of the discussion if so? I can easily fix this and it would further simplify the code (by removing the is_khugepaged and friends). As David H. has brought up in other discussions around similar topics, never should mean never, is this the only exception we should allow? Thanks! > > > I can no longer reproduce this issue, that's why I posted... although > > I should have followed up, and looked into what the original issue > > was. Nothing really sticks out so perhaps something in mm-new was > > broken and pulled out... not sure. > > > > It should now follow the expected behavior, which is that no mTHP > > collapse occurs because if the PMD size is disabled so is khugepaged > > collapse. > > > > Lmk if you are still experiencing this issue please. > > > > Cheers, > > -- Nico > >> > >> [1] > >> https://lore.kernel.org/all/83a66442-b7c7-42e7-af4e-fd211d8ed...@linux.alibaba.com/ > >> >