Re: [PATCH v7 06/12] khugepaged: introduce khugepaged_scan_bitmap for mTHP support

Baolin Wang Wed, 28 May 2025 07:04:47 -0700



On 2025/5/28 17:26, David Hildenbrand wrote:

On 22.05.25 11:39, Baolin Wang wrote:



On 2025/5/21 18:23, Nico Pache wrote:

On Tue, May 20, 2025 at 4:09 AM Baolin Wang
<[email protected]> wrote:

Sorry for late reply.

On 2025/5/17 14:47, Nico Pache wrote:
On Thu, May 15, 2025 at 9:20 PM Baolin Wang
<[email protected]> wrote:
On 2025/5/15 11:22, Nico Pache wrote:
khugepaged scans anons PMD ranges for potential collapse to ahugepage.To add mTHP support we use this scan to instead record chunks ofutilized
sections of the PMD.
khugepaged_scan_bitmap uses a stack struct to recursively scan abitmapthat represents chunks of utilized regions. We can then determinewhatmTHP size fits best and in the following patch, we set thisbitmap whilescanning the anon PMD. A minimum collapse order of 2 is used asthis is
the lowest order supported by anon memory.
max_ptes_none is used as a scale to determine how "full" an ordermust
be before being considered for collapse.
When attempting to collapse an order that has its order set to"always"
lets always collapse to that order in a greedy manner without
considering the number of bits set.

Signed-off-by: Nico Pache <[email protected]>
Sigh. You still haven't addressed or explained the issues Ipreviously
raised [1], so I don't know how to review this patch again...
Can you still reproduce this issue?
Yes, I can still reproduce this issue with today's (5/20) mm-newbranch.
I've disabled PMD-sized THP in my system:
[root]# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
[root]# cat/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
always inherit madvise [never]

And I tried calling madvise() with MADV_COLLAPSE for anonymous memory,
and I can still see it collapsing to a PMD-sized THP.

Hi Baolin ! Thank you for your reply and willingness to test again :)

I didn't realize we were talking about madvise collapse-- this makes
sense now. I also figured out why I could "reproduce" it before. My
script was always enabling the THP settings in two places, and I only
commented out one to test this. But this time I was doing more manual
testing.

The original design of madvise_collapse ignores the sysfs and
collapses even if you have an order disabled. I believe this behavior
is wrong, but by design. I spent some time playing around with madvise
collapses with and w/o my changes. This is not a new thing, I
reproduced the issue in 6.11 (Fedora 41), and I think its been
possible since the inception of madvise collapse 3 years ago. I
noticed a similar behavior on one of my RFC since it was "breaking"
selftests, and the fix was to reincorporate this broken sysfs
behavior.


OK. Thanks for the explanation.

7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepagecollapse")

"This call is independent of the system-wide THP sysfs settings, but
will fail for memory marked VM_NOHUGEPAGE."

The second condition holds true (and fails for VM_NOHUGEPAGE), but I
dont know if we actually want madvise_collapse to be independent of
the system-wide.


This design principle surprised me a bit, and I failed to find the
reason in the commit log. I agree that "never should mean never," and we
should respect the THP/mTHP sysfs setting. Additionally, for the
'shmem_enabled' sysfs interface controlled for shmem/tmpfs, THP collapse
can still be prohibited through the 'deny' configuration. The rules here
are somewhat confusing.

I recall that we decided to overwrite "VM_NOHUGEPAGE", because theassumption is that the same app that triggered MADV_NOHUGEPAGE triggersthe collapse. So the app decides on its own behavior.

Similarly, allowing for collapsing in a VM without VM_HUGEPAGE in the"madvise" mode would be fine.


But in the "never" case, we should just "never" collapse.


OK. Let's fix the "never" case first. Thanks.

Re: [PATCH v7 06/12] khugepaged: introduce khugepaged_scan_bitmap for mTHP support

Reply via email to