On 2025/9/2 00:46, David Hildenbrand wrote:
On 29.08.25 03:55, Baolin Wang wrote:


On 2025/8/28 18:48, Dev Jain wrote:

On 28/08/25 3:16 pm, Baolin Wang wrote:
(Sorry for chiming in late)

On 2025/8/22 22:10, David Hildenbrand wrote:
Once could also easily support the value 255 (HPAGE_PMD_NR / 2- 1),
but not sure
if we have to add that for now.

Yeah not so sure about this, this is a 'just have to know' too, and
yes you
might add it to the docs, but people are going to be mightily
confused, esp if
it's a calculated value.

I don't see any other way around having a separate tunable if we
don't just have
something VERY simple like on/off.

Yeah, not advocating that we add support for other values than 0/511,
really.


Also the mentioned issue sounds like something that needs to be
fixed elsewhere
honestly in the algorithm used to figure out mTHP ranges (I may be
wrong - and
happy to stand corrected if this is somehow inherent, but reallly
feels that
way).

I think the creep is unavoidable for certain values.

If you have the first two pages of a PMD area populated, and you
allow for at least half of the #PTEs to be non/zero, you'd collapse
first a
order-2 folio, then and order-3 ... until you reached PMD order.

So for now we really should just support 0 / 511 to say "don't
collapse if there are holes" vs. "always collapse if there is at
least one pte used".

If we only allow setting 0 or 511, as Nico mentioned before, "At 511,
no mTHP collapses would ever occur anyway, unless you have 2MB
disabled and other mTHP sizes enabled. Technically, at 511, only the
highest enabled order would ever be collapsed."
I didn't understand this statement. At 511, mTHP collapses will occur if
khugepaged cannot get a PMD folio. Our goal is to collapse to the
highest order folio.

Yes, I’m not saying that it’s incorrect behavior when set to 511. What I
mean is, as in the example I gave below, users may only want to allow a
large order collapse when the number of present PTEs reaches half of the
large folio, in order to avoid RSS bloat.

How do these users control allocation at fault time where this parameter is completely ignored?

Sorry, I did not get your point. Why does the 'max_pte_none' need to control allocation at fault time? Could you be more specific? Thanks.

Reply via email to