On Mon Jun 15, 2026 at 2:38 PM UTC, Vlastimil Babka (SUSE) wrote:
> On 6/12/26 17:29, Gregory Price wrote:
>> On Wed, Jun 10, 2026 at 04:12:52PM -0400, Gregory Price wrote:
>>> On Wed, Jun 10, 2026 at 08:59:59PM +0200, David Hildenbrand (Arm) wrote:
>>> > >
>>> > > I understand this question in two ways:
>>> > >
>>> > > 1) Can we disallow PAGE allocation and limit this to FOLIO allocation
>>> >
>>> > Yes. Can we only allow folios to be allocated from private memory nodes.
>>> > So let
>>> > me reply to that one below.
>>> >
>>> ... snip ...
>>> >
>>> > At LSF/MM we talked about how GFP flags are bad and how deriving stuff
>>> > from the
>>> > context might be better. I think there was also talk about how the
>>> > memalloc_*
>>> > interface might be a better way forward. Maybe we would start giving the
>>> > allocator more context ("we are allocating a folio").
>>> >
>>> > The following is incomplete (esp. hugetlb stuff I assume), just as some
>>> > idea:
>>> >
>>>
>>> I will still probably send the next RFC version tomorrow or friday,
>>> as I want to get some eyes on the __GFP_PRIVATE-less pattern.
>>>
>>> Also, I made a new `anondax` driver which enables userland testing
>>> of this functionality without any specialty hardware.
>>>
>>
>> (apologies for the length of this email: this will all be covered in
>> the coming cover letter, but I just wanted to share a bit of a preview)
>>
>> ===
>>
>> Just another small update - I am planning to post the RFC today once i
>> get some mild cleanup done. It will be based on the dax atomic hotplug
>>
>> https://lore.kernel.org/linux-mm/[email protected]/
>>
>> But a couple specific details regarding the memalloc pieces that i've
>> learned the past couple of days playing with it.
>>
>> 1) memalloc_folio is required to ensure non-folio allocations don't land
>> on the private node, even if it happens within a memalloc_private
>> context. Since memalloc_folio may be useful in contexts outside of
>> private nodes, I kept this as a separate flag.
>>
>> If we think there will *never* be additional users of memalloc_folio,
>> then we could fold _folio into _private to save the flag for now and
>> add it back when we actually need it.
>>
>> 2) memalloc_private is needed to unlock private nodes, but in the
>> original NOFALLBACK-only design, you also needed __GFP_THISNODE.
>>
>> This is *highly* restrictive. I found when playing with mbind that
>> MPOL_BIND + __GFP_THISNODE generates a WARN (valid WARN, it normally
>> implies a bug).
>>
>> That leads me to #3
>
> I think the memalloc approach is dangerous due to unexpected nesting. There
> might be nested page allocations in page allocation itself (due to some
> debugging option). But also interrupts do not change what "current" points
> to. Suddenly those could start requesting folios and/or private nodes and be
> surprised, I'm afraid.
Minor side-note: couldn't we just define it such that the allocator
ignores the context when not in_task() (and warn if you try to enter the
context while not currently in_task())?
(Don't think this would change the conclusion very much, e.g. doesn't
help with the nesting issues. Mostly curious in case I'm missing a
detail here).
> The memalloc scopes only work well when they restrict the context wrt
> reclaim, and allocations in IRQ have to be already restricted heavily
> (atomic) so further memalloc restrictions don't do anything in practice. But
> to make them change other aspects of the allocations like this won't work.