On 6/10/26 18:37, Gregory Price wrote:
> On Wed, Jun 10, 2026 at 05:00:33PM +0200, David Hildenbrand (Arm) wrote:
>> On 6/10/26 12:41, Gregory Price wrote:
>>>
>>> Notably: slub.c injects __GFP_THISNODE internally on behalf of kmalloc,
>>> which causes spillage into private nodes because slub allows private
>>> nodes in its mask. I think this is fixable.
>>>
>>> I have to inspect some other __GFP_THISNODE users (hugetlb, some arch
>>> code, etc), but it seems like fully dropping the FALLBACK entries and
>>> requiring __GFP_THISNODE might be sufficient.
>>
>> Sorry, I haven't been able to follow up so far, and not sure if that's what
>> you
>> are discussing here ...
>>
>> After the LSF/MM session, I was wondering, whether if we focus on allowing
>> only
>> folios allocations to end up on private memory nodes for now: could the
>> __GFP_THISNODE approach work there?
>>
>> Essentially, disallow any allocations on non-folio paths, and allow folio
>> allocation only with __GFP_THISNODE set.
>>
>> I have to find time to read the other mails in this thread, on my todo list.
>>
>> So sorry if that is precisely what is being discussed here.
>>
>
> So, I remember this being asked, and I didn't fully grok the request.
>
> I'm still not sure I fully understand the question, so apologies if I'm
> answer the wrong things here.
>
> I understand this question in two ways:
>
> 1) Can we disallow PAGE allocation and limit this to FOLIO allocation
Yes. Can we only allow folios to be allocated from private memory nodes. So let
me reply to that one below.
> 2) Can we disallow [Feature] (i.e. slab) allocation targeting the node.
>
>
> 1) Can we disallow page allocation and limit this to folios?
>
> No, I don't think so.
>
> Folio allocations are written in terms of page allocations, we would
> have to rewrite folio allocation interfaces and introduce a bunch of
> boilerplate for the sake of this.
>
> struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order,
> int preferred_nid, nodemask_t *nodemask)
> {
> struct page *page;
>
> page = __alloc_frozen_pages_noprof(gfp, order, preferred_nid,
> nodemask);
> if (page)
> set_page_refcounted(page);
> return page;
> }
>
> struct folio *__folio_alloc_noprof(gfp_t gfp, unsigned int order, int
> preferred_nid,
> nodemask_t *nodemask)
> {
> struct page *page = __alloc_pages_noprof(gfp | __GFP_COMP, order,
> preferred_nid, nodemask);
> return page_rmappable_folio(page);
> }
At LSF/MM we talked about how GFP flags are bad and how deriving stuff from the
context might be better. I think there was also talk about how the memalloc_*
interface might be a better way forward. Maybe we would start giving the
allocator more context ("we are allocating a folio").
The following is incomplete (esp. hugetlb stuff I assume), just as some idea:
>From 64aaff5f40497201ecc089c3339df6576184c433 Mon Sep 17 00:00:00 2001
From: "David Hildenbrand (Arm)" <[email protected]>
Date: Wed, 10 Jun 2026 20:55:49 +0200
Subject: [PATCH] tmp
Signed-off-by: David Hildenbrand (Arm) <[email protected]>
---
include/linux/sched.h | 2 +-
include/linux/sched/mm.h | 11 +++++++++++
mm/mempolicy.c | 14 ++++++++++++--
mm/page_alloc.c | 7 ++++++-
4 files changed, 30 insertions(+), 4 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ee06cba5c6f5..9c850b7be6bf 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1778,7 +1778,7 @@ extern struct pid *cad_pid;
* I am cleaning dirty pages
from some other bdi. */
#define PF_KTHREAD 0x00200000 /* I am a kernel thread */
#define PF_RANDOMIZE 0x00400000 /* Randomize virtual address
space */
-#define PF__HOLE__00800000 0x00800000
+#define PF__MEMALLOC_FOLIO 0x00800000 /* Allocating a folio that can
end up on
private memory nodes */
#define PF__HOLE__01000000 0x01000000
#define PF__HOLE__02000000 0x02000000
#define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to
meddle with
cpus_mask */
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index 95d0040df584..2101a447c084 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -471,6 +471,17 @@ static inline void memalloc_pin_restore(unsigned int flags)
memalloc_flags_restore(flags);
}
+static inline unsigned int memalloc_folio_save(void)
+{
+ return memalloc_flags_save(PF_MEMALLOC_FOLIO);
+}
+
+static inline void memalloc_folio_restore(unsigned int flags)
+{
+ memalloc_flags_restore(flags);
+}
+
+
#ifdef CONFIG_MEMCG
DECLARE_PER_CPU(struct mem_cgroup *, int_active_memcg);
/**
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 36699fabd3c2..a78b0e5a1fce 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2506,8 +2506,13 @@ static struct page *alloc_pages_mpol(gfp_t gfp, unsigned
int order,
struct folio *folio_alloc_mpol_noprof(gfp_t gfp, unsigned int order,
struct mempolicy *pol, pgoff_t ilx, int nid)
{
- struct page *page = alloc_pages_mpol(gfp | __GFP_COMP, order, pol,
+ struct page *page;
+ int flags;
+
+ flags = memalloc_folio_save();
+ page = alloc_pages_mpol(gfp | __GFP_COMP, order, pol,
ilx, nid);
+ memalloc_folio_restore(flags);
if (!page)
return NULL;
@@ -2588,7 +2593,12 @@ EXPORT_SYMBOL(alloc_pages_noprof);
struct folio *folio_alloc_noprof(gfp_t gfp, unsigned int order)
{
- return page_rmappable_folio(alloc_pages_noprof(gfp | __GFP_COMP,
order));
+ struct folio *folio;
+ int flags;
+
+ flags = memalloc_folio_save();
+ folio = page_rmappable_folio(alloc_pages_noprof(gfp | __GFP_COMP,
order));
+ memalloc_folio_restore(flags);
+ return folio;
}
EXPORT_SYMBOL(folio_alloc_noprof);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ee902a468c2f..37434b37f7af 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5345,8 +5345,13 @@ EXPORT_SYMBOL(__alloc_pages_noprof);
struct folio *__folio_alloc_noprof(gfp_t gfp, unsigned int order, int
preferred_nid,
nodemask_t *nodemask)
{
- struct page *page = __alloc_pages_noprof(gfp | __GFP_COMP, order,
+ struct page *page;
+ int flags;
+
+ flags = memalloc_folio_save();
+ page = __alloc_pages_noprof(gfp | __GFP_COMP, order,
preferred_nid, nodemask);
+ memalloc_folio_restore(flags);
return page_rmappable_folio(page);
}
EXPORT_SYMBOL(__folio_alloc_noprof);
--
2.43.0
--
Cheers,
David