On Tue, Jun 09, 2026 at 04:58:14AM +0100, Matthew Wilcox wrote: > OK, here's how I'd structure this:
Thanks a lot for looking into this and writing this Matthew! Looks workable, let's see if there's rough consensus around this. Two questions to make sure I understand. > > 1. Introduce PG_zeroed for buddy pages > 2. Set it if init_on_free is set Not 100% sure why we want this bit. And I am not sure this works actually because init_on_free does kernel_init_pages and does not flush cache on arm32. You will notice that user_alloc_needs_zeroing ignores init_on_free. Right? How about we skip step 2, make the patchset a bit smaller? > 3. Set it from balloon driver > > https://lore.kernel.org/lkml/c7094de807c0e963526686e1d245bc76193b1a92.1776689093.git....@redhat.com/ > > > but add FPI_ZEROED instead of an extra bool parameter. > > 4. Introduce page_is_zeroed like this: > > static inline bool page_is_zeroed(const struct page *page) > { > /* > * lru.next has bit 2 set if the page is already zeroed. > * Callers may simply overwrite it once they no longer > * need to preserve that information. > */ > return (unsigned long)page->lru.next & BIT(2); > } > > (you'll notice this is similar to page_is_pfmemalloc() but it doesn't > need to be in mm.h) > > This step is going to be a bit fiddly. We weren't expecting to return > multiple flags in page->lru.next, so clear_page_pfmemalloc() just sets > page->lru.next to NULL. So somewhere we need to make sure that > page->lru.next is definitely NULL, and then allow both the zeroed and > pfmemalloc flags to be set in it. > > The important part of this is that it allows the zeroed flag to be > returned from the page allocator without introducing pghint_t like you > did in v2. > > 5. Now you can start skipping various zeroing steps higher in the call > chain. > I understand David's disgust with vma_alloc_zeroed_movable_folio() > but that is surely a separate cleanup and nothing to do with this > patchset. One other question: would people like to see it as a single patchset or multiple ones 1-4? Multiple ones would be easier to review but of course this means no actual perf gain until part 5 is merged. Is that acceptable? -- MST

