Re: [PATCH] x86/mm: fix vmemmap leak on memory hot-remove

Mike Rapoport Tue, 19 May 2026 21:50:05 -0700

(adding Vishal)

On Wed, May 20, 2026 at 01:59:49AM +0900, Juhyung Park wrote:
> On Wed, May 20, 2026 at 1:41 AM Dave Hansen <[email protected]> wrote:
> >
> > On 5/19/26 09:27, Juhyung Park wrote:
> > > Hi Dave,
> > >
> > > On Wed, May 20, 2026 at 1:02 AM Dave Hansen <[email protected]> wrote:
> > >>
> > >> On 5/19/26 08:10, Juhyung Park wrote:
> > >>>  #endif
> > >>>       } else {
> > >>> -             pagetable_free(page_ptdesc(page));
> > >>> +             /*
> > >>> +              * Use __free_pages() to honor @order: vmemmap PMD leaves
> > >>> +              * freed here are not compound pages, so pagetable_free()
> > >>> +              * would lose leak 511 of 512 pages per 2 MB chunk.
> > >>> +              */
> > >>> +             __free_pages(page, order);
> > >>>       }
> > >>>  }
> > >>
> > >> I find myself really wondering how much of this came from a human and
> > >> how much from the LLM. Could you share that with us?
> > >
> > > Not my first kernel contribution, just so you know. (first in mm tho)
> > >
> > > I asked Claude to write both the commit body and comment and it was
> > > too verbose. I manually trimmed it down.
> > > Sorry if it still sounds too LLM-ish.
> >
> > Yeah, it still sounded really LLM-ish to me. Still rather chatty.
> >
> > > This was tested on a VM with virtualized CXL device and toggling it
> > > back and forth was visibly causing leaks. kmemleak was unable to catch
> > > this (rightfully so), so I skeptically asked Claude to see if it can
> > > figure it out while pwd was the kernel source the VM was running.
> > > "Access the VM at "ssh -p2223 [email protected]". There's a memory
> > > leak whenever CXL memory switches modes via: daxctl reconfigure-device
> > > --mode=system-ram dax0.0 --force, daxctl reconfigure-device
> > > --mode=devdax dax0.0 --force. Figure out why. If you need to reboot
> > > the VM, do not do it yourself and ask me."
> > >
> > > It did in 6 minutes and it basically told me to revert bf9e4e30f353. I
> > > was very skeptical and reviewed manually (with my short knowledge of
> > > mm) why this would be a correct fix.
> >
> > Neato.
> >
> > >> We're trying to get _away_ from using the 'struct page' APIs on page
> > >> tables. This goes backwards. Worst case, do:
> > >>
> > >>         /* vmemmap PMD leaves are not compound pages */
> > >>         for (i = 0; i < 1<<order; i++)
> > >>                 pagetable_free(page_ptdesc(&page[i]));
> > >>
> > >> Right?
> > >
> > > Shouldn't I worry about the loop overhead? With order == 9, that's 512
> > > iterations. That's compounded to O(N) when the entire memory size is
> > > in consideration.
> >
> > Is it optimal? No.
> >
> > Will anybody ever notice? Also no.
> >
> > Will anybody ever care? No sir.
> 
> Just spun a test with that loop. It doesn't fix the leak.
> 
> I hate to be the guy that copy-pastas LLM but this is outside my
> knowledge of mm. Claude suggests:
> "Each pagetable_free() on the tails is a no-op: When
> alloc_pages_node(node, gfp, order=9) returns without __GFP_COMP, the
> buddy allocator only sets _refcount = 1 on the head page. The other
> 511 pages (page[1] … page[511]) have _refcount = 0. There's no
> compound metadata, so they aren't "tails" in the folio sense either —
> they're just contiguous pages whose refcounts the allocator never
> touched."
> 
> Any ideas?
> 
> Thanks.
> 
> >
> > Can you measure the difference? I'd wager a beer: No again.
> >
> > Even if someone manages to notice, then you have a clear path to fix it
> > *right*: fix the ptdesc data structure to represent high-order allocations.


-- 
Sincerely yours,
Mike.

Re: [PATCH] x86/mm: fix vmemmap leak on memory hot-remove

Reply via email to