On Thu, Mar 5, 2026 at 9:55 AM Usama Arif <[email protected]> wrote:
>
>
>
> On 02/03/2026 21:20, Nico Pache wrote:
> > On Thu, Feb 26, 2026 at 4:34 AM Usama Arif <[email protected]> wrote:
> >>
> >> Device memory migration has two call sites that split huge PMDs:
> >>
> >> migrate_vma_split_unmapped_folio():
> >>   Called from migrate_vma_pages() when migrating a PMD-mapped THP to a
> >>   destination that doesn't support compound pages.  It splits the PMD
> >>   then splits the folio via folio_split_unmapped().
> >>
> >>   If the PMD split fails, folio_split_unmapped() would operate on an
> >>   unsplit folio with inconsistent page table state.  Propagate -ENOMEM
> >>   to skip this page's migration. This is safe as folio_split_unmapped
> >>   failure would be propagated in a similar way.
> >>
> >> migrate_vma_insert_page():
> >>   Called from migrate_vma_pages() when inserting a page into a VMA
> >>   during migration back from device memory.  If a huge zero PMD exists
> >>   at the target address, it must be split before PTE insertion.
> >>
> >>   If the split fails, the subsequent pte_alloc() and set_pte_at() would
> >>   operate on a PMD slot still occupied by the huge zero entry.  Use
> >>   goto abort, consistent with other allocation failures in this function.
> >>
> >> Signed-off-by: Usama Arif <[email protected]>
> >> ---
> >>  mm/migrate_device.c | 16 ++++++++++++++--
> >>  1 file changed, 14 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
> >> index 78c7acf024615..bc53e06fd9735 100644
> >> --- a/mm/migrate_device.c
> >> +++ b/mm/migrate_device.c
> >> @@ -909,7 +909,13 @@ static int migrate_vma_split_unmapped_folio(struct 
> >> migrate_vma *migrate,
> >>         int ret = 0;
> >>
> >>         folio_get(folio);
> >
> > Should we be concerned about this folio_get? Are we incrementing a
> > reference that was already held if we back out of the split?
> >
> > -- Nico
>
>
>
> Hi Nico,
>
> Thanks for pointing this out. It spun out to an entire investigation for me 
> [1].

Hey Usama,

I'm sorry my question lead you down a rabbit hole but I'm glad you did
the proper investigation and found the correct answer :) Thanks for
looking into it and for clearing that up via a comment!

Cheers,
-- Nico

>
> Similar to [1], I inserted trace prints [2] and created a new 
> __split_huge_pmd2
> that always returns -ENOMEM. Without folio_put on error [3], we get a 
> refcount of 2.
>
>        hmm-tests-129     [000] .l...     1.485514: __migrate_device_finalize: 
> FINALIZE[0]: src=ffb48827440e8000 dst=ffb48827440e8000 src==dst=1 
> refcount_src=2 mapcount_src=0 order_src=9 migrate=0 BEFORE 
> remove_migration_ptes
>        hmm-tests-129     [000] .l...     1.485517: __migrate_device_finalize: 
> FINALIZE[0]: src=ffb48827440e8000 refcount=3 mapcount=1 AFTER 
> remove_migration_ptes
>        hmm-tests-129     [000] .l...     1.485518: __migrate_device_finalize: 
> FINALIZE[0]: src=ffb48827440e8000 refcount=2 AFTER folio_put(src)
>
>
> With folio_put on error [4], we get a refcount of 1.
>
>        hmm-tests-129     [001] .....     1.492216: __migrate_device_finalize: 
> FINALIZE[0]: src=fff7b8be840f0000 dst=fff7b8be840f0000 src==dst=1 
> refcount_src=1 mapcount_src=0 order_src=9 migrate=0 BEFORE 
> remove_migration_ptes
>        hmm-tests-129     [001] .....     1.492219: __migrate_device_finalize: 
> FINALIZE[0]: src=fff7b8be840f0000 refcount=2 mapcount=1 AFTER 
> remove_migration_ptes
>        hmm-tests-129     [001] .....     1.492220: __migrate_device_finalize: 
> FINALIZE[0]: src=fff7b8be840f0000 refcount=1 AFTER folio_put(src)
>
>
> So we need folio_put for split_huge_pmd_address failure, but NOT for
> folio_split_unmapped.
>
>
> [1] 
> https://lore.kernel.org/all/[email protected]/
> [2] https://gist.github.com/uarif1/6abe4bedb85814e9be8d48a4fe742b41
> [3] https://gist.github.com/uarif1/f718af2113bc1a33484674b61b9dafcc
> [4] https://gist.github.com/uarif1/03c42f2549eaf2bc555e8b03e07a63c8
>


Reply via email to