On Thu, Mar 5, 2026 at 9:55 AM Usama Arif <[email protected]> wrote: > > > > On 02/03/2026 21:20, Nico Pache wrote: > > On Thu, Feb 26, 2026 at 4:34 AM Usama Arif <[email protected]> wrote: > >> > >> Device memory migration has two call sites that split huge PMDs: > >> > >> migrate_vma_split_unmapped_folio(): > >> Called from migrate_vma_pages() when migrating a PMD-mapped THP to a > >> destination that doesn't support compound pages. It splits the PMD > >> then splits the folio via folio_split_unmapped(). > >> > >> If the PMD split fails, folio_split_unmapped() would operate on an > >> unsplit folio with inconsistent page table state. Propagate -ENOMEM > >> to skip this page's migration. This is safe as folio_split_unmapped > >> failure would be propagated in a similar way. > >> > >> migrate_vma_insert_page(): > >> Called from migrate_vma_pages() when inserting a page into a VMA > >> during migration back from device memory. If a huge zero PMD exists > >> at the target address, it must be split before PTE insertion. > >> > >> If the split fails, the subsequent pte_alloc() and set_pte_at() would > >> operate on a PMD slot still occupied by the huge zero entry. Use > >> goto abort, consistent with other allocation failures in this function. > >> > >> Signed-off-by: Usama Arif <[email protected]> > >> --- > >> mm/migrate_device.c | 16 ++++++++++++++-- > >> 1 file changed, 14 insertions(+), 2 deletions(-) > >> > >> diff --git a/mm/migrate_device.c b/mm/migrate_device.c > >> index 78c7acf024615..bc53e06fd9735 100644 > >> --- a/mm/migrate_device.c > >> +++ b/mm/migrate_device.c > >> @@ -909,7 +909,13 @@ static int migrate_vma_split_unmapped_folio(struct > >> migrate_vma *migrate, > >> int ret = 0; > >> > >> folio_get(folio); > > > > Should we be concerned about this folio_get? Are we incrementing a > > reference that was already held if we back out of the split? > > > > -- Nico > > > > Hi Nico, > > Thanks for pointing this out. It spun out to an entire investigation for me > [1].
Hey Usama, I'm sorry my question lead you down a rabbit hole but I'm glad you did the proper investigation and found the correct answer :) Thanks for looking into it and for clearing that up via a comment! Cheers, -- Nico > > Similar to [1], I inserted trace prints [2] and created a new > __split_huge_pmd2 > that always returns -ENOMEM. Without folio_put on error [3], we get a > refcount of 2. > > hmm-tests-129 [000] .l... 1.485514: __migrate_device_finalize: > FINALIZE[0]: src=ffb48827440e8000 dst=ffb48827440e8000 src==dst=1 > refcount_src=2 mapcount_src=0 order_src=9 migrate=0 BEFORE > remove_migration_ptes > hmm-tests-129 [000] .l... 1.485517: __migrate_device_finalize: > FINALIZE[0]: src=ffb48827440e8000 refcount=3 mapcount=1 AFTER > remove_migration_ptes > hmm-tests-129 [000] .l... 1.485518: __migrate_device_finalize: > FINALIZE[0]: src=ffb48827440e8000 refcount=2 AFTER folio_put(src) > > > With folio_put on error [4], we get a refcount of 1. > > hmm-tests-129 [001] ..... 1.492216: __migrate_device_finalize: > FINALIZE[0]: src=fff7b8be840f0000 dst=fff7b8be840f0000 src==dst=1 > refcount_src=1 mapcount_src=0 order_src=9 migrate=0 BEFORE > remove_migration_ptes > hmm-tests-129 [001] ..... 1.492219: __migrate_device_finalize: > FINALIZE[0]: src=fff7b8be840f0000 refcount=2 mapcount=1 AFTER > remove_migration_ptes > hmm-tests-129 [001] ..... 1.492220: __migrate_device_finalize: > FINALIZE[0]: src=fff7b8be840f0000 refcount=1 AFTER folio_put(src) > > > So we need folio_put for split_huge_pmd_address failure, but NOT for > folio_split_unmapped. > > > [1] > https://lore.kernel.org/all/[email protected]/ > [2] https://gist.github.com/uarif1/6abe4bedb85814e9be8d48a4fe742b41 > [3] https://gist.github.com/uarif1/f718af2113bc1a33484674b61b9dafcc > [4] https://gist.github.com/uarif1/03c42f2549eaf2bc555e8b03e07a63c8 >
