Splitting a partially mapped folio caused a regression in the Intel Xe SVM test suite in the mremap section, resulting in the following stack trace:
NFO: task kworker/u65:2:1642 blocked for more than 30 seconds. [ 212.624286] Tainted: G S W 6.18.0-rc6-xe+ #1719 [ 212.638288] Workqueue: xe_page_fault_work_queue xe_pagefault_queue_work [xe] [ 212.638323] Call Trace: [ 212.638324] <TASK> [ 212.638325] __schedule+0x4b0/0x990 [ 212.638330] schedule+0x22/0xd0 [ 212.638331] io_schedule+0x41/0x60 [ 212.638333] migration_entry_wait_on_locked+0x1d8/0x2d0 [ 212.638336] ? __pfx_wake_page_function+0x10/0x10 [ 212.638339] migration_entry_wait+0xd2/0xe0 [ 212.638341] hmm_vma_walk_pmd+0x7c9/0x8d0 [ 212.638343] walk_pgd_range+0x51d/0xa40 [ 212.638345] __walk_page_range+0x75/0x1e0 [ 212.638347] walk_page_range_mm+0x138/0x1f0 [ 212.638349] hmm_range_fault+0x59/0xa0 [ 212.638351] drm_gpusvm_get_pages+0x194/0x7b0 [drm_gpusvm_helper] [ 212.638354] drm_gpusvm_range_get_pages+0x2d/0x40 [drm_gpusvm_helper] [ 212.638355] __xe_svm_handle_pagefault+0x259/0x900 [xe] [ 212.638375] ? update_load_avg+0x7f/0x6c0 [ 212.638377] ? update_curr+0x13d/0x170 [ 212.638379] xe_svm_handle_pagefault+0x37/0x90 [xe] [ 212.638396] xe_pagefault_queue_work+0x2da/0x3c0 [xe] [ 212.638420] process_one_work+0x16e/0x2e0 [ 212.638422] worker_thread+0x284/0x410 [ 212.638423] ? __pfx_worker_thread+0x10/0x10 [ 212.638425] kthread+0xec/0x210 [ 212.638427] ? __pfx_kthread+0x10/0x10 [ 212.638428] ? __pfx_kthread+0x10/0x10 [ 212.638430] ret_from_fork+0xbd/0x100 [ 212.638433] ? __pfx_kthread+0x10/0x10 [ 212.638434] ret_from_fork_asm+0x1a/0x30 [ 212.638436] </TASK> The issue appears to be that migration PTEs are not properly removed after a split due to incorrect retry handling after a split failure or success. Upon failure, collect a skip, and upon success, continue the collection from the current position in the sequence. Also, while here, fix migrate_vma_split_folio to only lock the new fault folio if it is different from the original fault folio (i.e., it’s possible the original fault folio is not the same as the one being split). Cc: Andrew Morton <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Zi Yan <[email protected]> Cc: Joshua Hahn <[email protected]> Cc: Rakie Kim <[email protected]> Cc: Byungchul Park <[email protected]> Cc: Gregory Price <[email protected]> Cc: Ying Huang <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Lorenzo Stoakes <[email protected]> Cc: Baolin Wang <[email protected]> Cc: Liam R. Howlett <[email protected]> Cc: Nico Pache <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Dev Jain <[email protected]> Cc: Barry Song <[email protected]> Cc: Lyude Paul <[email protected]> Cc: Danilo Krummrich <[email protected]> Cc: David Airlie <[email protected]> Cc: Simona Vetter <[email protected]> Cc: Ralph Campbell <[email protected]> Cc: Mika Penttilä <[email protected]> Cc: Francois Dugast <[email protected]> Cc: Balbir Singh <[email protected]> Signed-off-by: Matthew Brost <[email protected]> --- This fixup should be squashed into the patch "mm/migrate_device: handle partially mapped folios during" in mm/mm-unstable --- mm/migrate_device.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/mm/migrate_device.c b/mm/migrate_device.c index fa42d2ebd024..4506e96dcd20 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -110,8 +110,10 @@ static int migrate_vma_split_folio(struct folio *folio, folio_unlock(folio); folio_put(folio); } else if (folio != new_fault_folio) { - folio_get(new_fault_folio); - folio_lock(new_fault_folio); + if (new_fault_folio != fault_folio) { + folio_get(new_fault_folio); + folio_lock(new_fault_folio); + } folio_unlock(folio); folio_put(folio); } @@ -266,10 +268,11 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, return 0; } - ptep = pte_offset_map_lock(mm, pmdp, addr, &ptl); + ptep = pte_offset_map_lock(mm, pmdp, start, &ptl); if (!ptep) goto again; arch_enter_lazy_mmu_mode(); + ptep += (addr - start) / PAGE_SIZE; for (; addr < end; addr += PAGE_SIZE, ptep++) { struct dev_pagemap *pgmap; @@ -351,16 +354,18 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, if (folio && folio_test_large(folio)) { int ret; + arch_leave_lazy_mmu_mode(); pte_unmap_unlock(ptep, ptl); ret = migrate_vma_split_folio(folio, migrate->fault_page); if (ret) { - ptep = pte_offset_map_lock(mm, pmdp, addr, &ptl); - goto next; + if (unmapped) + flush_tlb_range(walk->vma, start, end); + + return migrate_vma_collect_skip(addr, end, walk); } - addr = start; goto again; } mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; -- 2.34.1
