Like teaching zap, mprotect, rmap walks .... code separately.

I'm, sure you'll find a way to break this down so I don't walk out of a
review with an headake ;)


:) I had smaller chunks earlier, but then ran into don't add the change unless 
you
use the change problem


It's perfectly reasonable to have something like

mm/huge_memory: teach copy_huge_pmd() about huge device-private entries
mm/huge_memory: support splitting device-private folios

...

etc :)

[...]

Careful: There is is_readable_exclusive_migration_entry(). So don't
change the !is_readable_migration_entry(entry) to 
is_writable_migration_entry(entry)(),
because it's wrong.


Ack, I assume you are referring to potential prot_none entries?

readable_exclusive are used to maintain the PageAnonExclusive bit right
now for migration entries. So it's not realted to prot_none.

[...]

-            WARN_ONCE(1, "Non present huge pmd without pmd migration 
enabled!");
+
+            if (!thp_migration_supported())
+                WARN_ONCE(1, "Non present huge pmd without pmd migration 
enabled!");
+
+            if (is_pmd_device_private_entry(orig_pmd)) {
+                folio_remove_rmap_pmd(folio, &folio->page, vma);
+                WARN_ON_ONCE(folio_mapcount(folio) < 0);

Can we jsut move that into the folio_is_device_private() check below.

The check you mean?

The whole thing like

if (...) {
        folio_remove_rmap_pmd(folio, &folio->page, vma);
        WARN_ON_ONCE(folio_mapcount(folio) < 0);
        folio_put(folio)
}


[...]


Why do we have to flush? pmd_clear() might be sufficient? In the PTE case we 
use pte_clear().

Without the flush, other entities will not see the cleared pmd and isn't the 
pte_clear() only
when should_defer_flush() is true?

It's a non-present page entry, so there should be no TLB entry to flush.



[...]

           pmde = pmd_mksoft_dirty(pmde);
       if (is_writable_migration_entry(entry))
diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index e05e14d6eacd..0ed337f94fcd 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -136,6 +136,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
                * page table entry. Other special swap entries are not
                * migratable, and we ignore regular swapped page.
                */
+            struct folio *folio;
+
               entry = pte_to_swp_entry(pte);
               if (!is_device_private_entry(entry))
                   goto next;
@@ -147,6 +149,51 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
                   pgmap->owner != migrate->pgmap_owner)
                   goto next;
   +            folio = page_folio(page);
+            if (folio_test_large(folio)) {
+                struct folio *new_folio;
+                struct folio *new_fault_folio = NULL;
+
+                /*
+                 * The reason for finding pmd present with a
+                 * device private pte and a large folio for the
+                 * pte is partial unmaps. Split the folio now
+                 * for the migration to be handled correctly
+                 */

There are also other cases, like any VMA splits. Not sure if that makes a 
difference,
the folio is PTE mapped.


Ack, I can clarify that the folio is just pte mapped or remove the comment

Sounds good.


--
Cheers

David / dhildenb

Reply via email to