Migration: find correct vma in new_vma_page()

Linux Kernel Mailing List Wed, 14 Nov 2007 19:59:40 -0800

Gitweb:     
http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3ad33b2436b545cbe8b28e53f3710432cad457ab
Commit:     3ad33b2436b545cbe8b28e53f3710432cad457ab
Parent:     e1a1c997afe907e6ec4799e4be0f38cffd8b418c
Author:     Lee Schermerhorn <[EMAIL PROTECTED]>
AuthorDate: Wed Nov 14 16:59:10 2007 -0800
Committer:  Linus Torvalds <[EMAIL PROTECTED]>
CommitDate: Wed Nov 14 18:45:38 2007 -0800


    Migration: find correct vma in new_vma_page()
    
    We hit the BUG_ON() in mm/rmap.c:vma_address() when trying to migrate via
    mbind(MPOL_MF_MOVE) a non-anon region that spans multiple vmas.  For
    anon-regions, we just fail to migrate any pages beyond the 1st vma in the
    range.
    
    This occurs because do_mbind() collects a list of pages to migrate by
    calling check_range().  check_range() walks the task's mm, spanning vmas as
    necessary, to collect the migratable pages into a list.  Then, do_mbind()
    calls migrate_pages() passing the list of pages, a function to allocate new
    pages based on vma policy [new_vma_page()], and a pointer to the first vma
    of the range.
    
    For each page in the list, new_vma_page() calls page_address_in_vma()
    passing the page and the vma [first in range] to obtain the address to get
    for alloc_page_vma().  The page address is needed to get interleaving
    policy correct.  If the pages in the list come from multiple vmas,
    eventually, new_page_address() will pass that page to page_address_in_vma()
    with the incorrect vma.  For !PageAnon pages, this will result in a bug
    check in rmap.c:vma_address().  For anon pages, vma_address() will just
    return EFAULT and fail the migration.
    
    This patch modifies new_vma_page() to check the return value from
    page_address_in_vma().  If the return value is EFAULT, new_vma_page()
    searchs forward via vm_next for the vma that maps the page--i.e., that does
    not return EFAULT.  This assumes that the pages in the list handed to
    migrate_pages() is in address order.  This is currently case.  The patch
    documents this assumption in a new comment block for new_vma_page().
    
    If new_vma_page() cannot locate the vma mapping the page in a forward
    search in the mm, it will pass a NULL vma to alloc_page_vma().  This will
    result in the allocation using the task policy, if any, else system default
    policy.  This situation is unlikely, but the patch documents this behavior
    with a comment.
    
    Note, this patch results in restarting from the first vma in a multi-vma
    range each time new_vma_page() is called.  If this is not acceptable, we
    can make the vma argument a pointer, both in new_vma_page() and it's caller
    unmap_and_move() so that the value held by the loop in migrate_pages()
    always passes down the last vma in which a page was found.  This will
    require changes to all new_page_t functions passed to migrate_pages().  Is
    this necessary?
    
    For this patch to work, we can't bug check in vma_address() for pages
    outside the argument vma.  This patch removes the BUG_ON().  All other
    callers [besides new_vma_page()] already check the return status.
    
    Tested on x86_64, 4 node NUMA platform.
    
    Signed-off-by: Lee Schermerhorn <[EMAIL PROTECTED]>
    Acked-by: Christoph Lameter <[EMAIL PROTECTED]>
    Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
    Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
---
 mm/mempolicy.c |   21 +++++++++++++++++++--
 mm/rmap.c      |    7 ++++---
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index c1592a9..83c69f8 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -722,12 +722,29 @@ out:
 
 }
 
+/*
+ * Allocate a new page for page migration based on vma policy.
+ * Start assuming that page is mapped by vma pointed to by @private.
+ * Search forward from there, if not.  N.B., this assumes that the
+ * list of pages handed to migrate_pages()--which is how we get here--
+ * is in virtual address order.
+ */
 static struct page *new_vma_page(struct page *page, unsigned long private, int 
**x)
 {
        struct vm_area_struct *vma = (struct vm_area_struct *)private;
+       unsigned long uninitialized_var(address);
 
-       return alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma,
-                                       page_address_in_vma(page, vma));
+       while (vma) {
+               address = page_address_in_vma(page, vma);
+               if (address != -EFAULT)
+                       break;
+               vma = vma->vm_next;
+       }
+
+       /*
+        * if !vma, alloc_page_vma() will use task or system default policy
+        */
+       return alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address);
 }
 #else
 
diff --git a/mm/rmap.c b/mm/rmap.c
index 8990f90..dc3be5f 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -183,7 +183,9 @@ static void page_unlock_anon_vma(struct anon_vma *anon_vma)
 }
 
 /*
- * At what user virtual address is page expected in vma?
+ * At what user virtual address is page expected in @vma?
+ * Returns virtual address or -EFAULT if page's index/offset is not
+ * within the range mapped the @vma.
  */
 static inline unsigned long
 vma_address(struct page *page, struct vm_area_struct *vma)
@@ -193,8 +195,7 @@ vma_address(struct page *page, struct vm_area_struct *vma)
 
        address = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
        if (unlikely(address < vma->vm_start || address >= vma->vm_end)) {
-               /* page should be within any vma from prio_tree_next */
-               BUG_ON(!PageAnon(page));
+               /* page should be within @vma mapping range */
                return -EFAULT;
        }
        return address;
-
To unsubscribe from this list: send the line "unsubscribe git-commits-head" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Migration: find correct vma in new_vma_page()

Reply via email to