This is a note to let you know that I've just added the patch titled

    hugetlb: fix race condition in hugetlb_fault()

to the 3.2-stable tree which can be found at:
    
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     hugetlb-fix-race-condition-in-hugetlb_fault.patch
and it can be found in the queue-3.2 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <[email protected]> know about it.


>From 66aebce747eaf9bc456bf1f1b217d8db843031d0 Mon Sep 17 00:00:00 2001
From: Chris Metcalf <[email protected]>
Date: Thu, 12 Apr 2012 12:49:15 -0700
Subject: hugetlb: fix race condition in hugetlb_fault()

From: Chris Metcalf <[email protected]>

commit 66aebce747eaf9bc456bf1f1b217d8db843031d0 upstream.

The race is as follows:

Suppose a multi-threaded task forks a new process (on cpu A), thus
bumping up the ref count on all the pages.  While the fork is occurring
(and thus we have marked all the PTEs as read-only), another thread in
the original process (on cpu B) tries to write to a huge page, taking an
access violation from the write-protect and calling hugetlb_cow().  Now,
suppose the fork() fails.  It will undo the COW and decrement the ref
count on the pages, so the ref count on the huge page drops back to 1.
Meanwhile hugetlb_cow() also decrements the ref count by one on the
original page, since the original address space doesn't need it any
more, having copied a new page to replace the original page.  This
leaves the ref count at zero, and when we call unlock_page(), we panic.

        fork on CPU A                           fault on CPU B
        =============                           ==============
        ...
        down_write(&parent->mmap_sem);
        down_write_nested(&child->mmap_sem);
        ...
        while duplicating vmas
                if error
                        break;
        ...
        up_write(&child->mmap_sem);
        up_write(&parent->mmap_sem);            ...
                                                down_read(&parent->mmap_sem);
                                                ...
                                                lock_page(page);
                                                handle COW
                                                page_mapcount(old_page) == 2
                                                alloc and prepare new_page
        ...
        handle error
        page_remove_rmap(page);
        put_page(page);
        ...
                                                fold new_page into pte
                                                page_remove_rmap(page);
                                                put_page(page);
                                                ...
                                oops ==>        unlock_page(page);
                                                up_read(&parent->mmap_sem);

The solution is to take an extra reference to the page while we are
holding the lock on it.

Signed-off-by: Chris Metcalf <[email protected]>
Cc: Hillf Danton <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
 mm/hugetlb.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2686,6 +2686,7 @@ int hugetlb_fault(struct mm_struct *mm,
         * so no worry about deadlock.
         */
        page = pte_page(entry);
+       get_page(page);
        if (page != pagecache_page)
                lock_page(page);
 
@@ -2717,6 +2718,7 @@ out_page_table_lock:
        }
        if (page != pagecache_page)
                unlock_page(page);
+       put_page(page);
 
 out_mutex:
        mutex_unlock(&hugetlb_instantiation_mutex);


Patches currently in stable-queue which might be from [email protected] are

queue-3.2/hugetlb-fix-race-condition-in-hugetlb_fault.patch
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to