This is a note to let you know that I've just added the patch titled
mm: thp: fix BUG on mm->nr_ptes
to the 3.0-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
mm-thp-fix-bug-on-mm-nr_ptes.patch
and it can be found in the queue-3.0 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <[email protected]> know about it.
>From 1c641e84719429bbfe62a95ed3545ee7fe24408f Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli <[email protected]>
Date: Mon, 5 Mar 2012 14:59:20 -0800
Subject: mm: thp: fix BUG on mm->nr_ptes
From: Andrea Arcangeli <[email protected]>
commit 1c641e84719429bbfe62a95ed3545ee7fe24408f upstream.
Dave Jones reports a few Fedora users hitting the BUG_ON(mm->nr_ptes...)
in exit_mmap() recently.
Quoting Hugh's discovery and explanation of the SMP race condition:
"mm->nr_ptes had unusual locking: down_read mmap_sem plus
page_table_lock when incrementing, down_write mmap_sem (or mm_users
0) when decrementing; whereas THP is careful to increment and
decrement it under page_table_lock.
Now most of those paths in THP also hold mmap_sem for read or write
(with appropriate checks on mm_users), but two do not: when
split_huge_page() is called by hwpoison_user_mappings(), and when
called by add_to_swap().
It's conceivable that the latter case is responsible for the
exit_mmap() BUG_ON mm->nr_ptes that has been reported on Fedora."
The simplest way to fix it without having to alter the locking is to make
split_huge_page() a noop in nr_ptes terms, so by counting the preallocated
pagetables that exists for every mapped hugepage. It was an arbitrary
choice not to count them and either way is not wrong or right, because
they are not used but they're still allocated.
Reported-by: Dave Jones <[email protected]>
Reported-by: Hugh Dickins <[email protected]>
Signed-off-by: Andrea Arcangeli <[email protected]>
Acked-by: Hugh Dickins <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Josh Boyer <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/huge_memory.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -641,6 +641,7 @@ static int __do_huge_pmd_anonymous_page(
set_pmd_at(mm, haddr, pmd, entry);
prepare_pmd_huge_pte(pgtable, mm);
add_mm_counter(mm, MM_ANONPAGES, HPAGE_PMD_NR);
+ mm->nr_ptes++;
spin_unlock(&mm->page_table_lock);
}
@@ -759,6 +760,7 @@ int copy_huge_pmd(struct mm_struct *dst_
pmd = pmd_mkold(pmd_wrprotect(pmd));
set_pmd_at(dst_mm, addr, dst_pmd, pmd);
prepare_pmd_huge_pte(pgtable, dst_mm);
+ dst_mm->nr_ptes++;
ret = 0;
out_unlock:
@@ -857,7 +859,6 @@ static int do_huge_pmd_wp_page_fallback(
}
kfree(pages);
- mm->nr_ptes++;
smp_wmb(); /* make pte visible before pmd */
pmd_populate(mm, pmd, pgtable);
page_remove_rmap(page);
@@ -1016,6 +1017,7 @@ int zap_huge_pmd(struct mmu_gather *tlb,
VM_BUG_ON(page_mapcount(page) < 0);
add_mm_counter(tlb->mm, MM_ANONPAGES, -HPAGE_PMD_NR);
VM_BUG_ON(!PageHead(page));
+ tlb->mm->nr_ptes--;
spin_unlock(&tlb->mm->page_table_lock);
tlb_remove_page(tlb, page);
pte_free(tlb->mm, pgtable);
@@ -1310,7 +1312,6 @@ static int __split_huge_page_map(struct
pte_unmap(pte);
}
- mm->nr_ptes++;
smp_wmb(); /* make pte visible before pmd */
/*
* Up to this point the pmd is present and huge and
@@ -1925,7 +1926,6 @@ static void collapse_huge_page(struct mm
set_pmd_at(mm, address, pmd, _pmd);
update_mmu_cache(vma, address, entry);
prepare_pmd_huge_pte(pgtable, mm);
- mm->nr_ptes--;
spin_unlock(&mm->page_table_lock);
#ifndef CONFIG_NUMA
Patches currently in stable-queue which might be from [email protected] are
queue-3.0/mm-thp-fix-bug-on-mm-nr_ptes.patch
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html