4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kirill A. Shutemov <kirill.shute...@linux.intel.com>

commit 0a85e51d37645e9ce57e5e1a30859e07810ed07c upstream.

Patch series "thp: fix few MADV_DONTNEED races"

For MADV_DONTNEED to work properly with huge pages, it's critical to not
clear pmd intermittently unless you hold down_write(mmap_sem).

Otherwise MADV_DONTNEED can miss the THP which can lead to userspace
breakage.

See example of such race in commit message of patch 2/4.

All these races are found by code inspection.  I haven't seen them
triggered.  I don't think it's worth to apply them to stable@.

This patch (of 4):

Restructure code in preparation for a fix.

Link: 
http://lkml.kernel.org/r/20170302151034.27829-2-kirill.shute...@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
Acked-by: Vlastimil Babka <vba...@suse.cz>
Cc: Andrea Arcangeli <aarca...@redhat.com>
Cc: Hillf Danton <hillf...@alibaba-inc.com>
Cc: <sta...@vger.kernel.org>
Signed-off-by: Andrew Morton <a...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torva...@linux-foundation.org>
[jwang: adjust context for 4.4 kernel]
Signed-off-by: Jack Wang <jinpu.w...@profitbricks.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>
---
 mm/huge_memory.c |   54 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 28 insertions(+), 26 deletions(-)

--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1566,35 +1566,37 @@ int change_huge_pmd(struct vm_area_struc
 {
        struct mm_struct *mm = vma->vm_mm;
        spinlock_t *ptl;
+       pmd_t entry;
+       bool preserve_write;
+
        int ret = 0;
 
-       if (__pmd_trans_huge_lock(pmd, vma, &ptl) == 1) {
-               pmd_t entry;
-               bool preserve_write = prot_numa && pmd_write(*pmd);
-               ret = 1;
-
-               /*
-                * Avoid trapping faults against the zero page. The read-only
-                * data is likely to be read-cached on the local CPU and
-                * local/remote hits to the zero page are not interesting.
-                */
-               if (prot_numa && is_huge_zero_pmd(*pmd)) {
-                       spin_unlock(ptl);
-                       return ret;
-               }
-
-               if (!prot_numa || !pmd_protnone(*pmd)) {
-                       entry = pmdp_huge_get_and_clear_notify(mm, addr, pmd);
-                       entry = pmd_modify(entry, newprot);
-                       if (preserve_write)
-                               entry = pmd_mkwrite(entry);
-                       ret = HPAGE_PMD_NR;
-                       set_pmd_at(mm, addr, pmd, entry);
-                       BUG_ON(!preserve_write && pmd_write(entry));
-               }
-               spin_unlock(ptl);
-       }
+       if (__pmd_trans_huge_lock(pmd, vma, &ptl) != 1)
+               return 0;
+
+       preserve_write = prot_numa && pmd_write(*pmd);
+       ret = 1;
+
+       /*
+        * Avoid trapping faults against the zero page. The read-only
+        * data is likely to be read-cached on the local CPU and
+        * local/remote hits to the zero page are not interesting.
+        */
+       if (prot_numa && is_huge_zero_pmd(*pmd))
+               goto unlock;
+
+       if (prot_numa && pmd_protnone(*pmd))
+               goto unlock;
 
+       entry = pmdp_huge_get_and_clear_notify(mm, addr, pmd);
+       entry = pmd_modify(entry, newprot);
+       if (preserve_write)
+               entry = pmd_mkwrite(entry);
+       ret = HPAGE_PMD_NR;
+       set_pmd_at(mm, addr, pmd, entry);
+       BUG_ON(!preserve_write && pmd_write(entry));
+unlock:
+       spin_unlock(ptl);
        return ret;
 }
 


Reply via email to