4.13-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andrea Arcangeli <[email protected]>

commit 1e3921471354244f70fe268586ff94a97a6dd4df upstream.

This oops:

  kernel BUG at fs/hugetlbfs/inode.c:484!
  RIP: remove_inode_hugepages+0x3d0/0x410
  Call Trace:
    hugetlbfs_setattr+0xd9/0x130
    notify_change+0x292/0x410
    do_truncate+0x65/0xa0
    do_sys_ftruncate.constprop.3+0x11a/0x180
    SyS_ftruncate+0xe/0x10
    tracesys+0xd9/0xde

was caused by the lack of i_size check in hugetlb_mcopy_atomic_pte.

mmap() can still succeed beyond the end of the i_size after vmtruncate
zapped vmas in those ranges, but the faults must not succeed, and that
includes UFFDIO_COPY.

We could differentiate the retval to userland to represent a SIGBUS like
a page fault would do (vs SIGSEGV), but it doesn't seem very useful and
we'd need to pick a random retval as there's no meaningful syscall
retval that would differentiate from SIGSEGV and SIGBUS, there's just
-EFAULT.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Andrea Arcangeli <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: "Dr. David Alan Gilbert" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
 mm/hugetlb.c |   32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3977,6 +3977,9 @@ int hugetlb_mcopy_atomic_pte(struct mm_s
                            unsigned long src_addr,
                            struct page **pagep)
 {
+       struct address_space *mapping;
+       pgoff_t idx;
+       unsigned long size;
        int vm_shared = dst_vma->vm_flags & VM_SHARED;
        struct hstate *h = hstate_vma(dst_vma);
        pte_t _dst_pte;
@@ -4014,13 +4017,24 @@ int hugetlb_mcopy_atomic_pte(struct mm_s
        __SetPageUptodate(page);
        set_page_huge_active(page);
 
+       mapping = dst_vma->vm_file->f_mapping;
+       idx = vma_hugecache_offset(h, dst_vma, dst_addr);
+
        /*
         * If shared, add to page cache
         */
        if (vm_shared) {
-               struct address_space *mapping = dst_vma->vm_file->f_mapping;
-               pgoff_t idx = vma_hugecache_offset(h, dst_vma, dst_addr);
+               size = i_size_read(mapping->host) >> huge_page_shift(h);
+               ret = -EFAULT;
+               if (idx >= size)
+                       goto out_release_nounlock;
 
+               /*
+                * Serialization between remove_inode_hugepages() and
+                * huge_add_to_page_cache() below happens through the
+                * hugetlb_fault_mutex_table that here must be hold by
+                * the caller.
+                */
                ret = huge_add_to_page_cache(page, mapping, idx);
                if (ret)
                        goto out_release_nounlock;
@@ -4029,6 +4043,20 @@ int hugetlb_mcopy_atomic_pte(struct mm_s
        ptl = huge_pte_lockptr(h, dst_mm, dst_pte);
        spin_lock(ptl);
 
+       /*
+        * Recheck the i_size after holding PT lock to make sure not
+        * to leave any page mapped (as page_mapped()) beyond the end
+        * of the i_size (remove_inode_hugepages() is strict about
+        * enforcing that). If we bail out here, we'll also leave a
+        * page in the radix tree in the vm_shared case beyond the end
+        * of the i_size, but remove_inode_hugepages() will take care
+        * of it as soon as we drop the hugetlb_fault_mutex_table.
+        */
+       size = i_size_read(mapping->host) >> huge_page_shift(h);
+       ret = -EFAULT;
+       if (idx >= size)
+               goto out_release_unlock;
+
        ret = -EEXIST;
        if (!huge_pte_none(huge_ptep_get(dst_pte)))
                goto out_release_unlock;


Reply via email to