Now, memory barrier in __do_huge_pmd_anonymous_page doesn't work.
Because lru_cache_add_lru uses pagevec so it could miss spinlock
easily so above rule was broken so user might see inconsistent data.

I was not first person who pointed out the problem. Mel and Peter
pointed out a few months ago and Peter pointed out further that
even spin_lock/unlock can't make sure it.
http://marc.info/?t=134333512700004

        In particular:

                *A = a;
                LOCK
                UNLOCK
                *B = b;

        may occur as:

                LOCK, STORE *B, STORE *A, UNLOCK

At last, Hugh pointed out that even we don't need memory barrier
in there because __SetPageUpdate already have done it from
Nick's [1] explicitly.

So this patch fixes comment on THP and adds same comment for
do_anonymous_page, too because everybody except Hugh was missing
that. It means we needs COMMENT about that.

[1] 0ed361dec "mm: fix PageUptodate data race"

Cc: Mel Gorman <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Kamezawa Hiroyuki <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Acked-by: Andrea Arcangeli <[email protected]>
Signed-off-by: Minchan Kim <[email protected]>
---
* from v1
  * Add Acked-by from Andrea
  * Comment correction

 mm/huge_memory.c | 11 +++++------
 mm/memory.c      |  5 +++++
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e2f7f5aa..f2f17ff 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -713,6 +713,11 @@ static int __do_huge_pmd_anonymous_page(struct mm_struct 
*mm,
                return VM_FAULT_OOM;
 
        clear_huge_page(page, haddr, HPAGE_PMD_NR);
+       /*
+        * The memory barrier inside __SetPageUptodate makes sure that
+        * clear_huge_page writes become visible before the set_pmd_at()
+        * write.
+        */
        __SetPageUptodate(page);
 
        spin_lock(&mm->page_table_lock);
@@ -724,12 +729,6 @@ static int __do_huge_pmd_anonymous_page(struct mm_struct 
*mm,
        } else {
                pmd_t entry;
                entry = mk_huge_pmd(page, vma);
-               /*
-                * The spinlocking to take the lru_lock inside
-                * page_add_new_anon_rmap() acts as a full memory
-                * barrier to be sure clear_huge_page writes become
-                * visible after the set_pmd_at() write.
-                */
                page_add_new_anon_rmap(page, vma, haddr);
                set_pmd_at(mm, haddr, pmd, entry);
                pgtable_trans_huge_deposit(mm, pgtable);
diff --git a/mm/memory.c b/mm/memory.c
index 494526a..d0da51e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3196,6 +3196,11 @@ static int do_anonymous_page(struct mm_struct *mm, 
struct vm_area_struct *vma,
        page = alloc_zeroed_user_highpage_movable(vma, address);
        if (!page)
                goto oom;
+       /*
+        * The memory barrier inside __SetPageUptodate makes sure that
+        * preceeding stores to the page contents become visible before
+        * the set_pte_at() write.
+        */
        __SetPageUptodate(page);
 
        if (mem_cgroup_newpage_charge(page, mm, GFP_KERNEL))
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to