Subject: + mm-use-paravirt-friendly-ops-for-numa-hinting-ptes.patch added to 
-mm tree
To: 
[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected]
From: [email protected]
Date: Tue, 15 Apr 2014 13:17:30 -0700


The patch titled
     Subject: mm: use paravirt friendly ops for NUMA hinting ptes
has been added to the -mm tree.  Its filename is
     mm-use-paravirt-friendly-ops-for-numa-hinting-ptes.patch

This patch should soon appear at
    
http://ozlabs.org/~akpm/mmots/broken-out/mm-use-paravirt-friendly-ops-for-numa-hinting-ptes.patch
and later at
    
http://ozlabs.org/~akpm/mmotm/broken-out/mm-use-paravirt-friendly-ops-for-numa-hinting-ptes.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/SubmitChecklist when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Mel Gorman <[email protected]>
Subject: mm: use paravirt friendly ops for NUMA hinting ptes

David Vrabel identified a regression when using automatic NUMA balancing
under Xen whereby page table entries were getting corrupted due to the use
of native PTE operations.  Quoting him

        Xen PV guest page tables require that their entries use machine
        addresses if the preset bit (_PAGE_PRESENT) is set, and (for
        successful migration) non-present PTEs must use pseudo-physical
        addresses.  This is because on migration MFNs in present PTEs are
        translated to PFNs (canonicalised) so they may be translated back
        to the new MFN in the destination domain (uncanonicalised).

        pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma()
        set and clear the _PAGE_PRESENT bit using pte_set_flags(),
        pte_clear_flags(), etc.

        In a Xen PV guest, these functions must translate MFNs to PFNs
        when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting
        _PAGE_PRESENT.

His suggested fix converted p[te|md]_[set|clear]_flags to using
paravirt-friendly ops but this is overkill.  He suggested an alternative
of using p[te|md]_modify in the NUMA page table operations but this is
does more work than necessary and would require looking up a VMA for
protections.

This patch modifies the NUMA page table operations to use paravirt
friendly operations to set/clear the flags of interest.  Unfortunately
this will take a performance hit when updating the PTEs on CONFIG_PARAVIRT
but I do not see a way around it that does not break Xen.

Signed-off-by: Mel Gorman <[email protected]>
Acked-by: David Vrabel <[email protected]>
Tested-by: David Vrabel <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Anvin <[email protected]>
Cc: Fengguang Wu <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Steven Noonan <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

 include/asm-generic/pgtable.h |   31 +++++++++++++++++++++++--------
 1 file changed, 23 insertions(+), 8 deletions(-)

diff -puN 
include/asm-generic/pgtable.h~mm-use-paravirt-friendly-ops-for-numa-hinting-ptes
 include/asm-generic/pgtable.h
--- 
a/include/asm-generic/pgtable.h~mm-use-paravirt-friendly-ops-for-numa-hinting-ptes
+++ a/include/asm-generic/pgtable.h
@@ -693,24 +693,35 @@ static inline int pmd_numa(pmd_t pmd)
 #ifndef pte_mknonnuma
 static inline pte_t pte_mknonnuma(pte_t pte)
 {
-       pte = pte_clear_flags(pte, _PAGE_NUMA);
-       return pte_set_flags(pte, _PAGE_PRESENT|_PAGE_ACCESSED);
+       pteval_t val = pte_val(pte);
+
+       val &= ~_PAGE_NUMA;
+       val |= (_PAGE_PRESENT|_PAGE_ACCESSED);
+       return __pte(val);
 }
 #endif
 
 #ifndef pmd_mknonnuma
 static inline pmd_t pmd_mknonnuma(pmd_t pmd)
 {
-       pmd = pmd_clear_flags(pmd, _PAGE_NUMA);
-       return pmd_set_flags(pmd, _PAGE_PRESENT|_PAGE_ACCESSED);
+       pmdval_t val = pmd_val(pmd);
+
+       val &= ~_PAGE_NUMA;
+       val |= (_PAGE_PRESENT|_PAGE_ACCESSED);
+
+       return __pmd(val);
 }
 #endif
 
 #ifndef pte_mknuma
 static inline pte_t pte_mknuma(pte_t pte)
 {
-       pte = pte_set_flags(pte, _PAGE_NUMA);
-       return pte_clear_flags(pte, _PAGE_PRESENT);
+       pteval_t val = pte_val(pte);
+
+       val &= ~_PAGE_PRESENT;
+       val |= _PAGE_NUMA;
+
+       return __pte(val);
 }
 #endif
 
@@ -729,8 +740,12 @@ static inline void ptep_set_numa(struct
 #ifndef pmd_mknuma
 static inline pmd_t pmd_mknuma(pmd_t pmd)
 {
-       pmd = pmd_set_flags(pmd, _PAGE_NUMA);
-       return pmd_clear_flags(pmd, _PAGE_PRESENT);
+       pmdval_t val = pmd_val(pmd);
+
+       val &= ~_PAGE_PRESENT;
+       val |= _PAGE_NUMA;
+
+       return __pmd(val);
 }
 #endif
 
_

Patches currently in -mm which might be from [email protected] are

mm-use-paravirt-friendly-ops-for-numa-hinting-ptes.patch
x86-require-x86-64-for-automatic-numa-balancing.patch
x86-define-_page_numa-by-reusing-software-bits-on-the-pmd-and-pte-levels.patch
mm-introduce-do_shared_fault-and-drop-do_fault-fix-fix.patch
mm-compactionc-isolate_freepages_block-small-tuneup.patch
mm-only-force-scan-in-reclaim-when-none-of-the-lrus-are-big-enough.patch
do_shared_fault-check-that-mmap_sem-is-held.patch

--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to