[tip:x86/pti] x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image

2018-04-12 Thread tip-bot for Dave Hansen
Commit-ID:  39114b7a743e6759bab4d96b7d9651d44d17e3f9
Gitweb: https://git.kernel.org/tip/39114b7a743e6759bab4d96b7d9651d44d17e3f9
Author: Dave Hansen 
AuthorDate: Fri, 6 Apr 2018 13:55:17 -0700
Committer:  Ingo Molnar 
CommitDate: Thu, 12 Apr 2018 09:06:00 +0200

x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image

Summary:

In current kernels, with PTI enabled, no pages are marked Global. This
potentially increases TLB misses.  But, the mechanism by which the Global
bit is set and cleared is rather haphazard.  This patch makes the process
more explicit.  In the end, it leaves us with Global entries in the page
tables for the areas truly shared by userspace and kernel and increases
TLB hit rates.

The place this patch really shines in on systems without PCIDs.  In this
case, we are using an lseek microbenchmark[1] to see how a reasonably
non-trivial syscall behaves.  Higher is better:

  No Global pages (baseline): 6077741 lseeks/sec
  88 Global Pages (this set): 7528609 lseeks/sec (+23.9%)

On a modern Skylake desktop with PCIDs, the benefits are tangible, but not
huge for a kernel compile (lower is better):

  No Global pages (baseline): 186.951 seconds time elapsed  ( +-  0.35% )
  28 Global pages (this set): 185.756 seconds time elapsed  ( +-  0.09% )
   -1.195 seconds (-0.64%)

I also re-checked everything using the lseek1 test[1]:

  No Global pages (baseline): 15783951 lseeks/sec
  28 Global pages (this set): 16054688 lseeks/sec
 +270737 lseeks/sec (+1.71%)

The effect is more visible, but still modest.

Details:

The kernel page tables are inherited from head_64.S which rudely marks
them as _PAGE_GLOBAL.  For PTI, we have been relying on the grace of
$DEITY and some insane behavior in pageattr.c to clear _PAGE_GLOBAL.
This patch tries to do better.

First, stop filtering out "unsupported" bits from being cleared in the
pageattr code.  It's fine to filter out *setting* these bits but it
is insane to keep us from clearing them.

Then, *explicitly* go clear _PAGE_GLOBAL from the kernel identity map.
Do not rely on pageattr to do it magically.

After this patch, we can see that "GLB" shows up in each copy of the
page tables, that we have the same number of global entries in each
and that they are the *same* entries.

  /sys/kernel/debug/page_tables/current_kernel:11
  /sys/kernel/debug/page_tables/current_user:11
  /sys/kernel/debug/page_tables/kernel:11

  9caae8ad6a1fb53aca2407ec037f612d  current_kernel.GLB
  9caae8ad6a1fb53aca2407ec037f612d  current_user.GLB
  9caae8ad6a1fb53aca2407ec037f612d  kernel.GLB

A quick visual audit also shows that all the entries make sense.
0xfe00 is the cpu_entry_area and 0x81c0
is the entry/exit text:

  0xfe00-0xfe002000   8K ro GLB 
NX pte
  0xfe002000-0xfe003000   4K RW GLB 
NX pte
  0xfe003000-0xfe006000  12K ro GLB 
NX pte
  0xfe006000-0xfe007000   4K ro GLB 
x  pte
  0xfe007000-0xfe00d000  24K RW GLB 
NX pte
  0xfe02d000-0xfe02e000   4K ro GLB 
NX pte
  0xfe02e000-0xfe02f000   4K RW GLB 
NX pte
  0xfe02f000-0xfe032000  12K ro GLB 
NX pte
  0xfe032000-0xfe033000   4K ro GLB 
x  pte
  0xfe033000-0xfe039000  24K RW GLB 
NX pte
  0x81c0-0x81e0   2M ro PSE GLB 
x  pmd

[1.] https://github.com/antonblanchard/will-it-scale/blob/master/tests/lseek1.c

Signed-off-by: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: Andy Lutomirski 
Cc: Arjan van de Ven 
Cc: Borislav Petkov 
Cc: Dan Williams 
Cc: David Woodhouse 
Cc: Greg Kroah-Hartman 
Cc: Hugh Dickins 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Kees Cook 
Cc: Linus Torvalds 
Cc: Nadav Amit 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux...@kvack.org
Link: http://lkml.kernel.org/r/20180406205517.c80fb...@viggo.jf.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/mm/init.c |  8 +---
 arch/x86/mm/pageattr.c | 12 +---
 arch/x86/mm/pti.c  | 25 +
 3 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 583a88c8a6ee..fec82b577c18 100644
--- 

[tip:x86/pti] x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image

2018-04-12 Thread tip-bot for Dave Hansen
Commit-ID:  39114b7a743e6759bab4d96b7d9651d44d17e3f9
Gitweb: https://git.kernel.org/tip/39114b7a743e6759bab4d96b7d9651d44d17e3f9
Author: Dave Hansen 
AuthorDate: Fri, 6 Apr 2018 13:55:17 -0700
Committer:  Ingo Molnar 
CommitDate: Thu, 12 Apr 2018 09:06:00 +0200

x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image

Summary:

In current kernels, with PTI enabled, no pages are marked Global. This
potentially increases TLB misses.  But, the mechanism by which the Global
bit is set and cleared is rather haphazard.  This patch makes the process
more explicit.  In the end, it leaves us with Global entries in the page
tables for the areas truly shared by userspace and kernel and increases
TLB hit rates.

The place this patch really shines in on systems without PCIDs.  In this
case, we are using an lseek microbenchmark[1] to see how a reasonably
non-trivial syscall behaves.  Higher is better:

  No Global pages (baseline): 6077741 lseeks/sec
  88 Global Pages (this set): 7528609 lseeks/sec (+23.9%)

On a modern Skylake desktop with PCIDs, the benefits are tangible, but not
huge for a kernel compile (lower is better):

  No Global pages (baseline): 186.951 seconds time elapsed  ( +-  0.35% )
  28 Global pages (this set): 185.756 seconds time elapsed  ( +-  0.09% )
   -1.195 seconds (-0.64%)

I also re-checked everything using the lseek1 test[1]:

  No Global pages (baseline): 15783951 lseeks/sec
  28 Global pages (this set): 16054688 lseeks/sec
 +270737 lseeks/sec (+1.71%)

The effect is more visible, but still modest.

Details:

The kernel page tables are inherited from head_64.S which rudely marks
them as _PAGE_GLOBAL.  For PTI, we have been relying on the grace of
$DEITY and some insane behavior in pageattr.c to clear _PAGE_GLOBAL.
This patch tries to do better.

First, stop filtering out "unsupported" bits from being cleared in the
pageattr code.  It's fine to filter out *setting* these bits but it
is insane to keep us from clearing them.

Then, *explicitly* go clear _PAGE_GLOBAL from the kernel identity map.
Do not rely on pageattr to do it magically.

After this patch, we can see that "GLB" shows up in each copy of the
page tables, that we have the same number of global entries in each
and that they are the *same* entries.

  /sys/kernel/debug/page_tables/current_kernel:11
  /sys/kernel/debug/page_tables/current_user:11
  /sys/kernel/debug/page_tables/kernel:11

  9caae8ad6a1fb53aca2407ec037f612d  current_kernel.GLB
  9caae8ad6a1fb53aca2407ec037f612d  current_user.GLB
  9caae8ad6a1fb53aca2407ec037f612d  kernel.GLB

A quick visual audit also shows that all the entries make sense.
0xfe00 is the cpu_entry_area and 0x81c0
is the entry/exit text:

  0xfe00-0xfe002000   8K ro GLB 
NX pte
  0xfe002000-0xfe003000   4K RW GLB 
NX pte
  0xfe003000-0xfe006000  12K ro GLB 
NX pte
  0xfe006000-0xfe007000   4K ro GLB 
x  pte
  0xfe007000-0xfe00d000  24K RW GLB 
NX pte
  0xfe02d000-0xfe02e000   4K ro GLB 
NX pte
  0xfe02e000-0xfe02f000   4K RW GLB 
NX pte
  0xfe02f000-0xfe032000  12K ro GLB 
NX pte
  0xfe032000-0xfe033000   4K ro GLB 
x  pte
  0xfe033000-0xfe039000  24K RW GLB 
NX pte
  0x81c0-0x81e0   2M ro PSE GLB 
x  pmd

[1.] https://github.com/antonblanchard/will-it-scale/blob/master/tests/lseek1.c

Signed-off-by: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: Andy Lutomirski 
Cc: Arjan van de Ven 
Cc: Borislav Petkov 
Cc: Dan Williams 
Cc: David Woodhouse 
Cc: Greg Kroah-Hartman 
Cc: Hugh Dickins 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Kees Cook 
Cc: Linus Torvalds 
Cc: Nadav Amit 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux...@kvack.org
Link: http://lkml.kernel.org/r/20180406205517.c80fb...@viggo.jf.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/mm/init.c |  8 +---
 arch/x86/mm/pageattr.c | 12 +---
 arch/x86/mm/pti.c  | 25 +
 3 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 583a88c8a6ee..fec82b577c18 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -161,12 +161,6 @@ struct map_range {
 
 static int page_size_mask;
 
-static void enable_global_pages(void)
-{
-   if (!static_cpu_has(X86_FEATURE_PTI))
-   __supported_pte_mask |= _PAGE_GLOBAL;
-}
-
 static void __init probe_page_size_mask(void)
 {
/*
@@ -187,7 +181,7 @@ static void __init probe_page_size_mask(void)
__supported_pte_mask &= 

[tip:x86/pti] x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image

2018-04-09 Thread tip-bot for Dave Hansen
Commit-ID:  a5df4f1f0d7872f6030dd12b166e570e60ae9e1d
Gitweb: https://git.kernel.org/tip/a5df4f1f0d7872f6030dd12b166e570e60ae9e1d
Author: Dave Hansen 
AuthorDate: Fri, 6 Apr 2018 13:55:17 -0700
Committer:  Ingo Molnar 
CommitDate: Mon, 9 Apr 2018 18:27:34 +0200

x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image

Summary:

In current kernels, with PTI enabled, no pages are marked Global. This
potentially increases TLB misses.  But, the mechanism by which the Global
bit is set and cleared is rather haphazard.  This patch makes the process
more explicit.  In the end, it leaves us with Global entries in the page
tables for the areas truly shared by userspace and kernel and increases
TLB hit rates.

The place this patch really shines in on systems without PCIDs.  In this
case, we are using an lseek microbenchmark[1] to see how a reasonably
non-trivial syscall behaves.  Higher is better:

  No Global pages (baseline): 6077741 lseeks/sec
  88 Global Pages (this set): 7528609 lseeks/sec (+23.9%)

On a modern Skylake desktop with PCIDs, the benefits are tangible, but not
huge for a kernel compile (lower is better):

  No Global pages (baseline): 186.951 seconds time elapsed  ( +-  0.35% )
  28 Global pages (this set): 185.756 seconds time elapsed  ( +-  0.09% )
   -1.195 seconds (-0.64%)

I also re-checked everything using the lseek1 test[1]:

  No Global pages (baseline): 15783951 lseeks/sec
  28 Global pages (this set): 16054688 lseeks/sec
 +270737 lseeks/sec (+1.71%)

The effect is more visible, but still modest.

Details:

The kernel page tables are inherited from head_64.S which rudely marks
them as _PAGE_GLOBAL.  For PTI, we have been relying on the grace of
$DEITY and some insane behavior in pageattr.c to clear _PAGE_GLOBAL.
This patch tries to do better.

First, stop filtering out "unsupported" bits from being cleared in the
pageattr code.  It's fine to filter out *setting* these bits but it
is insane to keep us from clearing them.

Then, *explicitly* go clear _PAGE_GLOBAL from the kernel identity map.
Do not rely on pageattr to do it magically.

After this patch, we can see that "GLB" shows up in each copy of the
page tables, that we have the same number of global entries in each
and that they are the *same* entries.

  /sys/kernel/debug/page_tables/current_kernel:11
  /sys/kernel/debug/page_tables/current_user:11
  /sys/kernel/debug/page_tables/kernel:11

  9caae8ad6a1fb53aca2407ec037f612d  current_kernel.GLB
  9caae8ad6a1fb53aca2407ec037f612d  current_user.GLB
  9caae8ad6a1fb53aca2407ec037f612d  kernel.GLB

A quick visual audit also shows that all the entries make sense.
0xfe00 is the cpu_entry_area and 0x81c0
is the entry/exit text:

  0xfe00-0xfe002000   8K ro GLB 
NX pte
  0xfe002000-0xfe003000   4K RW GLB 
NX pte
  0xfe003000-0xfe006000  12K ro GLB 
NX pte
  0xfe006000-0xfe007000   4K ro GLB 
x  pte
  0xfe007000-0xfe00d000  24K RW GLB 
NX pte
  0xfe02d000-0xfe02e000   4K ro GLB 
NX pte
  0xfe02e000-0xfe02f000   4K RW GLB 
NX pte
  0xfe02f000-0xfe032000  12K ro GLB 
NX pte
  0xfe032000-0xfe033000   4K ro GLB 
x  pte
  0xfe033000-0xfe039000  24K RW GLB 
NX pte
  0x81c0-0x81e0   2M ro PSE GLB 
x  pmd

[1.] https://github.com/antonblanchard/will-it-scale/blob/master/tests/lseek1.c

Signed-off-by: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: Andy Lutomirski 
Cc: Arjan van de Ven 
Cc: Borislav Petkov 
Cc: Dan Williams 
Cc: David Woodhouse 
Cc: Greg Kroah-Hartman 
Cc: Hugh Dickins 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Kees Cook 
Cc: Linus Torvalds 
Cc: Nadav Amit 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux...@kvack.org
Link: http://lkml.kernel.org/r/20180406205517.c80fb...@viggo.jf.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/mm/init.c |  8 +---
 arch/x86/mm/pageattr.c | 12 +---
 arch/x86/mm/pti.c  | 25 +
 3 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 583a88c8a6ee..fec82b577c18 100644
--- 

[tip:x86/pti] x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image

2018-04-09 Thread tip-bot for Dave Hansen
Commit-ID:  a5df4f1f0d7872f6030dd12b166e570e60ae9e1d
Gitweb: https://git.kernel.org/tip/a5df4f1f0d7872f6030dd12b166e570e60ae9e1d
Author: Dave Hansen 
AuthorDate: Fri, 6 Apr 2018 13:55:17 -0700
Committer:  Ingo Molnar 
CommitDate: Mon, 9 Apr 2018 18:27:34 +0200

x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image

Summary:

In current kernels, with PTI enabled, no pages are marked Global. This
potentially increases TLB misses.  But, the mechanism by which the Global
bit is set and cleared is rather haphazard.  This patch makes the process
more explicit.  In the end, it leaves us with Global entries in the page
tables for the areas truly shared by userspace and kernel and increases
TLB hit rates.

The place this patch really shines in on systems without PCIDs.  In this
case, we are using an lseek microbenchmark[1] to see how a reasonably
non-trivial syscall behaves.  Higher is better:

  No Global pages (baseline): 6077741 lseeks/sec
  88 Global Pages (this set): 7528609 lseeks/sec (+23.9%)

On a modern Skylake desktop with PCIDs, the benefits are tangible, but not
huge for a kernel compile (lower is better):

  No Global pages (baseline): 186.951 seconds time elapsed  ( +-  0.35% )
  28 Global pages (this set): 185.756 seconds time elapsed  ( +-  0.09% )
   -1.195 seconds (-0.64%)

I also re-checked everything using the lseek1 test[1]:

  No Global pages (baseline): 15783951 lseeks/sec
  28 Global pages (this set): 16054688 lseeks/sec
 +270737 lseeks/sec (+1.71%)

The effect is more visible, but still modest.

Details:

The kernel page tables are inherited from head_64.S which rudely marks
them as _PAGE_GLOBAL.  For PTI, we have been relying on the grace of
$DEITY and some insane behavior in pageattr.c to clear _PAGE_GLOBAL.
This patch tries to do better.

First, stop filtering out "unsupported" bits from being cleared in the
pageattr code.  It's fine to filter out *setting* these bits but it
is insane to keep us from clearing them.

Then, *explicitly* go clear _PAGE_GLOBAL from the kernel identity map.
Do not rely on pageattr to do it magically.

After this patch, we can see that "GLB" shows up in each copy of the
page tables, that we have the same number of global entries in each
and that they are the *same* entries.

  /sys/kernel/debug/page_tables/current_kernel:11
  /sys/kernel/debug/page_tables/current_user:11
  /sys/kernel/debug/page_tables/kernel:11

  9caae8ad6a1fb53aca2407ec037f612d  current_kernel.GLB
  9caae8ad6a1fb53aca2407ec037f612d  current_user.GLB
  9caae8ad6a1fb53aca2407ec037f612d  kernel.GLB

A quick visual audit also shows that all the entries make sense.
0xfe00 is the cpu_entry_area and 0x81c0
is the entry/exit text:

  0xfe00-0xfe002000   8K ro GLB 
NX pte
  0xfe002000-0xfe003000   4K RW GLB 
NX pte
  0xfe003000-0xfe006000  12K ro GLB 
NX pte
  0xfe006000-0xfe007000   4K ro GLB 
x  pte
  0xfe007000-0xfe00d000  24K RW GLB 
NX pte
  0xfe02d000-0xfe02e000   4K ro GLB 
NX pte
  0xfe02e000-0xfe02f000   4K RW GLB 
NX pte
  0xfe02f000-0xfe032000  12K ro GLB 
NX pte
  0xfe032000-0xfe033000   4K ro GLB 
x  pte
  0xfe033000-0xfe039000  24K RW GLB 
NX pte
  0x81c0-0x81e0   2M ro PSE GLB 
x  pmd

[1.] https://github.com/antonblanchard/will-it-scale/blob/master/tests/lseek1.c

Signed-off-by: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: Andy Lutomirski 
Cc: Arjan van de Ven 
Cc: Borislav Petkov 
Cc: Dan Williams 
Cc: David Woodhouse 
Cc: Greg Kroah-Hartman 
Cc: Hugh Dickins 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Kees Cook 
Cc: Linus Torvalds 
Cc: Nadav Amit 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux...@kvack.org
Link: http://lkml.kernel.org/r/20180406205517.c80fb...@viggo.jf.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/mm/init.c |  8 +---
 arch/x86/mm/pageattr.c | 12 +---
 arch/x86/mm/pti.c  | 25 +
 3 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 583a88c8a6ee..fec82b577c18 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -161,12 +161,6 @@ struct map_range {
 
 static int page_size_mask;
 
-static void enable_global_pages(void)
-{
-   if (!static_cpu_has(X86_FEATURE_PTI))
-   __supported_pte_mask |= _PAGE_GLOBAL;
-}
-
 static void __init probe_page_size_mask(void)
 {
/*
@@ -187,7 +181,7 @@ static void __init probe_page_size_mask(void)
__supported_pte_mask &=