On 01/04/2018 04:24 PM, Dave Hansen wrote:
> Changes from v1:
>  * update kernel-parameters.txt to clarify that the pti= option
>    is not just for disabling.  Also describe what 'pti=auto' does
>    and why
>  * Add a note about the presence of NX in the user portion of the
>    kernel page tables
>  * Clarify _additional_ 4k of PGD space
>  * Add a note about the runtime overhead of PCID without INVPCID
> 
> ---
> 
> From: Dave Hansen <dave.han...@linux.intel.com>
> 
> Add some details about how PTI works, what some of the downsides
> are, and how to debug it when things go wrong.
> 
> Also document the kernel parameter: 'nopti'.
> 
> Signed-off-by: Dave Hansen <dave.han...@linux.intel.com>
> Reviewed-by: Kees Cook <keesc...@chromium.org>
> Cc: Moritz Lipp <moritz.l...@iaik.tugraz.at>
> Cc: Daniel Gruss <daniel.gr...@iaik.tugraz.at>
> Cc: Michael Schwarz <michael.schw...@iaik.tugraz.at>
> Cc: Richard Fellner <richard.fell...@student.tugraz.at>
> Cc: Andy Lutomirski <l...@kernel.org>
> Cc: Linus Torvalds <torva...@linux-foundation.org>
> Cc: Hugh Dickins <hu...@google.com>
> Cc: x...@kernel.org
> ---
> 
>  b/Documentation/admin-guide/kernel-parameters.txt |   22 +-
>  b/Documentation/x86/pti.txt                       |  185 
> ++++++++++++++++++++++
>  2 files changed, 200 insertions(+), 7 deletions(-)

> diff -puN /dev/null Documentation/x86/pti.txt
> --- /dev/null 2017-12-15 13:48:30.454245127 -0800
> +++ b/Documentation/x86/pti.txt       2018-01-04 16:23:40.870819409 -0800
> @@ -0,0 +1,185 @@

> +The userspace copy is used when running userspace and mirrors the
> +mapping of userspace present in the kernel copy.  It maps a only

                                                       drop: a

> +the kernel data needed to enter and exit the kernel.  This data
> +is entirely contained in the 'struct cpu_entry_area' structure
> +which is placed in the fixmap and thus each CPU's copy of the
> +area has a compile-time-fixed virtual address.
> +

> +2. Runtime Cost
> +  a. CR3 manipulation to switch between the page table copies
> +     must be done at interrupt, syscall, and exception entry
> +     and exit (it can be skipped when the kernel is interrupted,
> +     though.)  Moves to CR3 are on the order of a hundred
> +     cycles, and are required every at entry and every at exit.

                                 at every entry and at every exit.

> +  d. Global pages are disabled for all kernel structures not
> +     mapped in both to kernel and userspace page tables.  This

               into both kernel and userspace page tables.

> +     feature of the MMU allows different processes to share TLB
> +     entries mapping the kernel.  Losing the feature means more
> +     TLB misses after a context switch.  The actual loss of
> +     performance is very small, however, never exceeding 1%.

> +  f. In addition to the fork()-time copying, there must also
> +     be an update to the userspace PGD any time a set_pgd() is done
> +     on a PGD used to map userspace.  This ensures that the kernel
> +     and userspace copies always map the same userspace
> +     memory.
> +  g. On systems without PCID support, each CR3 write flushes
> +     the entire TLB.  That means that each syscall, interrupt
> +     or exception flushes the TLB.
> +  h. On systems without INVPCID support, addresses can only be

This is the first mention of INVPCID. Probably needs more info
about what it is.

> +     flushed from the TLB for the current PCID.  When flushing
> +     a kernel address, we need to flush all PCIDs, so a single
> +     kernel address flush will require a TLB-flushing CR3 write
> +     upon the next use of every PCID.
> +
> +Possible Future Work
> +====================
> +1. We can be more careful about not actually writing to CR3
> +   unless its value is actually changed.
> +2. Allow PTI to enabled/disabled at runtime in addition to the

                to be

> +   boot-time switching.
> +
> +Testing
> +========



-- 
~Randy

Reply via email to