[tip:x86/fpu] x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASK

2015-06-09 Thread tip-bot for Qiaowei Ren
Commit-ID:  3c1d32300920a446c67d697cd6b80f012ad06028
Gitweb: http://git.kernel.org/tip/3c1d32300920a446c67d697cd6b80f012ad06028
Author: Qiaowei Ren 
AuthorDate: Sun, 7 Jun 2015 11:37:02 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 9 Jun 2015 12:24:30 +0200

x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASK

MPX_BNDCFG_ADDR_MASK is defined two times, so this patch removes
redundant one.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
Cc: Andrew Morton 
Cc: Dave Hansen 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20150607183702.5f129...@viggo.jf.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/include/asm/mpx.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 0cdd16a..871e5e5 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -45,7 +45,6 @@
 #define MPX_BNDSTA_TAIL2
 #define MPX_BNDCFG_TAIL12
 #define MPX_BNDSTA_ADDR_MASK   (~((1UL<http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/fpu] x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASK

2015-06-09 Thread tip-bot for Qiaowei Ren
Commit-ID:  3c1d32300920a446c67d697cd6b80f012ad06028
Gitweb: http://git.kernel.org/tip/3c1d32300920a446c67d697cd6b80f012ad06028
Author: Qiaowei Ren qiaowei@intel.com
AuthorDate: Sun, 7 Jun 2015 11:37:02 -0700
Committer:  Ingo Molnar mi...@kernel.org
CommitDate: Tue, 9 Jun 2015 12:24:30 +0200

x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASK

MPX_BNDCFG_ADDR_MASK is defined two times, so this patch removes
redundant one.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
Signed-off-by: Dave Hansen dave.han...@linux.intel.com
Reviewed-by: Thomas Gleixner t...@linutronix.de
Cc: Andrew Morton a...@linux-foundation.org
Cc: Dave Hansen d...@sr71.net
Cc: H. Peter Anvin h...@zytor.com
Cc: Linus Torvalds torva...@linux-foundation.org
Cc: Peter Zijlstra pet...@infradead.org
Link: http://lkml.kernel.org/r/20150607183702.5f129...@viggo.jf.intel.com
Signed-off-by: Ingo Molnar mi...@kernel.org
---
 arch/x86/include/asm/mpx.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 0cdd16a..871e5e5 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -45,7 +45,6 @@
 #define MPX_BNDSTA_TAIL2
 #define MPX_BNDCFG_TAIL12
 #define MPX_BNDSTA_ADDR_MASK   (~((1ULMPX_BNDSTA_TAIL)-1))
-#define MPX_BNDCFG_ADDR_MASK   (~((1ULMPX_BNDCFG_TAIL)-1))
 #define MPX_BT_ADDR_MASK   (~((1ULMPX_BD_ENTRY_TAIL)-1))
 
 #define MPX_BNDCFG_ADDR_MASK   (~((1ULMPX_BNDCFG_TAIL)-1))
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mpx] x86, mpx: Add documentation on Intel MPX

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  5776563648f6437ede91c91cbad85862ca682b0b
Gitweb: http://git.kernel.org/tip/5776563648f6437ede91c91cbad85862ca682b0b
Author: Qiaowei Ren 
AuthorDate: Fri, 14 Nov 2014 07:18:32 -0800
Committer:  Thomas Gleixner 
CommitDate: Tue, 18 Nov 2014 00:58:54 +0100

x86, mpx: Add documentation on Intel MPX

This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Dave Hansen 
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen 
Link: http://lkml.kernel.org/r/20141114151832.7fdb1...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner 
---
 Documentation/x86/intel_mpx.txt | 234 
 1 file changed, 234 insertions(+)

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..4472ed2
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,234 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new capability
+introduced into Intel Architecture. Intel MPX provides hardware features
+that can be used in conjunction with compiler changes to check memory
+references, for those references whose compile-time normal intentions are
+usurped at runtime due to buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture Instruction
+Set Extensions Programming Reference, Chapter 9: Intel(R) Memory Protection
+Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead, which
+can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How to get the advantage of MPX
+==
+
+For MPX to work, changes are required in the kernel, binutils and compiler.
+No source changes are required for applications, just a recompile.
+
+There are a lot of moving parts of this to all work right. The following
+is how we expect the compiler, application and kernel to work together.
+
+1) Application developer compiles with -fmpx. The compiler will add the
+   instrumentation as well as some setup code called early after the app
+   starts. New instruction prefixes are noops for old CPUs.
+2) That setup code allocates (virtual) space for the "bounds directory",
+   points the "bndcfgu" register to the directory and notifies the kernel
+   (via the new prctl(PR_MPX_ENABLE_MANAGEMENT)) that the app will be using
+   MPX.
+3) The kernel detects that the CPU has MPX, allows the new prctl() to
+   succeed, and notes the location of the bounds directory. Userspace is
+   expected to keep the bounds directory at that locationWe note it
+   instead of reading it each time because the 'xsave' operation needed
+   to access the bounds directory register is an expensive operation.
+4) If the application needs to spill bounds out of the 4 registers, it
+   issues a bndstx instruction. Since the bounds directory is empty at
+   this point, a bounds fault (#BR) is raised, the kernel allocates a
+   bounds table (in the user address space) and makes the relevant entry
+   in the bounds directory point to the new table.
+5) If the application violates the bounds specified in the bounds registers,
+   a separate kind of #BR is raised which will deliver a signal with
+   information about the violation in the 'struct siginfo'.
+6) Whenever memory is freed, we know that it can no longer contain valid
+   pointers, and we attempt to free the associated space in the bounds
+   tables. If an entire table becomes unused, we will attempt to free
+   the table and remove the entry in the directory.
+
+To summarize, there are essentially three things interacting here:
+
+GCC with -fmpx:
+ * enables annotation of code with MPX instructions and prefixes
+ * inserts code early in the application to call in to the "gcc runtime"
+GCC MPX Runtime:
+ * Checks for hardware MPX support in cpuid leaf
+ * allocates virtual space for the bounds directory (malloc() essentially)
+ * points the hardware BNDCFGU register at the directory
+ * calls a new prctl(PR_MPX_ENABLE_MANAGEMENT) to notify the kernel to
+   start managing the bounds directories
+Kernel MPX Code:
+ * Checks for hardware MPX support in cpuid leaf
+ * Handles #BR exceptions and sends SIGSEGV to the app when it violates
+   bounds, like during a buffer overflow.
+ * When bounds are spilled in to an unallocated bounds table, the kernel
+   notices in the #BR exception, allocates the virtual space, then
+   updates the bounds directory to point to the new table. It keeps
+   special track of the memory with a VM_MPX flag.
+ * Frees unused bounds tables at the time that the memory they described
+   is unmapped.
+
+
+3. How does MPX kernel code work
+
+
+Han

[tip:x86/mpx] x86, mpx: Add MPX-specific mmap interface

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  57319d80e1d328e34cb24868a4f4405661485e30
Gitweb: http://git.kernel.org/tip/57319d80e1d328e34cb24868a4f4405661485e30
Author: Qiaowei Ren 
AuthorDate: Fri, 14 Nov 2014 07:18:27 -0800
Committer:  Thomas Gleixner 
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

x86, mpx: Add MPX-specific mmap interface

We have chosen to perform the allocation of bounds tables in
kernel (See the patch "on-demand kernel allocation of bounds
tables") and to mark these VMAs with VM_MPX.

However, there is currently no suitable interface to actually do
this.  Existing interfaces, like do_mmap_pgoff(), have no way to
set a modified ->vm_ops or ->vm_flags and don't hold mmap_sem
long enough to let a caller do it.

This patch wraps mmap_region() and hold mmap_sem long enough to
make the modifications to the VMA which we need.

Also note the 32/64-bit #ifdef in the header.  We actually need
to do this at runtime eventually.  But, for now, we don't support
running 32-bit binaries on 64-bit kernels.  Support for this will
come in later patches.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Dave Hansen 
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen 
Link: http://lkml.kernel.org/r/20141114151827.ce440...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner 
---
 arch/x86/Kconfig   |  4 +++
 arch/x86/include/asm/mpx.h | 36 +++
 arch/x86/mm/Makefile   |  2 ++
 arch/x86/mm/mpx.c  | 86 ++
 4 files changed, 128 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ded8a67..967dfe0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -248,6 +248,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU && ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32 && SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..7d7c5f5
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,36 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..72d13b0
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,86 @@
+/*
+ * mpx.c - Memory Protection eXtensions
+ *
+ * Copyright (c) 2014, Intel Corporation.
+ * Qiaowei Ren 
+ * Dave Hansen 
+ */
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+static const char *mpx_mapping_name(struct vm_area_struct *vma)
+{
+   return "[mpx]";
+}
+
+static struct vm_operations_struct mpx_vma_ops = {
+   .name = mpx_mapping_name,
+};
+
+/*
+ * This is really a simplified "vm_mmap". it only handles MPX
+ * bounds tables (the bounds directory is user-allocated).
+ *
+ * Later on, we use the vma->vm_ops to uniquely identify these
+ * VMAs.
+ */
+static unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current->mm;
+   vm_flags_t vm_flags;
+   struct vm_area_struct *vma;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(>mmap_sem);
+
+   /* Too many mappings? */
+   if (mm->map_count > sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
+   if (addr & ~PAGE

[tip:x86/mpx] x86, mpx: Introduce VM_MPX to indicate that a VMA is MPX specific

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  4aae7e436fa51faf4bf5d11b175aea82cfe8224a
Gitweb: http://git.kernel.org/tip/4aae7e436fa51faf4bf5d11b175aea82cfe8224a
Author: Qiaowei Ren 
AuthorDate: Fri, 14 Nov 2014 07:18:25 -0800
Committer:  Thomas Gleixner 
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

x86, mpx: Introduce VM_MPX to indicate that a VMA is MPX specific

MPX-enabled applications using large swaths of memory can
potentially have large numbers of bounds tables in process
address space to save bounds information. These tables can take
up huge swaths of memory (as much as 80% of the memory on the
system) even if we clean them up aggressively. In the worst-case
scenario, the tables can be 4x the size of the data structure
being tracked. IOW, a 1-page structure can require 4 bounds-table
pages.

Being this huge, our expectation is that folks using MPX are
going to be keen on figuring out how much memory is being
dedicated to it. So we need a way to track memory use for MPX.

If we want to specifically track MPX VMAs we need to be able to
distinguish them from normal VMAs, and keep them from getting
merged with normal VMAs. A new VM_ flag set only on MPX VMAs does
both of those things. With this flag, MPX bounds-table VMAs can
be distinguished from other VMAs, and userspace can also walk
/proc/$pid/smaps to get memory usage for MPX.

In addition to this flag, we also introduce a special ->vm_ops
specific to MPX VMAs (see the patch "add MPX specific mmap
interface"), but currently different ->vm_ops do not by
themselves prevent VMA merging, so we still need this flag.

We understand that VM_ flags are scarce and are open to other
options.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Dave Hansen 
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen 
Link: http://lkml.kernel.org/r/20141114151825.56562...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner 
---
 fs/proc/task_mmu.c | 3 +++
 include/linux/mm.h | 6 ++
 2 files changed, 9 insertions(+)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4e0388c..f6734c6 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -552,6 +552,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_GROWSDOWN)]   = "gd",
[ilog2(VM_PFNMAP)]  = "pf",
[ilog2(VM_DENYWRITE)]   = "dw",
+#ifdef CONFIG_X86_INTEL_MPX
+   [ilog2(VM_MPX)] = "mp",
+#endif
[ilog2(VM_LOCKED)]  = "lo",
[ilog2(VM_IO)]  = "io",
[ilog2(VM_SEQ_READ)]= "sr",
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b464611..f7606d3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -128,6 +128,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB 0x0040  /* Huge TLB Page VM */
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1  0x0100  /* Architecture-specific flag */
+#define VM_ARCH_2  0x0200
 #define VM_DONTDUMP0x0400  /* Do not include in the core dump */
 
 #ifdef CONFIG_MEM_SOFT_DIRTY
@@ -155,6 +156,11 @@ extern unsigned int kobjsize(const void *objp);
 # define VM_MAPPED_COPYVM_ARCH_1   /* T if mapped copy of data 
(nommu mmap) */
 #endif
 
+#if defined(CONFIG_X86)
+/* MPX specific bounds table or bounds directory */
+# define VM_MPXVM_ARCH_2
+#endif
+
 #ifndef VM_GROWSUP
 # define VM_GROWSUPVM_NONE
 #endif
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mpx] mips: Sync struct siginfo with general version

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  232b5fff5bad78ad00b94153fa90ca53bef6a444
Gitweb: http://git.kernel.org/tip/232b5fff5bad78ad00b94153fa90ca53bef6a444
Author: Qiaowei Ren 
AuthorDate: Fri, 14 Nov 2014 07:18:20 -0800
Committer:  Thomas Gleixner 
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

mips: Sync struct siginfo with general version

New fields about bound violation are added into general struct
siginfo. This will impact MIPS and IA64, which extend general
struct siginfo. This patch syncs this struct for MIPS with
general version.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Dave Hansen 
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen 
Link: http://lkml.kernel.org/r/20141114151820.f7edc...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner 
---
 arch/mips/include/uapi/asm/siginfo.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mpx] ia64: Sync struct siginfo with general version

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  53f037b08b5bebf47aa2b574a984e2f9fc7926f2
Gitweb: http://git.kernel.org/tip/53f037b08b5bebf47aa2b574a984e2f9fc7926f2
Author: Qiaowei Ren 
AuthorDate: Fri, 14 Nov 2014 07:18:22 -0800
Committer:  Thomas Gleixner 
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

ia64: Sync struct siginfo with general version

New fields about bound violation are added into general struct
siginfo. This will impact MIPS and IA64, which extend general
struct siginfo. This patch syncs this struct for IA64 with
general version.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Dave Hansen 
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen 
Link: http://lkml.kernel.org/r/20141114151822.82b3b...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner 
---
 arch/ia64/include/uapi/asm/siginfo.h | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/include/uapi/asm/siginfo.h 
b/arch/ia64/include/uapi/asm/siginfo.h
index 4ea6225..bce9bc1 100644
--- a/arch/ia64/include/uapi/asm/siginfo.h
+++ b/arch/ia64/include/uapi/asm/siginfo.h
@@ -63,6 +63,10 @@ typedef struct siginfo {
unsigned int _flags;/* see below */
unsigned long _isr; /* isr */
short _addr_lsb;/* lsb of faulting address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -110,9 +114,9 @@ typedef struct siginfo {
 /*
  * SIGSEGV si_codes
  */
-#define __SEGV_PSTKOVF (__SI_FAULT|3)  /* paragraph stack overflow */
+#define __SEGV_PSTKOVF (__SI_FAULT|4)  /* paragraph stack overflow */
 #undef NSIGSEGV
-#define NSIGSEGV   3
+#define NSIGSEGV   4
 
 #undef NSIGTRAP
 #define NSIGTRAP   4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mpx] mpx: Extend siginfo structure to include bound violation information

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  ee1b58d36aa1b5a79eaba11f5c3633c88231da83
Gitweb: http://git.kernel.org/tip/ee1b58d36aa1b5a79eaba11f5c3633c88231da83
Author: Qiaowei Ren 
AuthorDate: Fri, 14 Nov 2014 07:18:19 -0800
Committer:  Thomas Gleixner 
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

mpx: Extend siginfo structure to include bound violation information

This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Dave Hansen 
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen 
Link: http://lkml.kernel.org/r/20141114151819.1908c...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner 
---
 include/uapi/asm-generic/siginfo.h | 9 -
 kernel/signal.c| 4 
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno  _sifields._sigfault._trapno
 #endif
 #define si_addr_lsb_sifields._sigfault._addr_lsb
+#define si_lower   _sifields._sigfault._addr_bnd._lower
+#define si_upper   _sifields._sigfault._addr_bnd._upper
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
-#define NSIGSEGV   2
+#define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV   3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index 8f0876f..2c403a4 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const 
siginfo_t *from)
if (from->si_code == BUS_MCEERR_AR || from->si_code == 
BUS_MCEERR_AO)
err |= __put_user(from->si_addr_lsb, >si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+   err |= __put_user(from->si_lower, >si_lower);
+   err |= __put_user(from->si_upper, >si_upper);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from->si_pid, >si_pid);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mpx] mpx: Extend siginfo structure to include bound violation information

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  ee1b58d36aa1b5a79eaba11f5c3633c88231da83
Gitweb: http://git.kernel.org/tip/ee1b58d36aa1b5a79eaba11f5c3633c88231da83
Author: Qiaowei Ren qiaowei@intel.com
AuthorDate: Fri, 14 Nov 2014 07:18:19 -0800
Committer:  Thomas Gleixner t...@linutronix.de
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

mpx: Extend siginfo structure to include bound violation information

This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
Signed-off-by: Dave Hansen dave.han...@linux.intel.com
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen d...@sr71.net
Link: http://lkml.kernel.org/r/20141114151819.1908c...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner t...@linutronix.de
---
 include/uapi/asm-generic/siginfo.h | 9 -
 kernel/signal.c| 4 
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno  _sifields._sigfault._trapno
 #endif
 #define si_addr_lsb_sifields._sigfault._addr_lsb
+#define si_lower   _sifields._sigfault._addr_bnd._lower
+#define si_upper   _sifields._sigfault._addr_bnd._upper
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
-#define NSIGSEGV   2
+#define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV   3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index 8f0876f..2c403a4 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const 
siginfo_t *from)
if (from-si_code == BUS_MCEERR_AR || from-si_code == 
BUS_MCEERR_AO)
err |= __put_user(from-si_addr_lsb, to-si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+   err |= __put_user(from-si_lower, to-si_lower);
+   err |= __put_user(from-si_upper, to-si_upper);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from-si_pid, to-si_pid);
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mpx] ia64: Sync struct siginfo with general version

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  53f037b08b5bebf47aa2b574a984e2f9fc7926f2
Gitweb: http://git.kernel.org/tip/53f037b08b5bebf47aa2b574a984e2f9fc7926f2
Author: Qiaowei Ren qiaowei@intel.com
AuthorDate: Fri, 14 Nov 2014 07:18:22 -0800
Committer:  Thomas Gleixner t...@linutronix.de
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

ia64: Sync struct siginfo with general version

New fields about bound violation are added into general struct
siginfo. This will impact MIPS and IA64, which extend general
struct siginfo. This patch syncs this struct for IA64 with
general version.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
Signed-off-by: Dave Hansen dave.han...@linux.intel.com
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen d...@sr71.net
Link: http://lkml.kernel.org/r/20141114151822.82b3b...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner t...@linutronix.de
---
 arch/ia64/include/uapi/asm/siginfo.h | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/include/uapi/asm/siginfo.h 
b/arch/ia64/include/uapi/asm/siginfo.h
index 4ea6225..bce9bc1 100644
--- a/arch/ia64/include/uapi/asm/siginfo.h
+++ b/arch/ia64/include/uapi/asm/siginfo.h
@@ -63,6 +63,10 @@ typedef struct siginfo {
unsigned int _flags;/* see below */
unsigned long _isr; /* isr */
short _addr_lsb;/* lsb of faulting address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -110,9 +114,9 @@ typedef struct siginfo {
 /*
  * SIGSEGV si_codes
  */
-#define __SEGV_PSTKOVF (__SI_FAULT|3)  /* paragraph stack overflow */
+#define __SEGV_PSTKOVF (__SI_FAULT|4)  /* paragraph stack overflow */
 #undef NSIGSEGV
-#define NSIGSEGV   3
+#define NSIGSEGV   4
 
 #undef NSIGTRAP
 #define NSIGTRAP   4
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mpx] mips: Sync struct siginfo with general version

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  232b5fff5bad78ad00b94153fa90ca53bef6a444
Gitweb: http://git.kernel.org/tip/232b5fff5bad78ad00b94153fa90ca53bef6a444
Author: Qiaowei Ren qiaowei@intel.com
AuthorDate: Fri, 14 Nov 2014 07:18:20 -0800
Committer:  Thomas Gleixner t...@linutronix.de
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

mips: Sync struct siginfo with general version

New fields about bound violation are added into general struct
siginfo. This will impact MIPS and IA64, which extend general
struct siginfo. This patch syncs this struct for MIPS with
general version.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
Signed-off-by: Dave Hansen dave.han...@linux.intel.com
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen d...@sr71.net
Link: http://lkml.kernel.org/r/20141114151820.f7edc...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner t...@linutronix.de
---
 arch/mips/include/uapi/asm/siginfo.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mpx] x86, mpx: Introduce VM_MPX to indicate that a VMA is MPX specific

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  4aae7e436fa51faf4bf5d11b175aea82cfe8224a
Gitweb: http://git.kernel.org/tip/4aae7e436fa51faf4bf5d11b175aea82cfe8224a
Author: Qiaowei Ren qiaowei@intel.com
AuthorDate: Fri, 14 Nov 2014 07:18:25 -0800
Committer:  Thomas Gleixner t...@linutronix.de
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

x86, mpx: Introduce VM_MPX to indicate that a VMA is MPX specific

MPX-enabled applications using large swaths of memory can
potentially have large numbers of bounds tables in process
address space to save bounds information. These tables can take
up huge swaths of memory (as much as 80% of the memory on the
system) even if we clean them up aggressively. In the worst-case
scenario, the tables can be 4x the size of the data structure
being tracked. IOW, a 1-page structure can require 4 bounds-table
pages.

Being this huge, our expectation is that folks using MPX are
going to be keen on figuring out how much memory is being
dedicated to it. So we need a way to track memory use for MPX.

If we want to specifically track MPX VMAs we need to be able to
distinguish them from normal VMAs, and keep them from getting
merged with normal VMAs. A new VM_ flag set only on MPX VMAs does
both of those things. With this flag, MPX bounds-table VMAs can
be distinguished from other VMAs, and userspace can also walk
/proc/$pid/smaps to get memory usage for MPX.

In addition to this flag, we also introduce a special -vm_ops
specific to MPX VMAs (see the patch add MPX specific mmap
interface), but currently different -vm_ops do not by
themselves prevent VMA merging, so we still need this flag.

We understand that VM_ flags are scarce and are open to other
options.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
Signed-off-by: Dave Hansen dave.han...@linux.intel.com
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen d...@sr71.net
Link: http://lkml.kernel.org/r/20141114151825.56562...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner t...@linutronix.de
---
 fs/proc/task_mmu.c | 3 +++
 include/linux/mm.h | 6 ++
 2 files changed, 9 insertions(+)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4e0388c..f6734c6 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -552,6 +552,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_GROWSDOWN)]   = gd,
[ilog2(VM_PFNMAP)]  = pf,
[ilog2(VM_DENYWRITE)]   = dw,
+#ifdef CONFIG_X86_INTEL_MPX
+   [ilog2(VM_MPX)] = mp,
+#endif
[ilog2(VM_LOCKED)]  = lo,
[ilog2(VM_IO)]  = io,
[ilog2(VM_SEQ_READ)]= sr,
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b464611..f7606d3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -128,6 +128,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB 0x0040  /* Huge TLB Page VM */
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1  0x0100  /* Architecture-specific flag */
+#define VM_ARCH_2  0x0200
 #define VM_DONTDUMP0x0400  /* Do not include in the core dump */
 
 #ifdef CONFIG_MEM_SOFT_DIRTY
@@ -155,6 +156,11 @@ extern unsigned int kobjsize(const void *objp);
 # define VM_MAPPED_COPYVM_ARCH_1   /* T if mapped copy of data 
(nommu mmap) */
 #endif
 
+#if defined(CONFIG_X86)
+/* MPX specific bounds table or bounds directory */
+# define VM_MPXVM_ARCH_2
+#endif
+
 #ifndef VM_GROWSUP
 # define VM_GROWSUPVM_NONE
 #endif
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mpx] x86, mpx: Add MPX-specific mmap interface

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  57319d80e1d328e34cb24868a4f4405661485e30
Gitweb: http://git.kernel.org/tip/57319d80e1d328e34cb24868a4f4405661485e30
Author: Qiaowei Ren qiaowei@intel.com
AuthorDate: Fri, 14 Nov 2014 07:18:27 -0800
Committer:  Thomas Gleixner t...@linutronix.de
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

x86, mpx: Add MPX-specific mmap interface

We have chosen to perform the allocation of bounds tables in
kernel (See the patch on-demand kernel allocation of bounds
tables) and to mark these VMAs with VM_MPX.

However, there is currently no suitable interface to actually do
this.  Existing interfaces, like do_mmap_pgoff(), have no way to
set a modified -vm_ops or -vm_flags and don't hold mmap_sem
long enough to let a caller do it.

This patch wraps mmap_region() and hold mmap_sem long enough to
make the modifications to the VMA which we need.

Also note the 32/64-bit #ifdef in the header.  We actually need
to do this at runtime eventually.  But, for now, we don't support
running 32-bit binaries on 64-bit kernels.  Support for this will
come in later patches.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
Signed-off-by: Dave Hansen dave.han...@linux.intel.com
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen d...@sr71.net
Link: http://lkml.kernel.org/r/20141114151827.ce440...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner t...@linutronix.de
---
 arch/x86/Kconfig   |  4 +++
 arch/x86/include/asm/mpx.h | 36 +++
 arch/x86/mm/Makefile   |  2 ++
 arch/x86/mm/mpx.c  | 86 ++
 4 files changed, 128 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ded8a67..967dfe0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -248,6 +248,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU  ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32  SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..7d7c5f5
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,36 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include linux/types.h
+#include asm/ptrace.h
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..72d13b0
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,86 @@
+/*
+ * mpx.c - Memory Protection eXtensions
+ *
+ * Copyright (c) 2014, Intel Corporation.
+ * Qiaowei Ren qiaowei@intel.com
+ * Dave Hansen dave.han...@intel.com
+ */
+#include linux/kernel.h
+#include linux/syscalls.h
+#include linux/sched/sysctl.h
+
+#include asm/mman.h
+#include asm/mpx.h
+
+static const char *mpx_mapping_name(struct vm_area_struct *vma)
+{
+   return [mpx];
+}
+
+static struct vm_operations_struct mpx_vma_ops = {
+   .name = mpx_mapping_name,
+};
+
+/*
+ * This is really a simplified vm_mmap. it only handles MPX
+ * bounds tables (the bounds directory is user-allocated).
+ *
+ * Later on, we use the vma-vm_ops to uniquely identify these
+ * VMAs.
+ */
+static unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current-mm;
+   vm_flags_t vm_flags;
+   struct vm_area_struct *vma;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES  len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(mm-mmap_sem);
+
+   /* Too many mappings? */
+   if (mm-map_count  sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid

[tip:x86/mpx] x86, mpx: Add documentation on Intel MPX

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  5776563648f6437ede91c91cbad85862ca682b0b
Gitweb: http://git.kernel.org/tip/5776563648f6437ede91c91cbad85862ca682b0b
Author: Qiaowei Ren qiaowei@intel.com
AuthorDate: Fri, 14 Nov 2014 07:18:32 -0800
Committer:  Thomas Gleixner t...@linutronix.de
CommitDate: Tue, 18 Nov 2014 00:58:54 +0100

x86, mpx: Add documentation on Intel MPX

This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
Signed-off-by: Dave Hansen dave.han...@linux.intel.com
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen d...@sr71.net
Link: http://lkml.kernel.org/r/20141114151832.7fdb1...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner t...@linutronix.de
---
 Documentation/x86/intel_mpx.txt | 234 
 1 file changed, 234 insertions(+)

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..4472ed2
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,234 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new capability
+introduced into Intel Architecture. Intel MPX provides hardware features
+that can be used in conjunction with compiler changes to check memory
+references, for those references whose compile-time normal intentions are
+usurped at runtime due to buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture Instruction
+Set Extensions Programming Reference, Chapter 9: Intel(R) Memory Protection
+Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead, which
+can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How to get the advantage of MPX
+==
+
+For MPX to work, changes are required in the kernel, binutils and compiler.
+No source changes are required for applications, just a recompile.
+
+There are a lot of moving parts of this to all work right. The following
+is how we expect the compiler, application and kernel to work together.
+
+1) Application developer compiles with -fmpx. The compiler will add the
+   instrumentation as well as some setup code called early after the app
+   starts. New instruction prefixes are noops for old CPUs.
+2) That setup code allocates (virtual) space for the bounds directory,
+   points the bndcfgu register to the directory and notifies the kernel
+   (via the new prctl(PR_MPX_ENABLE_MANAGEMENT)) that the app will be using
+   MPX.
+3) The kernel detects that the CPU has MPX, allows the new prctl() to
+   succeed, and notes the location of the bounds directory. Userspace is
+   expected to keep the bounds directory at that locationWe note it
+   instead of reading it each time because the 'xsave' operation needed
+   to access the bounds directory register is an expensive operation.
+4) If the application needs to spill bounds out of the 4 registers, it
+   issues a bndstx instruction. Since the bounds directory is empty at
+   this point, a bounds fault (#BR) is raised, the kernel allocates a
+   bounds table (in the user address space) and makes the relevant entry
+   in the bounds directory point to the new table.
+5) If the application violates the bounds specified in the bounds registers,
+   a separate kind of #BR is raised which will deliver a signal with
+   information about the violation in the 'struct siginfo'.
+6) Whenever memory is freed, we know that it can no longer contain valid
+   pointers, and we attempt to free the associated space in the bounds
+   tables. If an entire table becomes unused, we will attempt to free
+   the table and remove the entry in the directory.
+
+To summarize, there are essentially three things interacting here:
+
+GCC with -fmpx:
+ * enables annotation of code with MPX instructions and prefixes
+ * inserts code early in the application to call in to the gcc runtime
+GCC MPX Runtime:
+ * Checks for hardware MPX support in cpuid leaf
+ * allocates virtual space for the bounds directory (malloc() essentially)
+ * points the hardware BNDCFGU register at the directory
+ * calls a new prctl(PR_MPX_ENABLE_MANAGEMENT) to notify the kernel to
+   start managing the bounds directories
+Kernel MPX Code:
+ * Checks for hardware MPX support in cpuid leaf
+ * Handles #BR exceptions and sends SIGSEGV to the app when it violates
+   bounds, like during a buffer overflow.
+ * When bounds are spilled in to an unallocated bounds table, the kernel
+   notices in the #BR exception, allocates the virtual space, then
+   updates the bounds directory to point to the new table. It keeps
+   special track of the memory with a VM_MPX flag.
+ * Frees unused bounds tables at the time that the memory they described
+   is unmapped.
+
+
+3. How does MPX

[PATCH v9 06/12] mpx: extend siginfo structure to include bound violation information

2014-10-11 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

Signed-off-by: Qiaowei Ren 
---
 include/uapi/asm-generic/siginfo.h |9 -
 kernel/signal.c|4 
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno  _sifields._sigfault._trapno
 #endif
 #define si_addr_lsb_sifields._sigfault._addr_lsb
+#define si_lower   _sifields._sigfault._addr_bnd._lower
+#define si_upper   _sifields._sigfault._addr_bnd._upper
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
-#define NSIGSEGV   2
+#define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV   3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index 8f0876f..2c403a4 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const 
siginfo_t *from)
if (from->si_code == BUS_MCEERR_AR || from->si_code == 
BUS_MCEERR_AO)
err |= __put_user(from->si_addr_lsb, >si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+   err |= __put_user(from->si_lower, >si_lower);
+   err |= __put_user(from->si_upper, >si_upper);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from->si_pid, >si_pid);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 04/12] x86, mpx: add MPX to disaabled features

2014-10-11 Thread Qiaowei Ren
This allows us to use cpu_feature_enabled(X86_FEATURE_MPX) as
both a runtime and compile-time check.

When CONFIG_X86_INTEL_MPX is disabled,
cpu_feature_enabled(X86_FEATURE_MPX) will evaluate at
compile-time to 0. If CONFIG_X86_INTEL_MPX=y, then the cpuid
flag will be checked at runtime.

This patch must be applied after another Dave's commit:
  381aa07a9b4e1f82969203e9e4863da2a157781d

Signed-off-by: Dave Hansen 
Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/disabled-features.h |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/disabled-features.h 
b/arch/x86/include/asm/disabled-features.h
index 97534a7..f226df0 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -10,6 +10,12 @@
  * cpu_feature_enabled().
  */
 
+#ifdef CONFIG_X86_INTEL_MPX
+# define DISABLE_MPX   0
+#else
+# define DISABLE_MPX   (1<<(X86_FEATURE_MPX & 31))
+#endif
+
 #ifdef CONFIG_X86_64
 # define DISABLE_VME   (1<<(X86_FEATURE_VME & 31))
 # define DISABLE_K6_MTRR   (1<<(X86_FEATURE_K6_MTRR & 31))
@@ -34,6 +40,6 @@
 #define DISABLED_MASK6 0
 #define DISABLED_MASK7 0
 #define DISABLED_MASK8 0
-#define DISABLED_MASK9 0
+#define DISABLED_MASK9 (DISABLE_MPX)
 
 #endif /* _ASM_X86_DISABLED_FEATURES_H */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 07/12] mips: sync struct siginfo with general version

2014-10-11 Thread Qiaowei Ren
New fields about bound violation are added into general struct
siginfo. This will impact MIPS and IA64, which extend general
struct siginfo. This patch syncs this struct for MIPS with
general version.

Signed-off-by: Qiaowei Ren 
---
 arch/mips/include/uapi/asm/siginfo.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 05/12] x86, mpx: on-demand kernel allocation of bounds tables

2014-10-11 Thread Qiaowei Ren
MPX only has 4 hardware registers for storing bounds information.
If MPX-enabled code needs more than these 4 registers, it needs to
spill them somewhere. It has two special instructions for this
which allow the bounds to be moved between the bounds registers
and some new "bounds tables".

They are similar conceptually to a page fault and will be raised by
the MPX hardware during both bounds violations or when the tables
are not present. This patch handles those #BR exceptions for
not-present tables by carving the space out of the normal processes
address space (essentially calling the new mmap() interface indroduced
earlier in this patch set.) and then pointing the bounds-directory
over to it.

The tables *need* to be accessed and controlled by userspace because
the instructions for moving bounds in and out of them are extremely
frequent. They potentially happen every time a register pointing to
memory is dereferenced. Any direct kernel involvement (like a syscall)
to access the tables would obviously destroy performance.

 Why not do this in userspace? 

This patch is obviously doing this allocation in the kernel.
However, MPX does not strictly *require* anything in the kernel.
It can theoretically be done completely from userspace. Here are
a few ways this *could* be done. I don't think any of them are
practical in the real-world, but here they are.

Q: Can virtual space simply be reserved for the bounds tables so
   that we never have to allocate them?
A: As noted earlier, these tables are *HUGE*. An X-GB virtual
   area needs 4*X GB of virtual space, plus 2GB for the bounds
   directory. If we were to preallocate them for the 128TB of
   user virtual address space, we would need to reserve 512TB+2GB,
   which is larger than the entire virtual address space today.
   This means they can not be reserved ahead of time. Also, a
   single process's pre-popualated bounds directory consumes 2GB
   of virtual *AND* physical memory. IOW, it's completely
   infeasible to prepopulate bounds directories.

Q: Can we preallocate bounds table space at the same time memory
   is allocated which might contain pointers that might eventually
   need bounds tables?
A: This would work if we could hook the site of each and every
   memory allocation syscall. This can be done for small,
   constrained applications. But, it isn't practical at a larger
   scale since a given app has no way of controlling how all the
   parts of the app might allocate memory (think libraries). The
   kernel is really the only place to intercept these calls.

Q: Could a bounds fault be handed to userspace and the tables
   allocated there in a signal handler instead of in the kernel?
A: (thanks to tglx) mmap() is not on the list of safe async
   handler functions and even if mmap() would work it still
   requires locking or nasty tricks to keep track of the
   allocation state there.

Having ruled out all of the userspace-only approaches for managing
bounds tables that we could think of, we create them on demand in
the kernel.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   20 +
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |  101 
 arch/x86/kernel/traps.c|   52 ++-
 4 files changed, 173 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 5725ac4..b7598ac 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -18,6 +18,8 @@
 #define MPX_BT_ENTRY_SHIFT 5
 #define MPX_IGN_BITS   3
 
+#define MPX_BD_ENTRY_TAIL  3
+
 #else
 
 #define MPX_BD_ENTRY_OFFSET20
@@ -26,13 +28,31 @@
 #define MPX_BT_ENTRY_SHIFT 4
 #define MPX_IGN_BITS   2
 
+#define MPX_BD_ENTRY_TAIL  2
+
 #endif
 
+#define MPX_BNDSTA_TAIL2
+#define MPX_BNDCFG_TAIL12
+#define MPX_BNDSTA_ADDR_MASK   (~((1UL<
+ * Dave Hansen 
+ */
+
+#include 
+#include 
+#include 
+
+/*
+ * With 32-bit mode, MPX_BT_SIZE_BYTES is 4MB, and the size of each
+ * bounds table is 16KB. With 64-bit mode, MPX_BT_SIZE_BYTES is 2GB,
+ * and the size of each bounds table is 4MB.
+ */
+static int allocate_bt(long __user *bd_entry)
+{
+   unsigned long bt_addr;
+   unsigned long expected_old_val = 0;
+   unsigned long actual_old_val = 0;
+   int ret = 0;
+
+   /*
+* Carve the virtual space out of userspace for the new
+* bounds table:
+*/
+   bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES);
+   if (IS_ERR((void *)bt_addr))
+   return PTR_ERR((void *)bt_addr);
+   /*
+* Set the valid flag (kinda like _PAGE_PRESENT in a pte)
+*/
+   bt_addr = bt_addr | MPX_BD_ENTRY_VALID_FLAG;
+
+   /*
+* Go poke the address of the new bounds table in to the
+* bounds directory entry out in userspace memory.  Note:
+

[PATCH v9 03/12] x86, mpx: add MPX specific mmap interface

2014-10-11 Thread Qiaowei Ren
We have to do the allocation of bounds tables in kernel (See the patch
"on-demand kernel allocation of bounds tables"). Moreover, if we want
to track MPX VMAs we need to be able to stick new VM_MPX flag and a
specific vm_ops for MPX in the vma_area_struct.

But there are not suitable interfaces to do this in current kernel.
Existing interfaces, like do_mmap_pgoff(), could not stick specific
->vm_ops in the vma_area_struct when a VMA is created. So, this patch
adds MPX specific mmap interface to do the allocation of bounds tables.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/Kconfig   |4 ++
 arch/x86/include/asm/mpx.h |   38 +
 arch/x86/mm/Makefile   |2 +
 arch/x86/mm/mpx.c  |   79 
 4 files changed, 123 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/mm/mpx.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4b663e1..e5bcc70 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -243,6 +243,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU && ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32 && SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..5725ac4
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,38 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+unsigned long mpx_mmap(unsigned long len);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..e1b28e6
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,79 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static const char *mpx_mapping_name(struct vm_area_struct *vma)
+{
+   return "[mpx]";
+}
+
+static struct vm_operations_struct mpx_vma_ops = {
+   .name = mpx_mapping_name,
+};
+
+/*
+ * this is really a simplified "vm_mmap". it only handles mpx
+ * related maps, including bounds table and bounds directory.
+ *
+ * here we can stick new vm_flag VM_MPX in the vma_area_struct
+ * when create a bounds table or bounds directory, in order to
+ * track MPX specific memory.
+ */
+unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current->mm;
+   vm_flags_t vm_flags;
+   struct vm_area_struct *vma;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(>mmap_sem);
+
+   /* Too many mappings? */
+   if (mm->map_count > sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
+   if (addr & ~PAGE_MASK) {
+   ret = addr;
+   goto out;
+   }
+
+   vm_flags = VM_READ | VM_WRITE | VM_MPX |
+   mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
+
+   /* Set pgoff according to addr for anon_vma */
+   pgoff = addr >> PAGE_SHIFT;
+
+   ret = mmap_region(NULL, addr, len, vm_flags, pgoff);
+   if (IS_ERR_VALUE(ret))
+   goto out;
+
+   vma = find_vma(mm, ret);
+   if (!vma) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   vma->vm_ops = _vma_ops;
+
+   if (vm_flags & VM_LOCKED)

[PATCH v9 01/12] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific

2014-10-11 Thread Qiaowei Ren
MPX-enabled applications using large swaths of memory can potentially
have large numbers of bounds tables in process address space to save
bounds information. These tables can take up huge swaths of memory
(as much as 80% of the memory on the system) even if we clean them
up aggressively. In the worst-case scenario, the tables can be 4x the
size of the data structure being tracked. IOW, a 1-page structure can
require 4 bounds-table pages.

Being this huge, our expectation is that folks using MPX are going to
be keen on figuring out how much memory is being dedicated to it. So
we need a way to track memory use for MPX.

If we want to specifically track MPX VMAs we need to be able to
distinguish them from normal VMAs, and keep them from getting merged
with normal VMAs. A new VM_ flag set only on MPX VMAs does both of
those things. With this flag, MPX bounds-table VMAs can be distinguished
from other VMAs, and userspace can also walk /proc/$pid/smaps to get
memory usage for MPX.

Except this flag, we also introduce a specific ->vm_ops for MPX VMAs
(see the patch "add MPX specific mmap interface"), but currently vmas
with different ->vm_ops could be not prevented from merging. We
understand that VM_ flags are scarce and are open to other options.

Signed-off-by: Qiaowei Ren 
---
 fs/proc/task_mmu.c |1 +
 include/linux/mm.h |6 ++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index dfc791c..cc31520 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -549,6 +549,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_GROWSDOWN)]   = "gd",
[ilog2(VM_PFNMAP)]  = "pf",
[ilog2(VM_DENYWRITE)]   = "dw",
+   [ilog2(VM_MPX)] = "mp",
[ilog2(VM_LOCKED)]  = "lo",
[ilog2(VM_IO)]  = "io",
[ilog2(VM_SEQ_READ)]= "sr",
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8981cc8..942be8a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -127,6 +127,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB 0x0040  /* Huge TLB Page VM */
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1  0x0100  /* Architecture-specific flag */
+#define VM_ARCH_2  0x0200
 #define VM_DONTDUMP0x0400  /* Do not include in the core dump */
 
 #ifdef CONFIG_MEM_SOFT_DIRTY
@@ -154,6 +155,11 @@ extern unsigned int kobjsize(const void *objp);
 # define VM_MAPPED_COPYVM_ARCH_1   /* T if mapped copy of data 
(nommu mmap) */
 #endif
 
+#if defined(CONFIG_X86)
+/* MPX specific bounds table or bounds directory */
+# define VM_MPXVM_ARCH_2
+#endif
+
 #ifndef VM_GROWSUP
 # define VM_GROWSUPVM_NONE
 #endif
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 08/12] ia64: sync struct siginfo with general version

2014-10-11 Thread Qiaowei Ren
New fields about bound violation are added into general struct
siginfo. This will impact MIPS and IA64, which extend general
struct siginfo. This patch syncs this struct for IA64 with
general version.

Signed-off-by: Qiaowei Ren 
---
 arch/ia64/include/uapi/asm/siginfo.h |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/include/uapi/asm/siginfo.h 
b/arch/ia64/include/uapi/asm/siginfo.h
index 4ea6225..bce9bc1 100644
--- a/arch/ia64/include/uapi/asm/siginfo.h
+++ b/arch/ia64/include/uapi/asm/siginfo.h
@@ -63,6 +63,10 @@ typedef struct siginfo {
unsigned int _flags;/* see below */
unsigned long _isr; /* isr */
short _addr_lsb;/* lsb of faulting address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -110,9 +114,9 @@ typedef struct siginfo {
 /*
  * SIGSEGV si_codes
  */
-#define __SEGV_PSTKOVF (__SI_FAULT|3)  /* paragraph stack overflow */
+#define __SEGV_PSTKOVF (__SI_FAULT|4)  /* paragraph stack overflow */
 #undef NSIGSEGV
-#define NSIGSEGV   3
+#define NSIGSEGV   4
 
 #undef NSIGTRAP
 #define NSIGTRAP   4
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 12/12] x86, mpx: add documentation on Intel MPX

2014-10-11 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
---
 Documentation/x86/intel_mpx.txt |  245 +++
 1 files changed, 245 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..3c20a17
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,245 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new capability
+introduced into Intel Architecture. Intel MPX provides hardware features
+that can be used in conjunction with compiler changes to check memory
+references, for those references whose compile-time normal intentions are
+usurped at runtime due to buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture Instruction
+Set Extensions Programming Reference, Chapter 9: Intel(R) Memory Protection
+Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead, which
+can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How to get the advantage of MPX
+==
+
+For MPX to work, changes are required in the kernel, binutils and compiler.
+No source changes are required for applications, just a recompile.
+
+There are a lot of moving parts of this to all work right. The following
+is how we expect the compiler, application and kernel to work together.
+
+1) Application developer compiles with -fmpx. The compiler will add the
+   instrumentation as well as some setup code called early after the app
+   starts. New instruction prefixes are noops for old CPUs.
+2) That setup code allocates (virtual) space for the "bounds directory",
+   points the "bndcfgu" register to the directory and notifies the kernel
+   (via the new prctl(PR_MPX_ENABLE_MANAGEMENT)) that the app will be using
+   MPX.
+3) The kernel detects that the CPU has MPX, allows the new prctl() to
+   succeed, and notes the location of the bounds directory. Userspace is
+   expected to keep the bounds directory at that locationWe note it
+   instead of reading it each time because the 'xsave' operation needed
+   to access the bounds directory register is an expensive operation.
+4) If the application needs to spill bounds out of the 4 registers, it
+   issues a bndstx instruction. Since the bounds directory is empty at
+   this point, a bounds fault (#BR) is raised, the kernel allocates a
+   bounds table (in the user address space) and makes the relevant entry
+   in the bounds directory point to the new table.
+5) If the application violates the bounds specified in the bounds registers,
+   a separate kind of #BR is raised which will deliver a signal with
+   information about the violation in the 'struct siginfo'.
+6) Whenever memory is freed, we know that it can no longer contain valid
+   pointers, and we attempt to free the associated space in the bounds
+   tables. If an entire table becomes unused, we will attempt to free
+   the table and remove the entry in the directory.
+
+To summarize, there are essentially three things interacting here:
+
+GCC with -fmpx:
+ * enables annotation of code with MPX instructions and prefixes
+ * inserts code early in the application to call in to the "gcc runtime"
+GCC MPX Runtime:
+ * Checks for hardware MPX support in cpuid leaf
+ * allocates virtual space for the bounds directory (malloc() essentially)
+ * points the hardware BNDCFGU register at the directory
+ * calls a new prctl(PR_MPX_ENABLE_MANAGEMENT) to notify the kernel to
+   start managing the bounds directories
+Kernel MPX Code:
+ * Checks for hardware MPX support in cpuid leaf
+ * Handles #BR exceptions and sends SIGSEGV to the app when it violates
+   bounds, like during a buffer overflow.
+ * When bounds are spilled in to an unallocated bounds table, the kernel
+   notices in the #BR exception, allocates the virtual space, then
+   updates the bounds directory to point to the new table. It keeps
+   special track of the memory with a VM_MPX flag.
+ * Frees unused bounds tables at the time that the memory they described
+   is unmapped.
+
+
+3. How does MPX kernel code work
+
+
+Handling #BR faults caused by MPX
+-
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults.
+  * new bounds tables (BT) need to be allocated to save bounds.
+  * bounds violation caused by MPX instructions.
+
+We hook #BR handler to handle these two new situations.
+
+On-demand kernel allocation of bounds tables
+
+
+MPX only has 4 hardware registers for storing bounds informati

[PATCH v9 09/12] x86, mpx: decode MPX instruction to get bound violation information

2014-10-11 Thread Qiaowei Ren
This patch sets bound violation fields of siginfo struct in #BR
exception handler by decoding the user instruction and constructing
the faulting pointer.

This patch does't use the generic decoder, and implements a limited
special-purpose decoder to decode MPX instructions, simply because the
generic decoder is very heavyweight not just in terms of performance
but in terms of interface -- because it has to.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   23 
 arch/x86/kernel/mpx.c  |  299 
 arch/x86/kernel/traps.c|6 +
 3 files changed, 328 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index b7598ac..780af63 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 
@@ -44,15 +45,37 @@
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 unsigned long mpx_mmap(unsigned long len);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 #else
 static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
 {
return -EINVAL;
 }
+static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf)
+{
+}
 #endif /* CONFIG_X86_INTEL_MPX */
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 2103b5e..b7e4c0e 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -10,6 +10,275 @@
 #include 
 #include 
 
+enum reg_type {
+   REG_TYPE_RM = 0,
+   REG_TYPE_INDEX,
+   REG_TYPE_BASE,
+};
+
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+enum reg_type type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn->sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   } else {
+

[PATCH v9 11/12] x86, mpx: cleanup unused bound tables

2014-10-11 Thread Qiaowei Ren
There are two mappings in play: 1. The mapping with the actual data,
which userspace is munmap()ing or brk()ing away, etc... 2. The mapping
for the bounds table *backing* the data (is tagged with mpx_vma_ops,
see the patch "add MPX specific mmap interface").

If userspace use the prctl() indroduced earlier in this patchset to
enable the management of bounds tables in kernel, when it unmaps the
first kind of mapping with the actual data, kernel needs to free the
mapping for the bounds table backing the data. This patch calls
arch_unmap() at the very end of do_unmap() to do so. This will walk
the directory to look at the entries covered in the data vma and unmaps
the bounds table which is referenced from the directory and then clears
the directory entry.

Unmapping of bounds tables is called under vm_munmap() of the data VMA.
So we have to check ->vm_ops to prevent recursion. This recursion represents
having bounds tables for bounds tables, which should not occur normally.
Being strict about it here helps ensure that we do not have an exploitable
stack overflow.

Once we unmap the bounds table, we would have a bounds directory entry
pointing at empty address space. That address space could now be allocated
for some other (random) use, and the MPX hardware is now going to go
trying to walk it as if it were a bounds table. That would be bad. So
any unmapping of a bounds table has to be accompanied by a corresponding
write to the bounds directory entry to have it invalid. That write to
the bounds directory can fault.

Since we are doing the freeing from munmap() (and other paths like it),
we hold mmap_sem for write. If we fault, the page fault handler will
attempt to acquire mmap_sem for read and we will deadlock. For now, to
avoid deadlock, we disable page faults while touching the bounds directory
entry. This keeps us from being able to free the tables in this case.
This deficiency will be addressed in later patches.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mmu_context.h |   16 ++
 arch/x86/include/asm/mpx.h |9 +
 arch/x86/mm/mpx.c  |  317 
 include/asm-generic/mmu_context.h  |6 +
 mm/mmap.c  |2 +
 5 files changed, 350 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index e33ddb7..2b52d1b 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -111,4 +111,20 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm,
 #endif
 }
 
+static inline void arch_unmap(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long start, unsigned long end)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   /*
+* Userspace never asked us to manage the bounds tables,
+* so refuse to help.
+*/
+   if (!kernel_managing_mpx_tables(current->mm))
+   return;
+
+   mpx_notify_unmap(mm, vma, start, end);
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 32f13f5..a1a0155 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -48,6 +48,13 @@
 #define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
+#define MPX_BD_ENTRY_MASK  ((1<>(MPX_BT_ENTRY_OFFSET+ \
+   MPX_IGN_BITS)) & MPX_BD_ENTRY_MASK) << MPX_BD_ENTRY_SHIFT)
+#define MPX_GET_BT_ENTRY_OFFSET(addr)  addr)>>MPX_IGN_BITS) & \
+   MPX_BT_ENTRY_MASK) << MPX_BT_ENTRY_SHIFT)
+
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
@@ -73,6 +80,8 @@ static inline int kernel_managing_mpx_tables(struct mm_struct 
*mm)
return (mm->bd_addr != MPX_INVALID_BOUNDS_DIR);
 }
 unsigned long mpx_mmap(unsigned long len);
+void mpx_notify_unmap(struct mm_struct *mm, struct vm_area_struct *vma,
+   unsigned long start, unsigned long end);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 376f2ee..dcc6621 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -1,7 +1,16 @@
+/*
+ * mpx.c - Memory Protection eXtensions
+ *
+ * Copyright (c) 2014, Intel Corporation.
+ * Qiaowei Ren 
+ * Dave Hansen 
+ */
+
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static const char *mpx_mapping_name(struct vm_area_struct *vma)
@@ -13,6 +22,11 @@ static struct vm_operations_struct mpx_vma_ops = {
.name = mpx_mapping_name,
 };
 
+int is_mpx_vma(struct vm_area_struct *vma)
+{
+   return (vma->vm_ops == _vma_ops);
+}
+
 /*
  * this is really a simplified "vm_mmap". it only handles mpx
  * related maps, including bounds table and bo

[PATCH v9 10/12] x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT, PR_MPX_DISABLE_MANAGEMENT

2014-10-11 Thread Qiaowei Ren
This patch adds two prctl() commands to provide one explicit interaction
mechanism to enable or disable the management of bounds tables in kernel,
including on-demand kernel allocation (See the patch "on-demand kernel
allocation of bounds tables") and cleanup (See the patch "cleanup unused
bound tables"). Applications do not strictly need the kernel to manage
bounds tables and we expect some applications to use MPX without taking
advantage of the kernel support. This means the kernel can not simply
infer whether an application needs bounds table management from the
MPX registers. prctl() is an explicit signal from userspace.

PR_MPX_ENABLE_MANAGEMENT is meant to be a signal from userspace to
require kernel's help in managing bounds tables. And
PR_MPX_DISABLE_MANAGEMENT is the opposite, meaning that userspace don't
want kernel's help any more. With PR_MPX_DISABLE_MANAGEMENT, kernel
won't allocate and free the bounds table, even if the CPU supports MPX
feature.

PR_MPX_ENABLE_MANAGEMENT will do an xsave and fetch the base address
of bounds directory from the xsave buffer and then cache it into new
filed "bd_addr" of struct mm_struct. PR_MPX_DISABLE_MANAGEMENT will
set "bd_addr" to one invalid address. Then we can check "bd_addr" to
judge whether the management of bounds tables in kernel is enabled.

xsaves are expensive, so "bd_addr" is kept for caching to reduce the
number of we have to do at munmap() time. But we still have to do
xsave to get the value of BNDSTATUS at #BR fault time. In addition,
with this caching, userspace can't just move the bounds directory
around willy-nilly. For sane applications, base address of the bounds
directory won't be changed, otherwise we would be in a world of hurt.
But we will still check whether it is changed by users at #BR fault
time.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mmu_context.h |9 
 arch/x86/include/asm/mpx.h |   11 +
 arch/x86/include/asm/processor.h   |   18 +++
 arch/x86/kernel/mpx.c  |   88 
 arch/x86/kernel/setup.c|8 +++
 arch/x86/kernel/traps.c|   30 -
 arch/x86/mm/mpx.c  |   25 +++---
 fs/exec.c  |2 +
 include/asm-generic/mmu_context.h  |5 ++
 include/linux/mm_types.h   |3 +
 include/uapi/linux/prctl.h |6 +++
 kernel/sys.c   |   12 +
 12 files changed, 198 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index 166af2a..e33ddb7 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifndef CONFIG_PARAVIRT
 #include 
 
@@ -102,4 +103,12 @@ do {   \
 } while (0)
 #endif
 
+static inline void arch_bprm_mm_init(struct mm_struct *mm,
+   struct vm_area_struct *vma)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   mm->bd_addr = MPX_INVALID_BOUNDS_DIR;
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 780af63..32f13f5 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -5,6 +5,12 @@
 #include 
 #include 
 
+/*
+ * NULL is theoretically a valid place to put the bounds
+ * directory, so point this at an invalid address.
+ */
+#define MPX_INVALID_BOUNDS_DIR ((void __user *)-1)
+
 #ifdef CONFIG_X86_64
 
 /* upper 28 bits [47:20] of the virtual address in 64-bit used to
@@ -43,6 +49,7 @@
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
 struct mpx_insn {
@@ -61,6 +68,10 @@ struct mpx_insn {
 
 #define MAX_MPX_INSN_SIZE  15
 
+static inline int kernel_managing_mpx_tables(struct mm_struct *mm)
+{
+   return (mm->bd_addr != MPX_INVALID_BOUNDS_DIR);
+}
 unsigned long mpx_mmap(unsigned long len);
 
 #ifdef CONFIG_X86_INTEL_MPX
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 020142f..b35aefa 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -953,6 +953,24 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+/* Register/unregister a process' MPX related resource */
+#define MPX_ENABLE_MANAGEMENT(tsk) mpx_enable_management((tsk))
+#define MPX_DISABLE_MANAGEMENT(tsk)mpx_disable_management((tsk))
+
+#ifdef CONFIG_X86_INTEL_MPX
+extern int mpx_enable_management(struct task_struct *tsk);
+extern int mpx_disable_management(struct task_struct *tsk);
+#else
+static inline int mpx_enable_management(struct task_struct *tsk)
+{
+  

[PATCH v9 00/12] Intel MPX support

2014-10-11 Thread Qiaowei Ren
hanges since v6:
  * because arch_vma_name is removed, this patchset have toset MPX
specific ->vm_ops to do the same thing.
  * fix warnings for 32 bit arch.
  * add more description into these patches.

Changes since v7:
  * introduce VM_ARCH_2 flag. 
  * remove all of the pr_debug()s.
  * fix prctl numbers in documentation.
  * fix some bugs on bounds tables freeing.

Changes since v8:
  * add new patch to rename cfg_reg_u and status_reg.
  * add new patch to use disabled features from Dave's patches.
  * add new patch to sync struct siginfo for IA64.
  * rename two new prctl() commands to PR_MPX_ENABLE_MANAGEMENT and
PR_MPX_DISABLE_MANAGEMENT, check whether the management of bounds
tables in kernel is enabled at #BR fault time, and add locking to
protect the access to 'bd_addr'.
  * update the documentation file to add more content about on-demand
allocation of bounds tables, etc..

Qiaowei Ren (12):
  mm: distinguish VMAs with different vm_ops
  x86, mpx: rename cfg_reg_u and status_reg
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add MPX to disaabled features
  x86, mpx: on-demand kernel allocation of bounds tables
  mpx: extend siginfo structure to include bound violation information
  mips: sync struct siginfo with general version
  ia64: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT,
PR_MPX_DISABLE_MANAGEMENT
  x86, mpx: cleanup unused bound tables
  x86, mpx: add documentation on Intel MPX


Qiaowei Ren (12):
  x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
  x86, mpx: rename cfg_reg_u and status_reg
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add MPX to disaabled features
  x86, mpx: on-demand kernel allocation of bounds tables
  mpx: extend siginfo structure to include bound violation information
  mips: sync struct siginfo with general version
  ia64: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT,
PR_MPX_DISABLE_MANAGEMENT
  x86, mpx: cleanup unused bound tables
  x86, mpx: add documentation on Intel MPX

 Documentation/x86/intel_mpx.txt  |  245 +++
 arch/ia64/include/uapi/asm/siginfo.h |8 +-
 arch/mips/include/uapi/asm/siginfo.h |4 +
 arch/x86/Kconfig |4 +
 arch/x86/include/asm/disabled-features.h |8 +-
 arch/x86/include/asm/mmu_context.h   |   25 ++
 arch/x86/include/asm/mpx.h   |  101 ++
 arch/x86/include/asm/processor.h |   22 ++-
 arch/x86/kernel/Makefile |1 +
 arch/x86/kernel/mpx.c|  488 ++
 arch/x86/kernel/setup.c  |8 +
 arch/x86/kernel/traps.c  |   86 ++-
 arch/x86/mm/Makefile |2 +
 arch/x86/mm/mpx.c|  385 +++
 fs/exec.c|2 +
 fs/proc/task_mmu.c   |1 +
 include/asm-generic/mmu_context.h|   11 +
 include/linux/mm.h   |6 +
 include/linux/mm_types.h |3 +
 include/uapi/asm-generic/siginfo.h   |9 +-
 include/uapi/linux/prctl.h   |6 +
 kernel/signal.c  |4 +
 kernel/sys.c |   12 +
 mm/mmap.c|2 +
 24 files changed, 1436 insertions(+), 7 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c
 create mode 100644 arch/x86/mm/mpx.c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 02/12] x86, mpx: rename cfg_reg_u and status_reg

2014-10-11 Thread Qiaowei Ren
According to Intel SDM extension, MPX configuration and status registers
should be BNDCFGU and BNDSTATUS. This patch renames cfg_reg_u and
status_reg to bndcfgu and bndstatus.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/processor.h |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index eb71ec7..020142f 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -379,8 +379,8 @@ struct bndregs_struct {
 } __packed;
 
 struct bndcsr_struct {
-   u64 cfg_reg_u;
-   u64 status_reg;
+   u64 bndcfgu;
+   u64 bndstatus;
 } __packed;
 
 struct xsave_hdr_struct {
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 02/12] x86, mpx: rename cfg_reg_u and status_reg

2014-10-11 Thread Qiaowei Ren
According to Intel SDM extension, MPX configuration and status registers
should be BNDCFGU and BNDSTATUS. This patch renames cfg_reg_u and
status_reg to bndcfgu and bndstatus.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/processor.h |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index eb71ec7..020142f 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -379,8 +379,8 @@ struct bndregs_struct {
 } __packed;
 
 struct bndcsr_struct {
-   u64 cfg_reg_u;
-   u64 status_reg;
+   u64 bndcfgu;
+   u64 bndstatus;
 } __packed;
 
 struct xsave_hdr_struct {
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 00/12] Intel MPX support

2014-10-11 Thread Qiaowei Ren
 toset MPX
specific -vm_ops to do the same thing.
  * fix warnings for 32 bit arch.
  * add more description into these patches.

Changes since v7:
  * introduce VM_ARCH_2 flag. 
  * remove all of the pr_debug()s.
  * fix prctl numbers in documentation.
  * fix some bugs on bounds tables freeing.

Changes since v8:
  * add new patch to rename cfg_reg_u and status_reg.
  * add new patch to use disabled features from Dave's patches.
  * add new patch to sync struct siginfo for IA64.
  * rename two new prctl() commands to PR_MPX_ENABLE_MANAGEMENT and
PR_MPX_DISABLE_MANAGEMENT, check whether the management of bounds
tables in kernel is enabled at #BR fault time, and add locking to
protect the access to 'bd_addr'.
  * update the documentation file to add more content about on-demand
allocation of bounds tables, etc..

Qiaowei Ren (12):
  mm: distinguish VMAs with different vm_ops
  x86, mpx: rename cfg_reg_u and status_reg
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add MPX to disaabled features
  x86, mpx: on-demand kernel allocation of bounds tables
  mpx: extend siginfo structure to include bound violation information
  mips: sync struct siginfo with general version
  ia64: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT,
PR_MPX_DISABLE_MANAGEMENT
  x86, mpx: cleanup unused bound tables
  x86, mpx: add documentation on Intel MPX


Qiaowei Ren (12):
  x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
  x86, mpx: rename cfg_reg_u and status_reg
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add MPX to disaabled features
  x86, mpx: on-demand kernel allocation of bounds tables
  mpx: extend siginfo structure to include bound violation information
  mips: sync struct siginfo with general version
  ia64: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT,
PR_MPX_DISABLE_MANAGEMENT
  x86, mpx: cleanup unused bound tables
  x86, mpx: add documentation on Intel MPX

 Documentation/x86/intel_mpx.txt  |  245 +++
 arch/ia64/include/uapi/asm/siginfo.h |8 +-
 arch/mips/include/uapi/asm/siginfo.h |4 +
 arch/x86/Kconfig |4 +
 arch/x86/include/asm/disabled-features.h |8 +-
 arch/x86/include/asm/mmu_context.h   |   25 ++
 arch/x86/include/asm/mpx.h   |  101 ++
 arch/x86/include/asm/processor.h |   22 ++-
 arch/x86/kernel/Makefile |1 +
 arch/x86/kernel/mpx.c|  488 ++
 arch/x86/kernel/setup.c  |8 +
 arch/x86/kernel/traps.c  |   86 ++-
 arch/x86/mm/Makefile |2 +
 arch/x86/mm/mpx.c|  385 +++
 fs/exec.c|2 +
 fs/proc/task_mmu.c   |1 +
 include/asm-generic/mmu_context.h|   11 +
 include/linux/mm.h   |6 +
 include/linux/mm_types.h |3 +
 include/uapi/asm-generic/siginfo.h   |9 +-
 include/uapi/linux/prctl.h   |6 +
 kernel/signal.c  |4 +
 kernel/sys.c |   12 +
 mm/mmap.c|2 +
 24 files changed, 1436 insertions(+), 7 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c
 create mode 100644 arch/x86/mm/mpx.c

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 09/12] x86, mpx: decode MPX instruction to get bound violation information

2014-10-11 Thread Qiaowei Ren
This patch sets bound violation fields of siginfo struct in #BR
exception handler by decoding the user instruction and constructing
the faulting pointer.

This patch does't use the generic decoder, and implements a limited
special-purpose decoder to decode MPX instructions, simply because the
generic decoder is very heavyweight not just in terms of performance
but in terms of interface -- because it has to.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mpx.h |   23 
 arch/x86/kernel/mpx.c  |  299 
 arch/x86/kernel/traps.c|6 +
 3 files changed, 328 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index b7598ac..780af63 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include linux/types.h
 #include asm/ptrace.h
+#include asm/insn.h
 
 #ifdef CONFIG_X86_64
 
@@ -44,15 +45,37 @@
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 unsigned long mpx_mmap(unsigned long len);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 #else
 static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
 {
return -EINVAL;
 }
+static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf)
+{
+}
 #endif /* CONFIG_X86_INTEL_MPX */
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 2103b5e..b7e4c0e 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -10,6 +10,275 @@
 #include linux/syscalls.h
 #include asm/mpx.h
 
+enum reg_type {
+   REG_TYPE_RM = 0,
+   REG_TYPE_INDEX,
+   REG_TYPE_BASE,
+};
+
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+enum reg_type type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn-modrm.value;
+   unsigned char sib = (unsigned char)insn-sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn-rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn-rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn-rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn-modrm.value;
+   unsigned char sib = (unsigned char)insn-sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn-sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1  X86_SIB_SCALE(sib

[PATCH v9 11/12] x86, mpx: cleanup unused bound tables

2014-10-11 Thread Qiaowei Ren
There are two mappings in play: 1. The mapping with the actual data,
which userspace is munmap()ing or brk()ing away, etc... 2. The mapping
for the bounds table *backing* the data (is tagged with mpx_vma_ops,
see the patch add MPX specific mmap interface).

If userspace use the prctl() indroduced earlier in this patchset to
enable the management of bounds tables in kernel, when it unmaps the
first kind of mapping with the actual data, kernel needs to free the
mapping for the bounds table backing the data. This patch calls
arch_unmap() at the very end of do_unmap() to do so. This will walk
the directory to look at the entries covered in the data vma and unmaps
the bounds table which is referenced from the directory and then clears
the directory entry.

Unmapping of bounds tables is called under vm_munmap() of the data VMA.
So we have to check -vm_ops to prevent recursion. This recursion represents
having bounds tables for bounds tables, which should not occur normally.
Being strict about it here helps ensure that we do not have an exploitable
stack overflow.

Once we unmap the bounds table, we would have a bounds directory entry
pointing at empty address space. That address space could now be allocated
for some other (random) use, and the MPX hardware is now going to go
trying to walk it as if it were a bounds table. That would be bad. So
any unmapping of a bounds table has to be accompanied by a corresponding
write to the bounds directory entry to have it invalid. That write to
the bounds directory can fault.

Since we are doing the freeing from munmap() (and other paths like it),
we hold mmap_sem for write. If we fault, the page fault handler will
attempt to acquire mmap_sem for read and we will deadlock. For now, to
avoid deadlock, we disable page faults while touching the bounds directory
entry. This keeps us from being able to free the tables in this case.
This deficiency will be addressed in later patches.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mmu_context.h |   16 ++
 arch/x86/include/asm/mpx.h |9 +
 arch/x86/mm/mpx.c  |  317 
 include/asm-generic/mmu_context.h  |6 +
 mm/mmap.c  |2 +
 5 files changed, 350 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index e33ddb7..2b52d1b 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -111,4 +111,20 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm,
 #endif
 }
 
+static inline void arch_unmap(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long start, unsigned long end)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   /*
+* Userspace never asked us to manage the bounds tables,
+* so refuse to help.
+*/
+   if (!kernel_managing_mpx_tables(current-mm))
+   return;
+
+   mpx_notify_unmap(mm, vma, start, end);
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 32f13f5..a1a0155 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -48,6 +48,13 @@
 #define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
+#define MPX_BD_ENTRY_MASK  ((1MPX_BD_ENTRY_OFFSET)-1)
+#define MPX_BT_ENTRY_MASK  ((1MPX_BT_ENTRY_OFFSET)-1)
+#define MPX_GET_BD_ENTRY_OFFSET(addr)  addr)(MPX_BT_ENTRY_OFFSET+ \
+   MPX_IGN_BITS))  MPX_BD_ENTRY_MASK)  MPX_BD_ENTRY_SHIFT)
+#define MPX_GET_BT_ENTRY_OFFSET(addr)  addr)MPX_IGN_BITS)  \
+   MPX_BT_ENTRY_MASK)  MPX_BT_ENTRY_SHIFT)
+
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
@@ -73,6 +80,8 @@ static inline int kernel_managing_mpx_tables(struct mm_struct 
*mm)
return (mm-bd_addr != MPX_INVALID_BOUNDS_DIR);
 }
 unsigned long mpx_mmap(unsigned long len);
+void mpx_notify_unmap(struct mm_struct *mm, struct vm_area_struct *vma,
+   unsigned long start, unsigned long end);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 376f2ee..dcc6621 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -1,7 +1,16 @@
+/*
+ * mpx.c - Memory Protection eXtensions
+ *
+ * Copyright (c) 2014, Intel Corporation.
+ * Qiaowei Ren qiaowei@intel.com
+ * Dave Hansen dave.han...@intel.com
+ */
+
 #include linux/kernel.h
 #include linux/syscalls.h
 #include asm/mpx.h
 #include asm/mman.h
+#include asm/mmu_context.h
 #include linux/sched/sysctl.h
 
 static const char *mpx_mapping_name(struct vm_area_struct *vma)
@@ -13,6 +22,11 @@ static struct vm_operations_struct mpx_vma_ops = {
.name = mpx_mapping_name,
 };
 
+int is_mpx_vma(struct

[PATCH v9 10/12] x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT, PR_MPX_DISABLE_MANAGEMENT

2014-10-11 Thread Qiaowei Ren
This patch adds two prctl() commands to provide one explicit interaction
mechanism to enable or disable the management of bounds tables in kernel,
including on-demand kernel allocation (See the patch on-demand kernel
allocation of bounds tables) and cleanup (See the patch cleanup unused
bound tables). Applications do not strictly need the kernel to manage
bounds tables and we expect some applications to use MPX without taking
advantage of the kernel support. This means the kernel can not simply
infer whether an application needs bounds table management from the
MPX registers. prctl() is an explicit signal from userspace.

PR_MPX_ENABLE_MANAGEMENT is meant to be a signal from userspace to
require kernel's help in managing bounds tables. And
PR_MPX_DISABLE_MANAGEMENT is the opposite, meaning that userspace don't
want kernel's help any more. With PR_MPX_DISABLE_MANAGEMENT, kernel
won't allocate and free the bounds table, even if the CPU supports MPX
feature.

PR_MPX_ENABLE_MANAGEMENT will do an xsave and fetch the base address
of bounds directory from the xsave buffer and then cache it into new
filed bd_addr of struct mm_struct. PR_MPX_DISABLE_MANAGEMENT will
set bd_addr to one invalid address. Then we can check bd_addr to
judge whether the management of bounds tables in kernel is enabled.

xsaves are expensive, so bd_addr is kept for caching to reduce the
number of we have to do at munmap() time. But we still have to do
xsave to get the value of BNDSTATUS at #BR fault time. In addition,
with this caching, userspace can't just move the bounds directory
around willy-nilly. For sane applications, base address of the bounds
directory won't be changed, otherwise we would be in a world of hurt.
But we will still check whether it is changed by users at #BR fault
time.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mmu_context.h |9 
 arch/x86/include/asm/mpx.h |   11 +
 arch/x86/include/asm/processor.h   |   18 +++
 arch/x86/kernel/mpx.c  |   88 
 arch/x86/kernel/setup.c|8 +++
 arch/x86/kernel/traps.c|   30 -
 arch/x86/mm/mpx.c  |   25 +++---
 fs/exec.c  |2 +
 include/asm-generic/mmu_context.h  |5 ++
 include/linux/mm_types.h   |3 +
 include/uapi/linux/prctl.h |6 +++
 kernel/sys.c   |   12 +
 12 files changed, 198 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index 166af2a..e33ddb7 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -10,6 +10,7 @@
 #include asm/pgalloc.h
 #include asm/tlbflush.h
 #include asm/paravirt.h
+#include asm/mpx.h
 #ifndef CONFIG_PARAVIRT
 #include asm-generic/mm_hooks.h
 
@@ -102,4 +103,12 @@ do {   \
 } while (0)
 #endif
 
+static inline void arch_bprm_mm_init(struct mm_struct *mm,
+   struct vm_area_struct *vma)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   mm-bd_addr = MPX_INVALID_BOUNDS_DIR;
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 780af63..32f13f5 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -5,6 +5,12 @@
 #include asm/ptrace.h
 #include asm/insn.h
 
+/*
+ * NULL is theoretically a valid place to put the bounds
+ * directory, so point this at an invalid address.
+ */
+#define MPX_INVALID_BOUNDS_DIR ((void __user *)-1)
+
 #ifdef CONFIG_X86_64
 
 /* upper 28 bits [47:20] of the virtual address in 64-bit used to
@@ -43,6 +49,7 @@
 #define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
 struct mpx_insn {
@@ -61,6 +68,10 @@ struct mpx_insn {
 
 #define MAX_MPX_INSN_SIZE  15
 
+static inline int kernel_managing_mpx_tables(struct mm_struct *mm)
+{
+   return (mm-bd_addr != MPX_INVALID_BOUNDS_DIR);
+}
 unsigned long mpx_mmap(unsigned long len);
 
 #ifdef CONFIG_X86_INTEL_MPX
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 020142f..b35aefa 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -953,6 +953,24 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+/* Register/unregister a process' MPX related resource */
+#define MPX_ENABLE_MANAGEMENT(tsk) mpx_enable_management((tsk))
+#define MPX_DISABLE_MANAGEMENT(tsk)mpx_disable_management((tsk))
+
+#ifdef CONFIG_X86_INTEL_MPX
+extern int mpx_enable_management(struct task_struct *tsk);
+extern int mpx_disable_management(struct task_struct *tsk);
+#else
+static inline int mpx_enable_management(struct

[PATCH v9 08/12] ia64: sync struct siginfo with general version

2014-10-11 Thread Qiaowei Ren
New fields about bound violation are added into general struct
siginfo. This will impact MIPS and IA64, which extend general
struct siginfo. This patch syncs this struct for IA64 with
general version.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/ia64/include/uapi/asm/siginfo.h |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/include/uapi/asm/siginfo.h 
b/arch/ia64/include/uapi/asm/siginfo.h
index 4ea6225..bce9bc1 100644
--- a/arch/ia64/include/uapi/asm/siginfo.h
+++ b/arch/ia64/include/uapi/asm/siginfo.h
@@ -63,6 +63,10 @@ typedef struct siginfo {
unsigned int _flags;/* see below */
unsigned long _isr; /* isr */
short _addr_lsb;/* lsb of faulting address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -110,9 +114,9 @@ typedef struct siginfo {
 /*
  * SIGSEGV si_codes
  */
-#define __SEGV_PSTKOVF (__SI_FAULT|3)  /* paragraph stack overflow */
+#define __SEGV_PSTKOVF (__SI_FAULT|4)  /* paragraph stack overflow */
 #undef NSIGSEGV
-#define NSIGSEGV   3
+#define NSIGSEGV   4
 
 #undef NSIGTRAP
 #define NSIGTRAP   4
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 12/12] x86, mpx: add documentation on Intel MPX

2014-10-11 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 Documentation/x86/intel_mpx.txt |  245 +++
 1 files changed, 245 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..3c20a17
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,245 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new capability
+introduced into Intel Architecture. Intel MPX provides hardware features
+that can be used in conjunction with compiler changes to check memory
+references, for those references whose compile-time normal intentions are
+usurped at runtime due to buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture Instruction
+Set Extensions Programming Reference, Chapter 9: Intel(R) Memory Protection
+Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead, which
+can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How to get the advantage of MPX
+==
+
+For MPX to work, changes are required in the kernel, binutils and compiler.
+No source changes are required for applications, just a recompile.
+
+There are a lot of moving parts of this to all work right. The following
+is how we expect the compiler, application and kernel to work together.
+
+1) Application developer compiles with -fmpx. The compiler will add the
+   instrumentation as well as some setup code called early after the app
+   starts. New instruction prefixes are noops for old CPUs.
+2) That setup code allocates (virtual) space for the bounds directory,
+   points the bndcfgu register to the directory and notifies the kernel
+   (via the new prctl(PR_MPX_ENABLE_MANAGEMENT)) that the app will be using
+   MPX.
+3) The kernel detects that the CPU has MPX, allows the new prctl() to
+   succeed, and notes the location of the bounds directory. Userspace is
+   expected to keep the bounds directory at that locationWe note it
+   instead of reading it each time because the 'xsave' operation needed
+   to access the bounds directory register is an expensive operation.
+4) If the application needs to spill bounds out of the 4 registers, it
+   issues a bndstx instruction. Since the bounds directory is empty at
+   this point, a bounds fault (#BR) is raised, the kernel allocates a
+   bounds table (in the user address space) and makes the relevant entry
+   in the bounds directory point to the new table.
+5) If the application violates the bounds specified in the bounds registers,
+   a separate kind of #BR is raised which will deliver a signal with
+   information about the violation in the 'struct siginfo'.
+6) Whenever memory is freed, we know that it can no longer contain valid
+   pointers, and we attempt to free the associated space in the bounds
+   tables. If an entire table becomes unused, we will attempt to free
+   the table and remove the entry in the directory.
+
+To summarize, there are essentially three things interacting here:
+
+GCC with -fmpx:
+ * enables annotation of code with MPX instructions and prefixes
+ * inserts code early in the application to call in to the gcc runtime
+GCC MPX Runtime:
+ * Checks for hardware MPX support in cpuid leaf
+ * allocates virtual space for the bounds directory (malloc() essentially)
+ * points the hardware BNDCFGU register at the directory
+ * calls a new prctl(PR_MPX_ENABLE_MANAGEMENT) to notify the kernel to
+   start managing the bounds directories
+Kernel MPX Code:
+ * Checks for hardware MPX support in cpuid leaf
+ * Handles #BR exceptions and sends SIGSEGV to the app when it violates
+   bounds, like during a buffer overflow.
+ * When bounds are spilled in to an unallocated bounds table, the kernel
+   notices in the #BR exception, allocates the virtual space, then
+   updates the bounds directory to point to the new table. It keeps
+   special track of the memory with a VM_MPX flag.
+ * Frees unused bounds tables at the time that the memory they described
+   is unmapped.
+
+
+3. How does MPX kernel code work
+
+
+Handling #BR faults caused by MPX
+-
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults.
+  * new bounds tables (BT) need to be allocated to save bounds.
+  * bounds violation caused by MPX instructions.
+
+We hook #BR handler to handle these two new situations.
+
+On-demand kernel allocation of bounds tables
+
+
+MPX only has 4 hardware registers for storing bounds information. If
+MPX

[PATCH v9 03/12] x86, mpx: add MPX specific mmap interface

2014-10-11 Thread Qiaowei Ren
We have to do the allocation of bounds tables in kernel (See the patch
on-demand kernel allocation of bounds tables). Moreover, if we want
to track MPX VMAs we need to be able to stick new VM_MPX flag and a
specific vm_ops for MPX in the vma_area_struct.

But there are not suitable interfaces to do this in current kernel.
Existing interfaces, like do_mmap_pgoff(), could not stick specific
-vm_ops in the vma_area_struct when a VMA is created. So, this patch
adds MPX specific mmap interface to do the allocation of bounds tables.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/Kconfig   |4 ++
 arch/x86/include/asm/mpx.h |   38 +
 arch/x86/mm/Makefile   |2 +
 arch/x86/mm/mpx.c  |   79 
 4 files changed, 123 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/mm/mpx.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4b663e1..e5bcc70 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -243,6 +243,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU  ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32  SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..5725ac4
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,38 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include linux/types.h
+#include asm/ptrace.h
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+unsigned long mpx_mmap(unsigned long len);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..e1b28e6
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,79 @@
+#include linux/kernel.h
+#include linux/syscalls.h
+#include asm/mpx.h
+#include asm/mman.h
+#include linux/sched/sysctl.h
+
+static const char *mpx_mapping_name(struct vm_area_struct *vma)
+{
+   return [mpx];
+}
+
+static struct vm_operations_struct mpx_vma_ops = {
+   .name = mpx_mapping_name,
+};
+
+/*
+ * this is really a simplified vm_mmap. it only handles mpx
+ * related maps, including bounds table and bounds directory.
+ *
+ * here we can stick new vm_flag VM_MPX in the vma_area_struct
+ * when create a bounds table or bounds directory, in order to
+ * track MPX specific memory.
+ */
+unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current-mm;
+   vm_flags_t vm_flags;
+   struct vm_area_struct *vma;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES  len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(mm-mmap_sem);
+
+   /* Too many mappings? */
+   if (mm-map_count  sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
+   if (addr  ~PAGE_MASK) {
+   ret = addr;
+   goto out;
+   }
+
+   vm_flags = VM_READ | VM_WRITE | VM_MPX |
+   mm-def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
+
+   /* Set pgoff according to addr for anon_vma */
+   pgoff = addr  PAGE_SHIFT;
+
+   ret = mmap_region(NULL, addr, len, vm_flags, pgoff);
+   if (IS_ERR_VALUE(ret))
+   goto out;
+
+   vma = find_vma(mm, ret);
+   if (!vma) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   vma-vm_ops = mpx_vma_ops;
+
+   if (vm_flags  VM_LOCKED

[PATCH v9 01/12] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific

2014-10-11 Thread Qiaowei Ren
MPX-enabled applications using large swaths of memory can potentially
have large numbers of bounds tables in process address space to save
bounds information. These tables can take up huge swaths of memory
(as much as 80% of the memory on the system) even if we clean them
up aggressively. In the worst-case scenario, the tables can be 4x the
size of the data structure being tracked. IOW, a 1-page structure can
require 4 bounds-table pages.

Being this huge, our expectation is that folks using MPX are going to
be keen on figuring out how much memory is being dedicated to it. So
we need a way to track memory use for MPX.

If we want to specifically track MPX VMAs we need to be able to
distinguish them from normal VMAs, and keep them from getting merged
with normal VMAs. A new VM_ flag set only on MPX VMAs does both of
those things. With this flag, MPX bounds-table VMAs can be distinguished
from other VMAs, and userspace can also walk /proc/$pid/smaps to get
memory usage for MPX.

Except this flag, we also introduce a specific -vm_ops for MPX VMAs
(see the patch add MPX specific mmap interface), but currently vmas
with different -vm_ops could be not prevented from merging. We
understand that VM_ flags are scarce and are open to other options.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 fs/proc/task_mmu.c |1 +
 include/linux/mm.h |6 ++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index dfc791c..cc31520 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -549,6 +549,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_GROWSDOWN)]   = gd,
[ilog2(VM_PFNMAP)]  = pf,
[ilog2(VM_DENYWRITE)]   = dw,
+   [ilog2(VM_MPX)] = mp,
[ilog2(VM_LOCKED)]  = lo,
[ilog2(VM_IO)]  = io,
[ilog2(VM_SEQ_READ)]= sr,
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8981cc8..942be8a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -127,6 +127,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB 0x0040  /* Huge TLB Page VM */
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1  0x0100  /* Architecture-specific flag */
+#define VM_ARCH_2  0x0200
 #define VM_DONTDUMP0x0400  /* Do not include in the core dump */
 
 #ifdef CONFIG_MEM_SOFT_DIRTY
@@ -154,6 +155,11 @@ extern unsigned int kobjsize(const void *objp);
 # define VM_MAPPED_COPYVM_ARCH_1   /* T if mapped copy of data 
(nommu mmap) */
 #endif
 
+#if defined(CONFIG_X86)
+/* MPX specific bounds table or bounds directory */
+# define VM_MPXVM_ARCH_2
+#endif
+
 #ifndef VM_GROWSUP
 # define VM_GROWSUPVM_NONE
 #endif
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 07/12] mips: sync struct siginfo with general version

2014-10-11 Thread Qiaowei Ren
New fields about bound violation are added into general struct
siginfo. This will impact MIPS and IA64, which extend general
struct siginfo. This patch syncs this struct for MIPS with
general version.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/mips/include/uapi/asm/siginfo.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 05/12] x86, mpx: on-demand kernel allocation of bounds tables

2014-10-11 Thread Qiaowei Ren
MPX only has 4 hardware registers for storing bounds information.
If MPX-enabled code needs more than these 4 registers, it needs to
spill them somewhere. It has two special instructions for this
which allow the bounds to be moved between the bounds registers
and some new bounds tables.

They are similar conceptually to a page fault and will be raised by
the MPX hardware during both bounds violations or when the tables
are not present. This patch handles those #BR exceptions for
not-present tables by carving the space out of the normal processes
address space (essentially calling the new mmap() interface indroduced
earlier in this patch set.) and then pointing the bounds-directory
over to it.

The tables *need* to be accessed and controlled by userspace because
the instructions for moving bounds in and out of them are extremely
frequent. They potentially happen every time a register pointing to
memory is dereferenced. Any direct kernel involvement (like a syscall)
to access the tables would obviously destroy performance.

 Why not do this in userspace? 

This patch is obviously doing this allocation in the kernel.
However, MPX does not strictly *require* anything in the kernel.
It can theoretically be done completely from userspace. Here are
a few ways this *could* be done. I don't think any of them are
practical in the real-world, but here they are.

Q: Can virtual space simply be reserved for the bounds tables so
   that we never have to allocate them?
A: As noted earlier, these tables are *HUGE*. An X-GB virtual
   area needs 4*X GB of virtual space, plus 2GB for the bounds
   directory. If we were to preallocate them for the 128TB of
   user virtual address space, we would need to reserve 512TB+2GB,
   which is larger than the entire virtual address space today.
   This means they can not be reserved ahead of time. Also, a
   single process's pre-popualated bounds directory consumes 2GB
   of virtual *AND* physical memory. IOW, it's completely
   infeasible to prepopulate bounds directories.

Q: Can we preallocate bounds table space at the same time memory
   is allocated which might contain pointers that might eventually
   need bounds tables?
A: This would work if we could hook the site of each and every
   memory allocation syscall. This can be done for small,
   constrained applications. But, it isn't practical at a larger
   scale since a given app has no way of controlling how all the
   parts of the app might allocate memory (think libraries). The
   kernel is really the only place to intercept these calls.

Q: Could a bounds fault be handed to userspace and the tables
   allocated there in a signal handler instead of in the kernel?
A: (thanks to tglx) mmap() is not on the list of safe async
   handler functions and even if mmap() would work it still
   requires locking or nasty tricks to keep track of the
   allocation state there.

Having ruled out all of the userspace-only approaches for managing
bounds tables that we could think of, we create them on demand in
the kernel.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mpx.h |   20 +
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |  101 
 arch/x86/kernel/traps.c|   52 ++-
 4 files changed, 173 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 5725ac4..b7598ac 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -18,6 +18,8 @@
 #define MPX_BT_ENTRY_SHIFT 5
 #define MPX_IGN_BITS   3
 
+#define MPX_BD_ENTRY_TAIL  3
+
 #else
 
 #define MPX_BD_ENTRY_OFFSET20
@@ -26,13 +28,31 @@
 #define MPX_BT_ENTRY_SHIFT 4
 #define MPX_IGN_BITS   2
 
+#define MPX_BD_ENTRY_TAIL  2
+
 #endif
 
+#define MPX_BNDSTA_TAIL2
+#define MPX_BNDCFG_TAIL12
+#define MPX_BNDSTA_ADDR_MASK   (~((1ULMPX_BNDSTA_TAIL)-1))
+#define MPX_BNDCFG_ADDR_MASK   (~((1ULMPX_BNDCFG_TAIL)-1))
+#define MPX_BT_ADDR_MASK   (~((1ULMPX_BD_ENTRY_TAIL)-1))
+
 #define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BD_ENTRY_VALID_FLAG0x1
 
 unsigned long mpx_mmap(unsigned long len);
 
+#ifdef CONFIG_X86_INTEL_MPX
+int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+#else
+static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   return -EINVAL;
+}
+#endif /* CONFIG_X86_INTEL_MPX */
+
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index ada2e2d..9ece662 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -43,6 +43,7 @@ obj-$(CONFIG_PREEMPT) += preempt.o
 
 obj-y  += process.o
 obj-y  += i387.o xsave.o
+obj

[PATCH v9 04/12] x86, mpx: add MPX to disaabled features

2014-10-11 Thread Qiaowei Ren
This allows us to use cpu_feature_enabled(X86_FEATURE_MPX) as
both a runtime and compile-time check.

When CONFIG_X86_INTEL_MPX is disabled,
cpu_feature_enabled(X86_FEATURE_MPX) will evaluate at
compile-time to 0. If CONFIG_X86_INTEL_MPX=y, then the cpuid
flag will be checked at runtime.

This patch must be applied after another Dave's commit:
  381aa07a9b4e1f82969203e9e4863da2a157781d

Signed-off-by: Dave Hansen dave.han...@linux.intel.com
Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/disabled-features.h |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/disabled-features.h 
b/arch/x86/include/asm/disabled-features.h
index 97534a7..f226df0 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -10,6 +10,12 @@
  * cpu_feature_enabled().
  */
 
+#ifdef CONFIG_X86_INTEL_MPX
+# define DISABLE_MPX   0
+#else
+# define DISABLE_MPX   (1(X86_FEATURE_MPX  31))
+#endif
+
 #ifdef CONFIG_X86_64
 # define DISABLE_VME   (1(X86_FEATURE_VME  31))
 # define DISABLE_K6_MTRR   (1(X86_FEATURE_K6_MTRR  31))
@@ -34,6 +40,6 @@
 #define DISABLED_MASK6 0
 #define DISABLED_MASK7 0
 #define DISABLED_MASK8 0
-#define DISABLED_MASK9 0
+#define DISABLED_MASK9 (DISABLE_MPX)
 
 #endif /* _ASM_X86_DISABLED_FEATURES_H */
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 06/12] mpx: extend siginfo structure to include bound violation information

2014-10-11 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 include/uapi/asm-generic/siginfo.h |9 -
 kernel/signal.c|4 
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno  _sifields._sigfault._trapno
 #endif
 #define si_addr_lsb_sifields._sigfault._addr_lsb
+#define si_lower   _sifields._sigfault._addr_bnd._lower
+#define si_upper   _sifields._sigfault._addr_bnd._upper
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
-#define NSIGSEGV   2
+#define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV   3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index 8f0876f..2c403a4 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const 
siginfo_t *from)
if (from-si_code == BUS_MCEERR_AR || from-si_code == 
BUS_MCEERR_AO)
err |= __put_user(from-si_addr_lsb, to-si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+   err |= __put_user(from-si_lower, to-si_lower);
+   err |= __put_user(from-si_upper, to-si_upper);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from-si_pid, to-si_pid);
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 00/10] Intel MPX support

2014-09-11 Thread Qiaowei Ren
This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in "Intel(R) Architecture
Instruction Set Extensions Programming Reference".

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.

The high-level areas modified in the patchset are as follow:
1) struct siginfo is extended to include bound violation information.
2) two prctl() commands are added to do performance optimization.

Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator

This patchset has been tested on real internal hardware platform at Intel.
We have some simple unit tests in user space, which directly call MPX
instructions to produce #BR to let kernel allocate bounds tables and
cause bounds violations. We also compiled several benchmarks with an
MPX-enabled Gcc/Glibc and ICC, an ran them with this patch set.
We found a number of bugs in this code in these tests.

Future TODO items:
1) support 32-bit binaries on 64-bit kernels.

Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Changes since v2:
  * fix some compile warnings.
  * update documentation.

Changes since v3:
  * correct some syntax errors at documentation, and document
extended struct siginfo.
  * for kill the process when the error code of BNDSTATUS is 3.
  * add some comments.
  * remove new prctl() commands.
  * fix some compile warnings for 32-bit.

Changes since v4:
  * raise SIGBUS if the allocations of the bound tables fail.

Changes since v5:
  * hook unmap() path to cleanup unused bounds tables, and use
new prctl() command to register bounds directory address to
struct mm_struct to check whether one process is MPX enabled
during unmap().
  * in order track precisely MPX memory usage, add MPX specific
mmap interface and one VM_MPX flag to check whether a VMA
is MPX bounds table.
  * add macro cpu_has_mpx to do performance optimization.
  * sync struct figinfo for mips with general version to avoid
build issue.

Changes since v6:
  * because arch_vma_name is removed, this patchset have toset MPX
specific ->vm_ops to do the same thing.
  * fix warnings for 32 bit arch.
  * add more description into these patches.

Changes since v7:
  * introduce VM_ARCH_2 flag. 
  * remove all of the pr_debug()s.
  * fix prctl numbers in documentation.
  * fix some bugs on bounds tables freeing.

Qiaowei Ren (10):
  x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add macro cpu_has_mpx
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: extend siginfo structure to include bound violation
information
  mips: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER
  x86, mpx: cleanup unused bound tables
  

[PATCH v8 02/10] x86, mpx: add MPX specific mmap interface

2014-09-11 Thread Qiaowei Ren
This patch adds one MPX specific mmap interface, which only handles
mpx related maps, including bounds table and bounds directory.

In order to track MPX specific memory usage, this interface is added
to stick new vm_flag VM_MPX in the vma_area_struct when create a
bounds table or bounds directory.

These bounds tables can take huge amounts of memory.  In the
worst-case scenario, the tables can be 4x the size of the data
structure being tracked. IOW, a 1-page structure can require 4
bounds-table pages.

My expectation is that folks using MPX are going to be keen on
figuring out how much memory is being dedicated to it. With this
feature, plus some grepping in /proc/$pid/smaps one could take a
pretty good stab at it.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/Kconfig   |4 ++
 arch/x86/include/asm/mpx.h |   38 +
 arch/x86/mm/Makefile   |2 +
 arch/x86/mm/mpx.c  |   79 
 4 files changed, 123 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/mm/mpx.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 778178f..935aa69 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -243,6 +243,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU && ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32 && SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..5725ac4
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,38 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+unsigned long mpx_mmap(unsigned long len);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..e1b28e6
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,79 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static const char *mpx_mapping_name(struct vm_area_struct *vma)
+{
+   return "[mpx]";
+}
+
+static struct vm_operations_struct mpx_vma_ops = {
+   .name = mpx_mapping_name,
+};
+
+/*
+ * this is really a simplified "vm_mmap". it only handles mpx
+ * related maps, including bounds table and bounds directory.
+ *
+ * here we can stick new vm_flag VM_MPX in the vma_area_struct
+ * when create a bounds table or bounds directory, in order to
+ * track MPX specific memory.
+ */
+unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current->mm;
+   vm_flags_t vm_flags;
+   struct vm_area_struct *vma;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(>mmap_sem);
+
+   /* Too many mappings? */
+   if (mm->map_count > sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
+   if (addr & ~PAGE_MASK) {
+   ret = addr;
+   goto out;
+   }
+
+   vm_flags = VM_READ | VM_WRITE | VM_MPX |
+   mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
+
+   /* Set pgoff according to addr for anon_vma */
+   pgoff = addr >> PAGE_SHIFT;
+
+   ret = mmap_region(NULL, addr, len, vm_flags, pgoff);
+   if (IS_ERR_VALUE(ret))
+   goto out;
+
+

[PATCH v8 03/10] x86, mpx: add macro cpu_has_mpx

2014-09-11 Thread Qiaowei Ren
In order to do performance optimization, this patch adds macro
cpu_has_mpx which will directly return 0 when MPX is not supported
by kernel.

Community gave a lot of comments on this macro cpu_has_mpx in previous
version. Dave will introduce a patchset about disabled features to fix
it later.

In this code:
if (cpu_has_mpx)
do_some_mpx_thing();

The patch series from Dave will introduce a new macro cpu_feature_enabled()
(if merged after this patchset) to replace the cpu_has_mpx.
if (cpu_feature_enabled(X86_FEATURE_MPX))
do_some_mpx_thing();

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/cpufeature.h |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index bb9b258..82ec7ed 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -353,6 +353,12 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 #define cpu_has_eager_fpu  boot_cpu_has(X86_FEATURE_EAGER_FPU)
 #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT)
 
+#ifdef CONFIG_X86_INTEL_MPX
+#define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX)
+#else
+#define cpu_has_mpx 0
+#endif /* CONFIG_X86_INTEL_MPX */
+
 #ifdef CONFIG_X86_64
 
 #undef  cpu_has_vme
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 05/10] x86, mpx: extend siginfo structure to include bound violation information

2014-09-11 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

Signed-off-by: Qiaowei Ren 
---
 include/uapi/asm-generic/siginfo.h |9 -
 kernel/signal.c|4 
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno  _sifields._sigfault._trapno
 #endif
 #define si_addr_lsb_sifields._sigfault._addr_lsb
+#define si_lower   _sifields._sigfault._addr_bnd._lower
+#define si_upper   _sifields._sigfault._addr_bnd._upper
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
-#define NSIGSEGV   2
+#define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV   3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index 8f0876f..2c403a4 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const 
siginfo_t *from)
if (from->si_code == BUS_MCEERR_AR || from->si_code == 
BUS_MCEERR_AO)
err |= __put_user(from->si_addr_lsb, >si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+   err |= __put_user(from->si_lower, >si_lower);
+   err |= __put_user(from->si_upper, >si_upper);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from->si_pid, >si_pid);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 01/10] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific

2014-09-11 Thread Qiaowei Ren
MPX-enabled application will possibly create a lot of bounds tables
in process address space to save bounds information. These tables
can take up huge swaths of memory (as much as 80% of the memory on
the system) even if we clean them up aggressively. Being this huge,
we need a way to track their memory use. If we want to track them,
we essentially have two options:

1. walk the multi-GB (in virtual space) bounds directory to locate
   all the VMAs and walk them
2. Find a way to distinguish MPX bounds-table VMAs from normal
   anonymous VMAs and use some existing mechanism to walk them

We expect (1) will be prohibitively expensive. For (2), we only
need a single bit, and we've chosen to use a VM_ flag.  We understand
that they are scarce and are open to other options.

There is one potential hybrid approach: check the bounds directory
entry for any anonymous VMA that could possibly contain a bounds table.
This is less expensive than (1), but still requires reading a pointer
out of userspace for every VMA that we iterate over.

Signed-off-by: Qiaowei Ren 
---
 fs/proc/task_mmu.c |1 +
 include/linux/mm.h |6 ++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index dfc791c..cc31520 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -549,6 +549,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_GROWSDOWN)]   = "gd",
[ilog2(VM_PFNMAP)]  = "pf",
[ilog2(VM_DENYWRITE)]   = "dw",
+   [ilog2(VM_MPX)] = "mp",
[ilog2(VM_LOCKED)]  = "lo",
[ilog2(VM_IO)]  = "io",
[ilog2(VM_SEQ_READ)]= "sr",
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8981cc8..942be8a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -127,6 +127,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB 0x0040  /* Huge TLB Page VM */
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1  0x0100  /* Architecture-specific flag */
+#define VM_ARCH_2  0x0200
 #define VM_DONTDUMP0x0400  /* Do not include in the core dump */
 
 #ifdef CONFIG_MEM_SOFT_DIRTY
@@ -154,6 +155,11 @@ extern unsigned int kobjsize(const void *objp);
 # define VM_MAPPED_COPYVM_ARCH_1   /* T if mapped copy of data 
(nommu mmap) */
 #endif
 
+#if defined(CONFIG_X86)
+/* MPX specific bounds table or bounds directory */
+# define VM_MPXVM_ARCH_2
+#endif
+
 #ifndef VM_GROWSUP
 # define VM_GROWSUPVM_NONE
 #endif
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 06/10] mips: sync struct siginfo with general version

2014-09-11 Thread Qiaowei Ren
Due to new fields about bound violation added into struct siginfo,
this patch syncs it with general version to avoid build issue.

Signed-off-by: Qiaowei Ren 
---
 arch/mips/include/uapi/asm/siginfo.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 04/10] x86, mpx: hook #BR exception handler to allocate bound tables

2014-09-11 Thread Qiaowei Ren
This patch handles a #BR exception for non-existent tables by
carving the space out of the normal processes address space
(essentially calling mmap() from inside the kernel) and then
pointing the bounds-directory over to it.

The tables need to be accessed and controlled by userspace
because the compiler generates instructions for MPX-enabled
code which frequently store and retrieve entries from the bounds
tables. Any direct kernel involvement (like a syscall) to access
the tables would destroy performance since these are so frequent.

The tables are carved out of userspace because we have no better
spot to put them. For each pointer which is being tracked by MPX,
the bounds tables contain 4 longs worth of data, and the tables
are indexed virtually. If we were to preallocate the tables, we
would theoretically need to allocate 4x the virtual space that
we have available for userspace somewhere else. We don't have
that room in the kernel address space.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   20 +++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |   58 
 arch/x86/kernel/traps.c|   55 -
 4 files changed, 133 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 5725ac4..b7598ac 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -18,6 +18,8 @@
 #define MPX_BT_ENTRY_SHIFT 5
 #define MPX_IGN_BITS   3
 
+#define MPX_BD_ENTRY_TAIL  3
+
 #else
 
 #define MPX_BD_ENTRY_OFFSET20
@@ -26,13 +28,31 @@
 #define MPX_BT_ENTRY_SHIFT 4
 #define MPX_IGN_BITS   2
 
+#define MPX_BD_ENTRY_TAIL  2
+
 #endif
 
+#define MPX_BNDSTA_TAIL2
+#define MPX_BNDCFG_TAIL12
+#define MPX_BNDSTA_ADDR_MASK   (~((1UL<
+#include 
+#include 
+
+static int allocate_bt(long __user *bd_entry)
+{
+   unsigned long bt_addr, old_val = 0;
+   int ret = 0;
+
+   bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES);
+   if (IS_ERR((void *)bt_addr))
+   return bt_addr;
+   bt_addr = (bt_addr & MPX_BT_ADDR_MASK) | MPX_BD_ENTRY_VALID_FLAG;
+
+   ret = user_atomic_cmpxchg_inatomic(_val, bd_entry, 0, bt_addr);
+   if (ret)
+   goto out;
+
+   /*
+* there is a existing bounds table pointed at this bounds
+* directory entry, and so we need to free the bounds table
+* allocated just now.
+*/
+   if (old_val)
+   goto out;
+
+   return 0;
+
+out:
+   vm_munmap(bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES);
+   return ret;
+}
+
+/*
+ * When a BNDSTX instruction attempts to save bounds to a BD entry
+ * with the lack of the valid bit being set, a #BR is generated.
+ * This is an indication that no BT exists for this entry. In this
+ * case the fault handler will allocate a new BT.
+ *
+ * With 32-bit mode, the size of BD is 4MB, and the size of each
+ * bound table is 16KB. With 64-bit mode, the size of BD is 2GB,
+ * and the size of each bound table is 4MB.
+ */
+int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf->bndcsr.status_reg;
+
+   bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry < bd_base) ||
+   (bd_entry >= bd_base + MPX_BD_SIZE_BYTES))
+   return -EINVAL;
+
+   return allocate_bt((long __user *)bd_entry);
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 0d0e922..396a88b 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -60,6 +60,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -228,7 +229,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long 
error_code) \
 
 DO_ERROR(X86_TRAP_DE, SIGFPE,  "divide error", divide_error)
 DO_ERROR(X86_TRAP_OF, SIGSEGV, "overflow", overflow)
-DO_ERROR(X86_TRAP_BR, SIGSEGV, "bounds",   bounds)
 DO_ERROR(X86_TRAP_UD, SIGILL,  "invalid opcode",   invalid_op)
 DO_ERROR(X86_TRAP_OLD_MF, SIGFPE,  "coprocessor segment 
overrun",coprocessor_segment_overrun)
 DO_ERROR(X86_TRAP_TS, SIGSEGV, "invalid TSS",  invalid_TSS)
@@ -278,6 +278,59 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, 
long error_code)
 }
 #endif
 
+dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
+{
+   enum ctx_state prev_state;
+   unsigned long status;
+   struct xsave_struct *xsave_buf;
+   struct task_struct *tsk = current;
+
+   prev_state = exception_enter();
+   if (notify_die(DIE_TRAP, "bounds",

[PATCH v8 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-09-11 Thread Qiaowei Ren
This patch adds the PR_MPX_REGISTER and PR_MPX_UNREGISTER prctl()
commands. These commands can be used to register and unregister MPX
related resource on the x86 platform.

The base of the bounds directory is set into mm_struct during
PR_MPX_REGISTER command execution. This member can be used to
check whether one application is mpx enabled.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h   |1 +
 arch/x86/include/asm/processor.h |   18 
 arch/x86/kernel/mpx.c|   55 ++
 include/linux/mm_types.h |3 ++
 include/uapi/linux/prctl.h   |6 
 kernel/sys.c |   12 
 6 files changed, 95 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 780af63..6cb0853 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -43,6 +43,7 @@
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
 struct mpx_insn {
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index eb71ec7..b801fea 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -953,6 +953,24 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+/* Register/unregister a process' MPX related resource */
+#define MPX_REGISTER(tsk)  mpx_register((tsk))
+#define MPX_UNREGISTER(tsk)mpx_unregister((tsk))
+
+#ifdef CONFIG_X86_INTEL_MPX
+extern int mpx_register(struct task_struct *tsk);
+extern int mpx_unregister(struct task_struct *tsk);
+#else
+static inline int mpx_register(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+static inline int mpx_unregister(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+#endif /* CONFIG_X86_INTEL_MPX */
+
 extern u16 amd_get_nb_id(int cpu);
 
 static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 7ef6e39..b86873a 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -1,6 +1,61 @@
 #include 
 #include 
+#include 
 #include 
+#include 
+#include 
+
+/*
+ * This should only be called when cpuid has been checked
+ * and we are sure that MPX is available.
+ */
+static __user void *task_get_bounds_dir(struct task_struct *tsk)
+{
+   struct xsave_struct *xsave_buf;
+
+   fpu_xsave(>thread.fpu);
+   xsave_buf = &(tsk->thread.fpu.state->xsave);
+   if (!(xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ENABLE_FLAG))
+   return NULL;
+
+   return (void __user *)(unsigned long)(xsave_buf->bndcsr.cfg_reg_u &
+   MPX_BNDCFG_ADDR_MASK);
+}
+
+int mpx_register(struct task_struct *tsk)
+{
+   struct mm_struct *mm = tsk->mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   /*
+* runtime in the userspace will be responsible for allocation of
+* the bounds directory. Then, it will save the base of the bounds
+* directory into XSAVE/XRSTOR Save Area and enable MPX through
+* XRSTOR instruction.
+*
+* fpu_xsave() is expected to be very expensive. In order to do
+* performance optimization, here we get the base of the bounds
+* directory and then save it into mm_struct to be used in future.
+*/
+   mm->bd_addr = task_get_bounds_dir(tsk);
+   if (!mm->bd_addr)
+   return -EINVAL;
+
+   return 0;
+}
+
+int mpx_unregister(struct task_struct *tsk)
+{
+   struct mm_struct *mm = current->mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   mm->bd_addr = NULL;
+   return 0;
+}
 
 enum reg_type {
REG_TYPE_RM = 0,
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 6e0b286..760aee3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -454,6 +454,9 @@ struct mm_struct {
bool tlb_flush_pending;
 #endif
struct uprobes_state uprobes_state;
+#ifdef CONFIG_X86_INTEL_MPX
+   void __user *bd_addr;   /* address of the bounds directory */
+#endif
 };
 
 static inline void mm_init_cpumask(struct mm_struct *mm)
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 58afc04..ce86fa9 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -152,4 +152,10 @@
 #define PR_SET_THP_DISABLE 41
 #define PR_GET_THP_DISABLE 42
 
+/*
+ * Register/unregister MPX related resource.
+ */
+#define PR_MPX_REGISTER43
+#define PR_MPX_UNREGISTER  44
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index ce81291..9a43587 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -91,6 +91,

[PATCH v8 10/10] x86, mpx: add documentation on Intel MPX

2014-09-11 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
---
 Documentation/x86/intel_mpx.txt |  127 +++
 1 files changed, 127 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..ccffeee
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,127 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX provides
+hardware features that can be used in conjunction with compiler
+changes to check memory references, for those references whose
+compile-time normal intentions are usurped at runtime due to
+buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture
+Instruction Set Extensions Programming Reference, Chapter 9:
+Intel(R) Memory Protection Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead,
+which can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How does MPX kernel code work
+
+
+Handling #BR faults caused by MPX
+-
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults.
+  * bounds violation caused by MPX instructions.
+  * new bounds tables (BT) need to be allocated to save bounds.
+
+We hook #BR handler to handle these two new situations.
+
+Decoding MPX instructions
+-
+
+If a #BR is generated due to a bounds violation caused by MPX.
+We need to decode MPX instructions to get violation address and
+set this address into extended struct siginfo.
+
+The _sigfault feild of struct siginfo is extended as follow:
+
+87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
+88 struct {
+89 void __user *_addr; /* faulting insn/memory ref. */
+90 #ifdef __ARCH_SI_TRAPNO
+91 int _trapno;/* TRAP # which caused the signal */
+92 #endif
+93 short _addr_lsb; /* LSB of the reported address */
+94 struct {
+95 void __user *_lower;
+96 void __user *_upper;
+97 } _addr_bnd;
+98 } _sigfault;
+
+The '_addr' field refers to violation address, and new '_addr_and'
+field refers to the upper/lower bounds when a #BR is caused.
+
+Glibc will be also updated to support this new siginfo. So user
+can get violation address and bounds when bounds violations occur.
+
+Freeing unused bounds tables
+
+
+When a BNDSTX instruction attempts to save bounds to a bounds directory
+entry marked as invalid, a #BR is generated. This is an indication that
+no bounds table exists for this entry. In this case the fault handler
+will allocate a new bounds table on demand.
+
+Since the kernel allocated those tables on-demand without userspace
+knowledge, it is also responsible for freeing them when the associated
+mappings go away.
+
+Here, the solution for this issue is to hook do_munmap() to check
+whether one process is MPX enabled. If yes, those bounds tables covered
+in the virtual address region which is being unmapped will be freed also.
+
+Adding new prctl commands
+-
+
+Runtime library in userspace is responsible for allocation of bounds
+directory. So kernel have to use XSAVE instruction to get the base
+of bounds directory from BNDCFG register.
+
+But XSAVE is expected to be very expensive. In order to do performance
+optimization, we have to add new prctl command to get the base of
+bounds directory to be used in future.
+
+Two new prctl commands are added to register and unregister MPX related
+resource.
+
+155#define PR_MPX_REGISTER 43
+156#define PR_MPX_UNREGISTER   44
+
+The base of the bounds directory is set into mm_struct during
+PR_MPX_REGISTER command execution. This member can be used to
+check whether one application is mpx enabled.
+
+
+3. Tips
+===
+
+1) Users are not allowed to create bounds tables and point the bounds
+directory at them in the userspace. In fact, it is not also necessary
+for users to create bounds tables in the userspace.
+
+When #BR fault is produced due to invalid entry, bounds table will be
+created in kernel on demand and kernel will not transfer this fault to
+userspace. So usersapce can't receive #BR fault for invalid entry, and
+it is not also necessary for users to create bounds tables by themselves.
+
+Certainly users can allocate bounds tables and forcibly point the bounds
+directory at them through XSAVE instruction, and then set valid bit
+of bounds entry

[PATCH v8 09/10] x86, mpx: cleanup unused bound tables

2014-09-11 Thread Qiaowei Ren
Since the kernel allocated those tables on-demand without userspace
knowledge, it is also responsible for freeing them when the associated
mappings go away.

Here, the solution for this issue is to hook do_munmap() to check
whether one process is MPX enabled. If yes, those bounds tables covered
in the virtual address region which is being unmapped will be freed also.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mmu_context.h |   16 +++
 arch/x86/include/asm/mpx.h |9 ++
 arch/x86/mm/mpx.c  |  252 
 include/asm-generic/mmu_context.h  |6 +
 mm/mmap.c  |2 +
 5 files changed, 285 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index 166af2a..d13e01c 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifndef CONFIG_PARAVIRT
 #include 
 
@@ -102,4 +103,19 @@ do {   \
 } while (0)
 #endif
 
+static inline void arch_unmap(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long start, unsigned long end)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   /*
+* Check whether this vma comes from MPX-enabled application.
+* If so, release this vma related bound tables.
+*/
+   if (mm->bd_addr && !(vma->vm_flags & VM_MPX))
+   mpx_unmap(mm, start, end);
+
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 6cb0853..e848a74 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -42,6 +42,13 @@
 #define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
+#define MPX_BD_ENTRY_MASK  ((1<>(MPX_BT_ENTRY_OFFSET+ \
+   MPX_IGN_BITS)) & MPX_BD_ENTRY_MASK) << MPX_BD_ENTRY_SHIFT)
+#define MPX_GET_BT_ENTRY_OFFSET(addr)  addr)>>MPX_IGN_BITS) & \
+   MPX_BT_ENTRY_MASK) << MPX_BT_ENTRY_SHIFT)
+
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
@@ -63,6 +70,8 @@ struct mpx_insn {
 #define MAX_MPX_INSN_SIZE  15
 
 unsigned long mpx_mmap(unsigned long len);
+void mpx_unmap(struct mm_struct *mm,
+   unsigned long start, unsigned long end);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index e1b28e6..feb1f01 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -1,7 +1,16 @@
+/*
+ * mpx.c - Memory Protection eXtensions
+ *
+ * Copyright (c) 2014, Intel Corporation.
+ * Qiaowei Ren 
+ * Dave Hansen 
+ */
+
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static const char *mpx_mapping_name(struct vm_area_struct *vma)
@@ -77,3 +86,246 @@ out:
up_write(>mmap_sem);
return ret;
 }
+
+/*
+ * Get the base of bounds tables pointed by specific bounds
+ * directory entry.
+ */
+static int get_bt_addr(long __user *bd_entry, unsigned long *bt_addr)
+{
+   int valid;
+
+   if (!access_ok(VERIFY_READ, (bd_entry), sizeof(*(bd_entry
+   return -EFAULT;
+
+   pagefault_disable();
+   if (get_user(*bt_addr, bd_entry))
+   goto out;
+   pagefault_enable();
+
+   valid = *bt_addr & MPX_BD_ENTRY_VALID_FLAG;
+   *bt_addr &= MPX_BT_ADDR_MASK;
+
+   /*
+* If this bounds directory entry is nonzero, and meanwhile
+* the valid bit is zero, one SIGSEGV will be produced due to
+* this unexpected situation.
+*/
+   if (!valid && *bt_addr)
+   return -EINVAL;
+   if (!valid)
+   return -ENOENT;
+
+   return 0;
+
+out:
+   pagefault_enable();
+   return -EFAULT;
+}
+
+/*
+ * Free the backing physical pages of bounds table 'bt_addr'.
+ * Assume start...end is within that bounds table.
+ */
+static int __must_check zap_bt_entries(struct mm_struct *mm,
+   unsigned long bt_addr,
+   unsigned long start, unsigned long end)
+{
+   struct vm_area_struct *vma;
+
+   /* Find the vma which overlaps this bounds table */
+   vma = find_vma(mm, bt_addr);
+   /*
+* The table entry comes from userspace and could be
+* pointing anywhere, so make sure it is at least
+* pointing to valid memory.
+*/
+   if (!vma || !(vma->vm_flags & VM_MPX) ||
+   vma->vm_start > bt_addr ||
+   vma->vm_end < bt_addr+MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   zap_page_range(vma, start, end - start, NULL);
+   return 0;
+}
+
+static in

[PATCH v8 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-09-11 Thread Qiaowei Ren
This patch sets bound violation fields of siginfo struct in #BR
exception handler by decoding the user instruction and constructing
the faulting pointer.

This patch does't use the generic decoder, and implements a limited
special-purpose decoder to decode MPX instructions, simply because the
generic decoder is very heavyweight not just in terms of performance
but in terms of interface -- because it has to.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   23 
 arch/x86/kernel/mpx.c  |  299 
 arch/x86/kernel/traps.c|6 +
 3 files changed, 328 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index b7598ac..780af63 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 
@@ -44,15 +45,37 @@
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 unsigned long mpx_mmap(unsigned long len);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 #else
 static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
 {
return -EINVAL;
 }
+static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf)
+{
+}
 #endif /* CONFIG_X86_INTEL_MPX */
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 88d660f..7ef6e39 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -2,6 +2,275 @@
 #include 
 #include 
 
+enum reg_type {
+   REG_TYPE_RM = 0,
+   REG_TYPE_INDEX,
+   REG_TYPE_BASE,
+};
+
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+enum reg_type type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn->sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   } else {
+

[PATCH v8 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-09-11 Thread Qiaowei Ren
This patch sets bound violation fields of siginfo struct in #BR
exception handler by decoding the user instruction and constructing
the faulting pointer.

This patch does't use the generic decoder, and implements a limited
special-purpose decoder to decode MPX instructions, simply because the
generic decoder is very heavyweight not just in terms of performance
but in terms of interface -- because it has to.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mpx.h |   23 
 arch/x86/kernel/mpx.c  |  299 
 arch/x86/kernel/traps.c|6 +
 3 files changed, 328 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index b7598ac..780af63 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include linux/types.h
 #include asm/ptrace.h
+#include asm/insn.h
 
 #ifdef CONFIG_X86_64
 
@@ -44,15 +45,37 @@
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 unsigned long mpx_mmap(unsigned long len);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 #else
 static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
 {
return -EINVAL;
 }
+static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf)
+{
+}
 #endif /* CONFIG_X86_INTEL_MPX */
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 88d660f..7ef6e39 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -2,6 +2,275 @@
 #include linux/syscalls.h
 #include asm/mpx.h
 
+enum reg_type {
+   REG_TYPE_RM = 0,
+   REG_TYPE_INDEX,
+   REG_TYPE_BASE,
+};
+
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+enum reg_type type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn-modrm.value;
+   unsigned char sib = (unsigned char)insn-sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn-rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn-rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn-rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn-modrm.value;
+   unsigned char sib = (unsigned char)insn-sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn-sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1  X86_SIB_SCALE(sib

[PATCH v8 10/10] x86, mpx: add documentation on Intel MPX

2014-09-11 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 Documentation/x86/intel_mpx.txt |  127 +++
 1 files changed, 127 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..ccffeee
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,127 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX provides
+hardware features that can be used in conjunction with compiler
+changes to check memory references, for those references whose
+compile-time normal intentions are usurped at runtime due to
+buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture
+Instruction Set Extensions Programming Reference, Chapter 9:
+Intel(R) Memory Protection Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead,
+which can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How does MPX kernel code work
+
+
+Handling #BR faults caused by MPX
+-
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults.
+  * bounds violation caused by MPX instructions.
+  * new bounds tables (BT) need to be allocated to save bounds.
+
+We hook #BR handler to handle these two new situations.
+
+Decoding MPX instructions
+-
+
+If a #BR is generated due to a bounds violation caused by MPX.
+We need to decode MPX instructions to get violation address and
+set this address into extended struct siginfo.
+
+The _sigfault feild of struct siginfo is extended as follow:
+
+87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
+88 struct {
+89 void __user *_addr; /* faulting insn/memory ref. */
+90 #ifdef __ARCH_SI_TRAPNO
+91 int _trapno;/* TRAP # which caused the signal */
+92 #endif
+93 short _addr_lsb; /* LSB of the reported address */
+94 struct {
+95 void __user *_lower;
+96 void __user *_upper;
+97 } _addr_bnd;
+98 } _sigfault;
+
+The '_addr' field refers to violation address, and new '_addr_and'
+field refers to the upper/lower bounds when a #BR is caused.
+
+Glibc will be also updated to support this new siginfo. So user
+can get violation address and bounds when bounds violations occur.
+
+Freeing unused bounds tables
+
+
+When a BNDSTX instruction attempts to save bounds to a bounds directory
+entry marked as invalid, a #BR is generated. This is an indication that
+no bounds table exists for this entry. In this case the fault handler
+will allocate a new bounds table on demand.
+
+Since the kernel allocated those tables on-demand without userspace
+knowledge, it is also responsible for freeing them when the associated
+mappings go away.
+
+Here, the solution for this issue is to hook do_munmap() to check
+whether one process is MPX enabled. If yes, those bounds tables covered
+in the virtual address region which is being unmapped will be freed also.
+
+Adding new prctl commands
+-
+
+Runtime library in userspace is responsible for allocation of bounds
+directory. So kernel have to use XSAVE instruction to get the base
+of bounds directory from BNDCFG register.
+
+But XSAVE is expected to be very expensive. In order to do performance
+optimization, we have to add new prctl command to get the base of
+bounds directory to be used in future.
+
+Two new prctl commands are added to register and unregister MPX related
+resource.
+
+155#define PR_MPX_REGISTER 43
+156#define PR_MPX_UNREGISTER   44
+
+The base of the bounds directory is set into mm_struct during
+PR_MPX_REGISTER command execution. This member can be used to
+check whether one application is mpx enabled.
+
+
+3. Tips
+===
+
+1) Users are not allowed to create bounds tables and point the bounds
+directory at them in the userspace. In fact, it is not also necessary
+for users to create bounds tables in the userspace.
+
+When #BR fault is produced due to invalid entry, bounds table will be
+created in kernel on demand and kernel will not transfer this fault to
+userspace. So usersapce can't receive #BR fault for invalid entry, and
+it is not also necessary for users to create bounds tables by themselves.
+
+Certainly users can allocate bounds tables and forcibly point the bounds
+directory at them through XSAVE instruction, and then set valid

[PATCH v8 09/10] x86, mpx: cleanup unused bound tables

2014-09-11 Thread Qiaowei Ren
Since the kernel allocated those tables on-demand without userspace
knowledge, it is also responsible for freeing them when the associated
mappings go away.

Here, the solution for this issue is to hook do_munmap() to check
whether one process is MPX enabled. If yes, those bounds tables covered
in the virtual address region which is being unmapped will be freed also.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mmu_context.h |   16 +++
 arch/x86/include/asm/mpx.h |9 ++
 arch/x86/mm/mpx.c  |  252 
 include/asm-generic/mmu_context.h  |6 +
 mm/mmap.c  |2 +
 5 files changed, 285 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index 166af2a..d13e01c 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -10,6 +10,7 @@
 #include asm/pgalloc.h
 #include asm/tlbflush.h
 #include asm/paravirt.h
+#include asm/mpx.h
 #ifndef CONFIG_PARAVIRT
 #include asm-generic/mm_hooks.h
 
@@ -102,4 +103,19 @@ do {   \
 } while (0)
 #endif
 
+static inline void arch_unmap(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long start, unsigned long end)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   /*
+* Check whether this vma comes from MPX-enabled application.
+* If so, release this vma related bound tables.
+*/
+   if (mm-bd_addr  !(vma-vm_flags  VM_MPX))
+   mpx_unmap(mm, start, end);
+
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 6cb0853..e848a74 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -42,6 +42,13 @@
 #define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
+#define MPX_BD_ENTRY_MASK  ((1MPX_BD_ENTRY_OFFSET)-1)
+#define MPX_BT_ENTRY_MASK  ((1MPX_BT_ENTRY_OFFSET)-1)
+#define MPX_GET_BD_ENTRY_OFFSET(addr)  addr)(MPX_BT_ENTRY_OFFSET+ \
+   MPX_IGN_BITS))  MPX_BD_ENTRY_MASK)  MPX_BD_ENTRY_SHIFT)
+#define MPX_GET_BT_ENTRY_OFFSET(addr)  addr)MPX_IGN_BITS)  \
+   MPX_BT_ENTRY_MASK)  MPX_BT_ENTRY_SHIFT)
+
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
@@ -63,6 +70,8 @@ struct mpx_insn {
 #define MAX_MPX_INSN_SIZE  15
 
 unsigned long mpx_mmap(unsigned long len);
+void mpx_unmap(struct mm_struct *mm,
+   unsigned long start, unsigned long end);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index e1b28e6..feb1f01 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -1,7 +1,16 @@
+/*
+ * mpx.c - Memory Protection eXtensions
+ *
+ * Copyright (c) 2014, Intel Corporation.
+ * Qiaowei Ren qiaowei@intel.com
+ * Dave Hansen dave.han...@intel.com
+ */
+
 #include linux/kernel.h
 #include linux/syscalls.h
 #include asm/mpx.h
 #include asm/mman.h
+#include asm/mmu_context.h
 #include linux/sched/sysctl.h
 
 static const char *mpx_mapping_name(struct vm_area_struct *vma)
@@ -77,3 +86,246 @@ out:
up_write(mm-mmap_sem);
return ret;
 }
+
+/*
+ * Get the base of bounds tables pointed by specific bounds
+ * directory entry.
+ */
+static int get_bt_addr(long __user *bd_entry, unsigned long *bt_addr)
+{
+   int valid;
+
+   if (!access_ok(VERIFY_READ, (bd_entry), sizeof(*(bd_entry
+   return -EFAULT;
+
+   pagefault_disable();
+   if (get_user(*bt_addr, bd_entry))
+   goto out;
+   pagefault_enable();
+
+   valid = *bt_addr  MPX_BD_ENTRY_VALID_FLAG;
+   *bt_addr = MPX_BT_ADDR_MASK;
+
+   /*
+* If this bounds directory entry is nonzero, and meanwhile
+* the valid bit is zero, one SIGSEGV will be produced due to
+* this unexpected situation.
+*/
+   if (!valid  *bt_addr)
+   return -EINVAL;
+   if (!valid)
+   return -ENOENT;
+
+   return 0;
+
+out:
+   pagefault_enable();
+   return -EFAULT;
+}
+
+/*
+ * Free the backing physical pages of bounds table 'bt_addr'.
+ * Assume start...end is within that bounds table.
+ */
+static int __must_check zap_bt_entries(struct mm_struct *mm,
+   unsigned long bt_addr,
+   unsigned long start, unsigned long end)
+{
+   struct vm_area_struct *vma;
+
+   /* Find the vma which overlaps this bounds table */
+   vma = find_vma(mm, bt_addr);
+   /*
+* The table entry comes from userspace and could be
+* pointing anywhere, so make sure it is at least
+* pointing to valid memory.
+*/
+   if (!vma || !(vma-vm_flags  VM_MPX

[PATCH v8 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-09-11 Thread Qiaowei Ren
This patch adds the PR_MPX_REGISTER and PR_MPX_UNREGISTER prctl()
commands. These commands can be used to register and unregister MPX
related resource on the x86 platform.

The base of the bounds directory is set into mm_struct during
PR_MPX_REGISTER command execution. This member can be used to
check whether one application is mpx enabled.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mpx.h   |1 +
 arch/x86/include/asm/processor.h |   18 
 arch/x86/kernel/mpx.c|   55 ++
 include/linux/mm_types.h |3 ++
 include/uapi/linux/prctl.h   |6 
 kernel/sys.c |   12 
 6 files changed, 95 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 780af63..6cb0853 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -43,6 +43,7 @@
 #define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
 struct mpx_insn {
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index eb71ec7..b801fea 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -953,6 +953,24 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+/* Register/unregister a process' MPX related resource */
+#define MPX_REGISTER(tsk)  mpx_register((tsk))
+#define MPX_UNREGISTER(tsk)mpx_unregister((tsk))
+
+#ifdef CONFIG_X86_INTEL_MPX
+extern int mpx_register(struct task_struct *tsk);
+extern int mpx_unregister(struct task_struct *tsk);
+#else
+static inline int mpx_register(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+static inline int mpx_unregister(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+#endif /* CONFIG_X86_INTEL_MPX */
+
 extern u16 amd_get_nb_id(int cpu);
 
 static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 7ef6e39..b86873a 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -1,6 +1,61 @@
 #include linux/kernel.h
 #include linux/syscalls.h
+#include linux/prctl.h
 #include asm/mpx.h
+#include asm/i387.h
+#include asm/fpu-internal.h
+
+/*
+ * This should only be called when cpuid has been checked
+ * and we are sure that MPX is available.
+ */
+static __user void *task_get_bounds_dir(struct task_struct *tsk)
+{
+   struct xsave_struct *xsave_buf;
+
+   fpu_xsave(tsk-thread.fpu);
+   xsave_buf = (tsk-thread.fpu.state-xsave);
+   if (!(xsave_buf-bndcsr.cfg_reg_u  MPX_BNDCFG_ENABLE_FLAG))
+   return NULL;
+
+   return (void __user *)(unsigned long)(xsave_buf-bndcsr.cfg_reg_u 
+   MPX_BNDCFG_ADDR_MASK);
+}
+
+int mpx_register(struct task_struct *tsk)
+{
+   struct mm_struct *mm = tsk-mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   /*
+* runtime in the userspace will be responsible for allocation of
+* the bounds directory. Then, it will save the base of the bounds
+* directory into XSAVE/XRSTOR Save Area and enable MPX through
+* XRSTOR instruction.
+*
+* fpu_xsave() is expected to be very expensive. In order to do
+* performance optimization, here we get the base of the bounds
+* directory and then save it into mm_struct to be used in future.
+*/
+   mm-bd_addr = task_get_bounds_dir(tsk);
+   if (!mm-bd_addr)
+   return -EINVAL;
+
+   return 0;
+}
+
+int mpx_unregister(struct task_struct *tsk)
+{
+   struct mm_struct *mm = current-mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   mm-bd_addr = NULL;
+   return 0;
+}
 
 enum reg_type {
REG_TYPE_RM = 0,
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 6e0b286..760aee3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -454,6 +454,9 @@ struct mm_struct {
bool tlb_flush_pending;
 #endif
struct uprobes_state uprobes_state;
+#ifdef CONFIG_X86_INTEL_MPX
+   void __user *bd_addr;   /* address of the bounds directory */
+#endif
 };
 
 static inline void mm_init_cpumask(struct mm_struct *mm)
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 58afc04..ce86fa9 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -152,4 +152,10 @@
 #define PR_SET_THP_DISABLE 41
 #define PR_GET_THP_DISABLE 42
 
+/*
+ * Register/unregister MPX related resource.
+ */
+#define PR_MPX_REGISTER43
+#define PR_MPX_UNREGISTER  44
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index ce81291..9a43587 100644

[PATCH v8 04/10] x86, mpx: hook #BR exception handler to allocate bound tables

2014-09-11 Thread Qiaowei Ren
This patch handles a #BR exception for non-existent tables by
carving the space out of the normal processes address space
(essentially calling mmap() from inside the kernel) and then
pointing the bounds-directory over to it.

The tables need to be accessed and controlled by userspace
because the compiler generates instructions for MPX-enabled
code which frequently store and retrieve entries from the bounds
tables. Any direct kernel involvement (like a syscall) to access
the tables would destroy performance since these are so frequent.

The tables are carved out of userspace because we have no better
spot to put them. For each pointer which is being tracked by MPX,
the bounds tables contain 4 longs worth of data, and the tables
are indexed virtually. If we were to preallocate the tables, we
would theoretically need to allocate 4x the virtual space that
we have available for userspace somewhere else. We don't have
that room in the kernel address space.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mpx.h |   20 +++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |   58 
 arch/x86/kernel/traps.c|   55 -
 4 files changed, 133 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 5725ac4..b7598ac 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -18,6 +18,8 @@
 #define MPX_BT_ENTRY_SHIFT 5
 #define MPX_IGN_BITS   3
 
+#define MPX_BD_ENTRY_TAIL  3
+
 #else
 
 #define MPX_BD_ENTRY_OFFSET20
@@ -26,13 +28,31 @@
 #define MPX_BT_ENTRY_SHIFT 4
 #define MPX_IGN_BITS   2
 
+#define MPX_BD_ENTRY_TAIL  2
+
 #endif
 
+#define MPX_BNDSTA_TAIL2
+#define MPX_BNDCFG_TAIL12
+#define MPX_BNDSTA_ADDR_MASK   (~((1ULMPX_BNDSTA_TAIL)-1))
+#define MPX_BNDCFG_ADDR_MASK   (~((1ULMPX_BNDCFG_TAIL)-1))
+#define MPX_BT_ADDR_MASK   (~((1ULMPX_BD_ENTRY_TAIL)-1))
+
 #define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BD_ENTRY_VALID_FLAG0x1
 
 unsigned long mpx_mmap(unsigned long len);
 
+#ifdef CONFIG_X86_INTEL_MPX
+int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+#else
+static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   return -EINVAL;
+}
+#endif /* CONFIG_X86_INTEL_MPX */
+
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index ada2e2d..9ece662 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -43,6 +43,7 @@ obj-$(CONFIG_PREEMPT) += preempt.o
 
 obj-y  += process.o
 obj-y  += i387.o xsave.o
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
 obj-y  += ptrace.o
 obj-$(CONFIG_X86_32)   += tls.o
 obj-$(CONFIG_IA32_EMULATION)   += tls.o
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
new file mode 100644
index 000..88d660f
--- /dev/null
+++ b/arch/x86/kernel/mpx.c
@@ -0,0 +1,58 @@
+#include linux/kernel.h
+#include linux/syscalls.h
+#include asm/mpx.h
+
+static int allocate_bt(long __user *bd_entry)
+{
+   unsigned long bt_addr, old_val = 0;
+   int ret = 0;
+
+   bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES);
+   if (IS_ERR((void *)bt_addr))
+   return bt_addr;
+   bt_addr = (bt_addr  MPX_BT_ADDR_MASK) | MPX_BD_ENTRY_VALID_FLAG;
+
+   ret = user_atomic_cmpxchg_inatomic(old_val, bd_entry, 0, bt_addr);
+   if (ret)
+   goto out;
+
+   /*
+* there is a existing bounds table pointed at this bounds
+* directory entry, and so we need to free the bounds table
+* allocated just now.
+*/
+   if (old_val)
+   goto out;
+
+   return 0;
+
+out:
+   vm_munmap(bt_addr  MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES);
+   return ret;
+}
+
+/*
+ * When a BNDSTX instruction attempts to save bounds to a BD entry
+ * with the lack of the valid bit being set, a #BR is generated.
+ * This is an indication that no BT exists for this entry. In this
+ * case the fault handler will allocate a new BT.
+ *
+ * With 32-bit mode, the size of BD is 4MB, and the size of each
+ * bound table is 16KB. With 64-bit mode, the size of BD is 2GB,
+ * and the size of each bound table is 4MB.
+ */
+int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+
+   bd_base = xsave_buf-bndcsr.cfg_reg_u  MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf-bndcsr.status_reg;
+
+   bd_entry = status  MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry  bd_base) ||
+   (bd_entry = bd_base + MPX_BD_SIZE_BYTES))
+   return -EINVAL;
+
+   return

[PATCH v8 06/10] mips: sync struct siginfo with general version

2014-09-11 Thread Qiaowei Ren
Due to new fields about bound violation added into struct siginfo,
this patch syncs it with general version to avoid build issue.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/mips/include/uapi/asm/siginfo.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 01/10] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific

2014-09-11 Thread Qiaowei Ren
MPX-enabled application will possibly create a lot of bounds tables
in process address space to save bounds information. These tables
can take up huge swaths of memory (as much as 80% of the memory on
the system) even if we clean them up aggressively. Being this huge,
we need a way to track their memory use. If we want to track them,
we essentially have two options:

1. walk the multi-GB (in virtual space) bounds directory to locate
   all the VMAs and walk them
2. Find a way to distinguish MPX bounds-table VMAs from normal
   anonymous VMAs and use some existing mechanism to walk them

We expect (1) will be prohibitively expensive. For (2), we only
need a single bit, and we've chosen to use a VM_ flag.  We understand
that they are scarce and are open to other options.

There is one potential hybrid approach: check the bounds directory
entry for any anonymous VMA that could possibly contain a bounds table.
This is less expensive than (1), but still requires reading a pointer
out of userspace for every VMA that we iterate over.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 fs/proc/task_mmu.c |1 +
 include/linux/mm.h |6 ++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index dfc791c..cc31520 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -549,6 +549,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_GROWSDOWN)]   = gd,
[ilog2(VM_PFNMAP)]  = pf,
[ilog2(VM_DENYWRITE)]   = dw,
+   [ilog2(VM_MPX)] = mp,
[ilog2(VM_LOCKED)]  = lo,
[ilog2(VM_IO)]  = io,
[ilog2(VM_SEQ_READ)]= sr,
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8981cc8..942be8a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -127,6 +127,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB 0x0040  /* Huge TLB Page VM */
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1  0x0100  /* Architecture-specific flag */
+#define VM_ARCH_2  0x0200
 #define VM_DONTDUMP0x0400  /* Do not include in the core dump */
 
 #ifdef CONFIG_MEM_SOFT_DIRTY
@@ -154,6 +155,11 @@ extern unsigned int kobjsize(const void *objp);
 # define VM_MAPPED_COPYVM_ARCH_1   /* T if mapped copy of data 
(nommu mmap) */
 #endif
 
+#if defined(CONFIG_X86)
+/* MPX specific bounds table or bounds directory */
+# define VM_MPXVM_ARCH_2
+#endif
+
 #ifndef VM_GROWSUP
 # define VM_GROWSUPVM_NONE
 #endif
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 05/10] x86, mpx: extend siginfo structure to include bound violation information

2014-09-11 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 include/uapi/asm-generic/siginfo.h |9 -
 kernel/signal.c|4 
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno  _sifields._sigfault._trapno
 #endif
 #define si_addr_lsb_sifields._sigfault._addr_lsb
+#define si_lower   _sifields._sigfault._addr_bnd._lower
+#define si_upper   _sifields._sigfault._addr_bnd._upper
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
-#define NSIGSEGV   2
+#define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV   3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index 8f0876f..2c403a4 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const 
siginfo_t *from)
if (from-si_code == BUS_MCEERR_AR || from-si_code == 
BUS_MCEERR_AO)
err |= __put_user(from-si_addr_lsb, to-si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+   err |= __put_user(from-si_lower, to-si_lower);
+   err |= __put_user(from-si_upper, to-si_upper);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from-si_pid, to-si_pid);
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 03/10] x86, mpx: add macro cpu_has_mpx

2014-09-11 Thread Qiaowei Ren
In order to do performance optimization, this patch adds macro
cpu_has_mpx which will directly return 0 when MPX is not supported
by kernel.

Community gave a lot of comments on this macro cpu_has_mpx in previous
version. Dave will introduce a patchset about disabled features to fix
it later.

In this code:
if (cpu_has_mpx)
do_some_mpx_thing();

The patch series from Dave will introduce a new macro cpu_feature_enabled()
(if merged after this patchset) to replace the cpu_has_mpx.
if (cpu_feature_enabled(X86_FEATURE_MPX))
do_some_mpx_thing();

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/cpufeature.h |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index bb9b258..82ec7ed 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -353,6 +353,12 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 #define cpu_has_eager_fpu  boot_cpu_has(X86_FEATURE_EAGER_FPU)
 #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT)
 
+#ifdef CONFIG_X86_INTEL_MPX
+#define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX)
+#else
+#define cpu_has_mpx 0
+#endif /* CONFIG_X86_INTEL_MPX */
+
 #ifdef CONFIG_X86_64
 
 #undef  cpu_has_vme
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 02/10] x86, mpx: add MPX specific mmap interface

2014-09-11 Thread Qiaowei Ren
This patch adds one MPX specific mmap interface, which only handles
mpx related maps, including bounds table and bounds directory.

In order to track MPX specific memory usage, this interface is added
to stick new vm_flag VM_MPX in the vma_area_struct when create a
bounds table or bounds directory.

These bounds tables can take huge amounts of memory.  In the
worst-case scenario, the tables can be 4x the size of the data
structure being tracked. IOW, a 1-page structure can require 4
bounds-table pages.

My expectation is that folks using MPX are going to be keen on
figuring out how much memory is being dedicated to it. With this
feature, plus some grepping in /proc/$pid/smaps one could take a
pretty good stab at it.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/Kconfig   |4 ++
 arch/x86/include/asm/mpx.h |   38 +
 arch/x86/mm/Makefile   |2 +
 arch/x86/mm/mpx.c  |   79 
 4 files changed, 123 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/mm/mpx.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 778178f..935aa69 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -243,6 +243,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU  ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32  SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..5725ac4
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,38 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include linux/types.h
+#include asm/ptrace.h
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+unsigned long mpx_mmap(unsigned long len);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..e1b28e6
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,79 @@
+#include linux/kernel.h
+#include linux/syscalls.h
+#include asm/mpx.h
+#include asm/mman.h
+#include linux/sched/sysctl.h
+
+static const char *mpx_mapping_name(struct vm_area_struct *vma)
+{
+   return [mpx];
+}
+
+static struct vm_operations_struct mpx_vma_ops = {
+   .name = mpx_mapping_name,
+};
+
+/*
+ * this is really a simplified vm_mmap. it only handles mpx
+ * related maps, including bounds table and bounds directory.
+ *
+ * here we can stick new vm_flag VM_MPX in the vma_area_struct
+ * when create a bounds table or bounds directory, in order to
+ * track MPX specific memory.
+ */
+unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current-mm;
+   vm_flags_t vm_flags;
+   struct vm_area_struct *vma;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES  len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(mm-mmap_sem);
+
+   /* Too many mappings? */
+   if (mm-map_count  sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
+   if (addr  ~PAGE_MASK) {
+   ret = addr;
+   goto out;
+   }
+
+   vm_flags = VM_READ | VM_WRITE | VM_MPX |
+   mm-def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
+
+   /* Set pgoff according to addr for anon_vma */
+   pgoff = addr  PAGE_SHIFT;
+
+   ret = mmap_region(NULL, addr, len, vm_flags, pgoff);
+   if (IS_ERR_VALUE(ret))
+   goto out

[PATCH v8 00/10] Intel MPX support

2014-09-11 Thread Qiaowei Ren
This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in Intel(R) Architecture
Instruction Set Extensions Programming Reference.

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.

The high-level areas modified in the patchset are as follow:
1) struct siginfo is extended to include bound violation information.
2) two prctl() commands are added to do performance optimization.

Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator

This patchset has been tested on real internal hardware platform at Intel.
We have some simple unit tests in user space, which directly call MPX
instructions to produce #BR to let kernel allocate bounds tables and
cause bounds violations. We also compiled several benchmarks with an
MPX-enabled Gcc/Glibc and ICC, an ran them with this patch set.
We found a number of bugs in this code in these tests.

Future TODO items:
1) support 32-bit binaries on 64-bit kernels.

Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Changes since v2:
  * fix some compile warnings.
  * update documentation.

Changes since v3:
  * correct some syntax errors at documentation, and document
extended struct siginfo.
  * for kill the process when the error code of BNDSTATUS is 3.
  * add some comments.
  * remove new prctl() commands.
  * fix some compile warnings for 32-bit.

Changes since v4:
  * raise SIGBUS if the allocations of the bound tables fail.

Changes since v5:
  * hook unmap() path to cleanup unused bounds tables, and use
new prctl() command to register bounds directory address to
struct mm_struct to check whether one process is MPX enabled
during unmap().
  * in order track precisely MPX memory usage, add MPX specific
mmap interface and one VM_MPX flag to check whether a VMA
is MPX bounds table.
  * add macro cpu_has_mpx to do performance optimization.
  * sync struct figinfo for mips with general version to avoid
build issue.

Changes since v6:
  * because arch_vma_name is removed, this patchset have toset MPX
specific -vm_ops to do the same thing.
  * fix warnings for 32 bit arch.
  * add more description into these patches.

Changes since v7:
  * introduce VM_ARCH_2 flag. 
  * remove all of the pr_debug()s.
  * fix prctl numbers in documentation.
  * fix some bugs on bounds tables freeing.

Qiaowei Ren (10):
  x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add macro cpu_has_mpx
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: extend siginfo structure to include bound violation
information
  mips: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER
  x86, mpx: cleanup unused bound tables
  x86, mpx: add

[PATCH v7 02/10] x86, mpx: add MPX specific mmap interface

2014-07-20 Thread Qiaowei Ren
This patch adds one MPX specific mmap interface, which only handles
mpx related maps, including bounds table and bounds directory.

In order to track MPX specific memory usage, this interface is added
to stick new vm_flag VM_MPX in the vma_area_struct when create a
bounds table or bounds directory.

These bounds tables can take huge amounts of memory.  In the
worst-case scenario, the tables can be 4x the size of the data
structure being tracked. IOW, a 1-page structure can require 4
bounds-table pages.

My expectation is that folks using MPX are going to be keen on
figuring out how much memory is being dedicated to it. With this
feature, plus some grepping in /proc/$pid/smaps one could take a
pretty good stab at it.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/Kconfig   |4 ++
 arch/x86/include/asm/mpx.h |   38 +
 arch/x86/mm/Makefile   |2 +
 arch/x86/mm/mpx.c  |   79 
 4 files changed, 123 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/mm/mpx.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a8f749e..020db35 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -238,6 +238,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU && ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32 && SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..5725ac4
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,38 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+unsigned long mpx_mmap(unsigned long len);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..e1b28e6
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,79 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static const char *mpx_mapping_name(struct vm_area_struct *vma)
+{
+   return "[mpx]";
+}
+
+static struct vm_operations_struct mpx_vma_ops = {
+   .name = mpx_mapping_name,
+};
+
+/*
+ * this is really a simplified "vm_mmap". it only handles mpx
+ * related maps, including bounds table and bounds directory.
+ *
+ * here we can stick new vm_flag VM_MPX in the vma_area_struct
+ * when create a bounds table or bounds directory, in order to
+ * track MPX specific memory.
+ */
+unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current->mm;
+   vm_flags_t vm_flags;
+   struct vm_area_struct *vma;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(>mmap_sem);
+
+   /* Too many mappings? */
+   if (mm->map_count > sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
+   if (addr & ~PAGE_MASK) {
+   ret = addr;
+   goto out;
+   }
+
+   vm_flags = VM_READ | VM_WRITE | VM_MPX |
+   mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
+
+   /* Set pgoff according to addr for anon_vma */
+   pgoff = addr >> PAGE_SHIFT;
+
+   ret = mmap_region(NULL, addr, len, vm_flags, pgoff);
+   if (IS_ERR_VALUE(ret))
+   goto out;
+
+

[PATCH v7 05/10] x86, mpx: extend siginfo structure to include bound violation information

2014-07-20 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

Signed-off-by: Qiaowei Ren 
---
 include/uapi/asm-generic/siginfo.h |9 -
 kernel/signal.c|4 
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno  _sifields._sigfault._trapno
 #endif
 #define si_addr_lsb_sifields._sigfault._addr_lsb
+#define si_lower   _sifields._sigfault._addr_bnd._lower
+#define si_upper   _sifields._sigfault._addr_bnd._upper
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
-#define NSIGSEGV   2
+#define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV   3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index a4077e9..2131636 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const 
siginfo_t *from)
if (from->si_code == BUS_MCEERR_AR || from->si_code == 
BUS_MCEERR_AO)
err |= __put_user(from->si_addr_lsb, >si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+   err |= __put_user(from->si_lower, >si_lower);
+   err |= __put_user(from->si_upper, >si_upper);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from->si_pid, >si_pid);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 04/10] x86, mpx: hook #BR exception handler to allocate bound tables

2014-07-20 Thread Qiaowei Ren
This patch handles a #BR exception for non-existent tables by
carving the space out of the normal processes address space
(essentially calling mmap() from inside the kernel) and then
pointing the bounds-directory over to it.

The tables need to be accessed and controlled by userspace
because the compiler generates instructions for MPX-enabled
code which frequently store and retrieve entries from the bounds
tables. Any direct kernel involvement (like a syscall) to access
the tables would destroy performance since these are so frequent.

The tables are carved out of userspace because we have no better
spot to put them. For each pointer which is being tracked by MPX,
the bounds tables contain 4 longs worth of data, and the tables
are indexed virtually. If we were to preallocate the tables, we
would theoretically need to allocate 4x the virtual space that
we have available for userspace somewhere else. We don't have
that room in the kernel address space.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   20 ++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |   60 
 arch/x86/kernel/traps.c|   55 +++-
 4 files changed, 135 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 5725ac4..b7598ac 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -18,6 +18,8 @@
 #define MPX_BT_ENTRY_SHIFT 5
 #define MPX_IGN_BITS   3
 
+#define MPX_BD_ENTRY_TAIL  3
+
 #else
 
 #define MPX_BD_ENTRY_OFFSET20
@@ -26,13 +28,31 @@
 #define MPX_BT_ENTRY_SHIFT 4
 #define MPX_IGN_BITS   2
 
+#define MPX_BD_ENTRY_TAIL  2
+
 #endif
 
+#define MPX_BNDSTA_TAIL2
+#define MPX_BNDCFG_TAIL12
+#define MPX_BNDSTA_ADDR_MASK   (~((1UL<
+#include 
+#include 
+
+static int allocate_bt(long __user *bd_entry)
+{
+   unsigned long bt_addr, old_val = 0;
+   int ret = 0;
+
+   bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES);
+   if (IS_ERR((void *)bt_addr))
+   return bt_addr;
+   bt_addr = (bt_addr & MPX_BT_ADDR_MASK) | MPX_BD_ENTRY_VALID_FLAG;
+
+   ret = user_atomic_cmpxchg_inatomic(_val, bd_entry, 0, bt_addr);
+   if (ret)
+   goto out;
+
+   /*
+* there is a existing bounds table pointed at this bounds
+* directory entry, and so we need to free the bounds table
+* allocated just now.
+*/
+   if (old_val)
+   goto out;
+
+   pr_debug("Allocate bounds table %lx at entry %p\n",
+   bt_addr, bd_entry);
+   return 0;
+
+out:
+   vm_munmap(bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES);
+   return ret;
+}
+
+/*
+ * When a BNDSTX instruction attempts to save bounds to a BD entry
+ * with the lack of the valid bit being set, a #BR is generated.
+ * This is an indication that no BT exists for this entry. In this
+ * case the fault handler will allocate a new BT.
+ *
+ * With 32-bit mode, the size of BD is 4MB, and the size of each
+ * bound table is 16KB. With 64-bit mode, the size of BD is 2GB,
+ * and the size of each bound table is 4MB.
+ */
+int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf->bndcsr.status_reg;
+
+   bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry < bd_base) ||
+   (bd_entry >= bd_base + MPX_BD_SIZE_BYTES))
+   return -EINVAL;
+
+   return allocate_bt((long __user *)bd_entry);
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 0d0e922..396a88b 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -60,6 +60,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -228,7 +229,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long 
error_code) \
 
 DO_ERROR(X86_TRAP_DE, SIGFPE,  "divide error", divide_error)
 DO_ERROR(X86_TRAP_OF, SIGSEGV, "overflow", overflow)
-DO_ERROR(X86_TRAP_BR, SIGSEGV, "bounds",   bounds)
 DO_ERROR(X86_TRAP_UD, SIGILL,  "invalid opcode",   invalid_op)
 DO_ERROR(X86_TRAP_OLD_MF, SIGFPE,  "coprocessor segment 
overrun",coprocessor_segment_overrun)
 DO_ERROR(X86_TRAP_TS, SIGSEGV, "invalid TSS",  invalid_TSS)
@@ -278,6 +278,59 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, 
long error_code)
 }
 #endif
 
+dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
+{
+   enum ctx_state prev_state;
+   unsigned long status;
+   struct xsave_struct *xsave_buf;
+   struct task_stru

[PATCH v7 00/10] Intel MPX support

2014-07-20 Thread Qiaowei Ren
This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in "Intel(R) Architecture
Instruction Set Extensions Programming Reference".

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.

The high-level areas modified in the patchset are as follow:
1) struct siginfo is extended to include bound violation information.
2) two prctl() commands are added to do performance optimization.

Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator

In addition, this patchset has been tested on Intel internal hardware
platform for MPX testing.

Future TODO items:
1) support 32-bit binaries on 64-bit kernels.

Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Changes since v2:
  * fix some compile warnings.
  * update documentation.

Changes since v3:
  * correct some syntax errors at documentation, and document
extended struct siginfo.
  * for kill the process when the error code of BNDSTATUS is 3.
  * add some comments.
  * remove new prctl() commands.
  * fix some compile warnings for 32-bit.

Changes since v4:
  * raise SIGBUS if the allocations of the bound tables fail.

Changes since v5:
  * hook unmap() path to cleanup unused bounds tables, and use
new prctl() command to register bounds directory address to
struct mm_struct to check whether one process is MPX enabled
during unmap().
  * in order track precisely MPX memory usage, add MPX specific
mmap interface and one VM_MPX flag to check whether a VMA
is MPX bounds table.
  * add macro cpu_has_mpx to do performance optimization.
  * sync struct figinfo for mips with general version to avoid
build issue.

Changes since v6:
  * because arch_vma_name is removed, this patchset have toset MPX
specific ->vm_ops to do the same thing.
  * fix warnings for 32 bit arch.
  * add more description into these patches.

Qiaowei Ren (10):
  x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add macro cpu_has_mpx
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: extend siginfo structure to include bound violation
information
  mips: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER
  x86, mpx: cleanup unused bound tables
  x86, mpx: add documentation on Intel MPX

 Documentation/x86/intel_mpx.txt  |  127 +++
 arch/mips/include/uapi/asm/siginfo.h |4 +
 arch/x86/Kconfig |4 +
 arch/x86/include/asm/cpufeature.h|6 +
 arch/x86/include/asm/mmu_context.h   |   16 ++
 arch/x86/include/asm/mpx.h   |   91 
 arch/x86/include/asm/processor.h |   18 ++
 arch/x86/kernel/Makefile |1 +
 arch/x86/kernel/mpx.c 

[PATCH v7 09/10] x86, mpx: cleanup unused bound tables

2014-07-20 Thread Qiaowei Ren
Since the kernel allocated those tables on-demand without userspace
knowledge, it is also responsible for freeing them when the associated
mappings go away.

Here, the solution for this issue is to hook do_munmap() to check
whether one process is MPX enabled. If yes, those bounds tables covered
in the virtual address region which is being unmapped will be freed also.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mmu_context.h |   16 +++
 arch/x86/include/asm/mpx.h |9 ++
 arch/x86/mm/mpx.c  |  181 
 include/asm-generic/mmu_context.h  |6 +
 mm/mmap.c  |2 +
 5 files changed, 214 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index be12c53..af70d4f 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifndef CONFIG_PARAVIRT
 #include 
 
@@ -96,4 +97,19 @@ do { \
 } while (0)
 #endif
 
+static inline void arch_unmap(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long start, unsigned long end)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   /*
+* Check whether this vma comes from MPX-enabled application.
+* If so, release this vma related bound tables.
+*/
+   if (mm->bd_addr && !(vma->vm_flags & VM_MPX))
+   mpx_unmap(mm, start, end);
+
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 6cb0853..e848a74 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -42,6 +42,13 @@
 #define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
+#define MPX_BD_ENTRY_MASK  ((1<>(MPX_BT_ENTRY_OFFSET+ \
+   MPX_IGN_BITS)) & MPX_BD_ENTRY_MASK) << MPX_BD_ENTRY_SHIFT)
+#define MPX_GET_BT_ENTRY_OFFSET(addr)  addr)>>MPX_IGN_BITS) & \
+   MPX_BT_ENTRY_MASK) << MPX_BT_ENTRY_SHIFT)
+
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
@@ -63,6 +70,8 @@ struct mpx_insn {
 #define MAX_MPX_INSN_SIZE  15
 
 unsigned long mpx_mmap(unsigned long len);
+void mpx_unmap(struct mm_struct *mm,
+   unsigned long start, unsigned long end);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index e1b28e6..d29ec9c 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -2,6 +2,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static const char *mpx_mapping_name(struct vm_area_struct *vma)
@@ -77,3 +78,183 @@ out:
up_write(>mmap_sem);
return ret;
 }
+
+/*
+ * Get the base of bounds tables pointed by specific bounds
+ * directory entry.
+ */
+static int get_bt_addr(long __user *bd_entry, unsigned long *bt_addr,
+   unsigned int *valid)
+{
+   if (get_user(*bt_addr, bd_entry))
+   return -EFAULT;
+
+   *valid = *bt_addr & MPX_BD_ENTRY_VALID_FLAG;
+   *bt_addr &= MPX_BT_ADDR_MASK;
+
+   /*
+* If this bounds directory entry is nonzero, and meanwhile
+* the valid bit is zero, one SIGSEGV will be produced due to
+* this unexpected situation.
+*/
+   if (!(*valid) && *bt_addr)
+   force_sig(SIGSEGV, current);
+
+   return 0;
+}
+
+/*
+ * Free the backing physical pages of bounds table 'bt_addr'.
+ * Assume start...end is within that bounds table.
+ */
+static void zap_bt_entries(struct mm_struct *mm, unsigned long bt_addr,
+   unsigned long start, unsigned long end)
+{
+   struct vm_area_struct *vma;
+
+   /* Find the vma which overlaps this bounds table */
+   vma = find_vma(mm, bt_addr);
+   if (!vma || vma->vm_start > bt_addr ||
+   vma->vm_end < bt_addr+MPX_BT_SIZE_BYTES)
+   return;
+
+   zap_page_range(vma, start, end, NULL);
+}
+
+static void unmap_single_bt(struct mm_struct *mm, long __user *bd_entry,
+   unsigned long bt_addr)
+{
+   if (user_atomic_cmpxchg_inatomic(_addr, bd_entry,
+   bt_addr | MPX_BD_ENTRY_VALID_FLAG, 0))
+   return;
+
+   /*
+* to avoid recursion, do_munmap() will check whether it comes
+* from one bounds table through VM_MPX flag.
+*/
+   do_munmap(mm, bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES);
+}
+
+/*
+ * If the bounds table pointed by bounds directory 'bd_entry' is
+ * not shared, unmap this whole bounds table. Otherwise, only free
+ * those backing physical pages of bounds table entries covered
+ * i

[PATCH v7 03/10] x86, mpx: add macro cpu_has_mpx

2014-07-20 Thread Qiaowei Ren
In order to do performance optimization, this patch adds macro
cpu_has_mpx which will directly return 0 when MPX is not supported
by kernel.

Community gave a lot of comments on this macro cpu_has_mpx in previous
version. Dave will introduce a patchset about disabled features to fix
it later.

In this code:
if (cpu_has_mpx)
do_some_mpx_thing();

The patch series from Dave will introduce a new macro cpu_feature_enabled()
(if merged after this patchset) to replace the cpu_has_mpx.
if (cpu_feature_enabled(X86_FEATURE_MPX))
do_some_mpx_thing();

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/cpufeature.h |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index e265ff9..f302d08 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -339,6 +339,12 @@ extern const char * const x86_power_flags[32];
 #define cpu_has_eager_fpu  boot_cpu_has(X86_FEATURE_EAGER_FPU)
 #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT)
 
+#ifdef CONFIG_X86_INTEL_MPX
+#define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX)
+#else
+#define cpu_has_mpx 0
+#endif /* CONFIG_X86_INTEL_MPX */
+
 #ifdef CONFIG_X86_64
 
 #undef  cpu_has_vme
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-07-20 Thread Qiaowei Ren
This patch sets bound violation fields of siginfo struct in #BR
exception handler by decoding the user instruction and constructing
the faulting pointer.

This patch does't use the generic decoder, and implements a limited
special-purpose decoder to decode MPX instructions, simply because the
generic decoder is very heavyweight not just in terms of performance
but in terms of interface -- because it has to.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   23 
 arch/x86/kernel/mpx.c  |  299 
 arch/x86/kernel/traps.c|6 +
 3 files changed, 328 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index b7598ac..780af63 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 
@@ -44,15 +45,37 @@
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 unsigned long mpx_mmap(unsigned long len);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 #else
 static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
 {
return -EINVAL;
 }
+static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf)
+{
+}
 #endif /* CONFIG_X86_INTEL_MPX */
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index f02dcea..c1957a8 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -2,6 +2,275 @@
 #include 
 #include 
 
+enum reg_type {
+   REG_TYPE_RM = 0,
+   REG_TYPE_INDEX,
+   REG_TYPE_BASE,
+};
+
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+enum reg_type type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn->sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   } else {
+

[PATCH v7 10/10] x86, mpx: add documentation on Intel MPX

2014-07-20 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
---
 Documentation/x86/intel_mpx.txt |  127 +++
 1 files changed, 127 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..1af9809
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,127 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX provides
+hardware features that can be used in conjunction with compiler
+changes to check memory references, for those references whose
+compile-time normal intentions are usurped at runtime due to
+buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture
+Instruction Set Extensions Programming Reference, Chapter 9:
+Intel(R) Memory Protection Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead,
+which can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How does MPX kernel code work
+
+
+Handling #BR faults caused by MPX
+-
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults.
+  * bounds violation caused by MPX instructions.
+  * new bounds tables (BT) need to be allocated to save bounds.
+
+We hook #BR handler to handle these two new situations.
+
+Decoding MPX instructions
+-
+
+If a #BR is generated due to a bounds violation caused by MPX.
+We need to decode MPX instructions to get violation address and
+set this address into extended struct siginfo.
+
+The _sigfault feild of struct siginfo is extended as follow:
+
+87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
+88 struct {
+89 void __user *_addr; /* faulting insn/memory ref. */
+90 #ifdef __ARCH_SI_TRAPNO
+91 int _trapno;/* TRAP # which caused the signal */
+92 #endif
+93 short _addr_lsb; /* LSB of the reported address */
+94 struct {
+95 void __user *_lower;
+96 void __user *_upper;
+97 } _addr_bnd;
+98 } _sigfault;
+
+The '_addr' field refers to violation address, and new '_addr_and'
+field refers to the upper/lower bounds when a #BR is caused.
+
+Glibc will be also updated to support this new siginfo. So user
+can get violation address and bounds when bounds violations occur.
+
+Freeing unused bounds tables
+
+
+When a BNDSTX instruction attempts to save bounds to a bounds directory
+entry marked as invalid, a #BR is generated. This is an indication that
+no bounds table exists for this entry. In this case the fault handler
+will allocate a new bounds table on demand.
+
+Since the kernel allocated those tables on-demand without userspace
+knowledge, it is also responsible for freeing them when the associated
+mappings go away.
+
+Here, the solution for this issue is to hook do_munmap() to check
+whether one process is MPX enabled. If yes, those bounds tables covered
+in the virtual address region which is being unmapped will be freed also.
+
+Adding new prctl commands
+-
+
+Runtime library in userspace is responsible for allocation of bounds
+directory. So kernel have to use XSAVE instruction to get the base
+of bounds directory from BNDCFG register.
+
+But XSAVE is expected to be very expensive. In order to do performance
+optimization, we have to add new prctl command to get the base of
+bounds directory to be used in future.
+
+Two new prctl commands are added to register and unregister MPX related
+resource.
+
+155#define PR_MPX_REGISTER 41
+156#define PR_MPX_UNREGISTER   42
+
+The base of the bounds directory is set into mm_struct during
+PR_MPX_REGISTER command execution. This member can be used to
+check whether one application is mpx enabled.
+
+
+3. Tips
+===
+
+1) Users are not allowed to create bounds tables and point the bounds
+directory at them in the userspace. In fact, it is not also necessary
+for users to create bounds tables in the userspace.
+
+When #BR fault is produced due to invalid entry, bounds table will be
+created in kernel on demand and kernel will not transfer this fault to
+userspace. So usersapce can't receive #BR fault for invalid entry, and
+it is not also necessary for users to create bounds tables by themselves.
+
+Certainly users can allocate bounds tables and forcibly point the bounds
+directory at them through XSAVE instruction, and then set valid bit
+of bounds entry

[PATCH v7 01/10] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific

2014-07-20 Thread Qiaowei Ren
MPX-enabled application will possibly create a lot of bounds tables
in process address space to save bounds information. These tables
can take up huge swaths of memory (as much as 80% of the memory on
the system) even if we clean them up aggressively. Being this huge,
we need a way to track their memory use. If we want to track them,
we essentially have two options:

1. walk the multi-GB (in virtual space) bounds directory to locate
   all the VMAs and walk them
2. Find a way to distinguish MPX bounds-table VMAs from normal
   anonymous VMAs and use some existing mechanism to walk them

We expect (1) will be prohibitively expensive. For (2), we only
need a single bit, and we've chosen to use a VM_ flag.  We understand
that they are scarce and are open to other options.

There is one potential hybrid approach: check the bounds directory
entry for any anonymous VMA that could possibly contain a bounds table.
This is less expensive than (1), but still requires reading a pointer
out of userspace for every VMA that we iterate over.

Signed-off-by: Qiaowei Ren 
---
 fs/proc/task_mmu.c |1 +
 include/linux/mm.h |2 ++
 2 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index cfa63ee..b2bc755 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -549,6 +549,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_GROWSDOWN)]   = "gd",
[ilog2(VM_PFNMAP)]  = "pf",
[ilog2(VM_DENYWRITE)]   = "dw",
+   [ilog2(VM_MPX)] = "mp",
[ilog2(VM_LOCKED)]  = "lo",
[ilog2(VM_IO)]  = "io",
[ilog2(VM_SEQ_READ)]= "sr",
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e03dd29..44c75d7 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -127,6 +127,8 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB 0x0040  /* Huge TLB Page VM */
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1  0x0100  /* Architecture-specific flag */
+/* MPX specific bounds table or bounds directory (x86) */
+#define VM_MPX 0x0200
 #define VM_DONTDUMP0x0400  /* Do not include in the core dump */
 
 #ifdef CONFIG_MEM_SOFT_DIRTY
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 06/10] mips: sync struct siginfo with general version

2014-07-20 Thread Qiaowei Ren
Due to new fields about bound violation added into struct siginfo,
this patch syncs it with general version to avoid build issue.

Signed-off-by: Qiaowei Ren 
---
 arch/mips/include/uapi/asm/siginfo.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-07-20 Thread Qiaowei Ren
This patch adds the PR_MPX_REGISTER and PR_MPX_UNREGISTER prctl()
commands. These commands can be used to register and unregister MPX
related resource on the x86 platform.

The base of the bounds directory is set into mm_struct during
PR_MPX_REGISTER command execution. This member can be used to
check whether one application is mpx enabled.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h   |1 +
 arch/x86/include/asm/processor.h |   18 
 arch/x86/kernel/mpx.c|   56 ++
 include/linux/mm_types.h |3 ++
 include/uapi/linux/prctl.h   |6 
 kernel/sys.c |   12 
 6 files changed, 96 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 780af63..6cb0853 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -43,6 +43,7 @@
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
 struct mpx_insn {
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a4ea023..6e0966e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -952,6 +952,24 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+/* Register/unregister a process' MPX related resource */
+#define MPX_REGISTER(tsk)  mpx_register((tsk))
+#define MPX_UNREGISTER(tsk)mpx_unregister((tsk))
+
+#ifdef CONFIG_X86_INTEL_MPX
+extern int mpx_register(struct task_struct *tsk);
+extern int mpx_unregister(struct task_struct *tsk);
+#else
+static inline int mpx_register(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+static inline int mpx_unregister(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+#endif /* CONFIG_X86_INTEL_MPX */
+
 extern u16 amd_get_nb_id(int cpu);
 
 static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index c1957a8..6b7e526 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -1,6 +1,62 @@
 #include 
 #include 
+#include 
 #include 
+#include 
+#include 
+
+/*
+ * This should only be called when cpuid has been checked
+ * and we are sure that MPX is available.
+ */
+static __user void *task_get_bounds_dir(struct task_struct *tsk)
+{
+   struct xsave_struct *xsave_buf;
+
+   fpu_xsave(>thread.fpu);
+   xsave_buf = &(tsk->thread.fpu.state->xsave);
+   if (!(xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ENABLE_FLAG))
+   return NULL;
+
+   return (void __user *)(unsigned long)(xsave_buf->bndcsr.cfg_reg_u &
+   MPX_BNDCFG_ADDR_MASK);
+}
+
+int mpx_register(struct task_struct *tsk)
+{
+   struct mm_struct *mm = tsk->mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   /*
+* runtime in the userspace will be responsible for allocation of
+* the bounds directory. Then, it will save the base of the bounds
+* directory into XSAVE/XRSTOR Save Area and enable MPX through
+* XRSTOR instruction.
+*
+* fpu_xsave() is expected to be very expensive. In order to do
+* performance optimization, here we get the base of the bounds
+* directory and then save it into mm_struct to be used in future.
+*/
+   mm->bd_addr = task_get_bounds_dir(tsk);
+   if (!mm->bd_addr)
+   return -EINVAL;
+
+   pr_debug("MPX BD base address %p\n", mm->bd_addr);
+   return 0;
+}
+
+int mpx_unregister(struct task_struct *tsk)
+{
+   struct mm_struct *mm = current->mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   mm->bd_addr = NULL;
+   return 0;
+}
 
 enum reg_type {
REG_TYPE_RM = 0,
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 96c5750..131b5b3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -454,6 +454,9 @@ struct mm_struct {
bool tlb_flush_pending;
 #endif
struct uprobes_state uprobes_state;
+#ifdef CONFIG_X86_INTEL_MPX
+   void __user *bd_addr;   /* address of the bounds directory */
+#endif
 };
 
 static inline void mm_init_cpumask(struct mm_struct *mm)
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 58afc04..ce86fa9 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -152,4 +152,10 @@
 #define PR_SET_THP_DISABLE 41
 #define PR_GET_THP_DISABLE 42
 
+/*
+ * Register/unregister MPX related resource.
+ */
+#define PR_MPX_REGISTER43
+#define PR_MPX_UNREGISTER  44
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index 

[PATCH v7 06/10] mips: sync struct siginfo with general version

2014-07-20 Thread Qiaowei Ren
Due to new fields about bound violation added into struct siginfo,
this patch syncs it with general version to avoid build issue.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/mips/include/uapi/asm/siginfo.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-07-20 Thread Qiaowei Ren
This patch adds the PR_MPX_REGISTER and PR_MPX_UNREGISTER prctl()
commands. These commands can be used to register and unregister MPX
related resource on the x86 platform.

The base of the bounds directory is set into mm_struct during
PR_MPX_REGISTER command execution. This member can be used to
check whether one application is mpx enabled.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mpx.h   |1 +
 arch/x86/include/asm/processor.h |   18 
 arch/x86/kernel/mpx.c|   56 ++
 include/linux/mm_types.h |3 ++
 include/uapi/linux/prctl.h   |6 
 kernel/sys.c |   12 
 6 files changed, 96 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 780af63..6cb0853 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -43,6 +43,7 @@
 #define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
 struct mpx_insn {
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a4ea023..6e0966e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -952,6 +952,24 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+/* Register/unregister a process' MPX related resource */
+#define MPX_REGISTER(tsk)  mpx_register((tsk))
+#define MPX_UNREGISTER(tsk)mpx_unregister((tsk))
+
+#ifdef CONFIG_X86_INTEL_MPX
+extern int mpx_register(struct task_struct *tsk);
+extern int mpx_unregister(struct task_struct *tsk);
+#else
+static inline int mpx_register(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+static inline int mpx_unregister(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+#endif /* CONFIG_X86_INTEL_MPX */
+
 extern u16 amd_get_nb_id(int cpu);
 
 static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index c1957a8..6b7e526 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -1,6 +1,62 @@
 #include linux/kernel.h
 #include linux/syscalls.h
+#include linux/prctl.h
 #include asm/mpx.h
+#include asm/i387.h
+#include asm/fpu-internal.h
+
+/*
+ * This should only be called when cpuid has been checked
+ * and we are sure that MPX is available.
+ */
+static __user void *task_get_bounds_dir(struct task_struct *tsk)
+{
+   struct xsave_struct *xsave_buf;
+
+   fpu_xsave(tsk-thread.fpu);
+   xsave_buf = (tsk-thread.fpu.state-xsave);
+   if (!(xsave_buf-bndcsr.cfg_reg_u  MPX_BNDCFG_ENABLE_FLAG))
+   return NULL;
+
+   return (void __user *)(unsigned long)(xsave_buf-bndcsr.cfg_reg_u 
+   MPX_BNDCFG_ADDR_MASK);
+}
+
+int mpx_register(struct task_struct *tsk)
+{
+   struct mm_struct *mm = tsk-mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   /*
+* runtime in the userspace will be responsible for allocation of
+* the bounds directory. Then, it will save the base of the bounds
+* directory into XSAVE/XRSTOR Save Area and enable MPX through
+* XRSTOR instruction.
+*
+* fpu_xsave() is expected to be very expensive. In order to do
+* performance optimization, here we get the base of the bounds
+* directory and then save it into mm_struct to be used in future.
+*/
+   mm-bd_addr = task_get_bounds_dir(tsk);
+   if (!mm-bd_addr)
+   return -EINVAL;
+
+   pr_debug(MPX BD base address %p\n, mm-bd_addr);
+   return 0;
+}
+
+int mpx_unregister(struct task_struct *tsk)
+{
+   struct mm_struct *mm = current-mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   mm-bd_addr = NULL;
+   return 0;
+}
 
 enum reg_type {
REG_TYPE_RM = 0,
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 96c5750..131b5b3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -454,6 +454,9 @@ struct mm_struct {
bool tlb_flush_pending;
 #endif
struct uprobes_state uprobes_state;
+#ifdef CONFIG_X86_INTEL_MPX
+   void __user *bd_addr;   /* address of the bounds directory */
+#endif
 };
 
 static inline void mm_init_cpumask(struct mm_struct *mm)
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 58afc04..ce86fa9 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -152,4 +152,10 @@
 #define PR_SET_THP_DISABLE 41
 #define PR_GET_THP_DISABLE 42
 
+/*
+ * Register/unregister MPX related resource.
+ */
+#define PR_MPX_REGISTER43
+#define PR_MPX_UNREGISTER  44
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel

[PATCH v7 03/10] x86, mpx: add macro cpu_has_mpx

2014-07-20 Thread Qiaowei Ren
In order to do performance optimization, this patch adds macro
cpu_has_mpx which will directly return 0 when MPX is not supported
by kernel.

Community gave a lot of comments on this macro cpu_has_mpx in previous
version. Dave will introduce a patchset about disabled features to fix
it later.

In this code:
if (cpu_has_mpx)
do_some_mpx_thing();

The patch series from Dave will introduce a new macro cpu_feature_enabled()
(if merged after this patchset) to replace the cpu_has_mpx.
if (cpu_feature_enabled(X86_FEATURE_MPX))
do_some_mpx_thing();

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/cpufeature.h |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index e265ff9..f302d08 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -339,6 +339,12 @@ extern const char * const x86_power_flags[32];
 #define cpu_has_eager_fpu  boot_cpu_has(X86_FEATURE_EAGER_FPU)
 #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT)
 
+#ifdef CONFIG_X86_INTEL_MPX
+#define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX)
+#else
+#define cpu_has_mpx 0
+#endif /* CONFIG_X86_INTEL_MPX */
+
 #ifdef CONFIG_X86_64
 
 #undef  cpu_has_vme
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-07-20 Thread Qiaowei Ren
This patch sets bound violation fields of siginfo struct in #BR
exception handler by decoding the user instruction and constructing
the faulting pointer.

This patch does't use the generic decoder, and implements a limited
special-purpose decoder to decode MPX instructions, simply because the
generic decoder is very heavyweight not just in terms of performance
but in terms of interface -- because it has to.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mpx.h |   23 
 arch/x86/kernel/mpx.c  |  299 
 arch/x86/kernel/traps.c|6 +
 3 files changed, 328 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index b7598ac..780af63 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include linux/types.h
 #include asm/ptrace.h
+#include asm/insn.h
 
 #ifdef CONFIG_X86_64
 
@@ -44,15 +45,37 @@
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 unsigned long mpx_mmap(unsigned long len);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 #else
 static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
 {
return -EINVAL;
 }
+static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf)
+{
+}
 #endif /* CONFIG_X86_INTEL_MPX */
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index f02dcea..c1957a8 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -2,6 +2,275 @@
 #include linux/syscalls.h
 #include asm/mpx.h
 
+enum reg_type {
+   REG_TYPE_RM = 0,
+   REG_TYPE_INDEX,
+   REG_TYPE_BASE,
+};
+
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+enum reg_type type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn-modrm.value;
+   unsigned char sib = (unsigned char)insn-sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn-rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn-rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn-rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn-modrm.value;
+   unsigned char sib = (unsigned char)insn-sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn-sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1  X86_SIB_SCALE(sib

[PATCH v7 10/10] x86, mpx: add documentation on Intel MPX

2014-07-20 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 Documentation/x86/intel_mpx.txt |  127 +++
 1 files changed, 127 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..1af9809
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,127 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX provides
+hardware features that can be used in conjunction with compiler
+changes to check memory references, for those references whose
+compile-time normal intentions are usurped at runtime due to
+buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture
+Instruction Set Extensions Programming Reference, Chapter 9:
+Intel(R) Memory Protection Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead,
+which can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How does MPX kernel code work
+
+
+Handling #BR faults caused by MPX
+-
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults.
+  * bounds violation caused by MPX instructions.
+  * new bounds tables (BT) need to be allocated to save bounds.
+
+We hook #BR handler to handle these two new situations.
+
+Decoding MPX instructions
+-
+
+If a #BR is generated due to a bounds violation caused by MPX.
+We need to decode MPX instructions to get violation address and
+set this address into extended struct siginfo.
+
+The _sigfault feild of struct siginfo is extended as follow:
+
+87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
+88 struct {
+89 void __user *_addr; /* faulting insn/memory ref. */
+90 #ifdef __ARCH_SI_TRAPNO
+91 int _trapno;/* TRAP # which caused the signal */
+92 #endif
+93 short _addr_lsb; /* LSB of the reported address */
+94 struct {
+95 void __user *_lower;
+96 void __user *_upper;
+97 } _addr_bnd;
+98 } _sigfault;
+
+The '_addr' field refers to violation address, and new '_addr_and'
+field refers to the upper/lower bounds when a #BR is caused.
+
+Glibc will be also updated to support this new siginfo. So user
+can get violation address and bounds when bounds violations occur.
+
+Freeing unused bounds tables
+
+
+When a BNDSTX instruction attempts to save bounds to a bounds directory
+entry marked as invalid, a #BR is generated. This is an indication that
+no bounds table exists for this entry. In this case the fault handler
+will allocate a new bounds table on demand.
+
+Since the kernel allocated those tables on-demand without userspace
+knowledge, it is also responsible for freeing them when the associated
+mappings go away.
+
+Here, the solution for this issue is to hook do_munmap() to check
+whether one process is MPX enabled. If yes, those bounds tables covered
+in the virtual address region which is being unmapped will be freed also.
+
+Adding new prctl commands
+-
+
+Runtime library in userspace is responsible for allocation of bounds
+directory. So kernel have to use XSAVE instruction to get the base
+of bounds directory from BNDCFG register.
+
+But XSAVE is expected to be very expensive. In order to do performance
+optimization, we have to add new prctl command to get the base of
+bounds directory to be used in future.
+
+Two new prctl commands are added to register and unregister MPX related
+resource.
+
+155#define PR_MPX_REGISTER 41
+156#define PR_MPX_UNREGISTER   42
+
+The base of the bounds directory is set into mm_struct during
+PR_MPX_REGISTER command execution. This member can be used to
+check whether one application is mpx enabled.
+
+
+3. Tips
+===
+
+1) Users are not allowed to create bounds tables and point the bounds
+directory at them in the userspace. In fact, it is not also necessary
+for users to create bounds tables in the userspace.
+
+When #BR fault is produced due to invalid entry, bounds table will be
+created in kernel on demand and kernel will not transfer this fault to
+userspace. So usersapce can't receive #BR fault for invalid entry, and
+it is not also necessary for users to create bounds tables by themselves.
+
+Certainly users can allocate bounds tables and forcibly point the bounds
+directory at them through XSAVE instruction, and then set valid

[PATCH v7 01/10] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific

2014-07-20 Thread Qiaowei Ren
MPX-enabled application will possibly create a lot of bounds tables
in process address space to save bounds information. These tables
can take up huge swaths of memory (as much as 80% of the memory on
the system) even if we clean them up aggressively. Being this huge,
we need a way to track their memory use. If we want to track them,
we essentially have two options:

1. walk the multi-GB (in virtual space) bounds directory to locate
   all the VMAs and walk them
2. Find a way to distinguish MPX bounds-table VMAs from normal
   anonymous VMAs and use some existing mechanism to walk them

We expect (1) will be prohibitively expensive. For (2), we only
need a single bit, and we've chosen to use a VM_ flag.  We understand
that they are scarce and are open to other options.

There is one potential hybrid approach: check the bounds directory
entry for any anonymous VMA that could possibly contain a bounds table.
This is less expensive than (1), but still requires reading a pointer
out of userspace for every VMA that we iterate over.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 fs/proc/task_mmu.c |1 +
 include/linux/mm.h |2 ++
 2 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index cfa63ee..b2bc755 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -549,6 +549,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_GROWSDOWN)]   = gd,
[ilog2(VM_PFNMAP)]  = pf,
[ilog2(VM_DENYWRITE)]   = dw,
+   [ilog2(VM_MPX)] = mp,
[ilog2(VM_LOCKED)]  = lo,
[ilog2(VM_IO)]  = io,
[ilog2(VM_SEQ_READ)]= sr,
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e03dd29..44c75d7 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -127,6 +127,8 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB 0x0040  /* Huge TLB Page VM */
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1  0x0100  /* Architecture-specific flag */
+/* MPX specific bounds table or bounds directory (x86) */
+#define VM_MPX 0x0200
 #define VM_DONTDUMP0x0400  /* Do not include in the core dump */
 
 #ifdef CONFIG_MEM_SOFT_DIRTY
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 04/10] x86, mpx: hook #BR exception handler to allocate bound tables

2014-07-20 Thread Qiaowei Ren
This patch handles a #BR exception for non-existent tables by
carving the space out of the normal processes address space
(essentially calling mmap() from inside the kernel) and then
pointing the bounds-directory over to it.

The tables need to be accessed and controlled by userspace
because the compiler generates instructions for MPX-enabled
code which frequently store and retrieve entries from the bounds
tables. Any direct kernel involvement (like a syscall) to access
the tables would destroy performance since these are so frequent.

The tables are carved out of userspace because we have no better
spot to put them. For each pointer which is being tracked by MPX,
the bounds tables contain 4 longs worth of data, and the tables
are indexed virtually. If we were to preallocate the tables, we
would theoretically need to allocate 4x the virtual space that
we have available for userspace somewhere else. We don't have
that room in the kernel address space.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mpx.h |   20 ++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |   60 
 arch/x86/kernel/traps.c|   55 +++-
 4 files changed, 135 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 5725ac4..b7598ac 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -18,6 +18,8 @@
 #define MPX_BT_ENTRY_SHIFT 5
 #define MPX_IGN_BITS   3
 
+#define MPX_BD_ENTRY_TAIL  3
+
 #else
 
 #define MPX_BD_ENTRY_OFFSET20
@@ -26,13 +28,31 @@
 #define MPX_BT_ENTRY_SHIFT 4
 #define MPX_IGN_BITS   2
 
+#define MPX_BD_ENTRY_TAIL  2
+
 #endif
 
+#define MPX_BNDSTA_TAIL2
+#define MPX_BNDCFG_TAIL12
+#define MPX_BNDSTA_ADDR_MASK   (~((1ULMPX_BNDSTA_TAIL)-1))
+#define MPX_BNDCFG_ADDR_MASK   (~((1ULMPX_BNDCFG_TAIL)-1))
+#define MPX_BT_ADDR_MASK   (~((1ULMPX_BD_ENTRY_TAIL)-1))
+
 #define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BD_ENTRY_VALID_FLAG0x1
 
 unsigned long mpx_mmap(unsigned long len);
 
+#ifdef CONFIG_X86_INTEL_MPX
+int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+#else
+static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   return -EINVAL;
+}
+#endif /* CONFIG_X86_INTEL_MPX */
+
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 047f9ff..5e81e16 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -43,6 +43,7 @@ obj-$(CONFIG_PREEMPT) += preempt.o
 
 obj-y  += process.o
 obj-y  += i387.o xsave.o
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
 obj-y  += ptrace.o
 obj-$(CONFIG_X86_32)   += tls.o
 obj-$(CONFIG_IA32_EMULATION)   += tls.o
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
new file mode 100644
index 000..f02dcea
--- /dev/null
+++ b/arch/x86/kernel/mpx.c
@@ -0,0 +1,60 @@
+#include linux/kernel.h
+#include linux/syscalls.h
+#include asm/mpx.h
+
+static int allocate_bt(long __user *bd_entry)
+{
+   unsigned long bt_addr, old_val = 0;
+   int ret = 0;
+
+   bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES);
+   if (IS_ERR((void *)bt_addr))
+   return bt_addr;
+   bt_addr = (bt_addr  MPX_BT_ADDR_MASK) | MPX_BD_ENTRY_VALID_FLAG;
+
+   ret = user_atomic_cmpxchg_inatomic(old_val, bd_entry, 0, bt_addr);
+   if (ret)
+   goto out;
+
+   /*
+* there is a existing bounds table pointed at this bounds
+* directory entry, and so we need to free the bounds table
+* allocated just now.
+*/
+   if (old_val)
+   goto out;
+
+   pr_debug(Allocate bounds table %lx at entry %p\n,
+   bt_addr, bd_entry);
+   return 0;
+
+out:
+   vm_munmap(bt_addr  MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES);
+   return ret;
+}
+
+/*
+ * When a BNDSTX instruction attempts to save bounds to a BD entry
+ * with the lack of the valid bit being set, a #BR is generated.
+ * This is an indication that no BT exists for this entry. In this
+ * case the fault handler will allocate a new BT.
+ *
+ * With 32-bit mode, the size of BD is 4MB, and the size of each
+ * bound table is 16KB. With 64-bit mode, the size of BD is 2GB,
+ * and the size of each bound table is 4MB.
+ */
+int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+
+   bd_base = xsave_buf-bndcsr.cfg_reg_u  MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf-bndcsr.status_reg;
+
+   bd_entry = status  MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry  bd_base

[PATCH v7 00/10] Intel MPX support

2014-07-20 Thread Qiaowei Ren
This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in Intel(R) Architecture
Instruction Set Extensions Programming Reference.

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.

The high-level areas modified in the patchset are as follow:
1) struct siginfo is extended to include bound violation information.
2) two prctl() commands are added to do performance optimization.

Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator

In addition, this patchset has been tested on Intel internal hardware
platform for MPX testing.

Future TODO items:
1) support 32-bit binaries on 64-bit kernels.

Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Changes since v2:
  * fix some compile warnings.
  * update documentation.

Changes since v3:
  * correct some syntax errors at documentation, and document
extended struct siginfo.
  * for kill the process when the error code of BNDSTATUS is 3.
  * add some comments.
  * remove new prctl() commands.
  * fix some compile warnings for 32-bit.

Changes since v4:
  * raise SIGBUS if the allocations of the bound tables fail.

Changes since v5:
  * hook unmap() path to cleanup unused bounds tables, and use
new prctl() command to register bounds directory address to
struct mm_struct to check whether one process is MPX enabled
during unmap().
  * in order track precisely MPX memory usage, add MPX specific
mmap interface and one VM_MPX flag to check whether a VMA
is MPX bounds table.
  * add macro cpu_has_mpx to do performance optimization.
  * sync struct figinfo for mips with general version to avoid
build issue.

Changes since v6:
  * because arch_vma_name is removed, this patchset have toset MPX
specific -vm_ops to do the same thing.
  * fix warnings for 32 bit arch.
  * add more description into these patches.

Qiaowei Ren (10):
  x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add macro cpu_has_mpx
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: extend siginfo structure to include bound violation
information
  mips: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER
  x86, mpx: cleanup unused bound tables
  x86, mpx: add documentation on Intel MPX

 Documentation/x86/intel_mpx.txt  |  127 +++
 arch/mips/include/uapi/asm/siginfo.h |4 +
 arch/x86/Kconfig |4 +
 arch/x86/include/asm/cpufeature.h|6 +
 arch/x86/include/asm/mmu_context.h   |   16 ++
 arch/x86/include/asm/mpx.h   |   91 
 arch/x86/include/asm/processor.h |   18 ++
 arch/x86/kernel/Makefile |1 +
 arch/x86/kernel/mpx.c|  415

[PATCH v7 02/10] x86, mpx: add MPX specific mmap interface

2014-07-20 Thread Qiaowei Ren
This patch adds one MPX specific mmap interface, which only handles
mpx related maps, including bounds table and bounds directory.

In order to track MPX specific memory usage, this interface is added
to stick new vm_flag VM_MPX in the vma_area_struct when create a
bounds table or bounds directory.

These bounds tables can take huge amounts of memory.  In the
worst-case scenario, the tables can be 4x the size of the data
structure being tracked. IOW, a 1-page structure can require 4
bounds-table pages.

My expectation is that folks using MPX are going to be keen on
figuring out how much memory is being dedicated to it. With this
feature, plus some grepping in /proc/$pid/smaps one could take a
pretty good stab at it.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/Kconfig   |4 ++
 arch/x86/include/asm/mpx.h |   38 +
 arch/x86/mm/Makefile   |2 +
 arch/x86/mm/mpx.c  |   79 
 4 files changed, 123 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/mm/mpx.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a8f749e..020db35 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -238,6 +238,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU  ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32  SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..5725ac4
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,38 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include linux/types.h
+#include asm/ptrace.h
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+unsigned long mpx_mmap(unsigned long len);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..e1b28e6
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,79 @@
+#include linux/kernel.h
+#include linux/syscalls.h
+#include asm/mpx.h
+#include asm/mman.h
+#include linux/sched/sysctl.h
+
+static const char *mpx_mapping_name(struct vm_area_struct *vma)
+{
+   return [mpx];
+}
+
+static struct vm_operations_struct mpx_vma_ops = {
+   .name = mpx_mapping_name,
+};
+
+/*
+ * this is really a simplified vm_mmap. it only handles mpx
+ * related maps, including bounds table and bounds directory.
+ *
+ * here we can stick new vm_flag VM_MPX in the vma_area_struct
+ * when create a bounds table or bounds directory, in order to
+ * track MPX specific memory.
+ */
+unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current-mm;
+   vm_flags_t vm_flags;
+   struct vm_area_struct *vma;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES  len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(mm-mmap_sem);
+
+   /* Too many mappings? */
+   if (mm-map_count  sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
+   if (addr  ~PAGE_MASK) {
+   ret = addr;
+   goto out;
+   }
+
+   vm_flags = VM_READ | VM_WRITE | VM_MPX |
+   mm-def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
+
+   /* Set pgoff according to addr for anon_vma */
+   pgoff = addr  PAGE_SHIFT;
+
+   ret = mmap_region(NULL, addr, len, vm_flags, pgoff);
+   if (IS_ERR_VALUE(ret))
+   goto out

[PATCH v7 05/10] x86, mpx: extend siginfo structure to include bound violation information

2014-07-20 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 include/uapi/asm-generic/siginfo.h |9 -
 kernel/signal.c|4 
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno  _sifields._sigfault._trapno
 #endif
 #define si_addr_lsb_sifields._sigfault._addr_lsb
+#define si_lower   _sifields._sigfault._addr_bnd._lower
+#define si_upper   _sifields._sigfault._addr_bnd._upper
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
-#define NSIGSEGV   2
+#define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV   3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index a4077e9..2131636 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const 
siginfo_t *from)
if (from-si_code == BUS_MCEERR_AR || from-si_code == 
BUS_MCEERR_AO)
err |= __put_user(from-si_addr_lsb, to-si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+   err |= __put_user(from-si_lower, to-si_lower);
+   err |= __put_user(from-si_upper, to-si_upper);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from-si_pid, to-si_pid);
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 09/10] x86, mpx: cleanup unused bound tables

2014-07-20 Thread Qiaowei Ren
Since the kernel allocated those tables on-demand without userspace
knowledge, it is also responsible for freeing them when the associated
mappings go away.

Here, the solution for this issue is to hook do_munmap() to check
whether one process is MPX enabled. If yes, those bounds tables covered
in the virtual address region which is being unmapped will be freed also.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mmu_context.h |   16 +++
 arch/x86/include/asm/mpx.h |9 ++
 arch/x86/mm/mpx.c  |  181 
 include/asm-generic/mmu_context.h  |6 +
 mm/mmap.c  |2 +
 5 files changed, 214 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index be12c53..af70d4f 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -6,6 +6,7 @@
 #include asm/pgalloc.h
 #include asm/tlbflush.h
 #include asm/paravirt.h
+#include asm/mpx.h
 #ifndef CONFIG_PARAVIRT
 #include asm-generic/mm_hooks.h
 
@@ -96,4 +97,19 @@ do { \
 } while (0)
 #endif
 
+static inline void arch_unmap(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long start, unsigned long end)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   /*
+* Check whether this vma comes from MPX-enabled application.
+* If so, release this vma related bound tables.
+*/
+   if (mm-bd_addr  !(vma-vm_flags  VM_MPX))
+   mpx_unmap(mm, start, end);
+
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 6cb0853..e848a74 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -42,6 +42,13 @@
 #define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
+#define MPX_BD_ENTRY_MASK  ((1MPX_BD_ENTRY_OFFSET)-1)
+#define MPX_BT_ENTRY_MASK  ((1MPX_BT_ENTRY_OFFSET)-1)
+#define MPX_GET_BD_ENTRY_OFFSET(addr)  addr)(MPX_BT_ENTRY_OFFSET+ \
+   MPX_IGN_BITS))  MPX_BD_ENTRY_MASK)  MPX_BD_ENTRY_SHIFT)
+#define MPX_GET_BT_ENTRY_OFFSET(addr)  addr)MPX_IGN_BITS)  \
+   MPX_BT_ENTRY_MASK)  MPX_BT_ENTRY_SHIFT)
+
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
@@ -63,6 +70,8 @@ struct mpx_insn {
 #define MAX_MPX_INSN_SIZE  15
 
 unsigned long mpx_mmap(unsigned long len);
+void mpx_unmap(struct mm_struct *mm,
+   unsigned long start, unsigned long end);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index e1b28e6..d29ec9c 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -2,6 +2,7 @@
 #include linux/syscalls.h
 #include asm/mpx.h
 #include asm/mman.h
+#include asm/mmu_context.h
 #include linux/sched/sysctl.h
 
 static const char *mpx_mapping_name(struct vm_area_struct *vma)
@@ -77,3 +78,183 @@ out:
up_write(mm-mmap_sem);
return ret;
 }
+
+/*
+ * Get the base of bounds tables pointed by specific bounds
+ * directory entry.
+ */
+static int get_bt_addr(long __user *bd_entry, unsigned long *bt_addr,
+   unsigned int *valid)
+{
+   if (get_user(*bt_addr, bd_entry))
+   return -EFAULT;
+
+   *valid = *bt_addr  MPX_BD_ENTRY_VALID_FLAG;
+   *bt_addr = MPX_BT_ADDR_MASK;
+
+   /*
+* If this bounds directory entry is nonzero, and meanwhile
+* the valid bit is zero, one SIGSEGV will be produced due to
+* this unexpected situation.
+*/
+   if (!(*valid)  *bt_addr)
+   force_sig(SIGSEGV, current);
+
+   return 0;
+}
+
+/*
+ * Free the backing physical pages of bounds table 'bt_addr'.
+ * Assume start...end is within that bounds table.
+ */
+static void zap_bt_entries(struct mm_struct *mm, unsigned long bt_addr,
+   unsigned long start, unsigned long end)
+{
+   struct vm_area_struct *vma;
+
+   /* Find the vma which overlaps this bounds table */
+   vma = find_vma(mm, bt_addr);
+   if (!vma || vma-vm_start  bt_addr ||
+   vma-vm_end  bt_addr+MPX_BT_SIZE_BYTES)
+   return;
+
+   zap_page_range(vma, start, end, NULL);
+}
+
+static void unmap_single_bt(struct mm_struct *mm, long __user *bd_entry,
+   unsigned long bt_addr)
+{
+   if (user_atomic_cmpxchg_inatomic(bt_addr, bd_entry,
+   bt_addr | MPX_BD_ENTRY_VALID_FLAG, 0))
+   return;
+
+   /*
+* to avoid recursion, do_munmap() will check whether it comes
+* from one bounds table through VM_MPX flag.
+*/
+   do_munmap(mm, bt_addr  MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES);
+}
+
+/*
+ * If the bounds table pointed

[PATCH v6 00/10] Intel MPX support

2014-06-18 Thread Qiaowei Ren
This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in "Intel(R) Architecture
Instruction Set Extensions Programming Reference".

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.

The high-level areas modified in the patchset are as follow:
1) struct siginfo is extended to include bound violation information.
2) two prctl() commands are added to do performance optimization.

Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator

Future TODO items:
1) support 32-bit binaries on 64-bit kernels.

Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Changes since v2:
  * fix some compile warnings.
  * update documentation.

Changes since v3:
  * correct some syntax errors at documentation, and document
extended struct siginfo.
  * for kill the process when the error code of BNDSTATUS is 3.
  * add some comments.
  * remove new prctl() commands.
  * fix some compile warnings for 32-bit.

Changes since v4:
  * raise SIGBUS if the allocations of the bound tables fail.

Changes since v5:
  * hook unmap() path to cleanup unused bounds tables, and use
new prctl() command to register bounds directory address to
struct mm_struct to check whether one process is MPX enabled
during unmap().
  * in order track precisely MPX memory usage, add MPX specific
mmap interface and one VM_MPX flag to check whether a VMA
is MPX bounds table.
  * add macro cpu_has_mpx to do performance optimization.
  * sync struct figinfo for mips with general version to avoid
build issue.


Qiaowei Ren (10):
  x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add macro cpu_has_mpx
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: extend siginfo structure to include bound violation
information
  mips: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER
  x86, mpx: cleanup unused bound tables
  x86, mpx: add documentation on Intel MPX

 Documentation/x86/intel_mpx.txt  |  127 +++
 arch/mips/include/uapi/asm/siginfo.h |4 +
 arch/x86/Kconfig |4 +
 arch/x86/include/asm/cpufeature.h|6 +
 arch/x86/include/asm/mmu_context.h   |   16 ++
 arch/x86/include/asm/mpx.h   |   91 
 arch/x86/include/asm/processor.h |   18 ++
 arch/x86/kernel/Makefile |1 +
 arch/x86/kernel/mpx.c|  413 ++
 arch/x86/kernel/traps.c  |   62 +-
 arch/x86/mm/Makefile |2 +
 arch/x86/mm/init_64.c|2 +
 arch/x86/mm/mpx.c|  247 
 fs/proc/task_mmu.c   |1 +
 include/a

[PATCH v6 03/10] x86, mpx: add macro cpu_has_mpx

2014-06-18 Thread Qiaowei Ren
In order to do performance optimization, this patch adds macro
cpu_has_mpx which will directly return 0 when MPX is not supported
by kernel.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/cpufeature.h |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index e265ff9..f302d08 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -339,6 +339,12 @@ extern const char * const x86_power_flags[32];
 #define cpu_has_eager_fpu  boot_cpu_has(X86_FEATURE_EAGER_FPU)
 #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT)
 
+#ifdef CONFIG_X86_INTEL_MPX
+#define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX)
+#else
+#define cpu_has_mpx 0
+#endif /* CONFIG_X86_INTEL_MPX */
+
 #ifdef CONFIG_X86_64
 
 #undef  cpu_has_vme
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 01/10] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific

2014-06-18 Thread Qiaowei Ren
MPX-enabled application will possibly create a lot of bounds tables
in process address space to save bounds information. These tables
can take up huge swaths of memory (as much as 80% of the memory on
the system) even if we clean them up aggressively. Being this huge,
we need a way to track their memory use. If we want to track them,
we essentially have two options:

1. walk the multi-GB (in virtual space) bounds directory to locate
   all the VMAs and walk them
2. Find a way to distinguish MPX bounds-table VMAs from normal
   anonymous VMAs and use some existing mechanism to walk them

We expect (1) will be prohibitively expensive. For (2), we only
need a single bit, and we've chosen to use a VM_ flag.  We understand
that they are scarce and are open to other options.

There is one potential hybrid approach: check the bounds directory
entry for any anonymous VMA that could possibly contain a bounds table.
This is less expensive than (1), but still requires reading a pointer
out of userspace for every VMA that we iterate over.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/mm/init_64.c |2 ++
 fs/proc/task_mmu.c|1 +
 include/linux/mm.h|2 ++
 3 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index f35c66c..2d41679 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1223,6 +1223,8 @@ int in_gate_area_no_mm(unsigned long addr)
 
 const char *arch_vma_name(struct vm_area_struct *vma)
 {
+   if (vma->vm_flags & VM_MPX)
+   return "[mpx]";
if (vma->vm_mm && vma->vm_start == (long)vma->vm_mm->context.vdso)
return "[vdso]";
if (vma == _vma)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 442177b..09266bd 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -543,6 +543,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_GROWSDOWN)]   = "gd",
[ilog2(VM_PFNMAP)]  = "pf",
[ilog2(VM_DENYWRITE)]   = "dw",
+   [ilog2(VM_MPX)] = "mp",
[ilog2(VM_LOCKED)]  = "lo",
[ilog2(VM_IO)]  = "io",
[ilog2(VM_SEQ_READ)]= "sr",
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d677706..029c716 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -127,6 +127,8 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB 0x0040  /* Huge TLB Page VM */
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1  0x0100  /* Architecture-specific flag */
+/* MPX specific bounds table or bounds directory (x86) */
+#define VM_MPX 0x0200
 #define VM_DONTDUMP0x0400  /* Do not include in the core dump */
 
 #ifdef CONFIG_MEM_SOFT_DIRTY
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 05/10] x86, mpx: extend siginfo structure to include bound violation information

2014-06-18 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

Signed-off-by: Qiaowei Ren 
---
 include/uapi/asm-generic/siginfo.h |9 -
 kernel/signal.c|4 
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno  _sifields._sigfault._trapno
 #endif
 #define si_addr_lsb_sifields._sigfault._addr_lsb
+#define si_lower   _sifields._sigfault._addr_bnd._lower
+#define si_upper   _sifields._sigfault._addr_bnd._upper
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
-#define NSIGSEGV   2
+#define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV   3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index 6ea13c0..0fcf749 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2773,6 +2773,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const 
siginfo_t *from)
if (from->si_code == BUS_MCEERR_AR || from->si_code == 
BUS_MCEERR_AO)
err |= __put_user(from->si_addr_lsb, >si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+   err |= __put_user(from->si_lower, >si_lower);
+   err |= __put_user(from->si_upper, >si_upper);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from->si_pid, >si_pid);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-18 Thread Qiaowei Ren
This patch adds one MPX specific mmap interface, which only handles
mpx related maps, including bounds table and bounds directory.

In order to track MPX specific memory usage, this interface is added
to stick new vm_flag VM_MPX in the vma_area_struct when create a
bounds table or bounds directory.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/Kconfig   |4 +++
 arch/x86/include/asm/mpx.h |   38 
 arch/x86/mm/Makefile   |2 +
 arch/x86/mm/mpx.c  |   58 
 4 files changed, 102 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/mm/mpx.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 25d2c6f..0194790 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -237,6 +237,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU && ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32 && SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..5725ac4
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,38 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+unsigned long mpx_mmap(unsigned long len);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..546c5d1
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,58 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * this is really a simplified "vm_mmap". it only handles mpx
+ * related maps, including bounds table and bounds directory.
+ *
+ * here we can stick new vm_flag VM_MPX in the vma_area_struct
+ * when create a bounds table or bounds directory, in order to
+ * track MPX specific memory.
+ */
+unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current->mm;
+   vm_flags_t vm_flags;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(>mmap_sem);
+
+   /* Too many mappings? */
+   if (mm->map_count > sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
+   if (addr & ~PAGE_MASK) {
+   ret = addr;
+   goto out;
+   }
+
+   vm_flags = VM_READ | VM_WRITE | VM_MPX |
+   mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
+
+   /* Make bounds tables and bouds directory unlocked. */
+   if (vm_flags & VM_LOCKED)
+   vm_flags &= ~VM_LOCKED;
+
+   /* Set pgoff according to addr for anon_vma */
+   pgoff = addr >> PAGE_SHIFT;
+
+   ret = mmap_region(NULL, addr, len, vm_flags, pgoff);
+
+out:
+   up_write(>mmap_sem);
+   return ret;
+}
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-06-18 Thread Qiaowei Ren
This patch adds the PR_MPX_REGISTER and PR_MPX_UNREGISTER prctl()
commands. These commands can be used to register and unregister MPX
related resource on the x86 platform.

The base of the bounds directory is set into mm_struct during
PR_MPX_REGISTER command execution. This member can be used to
check whether one application is mpx enabled.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h   |1 +
 arch/x86/include/asm/processor.h |   18 
 arch/x86/kernel/mpx.c|   56 ++
 include/linux/mm_types.h |3 ++
 include/uapi/linux/prctl.h   |6 
 kernel/sys.c |   12 
 6 files changed, 96 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 780af63..6cb0853 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -43,6 +43,7 @@
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
 struct mpx_insn {
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a4ea023..6e0966e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -952,6 +952,24 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+/* Register/unregister a process' MPX related resource */
+#define MPX_REGISTER(tsk)  mpx_register((tsk))
+#define MPX_UNREGISTER(tsk)mpx_unregister((tsk))
+
+#ifdef CONFIG_X86_INTEL_MPX
+extern int mpx_register(struct task_struct *tsk);
+extern int mpx_unregister(struct task_struct *tsk);
+#else
+static inline int mpx_register(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+static inline int mpx_unregister(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+#endif /* CONFIG_X86_INTEL_MPX */
+
 extern u16 amd_get_nb_id(int cpu);
 
 static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 650b282..d8a2a09 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -1,6 +1,62 @@
 #include 
 #include 
+#include 
 #include 
+#include 
+#include 
+
+/*
+ * This should only be called when cpuid has been checked
+ * and we are sure that MPX is available.
+ */
+static __user void *task_get_bounds_dir(struct task_struct *tsk)
+{
+   struct xsave_struct *xsave_buf;
+
+   fpu_xsave(>thread.fpu);
+   xsave_buf = &(tsk->thread.fpu.state->xsave);
+   if (!(xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ENABLE_FLAG))
+   return NULL;
+
+   return (void __user *)(xsave_buf->bndcsr.cfg_reg_u &
+   MPX_BNDCFG_ADDR_MASK);
+}
+
+int mpx_register(struct task_struct *tsk)
+{
+   struct mm_struct *mm = tsk->mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   /*
+* runtime in the userspace will be responsible for allocation of
+* the bounds directory. Then, it will save the base of the bounds
+* directory into XSAVE/XRSTOR Save Area and enable MPX through
+* XRSTOR instruction.
+*
+* fpu_xsave() is expected to be very expensive. In order to do
+* performance optimization, here we get the base of the bounds
+* directory and then save it into mm_struct to be used in future.
+*/
+   mm->bd_addr = task_get_bounds_dir(tsk);
+   if (!mm->bd_addr)
+   return -EINVAL;
+
+   pr_debug("MPX BD base address %p\n", mm->bd_addr);
+   return 0;
+}
+
+int mpx_unregister(struct task_struct *tsk)
+{
+   struct mm_struct *mm = current->mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   mm->bd_addr = NULL;
+   return 0;
+}
 
 typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t;
 static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 8967e20..54b8011 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -454,6 +454,9 @@ struct mm_struct {
bool tlb_flush_pending;
 #endif
struct uprobes_state uprobes_state;
+#ifdef CONFIG_X86_INTEL_MPX
+   void __user *bd_addr;   /* address of the bounds directory */
+#endif
 };
 
 static inline void mm_init_cpumask(struct mm_struct *mm)
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 58afc04..ce86fa9 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -152,4 +152,10 @@
 #define PR_SET_THP_DISABLE 41
 #define PR_GET_THP_DISABLE 42
 
+/*
+ * Register/unregister MPX related resource.
+ */
+#define PR_MPX_REGISTER43
+#define PR_MPX_UNREGIST

[PATCH v6 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-06-18 Thread Qiaowei Ren
This patch sets bound violation fields of siginfo struct in #BR
exception handler by decoding the user instruction and constructing
the faulting pointer.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   23 
 arch/x86/kernel/mpx.c  |  294 
 arch/x86/kernel/traps.c|6 +
 3 files changed, 323 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index b7598ac..780af63 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 
@@ -44,15 +45,37 @@
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 unsigned long mpx_mmap(unsigned long len);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 #else
 static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
 {
return -EINVAL;
 }
+static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf)
+{
+}
 #endif /* CONFIG_X86_INTEL_MPX */
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 4230c7b..650b282 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -2,6 +2,270 @@
 #include 
 #include 
 
+typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t;
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+reg_type_t type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn->sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   } else {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   }
+   addr += insn->displacement.value;
+   }
+
+   return addr;
+}
+
+/* Verify next sizeof(t) bytes can be on the same instruction */
+#define validate_next(t, insn, n)  \
+   ((insn)-&

[PATCH v6 09/10] x86, mpx: cleanup unused bound tables

2014-06-18 Thread Qiaowei Ren
When user memory region is unmapped, related bound tables
become unused and need to be released also. This patch cleanups
these unused bound tables through hooking unmap path.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mmu_context.h |   16 +++
 arch/x86/include/asm/mpx.h |9 ++
 arch/x86/mm/mpx.c  |  189 
 include/asm-generic/mmu_context.h  |6 +
 mm/mmap.c  |2 +
 5 files changed, 222 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index be12c53..af70d4f 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifndef CONFIG_PARAVIRT
 #include 
 
@@ -96,4 +97,19 @@ do { \
 } while (0)
 #endif
 
+static inline void arch_unmap(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long start, unsigned long end)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   /*
+* Check whether this vma comes from MPX-enabled application.
+* If so, release this vma related bound tables.
+*/
+   if (mm->bd_addr && !(vma->vm_flags & VM_MPX))
+   mpx_unmap(mm, start, end);
+
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 6cb0853..e848a74 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -42,6 +42,13 @@
 #define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
+#define MPX_BD_ENTRY_MASK  ((1<>(MPX_BT_ENTRY_OFFSET+ \
+   MPX_IGN_BITS)) & MPX_BD_ENTRY_MASK) << MPX_BD_ENTRY_SHIFT)
+#define MPX_GET_BT_ENTRY_OFFSET(addr)  addr)>>MPX_IGN_BITS) & \
+   MPX_BT_ENTRY_MASK) << MPX_BT_ENTRY_SHIFT)
+
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
@@ -63,6 +70,8 @@ struct mpx_insn {
 #define MAX_MPX_INSN_SIZE  15
 
 unsigned long mpx_mmap(unsigned long len);
+void mpx_unmap(struct mm_struct *mm,
+   unsigned long start, unsigned long end);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 546c5d1..fd05cd4 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -2,6 +2,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -56,3 +57,191 @@ out:
up_write(>mmap_sem);
return ret;
 }
+
+/*
+ * Get the base of bounds tables pointed by specific bounds
+ * directory entry.
+ */
+static int get_bt_addr(long __user *bd_entry, unsigned long *bt_addr,
+   unsigned int *valid)
+{
+   if (get_user(*bt_addr, bd_entry))
+   return -EFAULT;
+
+   *valid = *bt_addr & MPX_BD_ENTRY_VALID_FLAG;
+   *bt_addr &= MPX_BT_ADDR_MASK;
+
+   /*
+* If this bounds directory entry is nonzero, and meanwhile
+* the valid bit is zero, one SIGSEGV will be produced due to
+* this unexpected situation.
+*/
+   if (!(*valid) && *bt_addr)
+   force_sig(SIGSEGV, current);
+
+   pr_debug("get_bt: BD Entry (%p) - Table (%lx,%d)\n",
+   bd_entry, *bt_addr, *valid);
+   return 0;
+}
+
+/*
+ * Free the backing physical pages of bounds table 'bt_addr'.
+ * Assume start...end is within that bounds table.
+ */
+static void zap_bt_entries(struct mm_struct *mm, unsigned long bt_addr,
+   unsigned long start, unsigned long end)
+{
+   struct vm_area_struct *vma;
+
+   /* Find the vma which overlaps this bounds table */
+   vma = find_vma(mm, bt_addr);
+   if (!vma || vma->vm_start > bt_addr ||
+   vma->vm_end < bt_addr+MPX_BT_SIZE_BYTES)
+   return;
+
+   zap_page_range(vma, start, end, NULL);
+   pr_debug("Bound table de-allocation %lx (%lx, %lx)\n",
+   bt_addr, start, end);
+}
+
+static void unmap_single_bt(struct mm_struct *mm, long __user *bd_entry,
+   unsigned long bt_addr)
+{
+   if (user_atomic_cmpxchg_inatomic(_addr, bd_entry,
+   bt_addr | MPX_BD_ENTRY_VALID_FLAG, 0))
+   return;
+
+   pr_debug("Bound table de-allocation %lx at entry addr %p\n",
+   bt_addr, bd_entry);
+   /*
+* to avoid recursion, do_munmap() will check whether it comes
+* from one bounds table through VM_MPX flag.
+*/
+   do_munmap(mm, bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES);
+}
+
+/*
+ * If the bounds table pointed by bounds directory 'bd_entry' is
+ * not shared, unmap thi

[PATCH v6 10/10] x86, mpx: add documentation on Intel MPX

2014-06-18 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
---
 Documentation/x86/intel_mpx.txt |  127 +++
 1 files changed, 127 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..1af9809
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,127 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX provides
+hardware features that can be used in conjunction with compiler
+changes to check memory references, for those references whose
+compile-time normal intentions are usurped at runtime due to
+buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture
+Instruction Set Extensions Programming Reference, Chapter 9:
+Intel(R) Memory Protection Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead,
+which can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How does MPX kernel code work
+
+
+Handling #BR faults caused by MPX
+-
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults.
+  * bounds violation caused by MPX instructions.
+  * new bounds tables (BT) need to be allocated to save bounds.
+
+We hook #BR handler to handle these two new situations.
+
+Decoding MPX instructions
+-
+
+If a #BR is generated due to a bounds violation caused by MPX.
+We need to decode MPX instructions to get violation address and
+set this address into extended struct siginfo.
+
+The _sigfault feild of struct siginfo is extended as follow:
+
+87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
+88 struct {
+89 void __user *_addr; /* faulting insn/memory ref. */
+90 #ifdef __ARCH_SI_TRAPNO
+91 int _trapno;/* TRAP # which caused the signal */
+92 #endif
+93 short _addr_lsb; /* LSB of the reported address */
+94 struct {
+95 void __user *_lower;
+96 void __user *_upper;
+97 } _addr_bnd;
+98 } _sigfault;
+
+The '_addr' field refers to violation address, and new '_addr_and'
+field refers to the upper/lower bounds when a #BR is caused.
+
+Glibc will be also updated to support this new siginfo. So user
+can get violation address and bounds when bounds violations occur.
+
+Freeing unused bounds tables
+
+
+When a BNDSTX instruction attempts to save bounds to a bounds directory
+entry marked as invalid, a #BR is generated. This is an indication that
+no bounds table exists for this entry. In this case the fault handler
+will allocate a new bounds table on demand.
+
+Since the kernel allocated those tables on-demand without userspace
+knowledge, it is also responsible for freeing them when the associated
+mappings go away.
+
+Here, the solution for this issue is to hook do_munmap() to check
+whether one process is MPX enabled. If yes, those bounds tables covered
+in the virtual address region which is being unmapped will be freed also.
+
+Adding new prctl commands
+-
+
+Runtime library in userspace is responsible for allocation of bounds
+directory. So kernel have to use XSAVE instruction to get the base
+of bounds directory from BNDCFG register.
+
+But XSAVE is expected to be very expensive. In order to do performance
+optimization, we have to add new prctl command to get the base of
+bounds directory to be used in future.
+
+Two new prctl commands are added to register and unregister MPX related
+resource.
+
+155#define PR_MPX_REGISTER 41
+156#define PR_MPX_UNREGISTER   42
+
+The base of the bounds directory is set into mm_struct during
+PR_MPX_REGISTER command execution. This member can be used to
+check whether one application is mpx enabled.
+
+
+3. Tips
+===
+
+1) Users are not allowed to create bounds tables and point the bounds
+directory at them in the userspace. In fact, it is not also necessary
+for users to create bounds tables in the userspace.
+
+When #BR fault is produced due to invalid entry, bounds table will be
+created in kernel on demand and kernel will not transfer this fault to
+userspace. So usersapce can't receive #BR fault for invalid entry, and
+it is not also necessary for users to create bounds tables by themselves.
+
+Certainly users can allocate bounds tables and forcibly point the bounds
+directory at them through XSAVE instruction, and then set valid bit
+of bounds entry

[PATCH v6 06/10] mips: sync struct siginfo with general version

2014-06-18 Thread Qiaowei Ren
Due to new fields about bound violation added into struct siginfo,
this patch syncs it with general version to avoid build issue.

Signed-off-by: Qiaowei Ren 
---
 arch/mips/include/uapi/asm/siginfo.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 04/10] x86, mpx: hook #BR exception handler to allocate bound tables

2014-06-18 Thread Qiaowei Ren
This patch handles a #BR exception for non-existent tables by
carving the space out of the normal processes address space
(essentially calling mmap() from inside the kernel) and then
pointing the bounds-directory over to it.

The tables need to be accessed and controlled by userspace
because the compiler generates instructions for MPX-enabled
code which frequently store and retrieve entries from the bounds
tables. Any direct kernel involvement (like a syscall) to access
the tables would destroy performance since these are so frequent.

The tables are carved out of userspace because we have no better
spot to put them. For each pointer which is being tracked by MPX,
the bounds tables contain 4 longs worth of data, and the tables
are indexed virtually. If we were to preallocate the tables, we
would theoretically need to allocate 4x the virtual space that
we have available for userspace somewhere else. We don't have
that room in the kernel address space.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   20 ++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |   63 
 arch/x86/kernel/traps.c|   56 ++-
 4 files changed, 139 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 5725ac4..b7598ac 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -18,6 +18,8 @@
 #define MPX_BT_ENTRY_SHIFT 5
 #define MPX_IGN_BITS   3
 
+#define MPX_BD_ENTRY_TAIL  3
+
 #else
 
 #define MPX_BD_ENTRY_OFFSET20
@@ -26,13 +28,31 @@
 #define MPX_BT_ENTRY_SHIFT 4
 #define MPX_IGN_BITS   2
 
+#define MPX_BD_ENTRY_TAIL  2
+
 #endif
 
+#define MPX_BNDSTA_TAIL2
+#define MPX_BNDCFG_TAIL12
+#define MPX_BNDSTA_ADDR_MASK   (~((1UL<
+#include 
+#include 
+
+static int allocate_bt(long __user *bd_entry)
+{
+   unsigned long bt_addr, old_val = 0;
+   int ret = 0;
+
+   bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES);
+   if (IS_ERR((void *)bt_addr)) {
+   pr_err("Bounds table allocation failed at entry addr %p\n",
+   bd_entry);
+   return bt_addr;
+   }
+   bt_addr = (bt_addr & MPX_BT_ADDR_MASK) | MPX_BD_ENTRY_VALID_FLAG;
+
+   ret = user_atomic_cmpxchg_inatomic(_val, bd_entry, 0, bt_addr);
+   if (ret)
+   goto out;
+
+   /*
+* there is a existing bounds table pointed at this bounds
+* directory entry, and so we need to free the bounds table
+* allocated just now.
+*/
+   if (old_val)
+   goto out;
+
+   pr_debug("Allocate bounds table %lx at entry %p\n",
+   bt_addr, bd_entry);
+   return 0;
+
+out:
+   vm_munmap(bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES);
+   return ret;
+}
+
+/*
+ * When a BNDSTX instruction attempts to save bounds to a BD entry
+ * with the lack of the valid bit being set, a #BR is generated.
+ * This is an indication that no BT exists for this entry. In this
+ * case the fault handler will allocate a new BT.
+ *
+ * With 32-bit mode, the size of BD is 4MB, and the size of each
+ * bound table is 16KB. With 64-bit mode, the size of BD is 2GB,
+ * and the size of each bound table is 4MB.
+ */
+int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf->bndcsr.status_reg;
+
+   bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry < bd_base) ||
+   (bd_entry >= bd_base + MPX_BD_SIZE_BYTES))
+   return -EINVAL;
+
+   return allocate_bt((long __user *)bd_entry);
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index f73b5d4..35b9b29 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -213,7 +214,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long 
error_code) \
 
 DO_ERROR_INFO(X86_TRAP_DE, SIGFPE,  "divide error",
divide_error,FPE_INTDIV, regs->ip )
 DO_ERROR (X86_TRAP_OF, SIGSEGV, "overflow",
overflow  )
-DO_ERROR (X86_TRAP_BR, SIGSEGV, "bounds",  bounds  
  )
 DO_ERROR_INFO(X86_TRAP_UD, SIGILL,  "invalid opcode",  
invalid_op,  ILL_ILLOPN, regs->ip )
 DO_ERROR (X86_TRAP_OLD_MF, SIGFPE,  "coprocessor segment overrun", 
coprocessor_segment_overrun  

[PATCH v6 04/10] x86, mpx: hook #BR exception handler to allocate bound tables

2014-06-18 Thread Qiaowei Ren
This patch handles a #BR exception for non-existent tables by
carving the space out of the normal processes address space
(essentially calling mmap() from inside the kernel) and then
pointing the bounds-directory over to it.

The tables need to be accessed and controlled by userspace
because the compiler generates instructions for MPX-enabled
code which frequently store and retrieve entries from the bounds
tables. Any direct kernel involvement (like a syscall) to access
the tables would destroy performance since these are so frequent.

The tables are carved out of userspace because we have no better
spot to put them. For each pointer which is being tracked by MPX,
the bounds tables contain 4 longs worth of data, and the tables
are indexed virtually. If we were to preallocate the tables, we
would theoretically need to allocate 4x the virtual space that
we have available for userspace somewhere else. We don't have
that room in the kernel address space.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mpx.h |   20 ++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |   63 
 arch/x86/kernel/traps.c|   56 ++-
 4 files changed, 139 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 5725ac4..b7598ac 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -18,6 +18,8 @@
 #define MPX_BT_ENTRY_SHIFT 5
 #define MPX_IGN_BITS   3
 
+#define MPX_BD_ENTRY_TAIL  3
+
 #else
 
 #define MPX_BD_ENTRY_OFFSET20
@@ -26,13 +28,31 @@
 #define MPX_BT_ENTRY_SHIFT 4
 #define MPX_IGN_BITS   2
 
+#define MPX_BD_ENTRY_TAIL  2
+
 #endif
 
+#define MPX_BNDSTA_TAIL2
+#define MPX_BNDCFG_TAIL12
+#define MPX_BNDSTA_ADDR_MASK   (~((1ULMPX_BNDSTA_TAIL)-1))
+#define MPX_BNDCFG_ADDR_MASK   (~((1ULMPX_BNDCFG_TAIL)-1))
+#define MPX_BT_ADDR_MASK   (~((1ULMPX_BD_ENTRY_TAIL)-1))
+
 #define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BD_ENTRY_VALID_FLAG0x1
 
 unsigned long mpx_mmap(unsigned long len);
 
+#ifdef CONFIG_X86_INTEL_MPX
+int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+#else
+static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   return -EINVAL;
+}
+#endif /* CONFIG_X86_INTEL_MPX */
+
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index f4d9600..3e81aed 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -41,6 +41,7 @@ obj-$(CONFIG_PREEMPT) += preempt.o
 
 obj-y  += process.o
 obj-y  += i387.o xsave.o
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
 obj-y  += ptrace.o
 obj-$(CONFIG_X86_32)   += tls.o
 obj-$(CONFIG_IA32_EMULATION)   += tls.o
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
new file mode 100644
index 000..4230c7b
--- /dev/null
+++ b/arch/x86/kernel/mpx.c
@@ -0,0 +1,63 @@
+#include linux/kernel.h
+#include linux/syscalls.h
+#include asm/mpx.h
+
+static int allocate_bt(long __user *bd_entry)
+{
+   unsigned long bt_addr, old_val = 0;
+   int ret = 0;
+
+   bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES);
+   if (IS_ERR((void *)bt_addr)) {
+   pr_err(Bounds table allocation failed at entry addr %p\n,
+   bd_entry);
+   return bt_addr;
+   }
+   bt_addr = (bt_addr  MPX_BT_ADDR_MASK) | MPX_BD_ENTRY_VALID_FLAG;
+
+   ret = user_atomic_cmpxchg_inatomic(old_val, bd_entry, 0, bt_addr);
+   if (ret)
+   goto out;
+
+   /*
+* there is a existing bounds table pointed at this bounds
+* directory entry, and so we need to free the bounds table
+* allocated just now.
+*/
+   if (old_val)
+   goto out;
+
+   pr_debug(Allocate bounds table %lx at entry %p\n,
+   bt_addr, bd_entry);
+   return 0;
+
+out:
+   vm_munmap(bt_addr  MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES);
+   return ret;
+}
+
+/*
+ * When a BNDSTX instruction attempts to save bounds to a BD entry
+ * with the lack of the valid bit being set, a #BR is generated.
+ * This is an indication that no BT exists for this entry. In this
+ * case the fault handler will allocate a new BT.
+ *
+ * With 32-bit mode, the size of BD is 4MB, and the size of each
+ * bound table is 16KB. With 64-bit mode, the size of BD is 2GB,
+ * and the size of each bound table is 4MB.
+ */
+int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+
+   bd_base = xsave_buf-bndcsr.cfg_reg_u  MPX_BNDCFG_ADDR_MASK

[PATCH v6 10/10] x86, mpx: add documentation on Intel MPX

2014-06-18 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 Documentation/x86/intel_mpx.txt |  127 +++
 1 files changed, 127 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..1af9809
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,127 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX provides
+hardware features that can be used in conjunction with compiler
+changes to check memory references, for those references whose
+compile-time normal intentions are usurped at runtime due to
+buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture
+Instruction Set Extensions Programming Reference, Chapter 9:
+Intel(R) Memory Protection Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead,
+which can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How does MPX kernel code work
+
+
+Handling #BR faults caused by MPX
+-
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults.
+  * bounds violation caused by MPX instructions.
+  * new bounds tables (BT) need to be allocated to save bounds.
+
+We hook #BR handler to handle these two new situations.
+
+Decoding MPX instructions
+-
+
+If a #BR is generated due to a bounds violation caused by MPX.
+We need to decode MPX instructions to get violation address and
+set this address into extended struct siginfo.
+
+The _sigfault feild of struct siginfo is extended as follow:
+
+87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
+88 struct {
+89 void __user *_addr; /* faulting insn/memory ref. */
+90 #ifdef __ARCH_SI_TRAPNO
+91 int _trapno;/* TRAP # which caused the signal */
+92 #endif
+93 short _addr_lsb; /* LSB of the reported address */
+94 struct {
+95 void __user *_lower;
+96 void __user *_upper;
+97 } _addr_bnd;
+98 } _sigfault;
+
+The '_addr' field refers to violation address, and new '_addr_and'
+field refers to the upper/lower bounds when a #BR is caused.
+
+Glibc will be also updated to support this new siginfo. So user
+can get violation address and bounds when bounds violations occur.
+
+Freeing unused bounds tables
+
+
+When a BNDSTX instruction attempts to save bounds to a bounds directory
+entry marked as invalid, a #BR is generated. This is an indication that
+no bounds table exists for this entry. In this case the fault handler
+will allocate a new bounds table on demand.
+
+Since the kernel allocated those tables on-demand without userspace
+knowledge, it is also responsible for freeing them when the associated
+mappings go away.
+
+Here, the solution for this issue is to hook do_munmap() to check
+whether one process is MPX enabled. If yes, those bounds tables covered
+in the virtual address region which is being unmapped will be freed also.
+
+Adding new prctl commands
+-
+
+Runtime library in userspace is responsible for allocation of bounds
+directory. So kernel have to use XSAVE instruction to get the base
+of bounds directory from BNDCFG register.
+
+But XSAVE is expected to be very expensive. In order to do performance
+optimization, we have to add new prctl command to get the base of
+bounds directory to be used in future.
+
+Two new prctl commands are added to register and unregister MPX related
+resource.
+
+155#define PR_MPX_REGISTER 41
+156#define PR_MPX_UNREGISTER   42
+
+The base of the bounds directory is set into mm_struct during
+PR_MPX_REGISTER command execution. This member can be used to
+check whether one application is mpx enabled.
+
+
+3. Tips
+===
+
+1) Users are not allowed to create bounds tables and point the bounds
+directory at them in the userspace. In fact, it is not also necessary
+for users to create bounds tables in the userspace.
+
+When #BR fault is produced due to invalid entry, bounds table will be
+created in kernel on demand and kernel will not transfer this fault to
+userspace. So usersapce can't receive #BR fault for invalid entry, and
+it is not also necessary for users to create bounds tables by themselves.
+
+Certainly users can allocate bounds tables and forcibly point the bounds
+directory at them through XSAVE instruction, and then set valid

[PATCH v6 06/10] mips: sync struct siginfo with general version

2014-06-18 Thread Qiaowei Ren
Due to new fields about bound violation added into struct siginfo,
this patch syncs it with general version to avoid build issue.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/mips/include/uapi/asm/siginfo.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 09/10] x86, mpx: cleanup unused bound tables

2014-06-18 Thread Qiaowei Ren
When user memory region is unmapped, related bound tables
become unused and need to be released also. This patch cleanups
these unused bound tables through hooking unmap path.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mmu_context.h |   16 +++
 arch/x86/include/asm/mpx.h |9 ++
 arch/x86/mm/mpx.c  |  189 
 include/asm-generic/mmu_context.h  |6 +
 mm/mmap.c  |2 +
 5 files changed, 222 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index be12c53..af70d4f 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -6,6 +6,7 @@
 #include asm/pgalloc.h
 #include asm/tlbflush.h
 #include asm/paravirt.h
+#include asm/mpx.h
 #ifndef CONFIG_PARAVIRT
 #include asm-generic/mm_hooks.h
 
@@ -96,4 +97,19 @@ do { \
 } while (0)
 #endif
 
+static inline void arch_unmap(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long start, unsigned long end)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   /*
+* Check whether this vma comes from MPX-enabled application.
+* If so, release this vma related bound tables.
+*/
+   if (mm-bd_addr  !(vma-vm_flags  VM_MPX))
+   mpx_unmap(mm, start, end);
+
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 6cb0853..e848a74 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -42,6 +42,13 @@
 #define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
+#define MPX_BD_ENTRY_MASK  ((1MPX_BD_ENTRY_OFFSET)-1)
+#define MPX_BT_ENTRY_MASK  ((1MPX_BT_ENTRY_OFFSET)-1)
+#define MPX_GET_BD_ENTRY_OFFSET(addr)  addr)(MPX_BT_ENTRY_OFFSET+ \
+   MPX_IGN_BITS))  MPX_BD_ENTRY_MASK)  MPX_BD_ENTRY_SHIFT)
+#define MPX_GET_BT_ENTRY_OFFSET(addr)  addr)MPX_IGN_BITS)  \
+   MPX_BT_ENTRY_MASK)  MPX_BT_ENTRY_SHIFT)
+
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
@@ -63,6 +70,8 @@ struct mpx_insn {
 #define MAX_MPX_INSN_SIZE  15
 
 unsigned long mpx_mmap(unsigned long len);
+void mpx_unmap(struct mm_struct *mm,
+   unsigned long start, unsigned long end);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 546c5d1..fd05cd4 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -2,6 +2,7 @@
 #include linux/syscalls.h
 #include asm/mpx.h
 #include asm/mman.h
+#include asm/mmu_context.h
 #include linux/sched/sysctl.h
 
 /*
@@ -56,3 +57,191 @@ out:
up_write(mm-mmap_sem);
return ret;
 }
+
+/*
+ * Get the base of bounds tables pointed by specific bounds
+ * directory entry.
+ */
+static int get_bt_addr(long __user *bd_entry, unsigned long *bt_addr,
+   unsigned int *valid)
+{
+   if (get_user(*bt_addr, bd_entry))
+   return -EFAULT;
+
+   *valid = *bt_addr  MPX_BD_ENTRY_VALID_FLAG;
+   *bt_addr = MPX_BT_ADDR_MASK;
+
+   /*
+* If this bounds directory entry is nonzero, and meanwhile
+* the valid bit is zero, one SIGSEGV will be produced due to
+* this unexpected situation.
+*/
+   if (!(*valid)  *bt_addr)
+   force_sig(SIGSEGV, current);
+
+   pr_debug(get_bt: BD Entry (%p) - Table (%lx,%d)\n,
+   bd_entry, *bt_addr, *valid);
+   return 0;
+}
+
+/*
+ * Free the backing physical pages of bounds table 'bt_addr'.
+ * Assume start...end is within that bounds table.
+ */
+static void zap_bt_entries(struct mm_struct *mm, unsigned long bt_addr,
+   unsigned long start, unsigned long end)
+{
+   struct vm_area_struct *vma;
+
+   /* Find the vma which overlaps this bounds table */
+   vma = find_vma(mm, bt_addr);
+   if (!vma || vma-vm_start  bt_addr ||
+   vma-vm_end  bt_addr+MPX_BT_SIZE_BYTES)
+   return;
+
+   zap_page_range(vma, start, end, NULL);
+   pr_debug(Bound table de-allocation %lx (%lx, %lx)\n,
+   bt_addr, start, end);
+}
+
+static void unmap_single_bt(struct mm_struct *mm, long __user *bd_entry,
+   unsigned long bt_addr)
+{
+   if (user_atomic_cmpxchg_inatomic(bt_addr, bd_entry,
+   bt_addr | MPX_BD_ENTRY_VALID_FLAG, 0))
+   return;
+
+   pr_debug(Bound table de-allocation %lx at entry addr %p\n,
+   bt_addr, bd_entry);
+   /*
+* to avoid recursion, do_munmap() will check whether it comes
+* from one bounds table through VM_MPX flag.
+*/
+   do_munmap(mm, bt_addr

[PATCH v6 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-06-18 Thread Qiaowei Ren
This patch adds the PR_MPX_REGISTER and PR_MPX_UNREGISTER prctl()
commands. These commands can be used to register and unregister MPX
related resource on the x86 platform.

The base of the bounds directory is set into mm_struct during
PR_MPX_REGISTER command execution. This member can be used to
check whether one application is mpx enabled.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/include/asm/mpx.h   |1 +
 arch/x86/include/asm/processor.h |   18 
 arch/x86/kernel/mpx.c|   56 ++
 include/linux/mm_types.h |3 ++
 include/uapi/linux/prctl.h   |6 
 kernel/sys.c |   12 
 6 files changed, 96 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 780af63..6cb0853 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -43,6 +43,7 @@
 #define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
 struct mpx_insn {
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a4ea023..6e0966e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -952,6 +952,24 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+/* Register/unregister a process' MPX related resource */
+#define MPX_REGISTER(tsk)  mpx_register((tsk))
+#define MPX_UNREGISTER(tsk)mpx_unregister((tsk))
+
+#ifdef CONFIG_X86_INTEL_MPX
+extern int mpx_register(struct task_struct *tsk);
+extern int mpx_unregister(struct task_struct *tsk);
+#else
+static inline int mpx_register(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+static inline int mpx_unregister(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+#endif /* CONFIG_X86_INTEL_MPX */
+
 extern u16 amd_get_nb_id(int cpu);
 
 static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 650b282..d8a2a09 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -1,6 +1,62 @@
 #include linux/kernel.h
 #include linux/syscalls.h
+#include linux/prctl.h
 #include asm/mpx.h
+#include asm/i387.h
+#include asm/fpu-internal.h
+
+/*
+ * This should only be called when cpuid has been checked
+ * and we are sure that MPX is available.
+ */
+static __user void *task_get_bounds_dir(struct task_struct *tsk)
+{
+   struct xsave_struct *xsave_buf;
+
+   fpu_xsave(tsk-thread.fpu);
+   xsave_buf = (tsk-thread.fpu.state-xsave);
+   if (!(xsave_buf-bndcsr.cfg_reg_u  MPX_BNDCFG_ENABLE_FLAG))
+   return NULL;
+
+   return (void __user *)(xsave_buf-bndcsr.cfg_reg_u 
+   MPX_BNDCFG_ADDR_MASK);
+}
+
+int mpx_register(struct task_struct *tsk)
+{
+   struct mm_struct *mm = tsk-mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   /*
+* runtime in the userspace will be responsible for allocation of
+* the bounds directory. Then, it will save the base of the bounds
+* directory into XSAVE/XRSTOR Save Area and enable MPX through
+* XRSTOR instruction.
+*
+* fpu_xsave() is expected to be very expensive. In order to do
+* performance optimization, here we get the base of the bounds
+* directory and then save it into mm_struct to be used in future.
+*/
+   mm-bd_addr = task_get_bounds_dir(tsk);
+   if (!mm-bd_addr)
+   return -EINVAL;
+
+   pr_debug(MPX BD base address %p\n, mm-bd_addr);
+   return 0;
+}
+
+int mpx_unregister(struct task_struct *tsk)
+{
+   struct mm_struct *mm = current-mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   mm-bd_addr = NULL;
+   return 0;
+}
 
 typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t;
 static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 8967e20..54b8011 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -454,6 +454,9 @@ struct mm_struct {
bool tlb_flush_pending;
 #endif
struct uprobes_state uprobes_state;
+#ifdef CONFIG_X86_INTEL_MPX
+   void __user *bd_addr;   /* address of the bounds directory */
+#endif
 };
 
 static inline void mm_init_cpumask(struct mm_struct *mm)
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 58afc04..ce86fa9 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -152,4 +152,10 @@
 #define PR_SET_THP_DISABLE 41
 #define PR_GET_THP_DISABLE 42
 
+/*
+ * Register/unregister MPX related resource.
+ */
+#define PR_MPX_REGISTER43

  1   2   3   >