Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-27 Thread H. Peter Anvin
On 01/27/2014 01:54 PM, Andy Lutomirski wrote:
> On 01/26/2014 06:10 PM, H. Peter Anvin wrote:
>> On 01/26/2014 05:55 PM, Ren Qiaowei wrote:
>>>
>>> Peter, you mean we should remove these two call and do what they do in
>>> user-space, right?
>>>
>>
>> Unless we think there is a benefit to the kernel to have a on/off switch
>> for the #BR exception (if disabled, all #BR exceptions are signals,
>> regardless of source.)
> 
> Yes.
> 
> For example, wouldn't UML want to have all of this stuff disabled?
> Presumably it would much prefer to receive the exception directly.
> 
> The same goes for seccomp users -- as it currently stands, this code
> allows mmap without a system call.
> 
> This probably means that the prctl should (optionally) take a parameter
> that fixes the address of the L1 table -- seccomp users would probably
> want that.  (Actually, everyone might -- this is going to have weird
> results if the L1 table moves.)
> 

That seems like a good argument.  If the table has been moved by
userspace we can deliver the signal as a violation.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-27 Thread Andy Lutomirski
On 01/26/2014 06:10 PM, H. Peter Anvin wrote:
> On 01/26/2014 05:55 PM, Ren Qiaowei wrote:
>>
>> Peter, you mean we should remove these two call and do what they do in
>> user-space, right?
>>
> 
> Unless we think there is a benefit to the kernel to have a on/off switch
> for the #BR exception (if disabled, all #BR exceptions are signals,
> regardless of source.)

Yes.

For example, wouldn't UML want to have all of this stuff disabled?
Presumably it would much prefer to receive the exception directly.

The same goes for seccomp users -- as it currently stands, this code
allows mmap without a system call.

This probably means that the prctl should (optionally) take a parameter
that fixes the address of the L1 table -- seccomp users would probably
want that.  (Actually, everyone might -- this is going to have weird
results if the L1 table moves.)

--Andy

> 
> There might be, would like other people's opinion.
> 
>   -hpa
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-27 Thread Andy Lutomirski
On 01/26/2014 01:08 AM, Qiaowei Ren wrote:
> This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl()
> commands on the x86 platform. These commands can be used to
> init and release MPX related resource.
> 
> A MMU notifier will be registered during PR_MPX_INIT
> command execution. So the bound tables can be automatically
> deallocated when one memory area is unmapped.
> 
> Signed-off-by: Qiaowei Ren 
> ---
>  arch/x86/Kconfig |4 ++
>  arch/x86/include/asm/mpx.h   |9 
>  arch/x86/include/asm/processor.h |   16 +++
>  arch/x86/kernel/mpx.c|   84 
> ++
>  include/uapi/linux/prctl.h   |6 +++
>  kernel/sys.c |   12 +
>  6 files changed, 131 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index cd18b83..28916e1 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -236,6 +236,10 @@ config HAVE_INTEL_TXT
>   def_bool y
>   depends on INTEL_IOMMU && ACPI
>  
> +config HAVE_INTEL_MPX
> + def_bool y
> + select MMU_NOTIFIER
> +
>  config X86_32_SMP
>   def_bool y
>   depends on X86_32 && SMP
> diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
> index d074153..9652e9e 100644
> --- a/arch/x86/include/asm/mpx.h
> +++ b/arch/x86/include/asm/mpx.h
> @@ -30,6 +30,15 @@
>  
>  #endif
>  
> +typedef union {
> + struct {
> + unsigned long ignored:MPX_IGN_BITS;
> + unsigned long l2index:MPX_L2_BITS;
> + unsigned long l1index:MPX_L1_BITS;
> + };
> + unsigned long addr;
> +} mpx_addr;
> +
>  void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
>  
>  #endif /* _ASM_X86_MPX_H */
> diff --git a/arch/x86/include/asm/processor.h 
> b/arch/x86/include/asm/processor.h
> index fdedd38..5962413 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -943,6 +943,22 @@ extern void start_thread(struct pt_regs *regs, unsigned 
> long new_ip,
>  extern int get_tsc_mode(unsigned long adr);
>  extern int set_tsc_mode(unsigned int val);
>  
> +#ifdef CONFIG_HAVE_INTEL_MPX
> +
> +/* Init/release a process' MPX related resource */
> +#define MPX_INIT(tsk)mpx_init((tsk))
> +#define MPX_RELEASE(tsk) mpx_release((tsk))
> +
> +extern int mpx_init(struct task_struct *tsk);
> +extern int mpx_release(struct task_struct *tsk);
> +
> +#else /* CONFIG_HAVE_INTEL_MPX */
> +
> +#define MPX_INIT(tsk)(-EINVAL)
> +#define MPX_RELEASE(tsk) (-EINVAL)
> +
> +#endif /* CONFIG_HAVE_INTEL_MPX */
> +
>  extern u16 amd_get_nb_id(int cpu);
>  
>  static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t 
> leaves)
> diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
> index e055e0e..9e91178 100644
> --- a/arch/x86/kernel/mpx.c
> +++ b/arch/x86/kernel/mpx.c
> @@ -1,5 +1,7 @@
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -7,6 +9,88 @@
>  #include 
>  #include 
>  
> +static struct mmu_notifier mpx_mn;
> +
> +static void mpx_invl_range_end(struct mmu_notifier *mn,
> + struct mm_struct *mm,
> + unsigned long start, unsigned long end)
> +{
> + struct xsave_struct *xsave_buf;
> + unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
> + unsigned long bt_addr;
> + unsigned long bd_base;
> + unsigned long bd_entry, bde_start, bde_end;
> + mpx_addr lap;
> +
> + pgd_t *pgd;
> + pud_t *pud;
> + pmd_t *pmd;
> + pte_t *pte;
> +
> + /* ignore swap notifications */
> + pgd = pgd_offset(mm, start);
> + pud = pud_offset(pgd, start);
> + pmd = pmd_offset(pud, start);
> + pte = pte_offset_kernel(pmd, start);
> + if (!pte_present(*pte) && !pte_none(*pte) && !pte_file(*pte))
> + return;
> +
> + /* get bound directory base address */
> + fpu_xsave(>thread.fpu);

I may be wrong, but I think that this will corrupt things if you call it
from within kernel_fpu_begin.

One of these days I think we should just start unconditionally doing
xsaveopt on kernel entry.

> + xsave_buf = &(current->thread.fpu.state->xsave);
> + bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
> +
> + /* get related bde range */
> + lap.addr = start;
> + bde_start = bd_base + (lap.l1index << MPX_L1_SHIFT);
> +
> + lap.addr = end;
> + if (lap.ignored || lap.l2index)
> + bde_end = bd_base + (lap.l1index< + else
> + bde_end = bd_base + (lap.l1index< +
> + for (bd_entry = bde_start; bd_entry < bde_end;
> + bd_entry += sizeof(unsigned long)) {
> +
> + if (get_user(bt_addr, (long __user *)bd_entry))
> + return;
> +
> + if (bt_addr & 0x1) {
> + user_atomic_cmpxchg_inatomic(_addr,
> + (long __user *)bd_entry, bt_addr, 0);

Shouldn't this 

Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-27 Thread Andy Lutomirski
On 01/26/2014 01:08 AM, Qiaowei Ren wrote:
 This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl()
 commands on the x86 platform. These commands can be used to
 init and release MPX related resource.
 
 A MMU notifier will be registered during PR_MPX_INIT
 command execution. So the bound tables can be automatically
 deallocated when one memory area is unmapped.
 
 Signed-off-by: Qiaowei Ren qiaowei@intel.com
 ---
  arch/x86/Kconfig |4 ++
  arch/x86/include/asm/mpx.h   |9 
  arch/x86/include/asm/processor.h |   16 +++
  arch/x86/kernel/mpx.c|   84 
 ++
  include/uapi/linux/prctl.h   |6 +++
  kernel/sys.c |   12 +
  6 files changed, 131 insertions(+), 0 deletions(-)
 
 diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
 index cd18b83..28916e1 100644
 --- a/arch/x86/Kconfig
 +++ b/arch/x86/Kconfig
 @@ -236,6 +236,10 @@ config HAVE_INTEL_TXT
   def_bool y
   depends on INTEL_IOMMU  ACPI
  
 +config HAVE_INTEL_MPX
 + def_bool y
 + select MMU_NOTIFIER
 +
  config X86_32_SMP
   def_bool y
   depends on X86_32  SMP
 diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
 index d074153..9652e9e 100644
 --- a/arch/x86/include/asm/mpx.h
 +++ b/arch/x86/include/asm/mpx.h
 @@ -30,6 +30,15 @@
  
  #endif
  
 +typedef union {
 + struct {
 + unsigned long ignored:MPX_IGN_BITS;
 + unsigned long l2index:MPX_L2_BITS;
 + unsigned long l1index:MPX_L1_BITS;
 + };
 + unsigned long addr;
 +} mpx_addr;
 +
  void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
  
  #endif /* _ASM_X86_MPX_H */
 diff --git a/arch/x86/include/asm/processor.h 
 b/arch/x86/include/asm/processor.h
 index fdedd38..5962413 100644
 --- a/arch/x86/include/asm/processor.h
 +++ b/arch/x86/include/asm/processor.h
 @@ -943,6 +943,22 @@ extern void start_thread(struct pt_regs *regs, unsigned 
 long new_ip,
  extern int get_tsc_mode(unsigned long adr);
  extern int set_tsc_mode(unsigned int val);
  
 +#ifdef CONFIG_HAVE_INTEL_MPX
 +
 +/* Init/release a process' MPX related resource */
 +#define MPX_INIT(tsk)mpx_init((tsk))
 +#define MPX_RELEASE(tsk) mpx_release((tsk))
 +
 +extern int mpx_init(struct task_struct *tsk);
 +extern int mpx_release(struct task_struct *tsk);
 +
 +#else /* CONFIG_HAVE_INTEL_MPX */
 +
 +#define MPX_INIT(tsk)(-EINVAL)
 +#define MPX_RELEASE(tsk) (-EINVAL)
 +
 +#endif /* CONFIG_HAVE_INTEL_MPX */
 +
  extern u16 amd_get_nb_id(int cpu);
  
  static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t 
 leaves)
 diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
 index e055e0e..9e91178 100644
 --- a/arch/x86/kernel/mpx.c
 +++ b/arch/x86/kernel/mpx.c
 @@ -1,5 +1,7 @@
  #include linux/kernel.h
  #include linux/syscalls.h
 +#include linux/prctl.h
 +#include linux/mmu_notifier.h
  #include asm/processor.h
  #include asm/mpx.h
  #include asm/mman.h
 @@ -7,6 +9,88 @@
  #include asm/fpu-internal.h
  #include asm/alternative.h
  
 +static struct mmu_notifier mpx_mn;
 +
 +static void mpx_invl_range_end(struct mmu_notifier *mn,
 + struct mm_struct *mm,
 + unsigned long start, unsigned long end)
 +{
 + struct xsave_struct *xsave_buf;
 + unsigned long bt_size = 1UL  (MPX_L2_BITS+MPX_L2_SHIFT);
 + unsigned long bt_addr;
 + unsigned long bd_base;
 + unsigned long bd_entry, bde_start, bde_end;
 + mpx_addr lap;
 +
 + pgd_t *pgd;
 + pud_t *pud;
 + pmd_t *pmd;
 + pte_t *pte;
 +
 + /* ignore swap notifications */
 + pgd = pgd_offset(mm, start);
 + pud = pud_offset(pgd, start);
 + pmd = pmd_offset(pud, start);
 + pte = pte_offset_kernel(pmd, start);
 + if (!pte_present(*pte)  !pte_none(*pte)  !pte_file(*pte))
 + return;
 +
 + /* get bound directory base address */
 + fpu_xsave(current-thread.fpu);

I may be wrong, but I think that this will corrupt things if you call it
from within kernel_fpu_begin.

One of these days I think we should just start unconditionally doing
xsaveopt on kernel entry.

 + xsave_buf = (current-thread.fpu.state-xsave);
 + bd_base = xsave_buf-bndcsr.cfg_reg_u  MPX_BNDCFG_ADDR_MASK;
 +
 + /* get related bde range */
 + lap.addr = start;
 + bde_start = bd_base + (lap.l1index  MPX_L1_SHIFT);
 +
 + lap.addr = end;
 + if (lap.ignored || lap.l2index)
 + bde_end = bd_base + (lap.l1indexMPX_L1_SHIFT) + sizeof(long);
 + else
 + bde_end = bd_base + (lap.l1indexMPX_L1_SHIFT);
 +
 + for (bd_entry = bde_start; bd_entry  bde_end;
 + bd_entry += sizeof(unsigned long)) {
 +
 + if (get_user(bt_addr, (long __user *)bd_entry))
 + return;
 +
 + if (bt_addr  0x1) {
 + user_atomic_cmpxchg_inatomic(bt_addr,
 + (long __user 

Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-27 Thread Andy Lutomirski
On 01/26/2014 06:10 PM, H. Peter Anvin wrote:
 On 01/26/2014 05:55 PM, Ren Qiaowei wrote:

 Peter, you mean we should remove these two call and do what they do in
 user-space, right?

 
 Unless we think there is a benefit to the kernel to have a on/off switch
 for the #BR exception (if disabled, all #BR exceptions are signals,
 regardless of source.)

Yes.

For example, wouldn't UML want to have all of this stuff disabled?
Presumably it would much prefer to receive the exception directly.

The same goes for seccomp users -- as it currently stands, this code
allows mmap without a system call.

This probably means that the prctl should (optionally) take a parameter
that fixes the address of the L1 table -- seccomp users would probably
want that.  (Actually, everyone might -- this is going to have weird
results if the L1 table moves.)

--Andy

 
 There might be, would like other people's opinion.
 
   -hpa
 
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-27 Thread H. Peter Anvin
On 01/27/2014 01:54 PM, Andy Lutomirski wrote:
 On 01/26/2014 06:10 PM, H. Peter Anvin wrote:
 On 01/26/2014 05:55 PM, Ren Qiaowei wrote:

 Peter, you mean we should remove these two call and do what they do in
 user-space, right?


 Unless we think there is a benefit to the kernel to have a on/off switch
 for the #BR exception (if disabled, all #BR exceptions are signals,
 regardless of source.)
 
 Yes.
 
 For example, wouldn't UML want to have all of this stuff disabled?
 Presumably it would much prefer to receive the exception directly.
 
 The same goes for seccomp users -- as it currently stands, this code
 allows mmap without a system call.
 
 This probably means that the prctl should (optionally) take a parameter
 that fixes the address of the L1 table -- seccomp users would probably
 want that.  (Actually, everyone might -- this is going to have weird
 results if the L1 table moves.)
 

That seems like a good argument.  If the table has been moved by
userspace we can deliver the signal as a violation.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/27/2014 10:10 AM, H. Peter Anvin wrote:

On 01/26/2014 05:55 PM, Ren Qiaowei wrote:


Peter, you mean we should remove these two call and do what they do in
user-space, right?



Unless we think there is a benefit to the kernel to have a on/off switch
for the #BR exception (if disabled, all #BR exceptions are signals,
regardless of source.)

There might be, would like other people's opinion.



These two syscalls are only used to release automaticlly the bound 
tables, and they will not set BNDSTATUS and BNDCFGU.


I want to remove them also. :) But if so, we have to release the bound 
tables in user-space.


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread H. Peter Anvin
On 01/26/2014 05:55 PM, Ren Qiaowei wrote:
>
> Peter, you mean we should remove these two call and do what they do in
> user-space, right?
> 

Unless we think there is a benefit to the kernel to have a on/off switch
for the #BR exception (if disabled, all #BR exceptions are signals,
regardless of source.)

There might be, would like other people's opinion.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/26/2014 11:14 PM, Ingo Molnar wrote:


* Ren, Qiaowei  wrote:


The size of one bound table is 4M bytes for 64bit, and 16K bytes for
32bit. It can not be accessed by user-space, and it will be accessed
automatically by hardware.


So, here's the bound-table allocation AFAICS:

+static bool allocate_bt(unsigned long bd_entry)
+{
+   unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
+   unsigned long bt_addr, old_val = 0;
+
+   bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE,
+   MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);

What ensures that user-space cannot access (and in particular, modify)
the pages at bt_addr? It's a read-write anonymous mapping AFAICS.

Looks like that we can not be able to ensure this. I just mean that 
user-space doesn't know the bound tables, and it should not access them 
also.


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/27/2014 09:50 AM, H. Peter Anvin wrote:

On 01/26/2014 12:39 AM, Ingo Molnar wrote:


It will be only once per startup.


In that case it would be more efficient to make this part of the
binary execution environment so that exec() sets it up automatically,
not a separate prctl() syscall.



This is not necessarily possible, and in particular it might need to be
deferred until the MPX runtime has initialized.

What isn't clear to me is if these syscalls are needed at all, or if it
would be better to just let the MPX runtile set BNDSTATUS and BNDCFGU
directly in userspace.  The kernel cannot rely on them staying
consistent across userspace anyway.

Now, it might be beneficial for the kernel to have them anyway.  It's a
bit of a tough call.

-hpa

Peter, you mean we should remove these two call and do what they do in 
user-space, right?


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread H. Peter Anvin
On 01/26/2014 12:39 AM, Ingo Molnar wrote:
>>
>> It will be only once per startup.
> 
> In that case it would be more efficient to make this part of the 
> binary execution environment so that exec() sets it up automatically, 
> not a separate prctl() syscall.
> 

This is not necessarily possible, and in particular it might need to be
deferred until the MPX runtime has initialized.

What isn't clear to me is if these syscalls are needed at all, or if it
would be better to just let the MPX runtile set BNDSTATUS and BNDCFGU
directly in userspace.  The kernel cannot rely on them staying
consistent across userspace anyway.

Now, it might be beneficial for the kernel to have them anyway.  It's a
bit of a tough call.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ingo Molnar

* Ren, Qiaowei  wrote:

> The size of one bound table is 4M bytes for 64bit, and 16K bytes for 
> 32bit. It can not be accessed by user-space, and it will be accessed 
> automatically by hardware.

So, here's the bound-table allocation AFAICS:

+static bool allocate_bt(unsigned long bd_entry)
+{
+   unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
+   unsigned long bt_addr, old_val = 0;
+
+   bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE,
+   MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);

What ensures that user-space cannot access (and in particular, modify) 
the pages at bt_addr? It's a read-write anonymous mapping AFAICS.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren, Qiaowei

> -Original Message-
> From: Ingo Molnar [mailto:mingo.kernel@gmail.com] On Behalf Of Ingo
> Molnar
> Sent: Sunday, January 26, 2014 5:08 PM
> To: Ren, Qiaowei
> Cc: H. Peter Anvin; Thomas Gleixner; Ingo Molnar; x...@kernel.org;
> linux-kernel@vger.kernel.org; Peter Zijlstra; Linus Torvalds; Andrew Morton
> Subject: Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT,
> PR_MPX_RELEASE
> 
> 
> * Qiaowei Ren  wrote:
> 
> > @@ -7,6 +9,88 @@
> >  #include 
> >  #include 
> >
> > +static struct mmu_notifier mpx_mn;
> 
> > +static struct mmu_notifier_ops mpx_mmuops = {
> > +   .invalidate_range_end = mpx_invl_range_end, };
> > +
> > +int mpx_init(struct task_struct *tsk) {
> > +   if (!boot_cpu_has(X86_FEATURE_MPX))
> > +   return -EINVAL;
> > +
> > +   /* register mmu_notifier to delallocate the bound tables */
> > +   mpx_mn.ops = _mmuops;
> > +   mmu_notifier_register(_mn, current->mm);
> 
> 0)
> 
> I do think MPX should be supported by Linux, but this is the best thing I can 
> say
> about the code at the moment:
> 
> 1)
> 
> The above MMU notifier bit is absolutely disgusting: it adds an
> O(nr_mpx_tasks) overhead to every MMU operation!
> 
> MPX needs to be called from architecture methods. (And the MMU notifier
> should probably be removed, it literally invites such abuse.)
> 
> 2)
> 
> The whole MPX_INIT() macro wrappery is ugly beyond relief.
> 
> 3)
> 
> The kernel/sys.c bits are x86-only, it probably won't even build on other
> architectures.
> 
> 4)
> 
> I also argue that MPX setup should be automatic through the ELF
> loader:
> 
>  - MPX setup could also be initiated through the ELF notes section or
>so - similar to the executable bit stack attribute in ELF binaries.
>(See PT_GNU_STACK handling in fs/binfmt_elf.c.)
> 
>  - What is the typical life time of the bounds table? Does user-space
>get access to it? I don't see where it's discoverable to
>user-space. (for example for debuggers)
> 
> 5)
> 
> Teardown can be done through regular munmap() if the notifier is eliminated,
> so that step falls away as well.
> 
> 6)
> 
> How many entries are in the bounds table? Because mpx_invl_range_end()
> looks utterly unscalable if it has any size beyond 1-2 entries.
> 
The size of one bound table is 4M bytes for 64bit, and 16K bytes for 32bit. It 
can not be accessed by user-space, and it will be accessed automatically by 
hardware.

When #BR faults come, and the entry of the bound directory is invalid, one 
bound table have to be allocated to save value of the bounds. It is what the 
second patch does. As for the lifetime of the bound tables, we can see the 
following case:
  User application use malloc or mmap to dynamically allocate memory. when 
address in the memory region is first accessed, related entry in the bound 
directory will be checked and it should be invalid. Then one new bound table 
will be allocated.
  After a time, this memory area is released, and so the bound tables related 
with this memory area become meaningless and may be released. If we don't 
release these unnecessary bound tables, it will save the workload of allocation 
of new bound tables when the memory area related with the entry of the bound 
directory is allocated and accessed again. But in this way the loss of virtual 
space will have to be worried.

Anyway, what this patch does is only that when a piece of address space is 
unmapped, we destroy/unmap the bounds tables that correspond to that same 
address space. MMU notifier will add some overhead for those tasks which use 
MPX. But I have no better way to know when one address space is unmapped, and 
then destroy the related bounds tables. Do you have any suggestion?

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren, Qiaowei

> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org
> [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Ingo Molnar
> Sent: Sunday, January 26, 2014 4:40 PM
> To: Ren, Qiaowei
> Cc: H. Peter Anvin; Thomas Gleixner; Ingo Molnar; x...@kernel.org;
> linux-kernel@vger.kernel.org; Peter Zijlstra
> Subject: Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT,
> PR_MPX_RELEASE
> 
> 
> * Ren Qiaowei  wrote:
> 
> > On 01/26/2014 04:22 PM, Ingo Molnar wrote:
> > >
> > >* Qiaowei Ren  wrote:
> > >
> > >>This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl() commands
> > >>on the x86 platform. These commands can be used to init and release
> > >>MPX related resource.
> > >>
> > >>A MMU notifier will be registered during PR_MPX_INIT command
> > >>execution. So the bound tables can be automatically deallocated when
> > >>one memory area is unmapped.
> > >>
> > >>Signed-off-by: Qiaowei Ren 
> > >>---
> > >>  arch/x86/Kconfig |4 ++
> > >>  arch/x86/include/asm/mpx.h   |9 
> > >>  arch/x86/include/asm/processor.h |   16 +++
> > >>  arch/x86/kernel/mpx.c|   84
> ++
> > >>  include/uapi/linux/prctl.h   |6 +++
> > >>  kernel/sys.c |   12 +
> > >>  6 files changed, 131 insertions(+), 0 deletions(-)
> > >
> > > Hm. What is the expected typical frequency of these syscalls for a
> > > larger application like a web browser? Only once per startup
> > > typically, or will they be called more frequently?
> >
> > It will be only once per startup.
> 
> In that case it would be more efficient to make this part of the binary 
> execution
> environment so that exec() sets it up automatically, not a separate prctl()
> syscall.
> 
Sorry, I guess what I said is not accurate. Normally it will be only once per 
startup. But user application maybe only partly want to use MPX, e.g. only one 
thread of process. For those cases, user application need to enable MPX for 
specific part of the code. So it is not enough to set it up automatically.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ingo Molnar

* Qiaowei Ren  wrote:

> @@ -7,6 +9,88 @@
>  #include 
>  #include 
>  
> +static struct mmu_notifier mpx_mn;

> +static struct mmu_notifier_ops mpx_mmuops = {
> + .invalidate_range_end = mpx_invl_range_end,
> +};
> +
> +int mpx_init(struct task_struct *tsk)
> +{
> + if (!boot_cpu_has(X86_FEATURE_MPX))
> + return -EINVAL;
> +
> + /* register mmu_notifier to delallocate the bound tables */
> + mpx_mn.ops = _mmuops;
> + mmu_notifier_register(_mn, current->mm);

0)

I do think MPX should be supported by Linux, but this is the best 
thing I can say about the code at the moment:

1)

The above MMU notifier bit is absolutely disgusting: it adds an 
O(nr_mpx_tasks) overhead to every MMU operation!

MPX needs to be called from architecture methods. (And the MMU 
notifier should probably be removed, it literally invites such abuse.)

2)

The whole MPX_INIT() macro wrappery is ugly beyond relief.

3)

The kernel/sys.c bits are x86-only, it probably won't even build on 
other architectures.

4)

I also argue that MPX setup should be automatic through the ELF 
loader:

 - MPX setup could also be initiated through the ELF notes section or
   so - similar to the executable bit stack attribute in ELF binaries. 
   (See PT_GNU_STACK handling in fs/binfmt_elf.c.)

 - What is the typical life time of the bounds table? Does user-space
   get access to it? I don't see where it's discoverable to
   user-space. (for example for debuggers)

5)

Teardown can be done through regular munmap() if the notifier is 
eliminated, so that step falls away as well.

6)

How many entries are in the bounds table? Because mpx_invl_range_end() 
looks utterly unscalable if it has any size beyond 1-2 entries.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ingo Molnar

* Ren Qiaowei  wrote:

> On 01/26/2014 04:22 PM, Ingo Molnar wrote:
> >
> >* Qiaowei Ren  wrote:
> >
> >>This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl()
> >>commands on the x86 platform. These commands can be used to
> >>init and release MPX related resource.
> >>
> >>A MMU notifier will be registered during PR_MPX_INIT
> >>command execution. So the bound tables can be automatically
> >>deallocated when one memory area is unmapped.
> >>
> >>Signed-off-by: Qiaowei Ren 
> >>---
> >>  arch/x86/Kconfig |4 ++
> >>  arch/x86/include/asm/mpx.h   |9 
> >>  arch/x86/include/asm/processor.h |   16 +++
> >>  arch/x86/kernel/mpx.c|   84 
> >> ++
> >>  include/uapi/linux/prctl.h   |6 +++
> >>  kernel/sys.c |   12 +
> >>  6 files changed, 131 insertions(+), 0 deletions(-)
> >
> > Hm. What is the expected typical frequency of these syscalls for a 
> > larger application like a web browser? Only once per startup 
> > typically, or will they be called more frequently?
> 
> It will be only once per startup.

In that case it would be more efficient to make this part of the 
binary execution environment so that exec() sets it up automatically, 
not a separate prctl() syscall.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/26/2014 04:22 PM, Ingo Molnar wrote:


* Qiaowei Ren  wrote:


This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl()
commands on the x86 platform. These commands can be used to
init and release MPX related resource.

A MMU notifier will be registered during PR_MPX_INIT
command execution. So the bound tables can be automatically
deallocated when one memory area is unmapped.

Signed-off-by: Qiaowei Ren 
---
  arch/x86/Kconfig |4 ++
  arch/x86/include/asm/mpx.h   |9 
  arch/x86/include/asm/processor.h |   16 +++
  arch/x86/kernel/mpx.c|   84 ++
  include/uapi/linux/prctl.h   |6 +++
  kernel/sys.c |   12 +
  6 files changed, 131 insertions(+), 0 deletions(-)


Hm. What is the expected typical frequency of these syscalls for a
larger application like a web browser? Only once per startup
typically, or will they be called more frequently?



It will be only once per startup.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ingo Molnar

* Qiaowei Ren  wrote:

> This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl()
> commands on the x86 platform. These commands can be used to
> init and release MPX related resource.
> 
> A MMU notifier will be registered during PR_MPX_INIT
> command execution. So the bound tables can be automatically
> deallocated when one memory area is unmapped.
> 
> Signed-off-by: Qiaowei Ren 
> ---
>  arch/x86/Kconfig |4 ++
>  arch/x86/include/asm/mpx.h   |9 
>  arch/x86/include/asm/processor.h |   16 +++
>  arch/x86/kernel/mpx.c|   84 
> ++
>  include/uapi/linux/prctl.h   |6 +++
>  kernel/sys.c |   12 +
>  6 files changed, 131 insertions(+), 0 deletions(-)

Hm. What is the expected typical frequency of these syscalls for a 
larger application like a web browser? Only once per startup 
typically, or will they be called more frequently?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ingo Molnar

* Qiaowei Ren qiaowei@intel.com wrote:

 This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl()
 commands on the x86 platform. These commands can be used to
 init and release MPX related resource.
 
 A MMU notifier will be registered during PR_MPX_INIT
 command execution. So the bound tables can be automatically
 deallocated when one memory area is unmapped.
 
 Signed-off-by: Qiaowei Ren qiaowei@intel.com
 ---
  arch/x86/Kconfig |4 ++
  arch/x86/include/asm/mpx.h   |9 
  arch/x86/include/asm/processor.h |   16 +++
  arch/x86/kernel/mpx.c|   84 
 ++
  include/uapi/linux/prctl.h   |6 +++
  kernel/sys.c |   12 +
  6 files changed, 131 insertions(+), 0 deletions(-)

Hm. What is the expected typical frequency of these syscalls for a 
larger application like a web browser? Only once per startup 
typically, or will they be called more frequently?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/26/2014 04:22 PM, Ingo Molnar wrote:


* Qiaowei Ren qiaowei@intel.com wrote:


This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl()
commands on the x86 platform. These commands can be used to
init and release MPX related resource.

A MMU notifier will be registered during PR_MPX_INIT
command execution. So the bound tables can be automatically
deallocated when one memory area is unmapped.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
  arch/x86/Kconfig |4 ++
  arch/x86/include/asm/mpx.h   |9 
  arch/x86/include/asm/processor.h |   16 +++
  arch/x86/kernel/mpx.c|   84 ++
  include/uapi/linux/prctl.h   |6 +++
  kernel/sys.c |   12 +
  6 files changed, 131 insertions(+), 0 deletions(-)


Hm. What is the expected typical frequency of these syscalls for a
larger application like a web browser? Only once per startup
typically, or will they be called more frequently?



It will be only once per startup.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ingo Molnar

* Ren Qiaowei qiaowei@intel.com wrote:

 On 01/26/2014 04:22 PM, Ingo Molnar wrote:
 
 * Qiaowei Ren qiaowei@intel.com wrote:
 
 This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl()
 commands on the x86 platform. These commands can be used to
 init and release MPX related resource.
 
 A MMU notifier will be registered during PR_MPX_INIT
 command execution. So the bound tables can be automatically
 deallocated when one memory area is unmapped.
 
 Signed-off-by: Qiaowei Ren qiaowei@intel.com
 ---
   arch/x86/Kconfig |4 ++
   arch/x86/include/asm/mpx.h   |9 
   arch/x86/include/asm/processor.h |   16 +++
   arch/x86/kernel/mpx.c|   84 
  ++
   include/uapi/linux/prctl.h   |6 +++
   kernel/sys.c |   12 +
   6 files changed, 131 insertions(+), 0 deletions(-)
 
  Hm. What is the expected typical frequency of these syscalls for a 
  larger application like a web browser? Only once per startup 
  typically, or will they be called more frequently?
 
 It will be only once per startup.

In that case it would be more efficient to make this part of the 
binary execution environment so that exec() sets it up automatically, 
not a separate prctl() syscall.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ingo Molnar

* Qiaowei Ren qiaowei@intel.com wrote:

 @@ -7,6 +9,88 @@
  #include asm/fpu-internal.h
  #include asm/alternative.h
  
 +static struct mmu_notifier mpx_mn;

 +static struct mmu_notifier_ops mpx_mmuops = {
 + .invalidate_range_end = mpx_invl_range_end,
 +};
 +
 +int mpx_init(struct task_struct *tsk)
 +{
 + if (!boot_cpu_has(X86_FEATURE_MPX))
 + return -EINVAL;
 +
 + /* register mmu_notifier to delallocate the bound tables */
 + mpx_mn.ops = mpx_mmuops;
 + mmu_notifier_register(mpx_mn, current-mm);

0)

I do think MPX should be supported by Linux, but this is the best 
thing I can say about the code at the moment:

1)

The above MMU notifier bit is absolutely disgusting: it adds an 
O(nr_mpx_tasks) overhead to every MMU operation!

MPX needs to be called from architecture methods. (And the MMU 
notifier should probably be removed, it literally invites such abuse.)

2)

The whole MPX_INIT() macro wrappery is ugly beyond relief.

3)

The kernel/sys.c bits are x86-only, it probably won't even build on 
other architectures.

4)

I also argue that MPX setup should be automatic through the ELF 
loader:

 - MPX setup could also be initiated through the ELF notes section or
   so - similar to the executable bit stack attribute in ELF binaries. 
   (See PT_GNU_STACK handling in fs/binfmt_elf.c.)

 - What is the typical life time of the bounds table? Does user-space
   get access to it? I don't see where it's discoverable to
   user-space. (for example for debuggers)

5)

Teardown can be done through regular munmap() if the notifier is 
eliminated, so that step falls away as well.

6)

How many entries are in the bounds table? Because mpx_invl_range_end() 
looks utterly unscalable if it has any size beyond 1-2 entries.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren, Qiaowei

 -Original Message-
 From: linux-kernel-ow...@vger.kernel.org
 [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Ingo Molnar
 Sent: Sunday, January 26, 2014 4:40 PM
 To: Ren, Qiaowei
 Cc: H. Peter Anvin; Thomas Gleixner; Ingo Molnar; x...@kernel.org;
 linux-kernel@vger.kernel.org; Peter Zijlstra
 Subject: Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT,
 PR_MPX_RELEASE
 
 
 * Ren Qiaowei qiaowei@intel.com wrote:
 
  On 01/26/2014 04:22 PM, Ingo Molnar wrote:
  
  * Qiaowei Ren qiaowei@intel.com wrote:
  
  This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl() commands
  on the x86 platform. These commands can be used to init and release
  MPX related resource.
  
  A MMU notifier will be registered during PR_MPX_INIT command
  execution. So the bound tables can be automatically deallocated when
  one memory area is unmapped.
  
  Signed-off-by: Qiaowei Ren qiaowei@intel.com
  ---
arch/x86/Kconfig |4 ++
arch/x86/include/asm/mpx.h   |9 
arch/x86/include/asm/processor.h |   16 +++
arch/x86/kernel/mpx.c|   84
 ++
include/uapi/linux/prctl.h   |6 +++
kernel/sys.c |   12 +
6 files changed, 131 insertions(+), 0 deletions(-)
  
   Hm. What is the expected typical frequency of these syscalls for a
   larger application like a web browser? Only once per startup
   typically, or will they be called more frequently?
 
  It will be only once per startup.
 
 In that case it would be more efficient to make this part of the binary 
 execution
 environment so that exec() sets it up automatically, not a separate prctl()
 syscall.
 
Sorry, I guess what I said is not accurate. Normally it will be only once per 
startup. But user application maybe only partly want to use MPX, e.g. only one 
thread of process. For those cases, user application need to enable MPX for 
specific part of the code. So it is not enough to set it up automatically.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren, Qiaowei

 -Original Message-
 From: Ingo Molnar [mailto:mingo.kernel@gmail.com] On Behalf Of Ingo
 Molnar
 Sent: Sunday, January 26, 2014 5:08 PM
 To: Ren, Qiaowei
 Cc: H. Peter Anvin; Thomas Gleixner; Ingo Molnar; x...@kernel.org;
 linux-kernel@vger.kernel.org; Peter Zijlstra; Linus Torvalds; Andrew Morton
 Subject: Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT,
 PR_MPX_RELEASE
 
 
 * Qiaowei Ren qiaowei@intel.com wrote:
 
  @@ -7,6 +9,88 @@
   #include asm/fpu-internal.h
   #include asm/alternative.h
 
  +static struct mmu_notifier mpx_mn;
 
  +static struct mmu_notifier_ops mpx_mmuops = {
  +   .invalidate_range_end = mpx_invl_range_end, };
  +
  +int mpx_init(struct task_struct *tsk) {
  +   if (!boot_cpu_has(X86_FEATURE_MPX))
  +   return -EINVAL;
  +
  +   /* register mmu_notifier to delallocate the bound tables */
  +   mpx_mn.ops = mpx_mmuops;
  +   mmu_notifier_register(mpx_mn, current-mm);
 
 0)
 
 I do think MPX should be supported by Linux, but this is the best thing I can 
 say
 about the code at the moment:
 
 1)
 
 The above MMU notifier bit is absolutely disgusting: it adds an
 O(nr_mpx_tasks) overhead to every MMU operation!
 
 MPX needs to be called from architecture methods. (And the MMU notifier
 should probably be removed, it literally invites such abuse.)
 
 2)
 
 The whole MPX_INIT() macro wrappery is ugly beyond relief.
 
 3)
 
 The kernel/sys.c bits are x86-only, it probably won't even build on other
 architectures.
 
 4)
 
 I also argue that MPX setup should be automatic through the ELF
 loader:
 
  - MPX setup could also be initiated through the ELF notes section or
so - similar to the executable bit stack attribute in ELF binaries.
(See PT_GNU_STACK handling in fs/binfmt_elf.c.)
 
  - What is the typical life time of the bounds table? Does user-space
get access to it? I don't see where it's discoverable to
user-space. (for example for debuggers)
 
 5)
 
 Teardown can be done through regular munmap() if the notifier is eliminated,
 so that step falls away as well.
 
 6)
 
 How many entries are in the bounds table? Because mpx_invl_range_end()
 looks utterly unscalable if it has any size beyond 1-2 entries.
 
The size of one bound table is 4M bytes for 64bit, and 16K bytes for 32bit. It 
can not be accessed by user-space, and it will be accessed automatically by 
hardware.

When #BR faults come, and the entry of the bound directory is invalid, one 
bound table have to be allocated to save value of the bounds. It is what the 
second patch does. As for the lifetime of the bound tables, we can see the 
following case:
  User application use malloc or mmap to dynamically allocate memory. when 
address in the memory region is first accessed, related entry in the bound 
directory will be checked and it should be invalid. Then one new bound table 
will be allocated.
  After a time, this memory area is released, and so the bound tables related 
with this memory area become meaningless and may be released. If we don't 
release these unnecessary bound tables, it will save the workload of allocation 
of new bound tables when the memory area related with the entry of the bound 
directory is allocated and accessed again. But in this way the loss of virtual 
space will have to be worried.

Anyway, what this patch does is only that when a piece of address space is 
unmapped, we destroy/unmap the bounds tables that correspond to that same 
address space. MMU notifier will add some overhead for those tasks which use 
MPX. But I have no better way to know when one address space is unmapped, and 
then destroy the related bounds tables. Do you have any suggestion?

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ingo Molnar

* Ren, Qiaowei qiaowei@intel.com wrote:

 The size of one bound table is 4M bytes for 64bit, and 16K bytes for 
 32bit. It can not be accessed by user-space, and it will be accessed 
 automatically by hardware.

So, here's the bound-table allocation AFAICS:

+static bool allocate_bt(unsigned long bd_entry)
+{
+   unsigned long bt_size = 1UL  (MPX_L2_BITS+MPX_L2_SHIFT);
+   unsigned long bt_addr, old_val = 0;
+
+   bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE,
+   MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);

What ensures that user-space cannot access (and in particular, modify) 
the pages at bt_addr? It's a read-write anonymous mapping AFAICS.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread H. Peter Anvin
On 01/26/2014 12:39 AM, Ingo Molnar wrote:

 It will be only once per startup.
 
 In that case it would be more efficient to make this part of the 
 binary execution environment so that exec() sets it up automatically, 
 not a separate prctl() syscall.
 

This is not necessarily possible, and in particular it might need to be
deferred until the MPX runtime has initialized.

What isn't clear to me is if these syscalls are needed at all, or if it
would be better to just let the MPX runtile set BNDSTATUS and BNDCFGU
directly in userspace.  The kernel cannot rely on them staying
consistent across userspace anyway.

Now, it might be beneficial for the kernel to have them anyway.  It's a
bit of a tough call.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/27/2014 09:50 AM, H. Peter Anvin wrote:

On 01/26/2014 12:39 AM, Ingo Molnar wrote:


It will be only once per startup.


In that case it would be more efficient to make this part of the
binary execution environment so that exec() sets it up automatically,
not a separate prctl() syscall.



This is not necessarily possible, and in particular it might need to be
deferred until the MPX runtime has initialized.

What isn't clear to me is if these syscalls are needed at all, or if it
would be better to just let the MPX runtile set BNDSTATUS and BNDCFGU
directly in userspace.  The kernel cannot rely on them staying
consistent across userspace anyway.

Now, it might be beneficial for the kernel to have them anyway.  It's a
bit of a tough call.

-hpa

Peter, you mean we should remove these two call and do what they do in 
user-space, right?


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/26/2014 11:14 PM, Ingo Molnar wrote:


* Ren, Qiaowei qiaowei@intel.com wrote:


The size of one bound table is 4M bytes for 64bit, and 16K bytes for
32bit. It can not be accessed by user-space, and it will be accessed
automatically by hardware.


So, here's the bound-table allocation AFAICS:

+static bool allocate_bt(unsigned long bd_entry)
+{
+   unsigned long bt_size = 1UL  (MPX_L2_BITS+MPX_L2_SHIFT);
+   unsigned long bt_addr, old_val = 0;
+
+   bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE,
+   MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);

What ensures that user-space cannot access (and in particular, modify)
the pages at bt_addr? It's a read-write anonymous mapping AFAICS.

Looks like that we can not be able to ensure this. I just mean that 
user-space doesn't know the bound tables, and it should not access them 
also.


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread H. Peter Anvin
On 01/26/2014 05:55 PM, Ren Qiaowei wrote:

 Peter, you mean we should remove these two call and do what they do in
 user-space, right?
 

Unless we think there is a benefit to the kernel to have a on/off switch
for the #BR exception (if disabled, all #BR exceptions are signals,
regardless of source.)

There might be, would like other people's opinion.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/27/2014 10:10 AM, H. Peter Anvin wrote:

On 01/26/2014 05:55 PM, Ren Qiaowei wrote:


Peter, you mean we should remove these two call and do what they do in
user-space, right?



Unless we think there is a benefit to the kernel to have a on/off switch
for the #BR exception (if disabled, all #BR exceptions are signals,
regardless of source.)

There might be, would like other people's opinion.



These two syscalls are only used to release automaticlly the bound 
tables, and they will not set BNDSTATUS and BNDCFGU.


I want to remove them also. :) But if so, we have to release the bound 
tables in user-space.


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/