Re: [PATCH v9 09/12] x86, mpx: decode MPX instruction to get bound violation information

2014-10-30 Thread Ren Qiaowei

On 10/31/2014 06:38 AM, Dave Hansen wrote:

+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf)
+{
+   struct mpx_insn insn;
+   uint8_t bndregno;
+   unsigned long addr_vio;
+
+   addr_vio = mpx_insn_decode(, regs);
+
+   bndregno = X86_MODRM_REG(insn.modrm.value);
+   if (bndregno > 3)
+   return;
+
+   /* Note: the upper 32 bits are ignored in 32-bit mode. */
+   info->si_lower = (void __user *)(unsigned long)
+   (xsave_buf->bndregs.bndregs[2*bndregno]);
+   info->si_upper = (void __user *)(unsigned long)
+   (~xsave_buf->bndregs.bndregs[2*bndregno+1]);
+   info->si_addr_lsb = 0;
+   info->si_signo = SIGSEGV;
+   info->si_errno = 0;
+   info->si_code = SEGV_BNDERR;
+   info->si_addr = (void __user *)addr_vio;
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 611b6ec..b2a916b 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -284,6 +284,7 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long 
error_code)
unsigned long status;
struct xsave_struct *xsave_buf;
struct task_struct *tsk = current;
+   siginfo_t info;

prev_state = exception_enter();
if (notify_die(DIE_TRAP, "bounds", regs, error_code,
@@ -316,6 +317,11 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long 
error_code)
break;

case 1: /* Bound violation. */
+   do_mpx_bounds(regs, , xsave_buf);
+   do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs,
+   error_code, );
+   break;
+
case 0: /* No exception caused by Intel MPX operations. */
do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, NULL);
break;



So, siginfo is stack-allocarted here.  do_mpx_bounds() can error out if
it sees an invalid bndregno.  We still send the signal with the 
whether or not we filled the 'info' in do_mpx_bounds().

Can't this leak some kernel stack out in the 'info'?



This should check the return value of do_mpx_bounds and should be fixed.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 09/12] x86, mpx: decode MPX instruction to get bound violation information

2014-10-30 Thread Ren Qiaowei

On 10/31/2014 06:38 AM, Dave Hansen wrote:

+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf)
+{
+   struct mpx_insn insn;
+   uint8_t bndregno;
+   unsigned long addr_vio;
+
+   addr_vio = mpx_insn_decode(insn, regs);
+
+   bndregno = X86_MODRM_REG(insn.modrm.value);
+   if (bndregno  3)
+   return;
+
+   /* Note: the upper 32 bits are ignored in 32-bit mode. */
+   info-si_lower = (void __user *)(unsigned long)
+   (xsave_buf-bndregs.bndregs[2*bndregno]);
+   info-si_upper = (void __user *)(unsigned long)
+   (~xsave_buf-bndregs.bndregs[2*bndregno+1]);
+   info-si_addr_lsb = 0;
+   info-si_signo = SIGSEGV;
+   info-si_errno = 0;
+   info-si_code = SEGV_BNDERR;
+   info-si_addr = (void __user *)addr_vio;
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 611b6ec..b2a916b 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -284,6 +284,7 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long 
error_code)
unsigned long status;
struct xsave_struct *xsave_buf;
struct task_struct *tsk = current;
+   siginfo_t info;

prev_state = exception_enter();
if (notify_die(DIE_TRAP, bounds, regs, error_code,
@@ -316,6 +317,11 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long 
error_code)
break;

case 1: /* Bound violation. */
+   do_mpx_bounds(regs, info, xsave_buf);
+   do_trap(X86_TRAP_BR, SIGSEGV, bounds, regs,
+   error_code, info);
+   break;
+
case 0: /* No exception caused by Intel MPX operations. */
do_trap(X86_TRAP_BR, SIGSEGV, bounds, regs, error_code, NULL);
break;



So, siginfo is stack-allocarted here.  do_mpx_bounds() can error out if
it sees an invalid bndregno.  We still send the signal with the info
whether or not we filled the 'info' in do_mpx_bounds().

Can't this leak some kernel stack out in the 'info'?



This should check the return value of do_mpx_bounds and should be fixed.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 09/12] x86, mpx: decode MPX instruction to get bound violation information

2014-10-28 Thread Ren Qiaowei

On 10/28/2014 04:36 AM, Thomas Gleixner wrote:

On Mon, 27 Oct 2014, Ren, Qiaowei wrote:

On 2014-10-24, Thomas Gleixner wrote:

On Sun, 12 Oct 2014, Qiaowei Ren wrote:


This patch sets bound violation fields of siginfo struct in #BR
exception handler by decoding the user instruction and constructing
the faulting pointer.

This patch does't use the generic decoder, and implements a limited
special-purpose decoder to decode MPX instructions, simply because
the generic decoder is very heavyweight not just in terms of
performance but in terms of interface -- because it has to.


My question still stands why using the existing decoder is an issue.
Performance is a complete non issue in case of a bounds violation and
the interface argument is just silly, really.



As hpa said, we only need to decode several mpx instructions
including BNDCL/BNDCU, and general decoder looks like a little
heavy. Peter, what do you think about it?


You're repeating yourself. Care to read the discussion about this from
the last round of review again?



Ok. I will go through it again. Thanks.

- Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 11/12] x86, mpx: cleanup unused bound tables

2014-10-28 Thread Ren Qiaowei

On 10/28/2014 04:49 AM, Thomas Gleixner wrote:

On Mon, 27 Oct 2014, Ren Qiaowei wrote:

If so, I guess that there are some questions needed to be considered:

1) Almost all palces which call do_munmap() will need to add
mpx_pre_unmap/post_unmap calls, like vm_munmap(), mremap(), shmdt(), etc..


What's the problem with that?



For example:

shmdt()
down_write(mm->mmap_sem);
vma = find_vma();
while (vma)
do_munmap();
up_write(mm->mmap_sem);

We could not simply add mpx_pre_unmap() before do_munmap() or 
down_write(). And seems like it is a little hard for shmdt() to be 
changed to match this solution, right?



2) before mpx_post_unmap() call, it is possible for those bounds tables within
mm->bd_remove_vmas to be re-used.

In this case, userspace may do new mapping and access one address which will
cover one of those bounds tables. During this period, HW will check if one
bounds table exist, if yes one fault won't be produced.


Errm. Before user space can use the bounds table for the new mapping
it needs to add the entries, right? So:

CPU 0   CPU 1

down_write(mm->bd_sem);
mpx_pre_unmap();
clear bounds directory entries  
unmap();
map()
write_bounds_entry()
trap()
  down_read(mm->bd_sem);
mpx_post_unmap();
up_write(mm->bd_sem);
  allocate_bounds_table();

That's the whole point of bd_sem.



Yes. Got it.


3) According to Dave, those bounds tables related to adjacent VMAs within the
start and the end possibly don't have to be fully unmmaped, and we only need
free the part of backing physical memory.


Care to explain why that's a problem?



I guess you mean one new field mm->bd_remove_vmas should be added into 
staruct mm, right?


For those VMAs which we only need to free part of backing physical 
memory, we could not clear bounds directory entries and should also mark 
the range of backing physical memory within this vma. If so, maybe there 
are too many new fields which will be added into mm struct, right?


Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 10/12] x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT, PR_MPX_DISABLE_MANAGEMENT

2014-10-28 Thread Ren Qiaowei

On 10/28/2014 04:38 AM, Thomas Gleixner wrote:

On Mon, 27 Oct 2014, Ren, Qiaowei wrote:

On 2014-10-24, Thomas Gleixner wrote:

On Sun, 12 Oct 2014, Qiaowei Ren wrote:

+int mpx_enable_management(struct task_struct *tsk) {
+   struct mm_struct *mm = tsk->mm;
+   void __user *bd_base = MPX_INVALID_BOUNDS_DIR;


What's the point of initializing bd_base here. I had to look twice to
figure out that it gets overwritten by task_get_bounds_dir()



I just want to put task_get_bounds_dir() outside mm->mmap_sem holding.


What you want is not interesting at all. What's interesting is what
you do and what you send for review.



I see. Thanks.

- Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 11/12] x86, mpx: cleanup unused bound tables

2014-10-28 Thread Ren Qiaowei

On 10/28/2014 04:49 AM, Thomas Gleixner wrote:

On Mon, 27 Oct 2014, Ren Qiaowei wrote:

If so, I guess that there are some questions needed to be considered:

1) Almost all palces which call do_munmap() will need to add
mpx_pre_unmap/post_unmap calls, like vm_munmap(), mremap(), shmdt(), etc..


What's the problem with that?



For example:

shmdt()
down_write(mm-mmap_sem);
vma = find_vma();
while (vma)
do_munmap();
up_write(mm-mmap_sem);

We could not simply add mpx_pre_unmap() before do_munmap() or 
down_write(). And seems like it is a little hard for shmdt() to be 
changed to match this solution, right?



2) before mpx_post_unmap() call, it is possible for those bounds tables within
mm-bd_remove_vmas to be re-used.

In this case, userspace may do new mapping and access one address which will
cover one of those bounds tables. During this period, HW will check if one
bounds table exist, if yes one fault won't be produced.


Errm. Before user space can use the bounds table for the new mapping
it needs to add the entries, right? So:

CPU 0   CPU 1

down_write(mm-bd_sem);
mpx_pre_unmap();
clear bounds directory entries  
unmap();
map()
write_bounds_entry()
trap()
  down_read(mm-bd_sem);
mpx_post_unmap();
up_write(mm-bd_sem);
  allocate_bounds_table();

That's the whole point of bd_sem.



Yes. Got it.


3) According to Dave, those bounds tables related to adjacent VMAs within the
start and the end possibly don't have to be fully unmmaped, and we only need
free the part of backing physical memory.


Care to explain why that's a problem?



I guess you mean one new field mm-bd_remove_vmas should be added into 
staruct mm, right?


For those VMAs which we only need to free part of backing physical 
memory, we could not clear bounds directory entries and should also mark 
the range of backing physical memory within this vma. If so, maybe there 
are too many new fields which will be added into mm struct, right?


Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 10/12] x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT, PR_MPX_DISABLE_MANAGEMENT

2014-10-28 Thread Ren Qiaowei

On 10/28/2014 04:38 AM, Thomas Gleixner wrote:

On Mon, 27 Oct 2014, Ren, Qiaowei wrote:

On 2014-10-24, Thomas Gleixner wrote:

On Sun, 12 Oct 2014, Qiaowei Ren wrote:

+int mpx_enable_management(struct task_struct *tsk) {
+   struct mm_struct *mm = tsk-mm;
+   void __user *bd_base = MPX_INVALID_BOUNDS_DIR;


What's the point of initializing bd_base here. I had to look twice to
figure out that it gets overwritten by task_get_bounds_dir()



I just want to put task_get_bounds_dir() outside mm-mmap_sem holding.


What you want is not interesting at all. What's interesting is what
you do and what you send for review.



I see. Thanks.

- Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 09/12] x86, mpx: decode MPX instruction to get bound violation information

2014-10-28 Thread Ren Qiaowei

On 10/28/2014 04:36 AM, Thomas Gleixner wrote:

On Mon, 27 Oct 2014, Ren, Qiaowei wrote:

On 2014-10-24, Thomas Gleixner wrote:

On Sun, 12 Oct 2014, Qiaowei Ren wrote:


This patch sets bound violation fields of siginfo struct in #BR
exception handler by decoding the user instruction and constructing
the faulting pointer.

This patch does't use the generic decoder, and implements a limited
special-purpose decoder to decode MPX instructions, simply because
the generic decoder is very heavyweight not just in terms of
performance but in terms of interface -- because it has to.


My question still stands why using the existing decoder is an issue.
Performance is a complete non issue in case of a bounds violation and
the interface argument is just silly, really.



As hpa said, we only need to decode several mpx instructions
including BNDCL/BNDCU, and general decoder looks like a little
heavy. Peter, what do you think about it?


You're repeating yourself. Care to read the discussion about this from
the last round of review again?



Ok. I will go through it again. Thanks.

- Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 05/12] x86, mpx: on-demand kernel allocation of bounds tables

2014-10-26 Thread Ren Qiaowei

On 10/24/2014 08:08 PM, Thomas Gleixner wrote:

On Sun, 12 Oct 2014, Qiaowei Ren wrote:

+   /*
+* Go poke the address of the new bounds table in to the
+* bounds directory entry out in userspace memory.  Note:
+* we may race with another CPU instantiating the same table.
+* In that case the cmpxchg will see an unexpected
+* 'actual_old_val'.
+*/
+   ret = user_atomic_cmpxchg_inatomic(_old_val, bd_entry,
+  expected_old_val, bt_addr);


This is fully preemptible non-atomic context, right?

So this wants a proper comment, why using
user_atomic_cmpxchg_inatomic() is the right thing to do here.



Well, we will address it.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 11/12] x86, mpx: cleanup unused bound tables

2014-10-26 Thread Ren Qiaowei

On 10/24/2014 10:40 PM, Thomas Gleixner wrote:

On Sun, 12 Oct 2014, Qiaowei Ren wrote:

Since we are doing the freeing from munmap() (and other paths like it),
we hold mmap_sem for write. If we fault, the page fault handler will
attempt to acquire mmap_sem for read and we will deadlock. For now, to
avoid deadlock, we disable page faults while touching the bounds directory
entry. This keeps us from being able to free the tables in this case.
This deficiency will be addressed in later patches.


This is wrong to begin with. You need a proper design for addressing
this short coming in the first place. Retrofitting it into your
current design will just not work at all or end up with some major
mess.

The problem to solve is, that the kernel needs to unmap the bounds
table if the data mapping which it covers goes away.

You decided to do that at the end of unmap with mmap_sem write
held. As I explained last time, that's not an absolute requirement,
it's just a random choice. And that choice is obvioulsy not a really
good one as your own observation of the page fault issue proves.

So perhaps we should look at it the other way round. We already know
before the actual unmap happens what's the start and the end of the
area.

So we can be way smarter and do the following:

mpx_pre_unmap(from, len);

down_write(mmap_sem);
do_unmap(from, len);
up_write(mmap_sem);

mpx_post_unmap(mpx_unmap);

int mpx_pre_unmap(...)
{
 down_write(mm->bd_sem);

 down_read(mm->mmap_sem);

 Here we do the following:

 1) Invalidate the bounds table entries for the to be unmapped area in
the bounds directory. This can fault as we only hold mmap sem for
read.

 2) Mark the corresponding VMAs which should be unmapped and
removed.  This can be done with mmap sem down read held as the
VMA chain cannot be changed and the 'Mark for removal" field is
protected by mm->bd_sem.

For each to be removed VMA we increment mm->bd_remove_vmas

 Holding mm->bd_sem also prevents that a new bound table to be
 inserted, if we do the whole protection thing right.

 up_read(mm->mmap_sem);
}

void mpx_post_unmap(void)
{
if (mm->bd_remove_vmas) {
   down_write(mm->mmap_sem);
   cleanup_to_be_removed_vmas();
   up_write(mm->mmap_sem);
}

up_write(mm->bd_sem);
}



If so, I guess that there are some questions needed to be considered:

1) Almost all palces which call do_munmap() will need to add 
mpx_pre_unmap/post_unmap calls, like vm_munmap(), mremap(), shmdt(), etc..


2) before mpx_post_unmap() call, it is possible for those bounds tables 
within mm->bd_remove_vmas to be re-used.


In this case, userspace may do new mapping and access one address which 
will cover one of those bounds tables. During this period, HW will check 
if one bounds table exist, if yes one fault won't be produced.


3) According to Dave, those bounds tables related to adjacent VMAs 
within the start and the end possibly don't have to be fully unmmaped, 
and we only need free the part of backing physical memory.



The mpx_pre_unmap/post_unmap calls can be either added explicit to the
relevant down_write/unmap/up_write code pathes or you hide it behind
some wrapper function.

Now you need to acquire mm->bd_sem for the fault handling code as well
in order to serialize the creation of new bound table entries. In that
case a down_read(mm->bd_sem) is sufficient as the cmpxchg() prevents
the insertion of multiple tables for the same directory entries.

So we'd end up with the following serialization scheme:

prctl enable/disable management:  down_write(mm->bd_sem);

bounds violation: down_read(mm->bd_sem);

directory entry allocation:   down_read(mm->bd_sem);

directory entry removal:  down_write(mm->bd_sem);

Now we need to check whether there is a potential deadlock lurking
around the corner. We get the following nesting scenarios:

- prctl enable/disable management:  down_write(mm->bd_sem);

No mmap_sem nesting at all

- bounds violation:  down_read(mm->bd_sem);

Might nest down_read(mm->mmap_sem) if the copy code from user space
faults.

- directory entry allocation:down_read(mm->bd_sem);

Nests down_write(mm->mmap_sem) for the VMA allocation.

Might nest down_read(mm->mmap_sem) if the cmpxchg() for the
directory entry faults

- directory entry removal:down_write(mm->bd_sem);

Might nest down_read(mm->mmap_sem) if the invalidation of the
directory entry faults

In other words here is the possible nesting:

   #1 down_read(mm->bd_sem);  down_read(mm->mmap_sem);

   #2 down_read(mm->bd_sem);  down_write(mm->mmap_sem);

   #3 down_write(mm->bd_sem); down_read(mm->mmap_sem);

   #4 down_write(mm->bd_sem); down_write(mm->mmap_sem);

We never execute any of those nested on a single task. The bounds
fault handler is never nested as it always comes 

RE: [PATCH v9 10/12] x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT, PR_MPX_DISABLE_MANAGEMENT

2014-10-26 Thread Ren, Qiaowei


On 2014-10-24, Thomas Gleixner wrote:
> On Sun, 12 Oct 2014, Qiaowei Ren wrote:
>> +int mpx_enable_management(struct task_struct *tsk) {
>> +struct mm_struct *mm = tsk->mm;
>> +void __user *bd_base = MPX_INVALID_BOUNDS_DIR;
> 
> What's the point of initializing bd_base here. I had to look twice to
> figure out that it gets overwritten by task_get_bounds_dir()
> 

I just want to put task_get_bounds_dir() outside mm->mmap_sem holding.

>> @@ -285,6 +285,7 @@ dotraplinkage void do_bounds(struct pt_regs
>> *regs,
> long error_code)
>>  struct xsave_struct *xsave_buf;
>>  struct task_struct *tsk = current;
>>  siginfo_t info;
>> +int ret = 0;
>> 
>>  prev_state = exception_enter();
>>  if (notify_die(DIE_TRAP, "bounds", regs, error_code, @@ -312,8
>> +313,35 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long
> error_code)
>>   */
>>  switch (status & MPX_BNDSTA_ERROR_CODE) {
>>  case 2: /* Bound directory has invalid entry. */
>> -if (do_mpx_bt_fault(xsave_buf))
>> +down_write(>mm->mmap_sem);
> 
> The handling of mm->mmap_sem here is horrible. The only reason why you
> want to hold mmap_sem write locked in the first place is that you want
> to cover the allocation and the mm->bd_addr check.
> 
> I think it's wrong to tie this to mmap_sem in the first place. If MPX
> is enabled then you should have mm->bd_addr and an explicit mutex to protect 
> it.
> 
> So the logic would look like this:
> 
>mutex_lock(>bd_mutex);
>if (!kernel_managed(mm))
>   do_trap(); else if (do_mpx_bt_fault()) force_sig();
>mutex_unlock(>bd_mutex);
> No tricks with mmap_sem, no special return value handling. Straight
> forward code instead of a convoluted and error prone mess.
> 
> Hmm?
> 
I guess this is a good solution. If so, new field 'bd_sem' have to be added 
into struct mm_struct.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v9 09/12] x86, mpx: decode MPX instruction to get bound violation information

2014-10-26 Thread Ren, Qiaowei


On 2014-10-24, Thomas Gleixner wrote:
> On Sun, 12 Oct 2014, Qiaowei Ren wrote:
> 
>> This patch sets bound violation fields of siginfo struct in #BR
>> exception handler by decoding the user instruction and constructing
>> the faulting pointer.
>> 
>> This patch does't use the generic decoder, and implements a limited
>> special-purpose decoder to decode MPX instructions, simply because
>> the generic decoder is very heavyweight not just in terms of
>> performance but in terms of interface -- because it has to.
> 
> My question still stands why using the existing decoder is an issue.
> Performance is a complete non issue in case of a bounds violation and
> the interface argument is just silly, really.
> 

As hpa said, we only need to decode several mpx instructions including 
BNDCL/BNDCU, and general decoder looks like a little heavy. Peter, what do you 
think about it?

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v9 09/12] x86, mpx: decode MPX instruction to get bound violation information

2014-10-26 Thread Ren, Qiaowei


On 2014-10-24, Thomas Gleixner wrote:
 On Sun, 12 Oct 2014, Qiaowei Ren wrote:
 
 This patch sets bound violation fields of siginfo struct in #BR
 exception handler by decoding the user instruction and constructing
 the faulting pointer.
 
 This patch does't use the generic decoder, and implements a limited
 special-purpose decoder to decode MPX instructions, simply because
 the generic decoder is very heavyweight not just in terms of
 performance but in terms of interface -- because it has to.
 
 My question still stands why using the existing decoder is an issue.
 Performance is a complete non issue in case of a bounds violation and
 the interface argument is just silly, really.
 

As hpa said, we only need to decode several mpx instructions including 
BNDCL/BNDCU, and general decoder looks like a little heavy. Peter, what do you 
think about it?

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v9 10/12] x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT, PR_MPX_DISABLE_MANAGEMENT

2014-10-26 Thread Ren, Qiaowei


On 2014-10-24, Thomas Gleixner wrote:
 On Sun, 12 Oct 2014, Qiaowei Ren wrote:
 +int mpx_enable_management(struct task_struct *tsk) {
 +struct mm_struct *mm = tsk-mm;
 +void __user *bd_base = MPX_INVALID_BOUNDS_DIR;
 
 What's the point of initializing bd_base here. I had to look twice to
 figure out that it gets overwritten by task_get_bounds_dir()
 

I just want to put task_get_bounds_dir() outside mm-mmap_sem holding.

 @@ -285,6 +285,7 @@ dotraplinkage void do_bounds(struct pt_regs
 *regs,
 long error_code)
  struct xsave_struct *xsave_buf;
  struct task_struct *tsk = current;
  siginfo_t info;
 +int ret = 0;
 
  prev_state = exception_enter();
  if (notify_die(DIE_TRAP, bounds, regs, error_code, @@ -312,8
 +313,35 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long
 error_code)
   */
  switch (status  MPX_BNDSTA_ERROR_CODE) {
  case 2: /* Bound directory has invalid entry. */
 -if (do_mpx_bt_fault(xsave_buf))
 +down_write(current-mm-mmap_sem);
 
 The handling of mm-mmap_sem here is horrible. The only reason why you
 want to hold mmap_sem write locked in the first place is that you want
 to cover the allocation and the mm-bd_addr check.
 
 I think it's wrong to tie this to mmap_sem in the first place. If MPX
 is enabled then you should have mm-bd_addr and an explicit mutex to protect 
 it.
 
 So the logic would look like this:
 
mutex_lock(mm-bd_mutex);
if (!kernel_managed(mm))
   do_trap(); else if (do_mpx_bt_fault()) force_sig();
mutex_unlock(mm-bd_mutex);
 No tricks with mmap_sem, no special return value handling. Straight
 forward code instead of a convoluted and error prone mess.
 
 Hmm?
 
I guess this is a good solution. If so, new field 'bd_sem' have to be added 
into struct mm_struct.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 11/12] x86, mpx: cleanup unused bound tables

2014-10-26 Thread Ren Qiaowei

On 10/24/2014 10:40 PM, Thomas Gleixner wrote:

On Sun, 12 Oct 2014, Qiaowei Ren wrote:

Since we are doing the freeing from munmap() (and other paths like it),
we hold mmap_sem for write. If we fault, the page fault handler will
attempt to acquire mmap_sem for read and we will deadlock. For now, to
avoid deadlock, we disable page faults while touching the bounds directory
entry. This keeps us from being able to free the tables in this case.
This deficiency will be addressed in later patches.


This is wrong to begin with. You need a proper design for addressing
this short coming in the first place. Retrofitting it into your
current design will just not work at all or end up with some major
mess.

The problem to solve is, that the kernel needs to unmap the bounds
table if the data mapping which it covers goes away.

You decided to do that at the end of unmap with mmap_sem write
held. As I explained last time, that's not an absolute requirement,
it's just a random choice. And that choice is obvioulsy not a really
good one as your own observation of the page fault issue proves.

So perhaps we should look at it the other way round. We already know
before the actual unmap happens what's the start and the end of the
area.

So we can be way smarter and do the following:

mpx_pre_unmap(from, len);

down_write(mmap_sem);
do_unmap(from, len);
up_write(mmap_sem);

mpx_post_unmap(mpx_unmap);

int mpx_pre_unmap(...)
{
 down_write(mm-bd_sem);

 down_read(mm-mmap_sem);

 Here we do the following:

 1) Invalidate the bounds table entries for the to be unmapped area in
the bounds directory. This can fault as we only hold mmap sem for
read.

 2) Mark the corresponding VMAs which should be unmapped and
removed.  This can be done with mmap sem down read held as the
VMA chain cannot be changed and the 'Mark for removal field is
protected by mm-bd_sem.

For each to be removed VMA we increment mm-bd_remove_vmas

 Holding mm-bd_sem also prevents that a new bound table to be
 inserted, if we do the whole protection thing right.

 up_read(mm-mmap_sem);
}

void mpx_post_unmap(void)
{
if (mm-bd_remove_vmas) {
   down_write(mm-mmap_sem);
   cleanup_to_be_removed_vmas();
   up_write(mm-mmap_sem);
}

up_write(mm-bd_sem);
}



If so, I guess that there are some questions needed to be considered:

1) Almost all palces which call do_munmap() will need to add 
mpx_pre_unmap/post_unmap calls, like vm_munmap(), mremap(), shmdt(), etc..


2) before mpx_post_unmap() call, it is possible for those bounds tables 
within mm-bd_remove_vmas to be re-used.


In this case, userspace may do new mapping and access one address which 
will cover one of those bounds tables. During this period, HW will check 
if one bounds table exist, if yes one fault won't be produced.


3) According to Dave, those bounds tables related to adjacent VMAs 
within the start and the end possibly don't have to be fully unmmaped, 
and we only need free the part of backing physical memory.



The mpx_pre_unmap/post_unmap calls can be either added explicit to the
relevant down_write/unmap/up_write code pathes or you hide it behind
some wrapper function.

Now you need to acquire mm-bd_sem for the fault handling code as well
in order to serialize the creation of new bound table entries. In that
case a down_read(mm-bd_sem) is sufficient as the cmpxchg() prevents
the insertion of multiple tables for the same directory entries.

So we'd end up with the following serialization scheme:

prctl enable/disable management:  down_write(mm-bd_sem);

bounds violation: down_read(mm-bd_sem);

directory entry allocation:   down_read(mm-bd_sem);

directory entry removal:  down_write(mm-bd_sem);

Now we need to check whether there is a potential deadlock lurking
around the corner. We get the following nesting scenarios:

- prctl enable/disable management:  down_write(mm-bd_sem);

No mmap_sem nesting at all

- bounds violation:  down_read(mm-bd_sem);

Might nest down_read(mm-mmap_sem) if the copy code from user space
faults.

- directory entry allocation:down_read(mm-bd_sem);

Nests down_write(mm-mmap_sem) for the VMA allocation.

Might nest down_read(mm-mmap_sem) if the cmpxchg() for the
directory entry faults

- directory entry removal:down_write(mm-bd_sem);

Might nest down_read(mm-mmap_sem) if the invalidation of the
directory entry faults

In other words here is the possible nesting:

   #1 down_read(mm-bd_sem);  down_read(mm-mmap_sem);

   #2 down_read(mm-bd_sem);  down_write(mm-mmap_sem);

   #3 down_write(mm-bd_sem); down_read(mm-mmap_sem);

   #4 down_write(mm-bd_sem); down_write(mm-mmap_sem);

We never execute any of those nested on a single task. The bounds
fault handler is never nested as it always comes from user space. So
we should be 

Re: [PATCH v9 05/12] x86, mpx: on-demand kernel allocation of bounds tables

2014-10-26 Thread Ren Qiaowei

On 10/24/2014 08:08 PM, Thomas Gleixner wrote:

On Sun, 12 Oct 2014, Qiaowei Ren wrote:

+   /*
+* Go poke the address of the new bounds table in to the
+* bounds directory entry out in userspace memory.  Note:
+* we may race with another CPU instantiating the same table.
+* In that case the cmpxchg will see an unexpected
+* 'actual_old_val'.
+*/
+   ret = user_atomic_cmpxchg_inatomic(actual_old_val, bd_entry,
+  expected_old_val, bt_addr);


This is fully preemptible non-atomic context, right?

So this wants a proper comment, why using
user_atomic_cmpxchg_inatomic() is the right thing to do here.



Well, we will address it.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v7 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-10-13 Thread Ren, Qiaowei


On 2014-10-14, Hansen, Dave wrote:
> On 07/20/2014 11:09 PM, Andi Kleen wrote:
>> Qiaowei Ren  writes:
>>> This patch adds the PR_MPX_REGISTER and PR_MPX_UNREGISTER prctl()
>>> commands. These commands can be used to register and unregister MPX
>>> related resource on the x86 platform.
>> 
>> Please provide a manpage for the API. This is needed for proper
>> review. Your description is far too vague.
> 
> Qiaowei, have you written this manpage yet?  I see the new patches,
> but no manpage yet.

It will be added into subsequent version or another mainline patchset.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v7 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-10-13 Thread Ren, Qiaowei


On 2014-10-14, Hansen, Dave wrote:
 On 07/20/2014 11:09 PM, Andi Kleen wrote:
 Qiaowei Ren qiaowei@intel.com writes:
 This patch adds the PR_MPX_REGISTER and PR_MPX_UNREGISTER prctl()
 commands. These commands can be used to register and unregister MPX
 related resource on the x86 platform.
 
 Please provide a manpage for the API. This is needed for proper
 review. Your description is far too vague.
 
 Qiaowei, have you written this manpage yet?  I see the new patches,
 but no manpage yet.

It will be added into subsequent version or another mainline patchset.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-09-17 Thread Ren, Qiaowei


On 2014-09-18, Kevin Easton wrote:
> On Thu, Sep 18, 2014 at 12:40:29AM +0000, Ren, Qiaowei wrote:
>>> Would it be prudent to use an error code other than EINVAL for the
>>> "hardware doesn't support it" case?
>>> 
>> Seems like no specific error code for this case.
> 
> ENXIO would probably be OK.  It's not too important as long as it's
> documented.
> 
Yes. Looks like that ENXIO will be better.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-09-17 Thread Ren, Qiaowei


On 2014-09-16, Kevin Easton wrote:
> On Thu, Sep 11, 2014 at 04:46:48PM +0800, Qiaowei Ren wrote:
>> +
>> +int mpx_register(struct task_struct *tsk) {
>> +struct mm_struct *mm = tsk->mm;
>> +
>> +if (!cpu_has_mpx)
>> +return -EINVAL;
>> +
>> +/*
>> + * runtime in the userspace will be responsible for allocation of
>> + * the bounds directory. Then, it will save the base of the bounds
>> + * directory into XSAVE/XRSTOR Save Area and enable MPX through
>> + * XRSTOR instruction.
>> + *
>> + * fpu_xsave() is expected to be very expensive. In order to do
>> + * performance optimization, here we get the base of the bounds
>> + * directory and then save it into mm_struct to be used in future.
>> + */
>> +mm->bd_addr = task_get_bounds_dir(tsk);
>> +if (!mm->bd_addr)
>> +return -EINVAL;
>> +
>> +return 0;
>> +}
>> +
>> +int mpx_unregister(struct task_struct *tsk) {
>> +struct mm_struct *mm = current->mm;
>> +
>> +if (!cpu_has_mpx)
>> +return -EINVAL;
>> +
>> +mm->bd_addr = NULL;
>> +return 0;
>> +}
> 
> If that's changed, then mpx_register() and mpx_unregister() don't need
> a task_struct, just an mm_struct.
> 
Yes. An mm_struct is enough.

> Probably these functions should be locking mmap_sem.
> 
> Would it be prudent to use an error code other than EINVAL for the
> "hardware doesn't support it" case?
>
Seems like no specific error code for this case.

>> @@ -2011,6 +2017,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned
> long, arg2, unsigned long, arg3,
>>  me->mm->def_flags &= ~VM_NOHUGEPAGE;
>>  up_write(>mm->mmap_sem);
>>  break;
>> +case PR_MPX_REGISTER:
>> +error = MPX_REGISTER(me);
>> +break;
>> +case PR_MPX_UNREGISTER:
>> +error = MPX_UNREGISTER(me);
>> +break;
> 
> If you pass me->mm from prctl, that makes it clear that it's
> per-process not per-thread, just like PR_SET_DUMPABLE / PR_GET_DUMPABLE.
> 
> This code should also enforce nulls in arg2 / arg3 / arg4,/ arg5 if
> it's not using them, otherwise you'll be sunk if you ever want to use them 
> later.
> 
> It seems like it only makes sense for all threads using the mm to have
> the same bounds directory set.  If the interface was changed to
> directly pass the address, then could the kernel take care of setting
> it for *all* of the threads in the process? This seems like something
> that would be easier for the kernel to do than userspace.
> 
If the interface was changed to this, it will be possible for insane 
application to pass error bounds directory address to kernel. We still have to 
call fpu_xsave() to check this.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-09-17 Thread Ren, Qiaowei


On 2014-09-16, Kevin Easton wrote:
 On Thu, Sep 11, 2014 at 04:46:48PM +0800, Qiaowei Ren wrote:
 +
 +int mpx_register(struct task_struct *tsk) {
 +struct mm_struct *mm = tsk-mm;
 +
 +if (!cpu_has_mpx)
 +return -EINVAL;
 +
 +/*
 + * runtime in the userspace will be responsible for allocation of
 + * the bounds directory. Then, it will save the base of the bounds
 + * directory into XSAVE/XRSTOR Save Area and enable MPX through
 + * XRSTOR instruction.
 + *
 + * fpu_xsave() is expected to be very expensive. In order to do
 + * performance optimization, here we get the base of the bounds
 + * directory and then save it into mm_struct to be used in future.
 + */
 +mm-bd_addr = task_get_bounds_dir(tsk);
 +if (!mm-bd_addr)
 +return -EINVAL;
 +
 +return 0;
 +}
 +
 +int mpx_unregister(struct task_struct *tsk) {
 +struct mm_struct *mm = current-mm;
 +
 +if (!cpu_has_mpx)
 +return -EINVAL;
 +
 +mm-bd_addr = NULL;
 +return 0;
 +}
 
 If that's changed, then mpx_register() and mpx_unregister() don't need
 a task_struct, just an mm_struct.
 
Yes. An mm_struct is enough.

 Probably these functions should be locking mmap_sem.
 
 Would it be prudent to use an error code other than EINVAL for the
 hardware doesn't support it case?

Seems like no specific error code for this case.

 @@ -2011,6 +2017,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned
 long, arg2, unsigned long, arg3,
  me-mm-def_flags = ~VM_NOHUGEPAGE;
  up_write(me-mm-mmap_sem);
  break;
 +case PR_MPX_REGISTER:
 +error = MPX_REGISTER(me);
 +break;
 +case PR_MPX_UNREGISTER:
 +error = MPX_UNREGISTER(me);
 +break;
 
 If you pass me-mm from prctl, that makes it clear that it's
 per-process not per-thread, just like PR_SET_DUMPABLE / PR_GET_DUMPABLE.
 
 This code should also enforce nulls in arg2 / arg3 / arg4,/ arg5 if
 it's not using them, otherwise you'll be sunk if you ever want to use them 
 later.
 
 It seems like it only makes sense for all threads using the mm to have
 the same bounds directory set.  If the interface was changed to
 directly pass the address, then could the kernel take care of setting
 it for *all* of the threads in the process? This seems like something
 that would be easier for the kernel to do than userspace.
 
If the interface was changed to this, it will be possible for insane 
application to pass error bounds directory address to kernel. We still have to 
call fpu_xsave() to check this.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-09-17 Thread Ren, Qiaowei


On 2014-09-18, Kevin Easton wrote:
 On Thu, Sep 18, 2014 at 12:40:29AM +, Ren, Qiaowei wrote:
 Would it be prudent to use an error code other than EINVAL for the
 hardware doesn't support it case?
 
 Seems like no specific error code for this case.
 
 ENXIO would probably be OK.  It's not too important as long as it's
 documented.
 
Yes. Looks like that ENXIO will be better.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 09/10] x86, mpx: cleanup unused bound tables

2014-09-16 Thread Ren, Qiaowei


On 2014-09-16, Hansen, Dave wrote:
> On 09/11/2014 01:46 AM, Qiaowei Ren wrote:
>> +/*
>> + * Free the backing physical pages of bounds table 'bt_addr'.
>> + * Assume start...end is within that bounds table.
>> + */
>> +static int __must_check zap_bt_entries(struct mm_struct *mm,
>> +unsigned long bt_addr,
>> +unsigned long start, unsigned long end) {
>> +struct vm_area_struct *vma;
>> +
>> +/* Find the vma which overlaps this bounds table */
>> +vma = find_vma(mm, bt_addr);
>> +/*
>> + * The table entry comes from userspace and could be
>> + * pointing anywhere, so make sure it is at least
>> + * pointing to valid memory.
>> + */
>> +if (!vma || !(vma->vm_flags & VM_MPX) ||
>> +vma->vm_start > bt_addr ||
>> +vma->vm_end < bt_addr+MPX_BT_SIZE_BYTES)
>> +return -EINVAL;
> 
> If someone did *ANYTHING* to split the VMA, this check would fail.  I
> think that's a little draconian, considering that somebody could do a
> NUMA policy on part of a VM_MPX VMA and cause it to be split.
> 
> This check should look across the entire 'bt_addr ->
> bt_addr+MPX_BT_SIZE_BYTES' range, find all of the VM_MPX VMAs, and zap
> only those.
> 
> If we encounter a non-VM_MPX vma, it should be ignored.
>
Ok.

>> +if (ret == -EFAULT)
>> +return ret;
>> +
>> +/*
>> + * unmap those bounds table which are entirely covered in this
>> + * virtual address region.
>> + */
> 
> Entirely covered *AND* not at the edges, right?
> 
Yes.

>> +bde_start = mm->bd_addr + MPX_GET_BD_ENTRY_OFFSET(start);
>> +bde_end = mm->bd_addr + MPX_GET_BD_ENTRY_OFFSET(end-1);
>> +for (bd_entry = bde_start + 1; bd_entry < bde_end; bd_entry++) {
> 
> This needs a big fat comment that it is only freeing the bounds tables that 
> are 1.
> fully covered 2. not at the edges of the mapping, even if full aligned
> 
> Does this get any nicer if we have unmap_side_bts() *ONLY* go after
> bounds tables that are partially owned by the region being unmapped?
> 
> It seems like we really should do this:
> 
>   for (each bt fully owned)
>   unmap_single_bt()
>   if (start edge unaligned)
>   free start edge
>   if (end edge unaligned)
>   free end edge
> 
> I bet the unmap_side_bts() code gets simpler if we do that, too.
> 
Maybe. I will try this.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 09/10] x86, mpx: cleanup unused bound tables

2014-09-16 Thread Ren, Qiaowei


On 2014-09-16, Hansen, Dave wrote:
 On 09/11/2014 01:46 AM, Qiaowei Ren wrote:
 +/*
 + * Free the backing physical pages of bounds table 'bt_addr'.
 + * Assume start...end is within that bounds table.
 + */
 +static int __must_check zap_bt_entries(struct mm_struct *mm,
 +unsigned long bt_addr,
 +unsigned long start, unsigned long end) {
 +struct vm_area_struct *vma;
 +
 +/* Find the vma which overlaps this bounds table */
 +vma = find_vma(mm, bt_addr);
 +/*
 + * The table entry comes from userspace and could be
 + * pointing anywhere, so make sure it is at least
 + * pointing to valid memory.
 + */
 +if (!vma || !(vma-vm_flags  VM_MPX) ||
 +vma-vm_start  bt_addr ||
 +vma-vm_end  bt_addr+MPX_BT_SIZE_BYTES)
 +return -EINVAL;
 
 If someone did *ANYTHING* to split the VMA, this check would fail.  I
 think that's a little draconian, considering that somebody could do a
 NUMA policy on part of a VM_MPX VMA and cause it to be split.
 
 This check should look across the entire 'bt_addr -
 bt_addr+MPX_BT_SIZE_BYTES' range, find all of the VM_MPX VMAs, and zap
 only those.
 
 If we encounter a non-VM_MPX vma, it should be ignored.

Ok.

 +if (ret == -EFAULT)
 +return ret;
 +
 +/*
 + * unmap those bounds table which are entirely covered in this
 + * virtual address region.
 + */
 
 Entirely covered *AND* not at the edges, right?
 
Yes.

 +bde_start = mm-bd_addr + MPX_GET_BD_ENTRY_OFFSET(start);
 +bde_end = mm-bd_addr + MPX_GET_BD_ENTRY_OFFSET(end-1);
 +for (bd_entry = bde_start + 1; bd_entry  bde_end; bd_entry++) {
 
 This needs a big fat comment that it is only freeing the bounds tables that 
 are 1.
 fully covered 2. not at the edges of the mapping, even if full aligned
 
 Does this get any nicer if we have unmap_side_bts() *ONLY* go after
 bounds tables that are partially owned by the region being unmapped?
 
 It seems like we really should do this:
 
   for (each bt fully owned)
   unmap_single_bt()
   if (start edge unaligned)
   free start edge
   if (end edge unaligned)
   free end edge
 
 I bet the unmap_side_bts() code gets simpler if we do that, too.
 
Maybe. I will try this.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-09-15 Thread Ren, Qiaowei


On 2014-09-15, One Thousand Gnomes wrote:
>> The base of the bounds directory is set into mm_struct during
>> PR_MPX_REGISTER command execution. This member can be used to check
>> whether one application is mpx enabled.
> 
> Not really because by the time you ask the question another thread
> might have decided to unregister it.
> 
> 
>> +int mpx_register(struct task_struct *tsk) {
>> +struct mm_struct *mm = tsk->mm;
>> +
>> +if (!cpu_has_mpx)
>> +return -EINVAL;
>> +
>> +/*
>> + * runtime in the userspace will be responsible for allocation of
>> + * the bounds directory. Then, it will save the base of the bounds
>> + * directory into XSAVE/XRSTOR Save Area and enable MPX through
>> + * XRSTOR instruction.
>> + *
>> + * fpu_xsave() is expected to be very expensive. In order to do
>> + * performance optimization, here we get the base of the bounds
>> + * directory and then save it into mm_struct to be used in future.
>> + */
>> +mm->bd_addr = task_get_bounds_dir(tsk);
>> +if (!mm->bd_addr)
>> +return -EINVAL;
> 
> What stops two threads calling this in parallel ?
>> +
>> +return 0;
>> +}
>> +
>> +int mpx_unregister(struct task_struct *tsk) {
>> +struct mm_struct *mm = current->mm;
>> +
>> +if (!cpu_has_mpx)
>> +return -EINVAL;
>> +
>> +mm->bd_addr = NULL;
> 
> or indeed calling this in parallel
> 
> What are the semantics across execve() ?
> 
This will not impact on the semantics of execve(). One runtime library for MPX 
will be provided (or merged into Glibc), and when the application starts, this 
runtime will be called to initialize MPX runtime environment, including calling 
prctl() to notify the kernel to start managing the bounds directories. You can 
see the discussion about exec(): https://lkml.org/lkml/2014/1/26/199 

It would be extremely unusual for an application to have some MPX and some 
non-MPX threads, since they would share the same address space and the non-MPX 
threads would mess up the bounds. That is to say, it looks like be unusual for 
one of these threads to call prctl() to enable or disable MPX. I guess we need 
to add some rules into documentation.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-09-15 Thread Ren, Qiaowei


On 2014-09-15, One Thousand Gnomes wrote:
 The base of the bounds directory is set into mm_struct during
 PR_MPX_REGISTER command execution. This member can be used to check
 whether one application is mpx enabled.
 
 Not really because by the time you ask the question another thread
 might have decided to unregister it.
 
 
 +int mpx_register(struct task_struct *tsk) {
 +struct mm_struct *mm = tsk-mm;
 +
 +if (!cpu_has_mpx)
 +return -EINVAL;
 +
 +/*
 + * runtime in the userspace will be responsible for allocation of
 + * the bounds directory. Then, it will save the base of the bounds
 + * directory into XSAVE/XRSTOR Save Area and enable MPX through
 + * XRSTOR instruction.
 + *
 + * fpu_xsave() is expected to be very expensive. In order to do
 + * performance optimization, here we get the base of the bounds
 + * directory and then save it into mm_struct to be used in future.
 + */
 +mm-bd_addr = task_get_bounds_dir(tsk);
 +if (!mm-bd_addr)
 +return -EINVAL;
 
 What stops two threads calling this in parallel ?
 +
 +return 0;
 +}
 +
 +int mpx_unregister(struct task_struct *tsk) {
 +struct mm_struct *mm = current-mm;
 +
 +if (!cpu_has_mpx)
 +return -EINVAL;
 +
 +mm-bd_addr = NULL;
 
 or indeed calling this in parallel
 
 What are the semantics across execve() ?
 
This will not impact on the semantics of execve(). One runtime library for MPX 
will be provided (or merged into Glibc), and when the application starts, this 
runtime will be called to initialize MPX runtime environment, including calling 
prctl() to notify the kernel to start managing the bounds directories. You can 
see the discussion about exec(): https://lkml.org/lkml/2014/1/26/199 

It would be extremely unusual for an application to have some MPX and some 
non-MPX threads, since they would share the same address space and the non-MPX 
threads would mess up the bounds. That is to say, it looks like be unusual for 
one of these threads to call prctl() to enable or disable MPX. I guess we need 
to add some rules into documentation.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 04/10] x86, mpx: hook #BR exception handler to allocate bound tables

2014-09-13 Thread Ren, Qiaowei


On 2014-09-13, Hansen, Dave wrote:
> On 09/11/2014 01:46 AM, Qiaowei Ren wrote:
>> +static int allocate_bt(long __user *bd_entry) {
>> +unsigned long bt_addr, old_val = 0;
>> +int ret = 0;
>> +
>> +bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES);
>> +if (IS_ERR((void *)bt_addr))
>> +return bt_addr;
>> +bt_addr = (bt_addr & MPX_BT_ADDR_MASK) |
> MPX_BD_ENTRY_VALID_FLAG;
> 
> Qiaowei, why do we need the "& MPX_BT_ADDR_MASK" here?

It should be not necessary, and can be removed.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 06/10] mips: sync struct siginfo with general version

2014-09-13 Thread Ren, Qiaowei


On 2014-09-12, Thomas Gleixner wrote:
> On Fri, 12 Sep 2014, Ren, Qiaowei wrote:
>> On 2014-09-12, Thomas Gleixner wrote:
>>> On Thu, 11 Sep 2014, Qiaowei Ren wrote:
>>> 
>>>> Due to new fields about bound violation added into struct
>>>> siginfo, this patch syncs it with general version to avoid build issue.
>>> 
>>> You completely fail to explain which build issue is addressed by
>>> this patch. The code you added to kernel/signal.c which accesses
>>> _addr_bnd is guarded by
>>> 
>>> +#ifdef SEGV_BNDERR
>>> 
>>> which is not defined my MIPS. Also why is this only affecting MIPS
>>> and not any other architecture which provides its own struct siginfo ?
>>> 
>>> That patch makes no sense at all, at least not without a proper explanation.
>>> 
>> For arch=mips, siginfo.h (arch/mips/include/uapi/asm/siginfo.h) will
>> include general siginfo.h, and only replace general stuct siginfo
>> with mips specific struct siginfo. So SEGV_BNDERR will be defined
>> for all archs, and we will get error like "no _lower in struct
>> siginfo" when arch=mips.
>> 
>> In addition, only MIPS arch define its own struct siginfo, so this
>> is only affecting MIPS.
> 
> So IA64 does not count as an architecture and therefor does not need
> the same treatment, right?
> 
struct siginfo for IA64 should be also synced. I will do this next post.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 06/10] mips: sync struct siginfo with general version

2014-09-13 Thread Ren, Qiaowei


On 2014-09-12, Thomas Gleixner wrote:
 On Fri, 12 Sep 2014, Ren, Qiaowei wrote:
 On 2014-09-12, Thomas Gleixner wrote:
 On Thu, 11 Sep 2014, Qiaowei Ren wrote:
 
 Due to new fields about bound violation added into struct
 siginfo, this patch syncs it with general version to avoid build issue.
 
 You completely fail to explain which build issue is addressed by
 this patch. The code you added to kernel/signal.c which accesses
 _addr_bnd is guarded by
 
 +#ifdef SEGV_BNDERR
 
 which is not defined my MIPS. Also why is this only affecting MIPS
 and not any other architecture which provides its own struct siginfo ?
 
 That patch makes no sense at all, at least not without a proper explanation.
 
 For arch=mips, siginfo.h (arch/mips/include/uapi/asm/siginfo.h) will
 include general siginfo.h, and only replace general stuct siginfo
 with mips specific struct siginfo. So SEGV_BNDERR will be defined
 for all archs, and we will get error like no _lower in struct
 siginfo when arch=mips.
 
 In addition, only MIPS arch define its own struct siginfo, so this
 is only affecting MIPS.
 
 So IA64 does not count as an architecture and therefor does not need
 the same treatment, right?
 
struct siginfo for IA64 should be also synced. I will do this next post.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 04/10] x86, mpx: hook #BR exception handler to allocate bound tables

2014-09-13 Thread Ren, Qiaowei


On 2014-09-13, Hansen, Dave wrote:
 On 09/11/2014 01:46 AM, Qiaowei Ren wrote:
 +static int allocate_bt(long __user *bd_entry) {
 +unsigned long bt_addr, old_val = 0;
 +int ret = 0;
 +
 +bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES);
 +if (IS_ERR((void *)bt_addr))
 +return bt_addr;
 +bt_addr = (bt_addr  MPX_BT_ADDR_MASK) |
 MPX_BD_ENTRY_VALID_FLAG;
 
 Qiaowei, why do we need the  MPX_BT_ADDR_MASK here?

It should be not necessary, and can be removed.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-09-11 Thread Ren, Qiaowei


On 2014-09-11, Hansen, Dave wrote:
> On 09/11/2014 01:46 AM, Qiaowei Ren wrote:
>> +
>> +return (void __user *)(unsigned long)(xsave_buf->bndcsr.cfg_reg_u &
>> +MPX_BNDCFG_ADDR_MASK);
>> +}
> 
> I don't think casting a u64 to a ulong, then to a pointer is useful.
> Just take the '(unsigned long)' out.

If so, this will spits out a warning on 32-bit:

arch/x86/kernel/mpx.c: In function 'task_get_bounds_dir':
arch/x86/kernel/mpx.c:21:9: warning: cast to pointer from integer of different 
size [-Wint-to-pointer-cast]

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 09/10] x86, mpx: cleanup unused bound tables

2014-09-11 Thread Ren, Qiaowei


On 2014-09-11, Hansen, Dave wrote:
> On 09/11/2014 01:46 AM, Qiaowei Ren wrote:
>> + * This function will be called by do_munmap(), and the VMAs
>> + covering
>> + * the virtual address region start...end have already been split
>> + if
>> + * necessary and remvoed from the VMA list.
> 
> "remvoed" -> "removed"
> 
>> +void mpx_unmap(struct mm_struct *mm,
>> +unsigned long start, unsigned long end) {
>> +int ret;
>> +
>> +ret = mpx_try_unmap(mm, start, end);
>> +if (ret == -EINVAL)
>> +force_sig(SIGSEGV, current);
>> +}
> 
> In the case of a fault during an unmap, this just ignores the
> situation and returns silently.  Where is the code to retry the
> freeing operation outside of mmap_sem?

Dave, you mean delayed_work code? According to our discussion, it will be 
deferred to another mainline post.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 06/10] mips: sync struct siginfo with general version

2014-09-11 Thread Ren, Qiaowei


On 2014-09-12, Thomas Gleixner wrote:
> On Thu, 11 Sep 2014, Qiaowei Ren wrote:
> 
>> Due to new fields about bound violation added into struct siginfo,
>> this patch syncs it with general version to avoid build issue.
> 
> You completely fail to explain which build issue is addressed by this
> patch. The code you added to kernel/signal.c which accesses _addr_bnd
> is guarded by
> 
> +#ifdef SEGV_BNDERR
> 
> which is not defined my MIPS. Also why is this only affecting MIPS and
> not any other architecture which provides its own struct siginfo ?
> 
> That patch makes no sense at all, at least not without a proper explanation.
>

For arch=mips, siginfo.h (arch/mips/include/uapi/asm/siginfo.h) will include 
general siginfo.h, and only replace general stuct siginfo with mips specific 
struct siginfo. So SEGV_BNDERR will be defined for all archs, and we will get 
error like "no _lower in struct siginfo" when arch=mips.

In addition, only MIPS arch define its own struct siginfo, so this is only 
affecting MIPS. 

Thanks,
Qiaowei

> 
>> Signed-off-by: Qiaowei Ren 
>> ---
>>  arch/mips/include/uapi/asm/siginfo.h |4 
>>  1 files changed, 4 insertions(+), 0 deletions(-)
>> diff --git a/arch/mips/include/uapi/asm/siginfo.h
>> b/arch/mips/include/uapi/asm/siginfo.h
>> index e811744..d08f83f 100644
>> --- a/arch/mips/include/uapi/asm/siginfo.h
>> +++ b/arch/mips/include/uapi/asm/siginfo.h
>> @@ -92,6 +92,10 @@ typedef struct siginfo {
>>  int _trapno;/* TRAP # which caused the signal */
>>  #endif
>>  short _addr_lsb;
>> +struct {
>> +void __user *_lower;
>> +void __user *_upper;
>> +} _addr_bnd;
>>  } _sigfault;
>>  
>>  /* SIGPOLL, SIGXFSZ (To do ...)  */
>> --
>> 1.7.1
>> 
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 06/10] mips: sync struct siginfo with general version

2014-09-11 Thread Ren, Qiaowei


On 2014-09-12, Thomas Gleixner wrote:
 On Thu, 11 Sep 2014, Qiaowei Ren wrote:
 
 Due to new fields about bound violation added into struct siginfo,
 this patch syncs it with general version to avoid build issue.
 
 You completely fail to explain which build issue is addressed by this
 patch. The code you added to kernel/signal.c which accesses _addr_bnd
 is guarded by
 
 +#ifdef SEGV_BNDERR
 
 which is not defined my MIPS. Also why is this only affecting MIPS and
 not any other architecture which provides its own struct siginfo ?
 
 That patch makes no sense at all, at least not without a proper explanation.


For arch=mips, siginfo.h (arch/mips/include/uapi/asm/siginfo.h) will include 
general siginfo.h, and only replace general stuct siginfo with mips specific 
struct siginfo. So SEGV_BNDERR will be defined for all archs, and we will get 
error like no _lower in struct siginfo when arch=mips.

In addition, only MIPS arch define its own struct siginfo, so this is only 
affecting MIPS. 

Thanks,
Qiaowei

 
 Signed-off-by: Qiaowei Ren qiaowei@intel.com
 ---
  arch/mips/include/uapi/asm/siginfo.h |4 
  1 files changed, 4 insertions(+), 0 deletions(-)
 diff --git a/arch/mips/include/uapi/asm/siginfo.h
 b/arch/mips/include/uapi/asm/siginfo.h
 index e811744..d08f83f 100644
 --- a/arch/mips/include/uapi/asm/siginfo.h
 +++ b/arch/mips/include/uapi/asm/siginfo.h
 @@ -92,6 +92,10 @@ typedef struct siginfo {
  int _trapno;/* TRAP # which caused the signal */
  #endif
  short _addr_lsb;
 +struct {
 +void __user *_lower;
 +void __user *_upper;
 +} _addr_bnd;
  } _sigfault;
  
  /* SIGPOLL, SIGXFSZ (To do ...)  */
 --
 1.7.1
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 09/10] x86, mpx: cleanup unused bound tables

2014-09-11 Thread Ren, Qiaowei


On 2014-09-11, Hansen, Dave wrote:
 On 09/11/2014 01:46 AM, Qiaowei Ren wrote:
 + * This function will be called by do_munmap(), and the VMAs
 + covering
 + * the virtual address region start...end have already been split
 + if
 + * necessary and remvoed from the VMA list.
 
 remvoed - removed
 
 +void mpx_unmap(struct mm_struct *mm,
 +unsigned long start, unsigned long end) {
 +int ret;
 +
 +ret = mpx_try_unmap(mm, start, end);
 +if (ret == -EINVAL)
 +force_sig(SIGSEGV, current);
 +}
 
 In the case of a fault during an unmap, this just ignores the
 situation and returns silently.  Where is the code to retry the
 freeing operation outside of mmap_sem?

Dave, you mean delayed_work code? According to our discussion, it will be 
deferred to another mainline post.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-09-11 Thread Ren, Qiaowei


On 2014-09-11, Hansen, Dave wrote:
 On 09/11/2014 01:46 AM, Qiaowei Ren wrote:
 +
 +return (void __user *)(unsigned long)(xsave_buf-bndcsr.cfg_reg_u 
 +MPX_BNDCFG_ADDR_MASK);
 +}
 
 I don't think casting a u64 to a ulong, then to a pointer is useful.
 Just take the '(unsigned long)' out.

If so, this will spits out a warning on 32-bit:

arch/x86/kernel/mpx.c: In function 'task_get_bounds_dir':
arch/x86/kernel/mpx.c:21:9: warning: cast to pointer from integer of different 
size [-Wint-to-pointer-cast]

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v7 03/10] x86, mpx: add macro cpu_has_mpx

2014-07-23 Thread Ren, Qiaowei


On 2014-07-24, Hansen, Dave wrote:
> On 07/23/2014 05:56 PM, Ren, Qiaowei wrote:
>> On 2014-07-24, Hansen, Dave wrote:
>>> On 07/22/2014 07:35 PM, Ren, Qiaowei wrote:
>>>> The checking about MPX feature should be as follow:
>>>> 
>>>> if (pcntxt_mask & XSTATE_EAGER) {
>>>> if (eagerfpu == DISABLE) {
>>>> pr_err("eagerfpu not present, disabling
> some
>>> xstate features: 0x%llx\n",
>>>> pcntxt_mask &
>>> XSTATE_EAGER);
>>>> pcntxt_mask &= ~XSTATE_EAGER; } else {
>>>> eagerfpu = ENABLE;
>>>> }
>>>> }
>>>> This patch was merged into kernel the ending of last year
>>>> (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/c
>>>> om
>>>> mi
>>>> t/?id=e7d820a5e549b3eb6c3f9467507566565646a669 )
>>> 
>>> Should we be doing a clear_cpu_cap(X86_FEATURE_MPX) in here?
>>> 
>>> This isn't major, but I can't _ever_ imagine a user being able to
>>> track down why MPX is not working from this message. Should we
>>> spruce it up somehow?
>> 
>> Maybe. If the error log "disabling some xstate features:" is changed
>> to "disabling MPX xstate features:", do you think it is OK?
> 
> That's better.  Is it really disabling MPX, though?
> 
> And shouldn't we clear the cpu feature bit too?

I am not sure. I am suspecting whether this checking should be moved before 
xstate_enable().

Peter, what do you think of it?

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v7 09/10] x86, mpx: cleanup unused bound tables

2014-07-23 Thread Ren, Qiaowei


On 2014-07-24, Hansen, Dave wrote:
> On 07/23/2014 05:49 PM, Ren, Qiaowei wrote:
>> I can check a lot of debug information when one VMA and related
>> bounds tables are allocated and freed through adding a lot of
>> print() like log into kernel/runtime. Do you think this is enough?
> 
> I thought the entire reason we grabbed a VM_ flag was to make it
> possible to figure out without resorting to this.

All cleanup work certainly depends on this VM_ flag. In addition, as we 
discussed, this new VM_ flag can mainly have runtime know how much memory is 
occupied by MPX.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v7 03/10] x86, mpx: add macro cpu_has_mpx

2014-07-23 Thread Ren, Qiaowei


On 2014-07-24, Hansen, Dave wrote:
> On 07/22/2014 07:35 PM, Ren, Qiaowei wrote:
>> The checking about MPX feature should be as follow:
>> 
>> if (pcntxt_mask & XSTATE_EAGER) {
>> if (eagerfpu == DISABLE) {
>> pr_err("eagerfpu not present, disabling some
> xstate features: 0x%llx\n",
>> pcntxt_mask &
> XSTATE_EAGER);
>> pcntxt_mask &= ~XSTATE_EAGER; } else { eagerfpu
>> = ENABLE;
>> }
>> }
>> This patch was merged into kernel the ending of last year
>> (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/com
>> mi
>> t/?id=e7d820a5e549b3eb6c3f9467507566565646a669 )
> 
> Should we be doing a clear_cpu_cap(X86_FEATURE_MPX) in here?
> 
> This isn't major, but I can't _ever_ imagine a user being able to
> track down why MPX is not working from this message.  Should we spruce it up 
> somehow?

Maybe. If the error log "disabling some xstate features:" is changed to 
"disabling MPX xstate features:", do you think it is OK?

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v7 09/10] x86, mpx: cleanup unused bound tables

2014-07-23 Thread Ren, Qiaowei


On 2014-07-24, Hansen, Dave wrote:
> On 07/20/2014 10:38 PM, Qiaowei Ren wrote:
>> Since the kernel allocated those tables on-demand without userspace
>> knowledge, it is also responsible for freeing them when the
>> associated mappings go away.
>> 
>> Here, the solution for this issue is to hook do_munmap() to check
>> whether one process is MPX enabled. If yes, those bounds tables
>> covered in the virtual address region which is being unmapped will
>> be freed
> also.
> 
> This is the part of the code that I'm the most concerned about.
> 
> Could you elaborate on how you've tested this to make sure it works OK?

I can check a lot of debug information when one VMA and related bounds tables 
are allocated and freed through adding a lot of print() like log into 
kernel/runtime. Do you think this is enough?

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v7 09/10] x86, mpx: cleanup unused bound tables

2014-07-23 Thread Ren, Qiaowei


On 2014-07-24, Hansen, Dave wrote:
 On 07/20/2014 10:38 PM, Qiaowei Ren wrote:
 Since the kernel allocated those tables on-demand without userspace
 knowledge, it is also responsible for freeing them when the
 associated mappings go away.
 
 Here, the solution for this issue is to hook do_munmap() to check
 whether one process is MPX enabled. If yes, those bounds tables
 covered in the virtual address region which is being unmapped will
 be freed
 also.
 
 This is the part of the code that I'm the most concerned about.
 
 Could you elaborate on how you've tested this to make sure it works OK?

I can check a lot of debug information when one VMA and related bounds tables 
are allocated and freed through adding a lot of print() like log into 
kernel/runtime. Do you think this is enough?

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v7 03/10] x86, mpx: add macro cpu_has_mpx

2014-07-23 Thread Ren, Qiaowei


On 2014-07-24, Hansen, Dave wrote:
 On 07/22/2014 07:35 PM, Ren, Qiaowei wrote:
 The checking about MPX feature should be as follow:
 
 if (pcntxt_mask  XSTATE_EAGER) {
 if (eagerfpu == DISABLE) {
 pr_err(eagerfpu not present, disabling some
 xstate features: 0x%llx\n,
 pcntxt_mask 
 XSTATE_EAGER);
 pcntxt_mask = ~XSTATE_EAGER; } else { eagerfpu
 = ENABLE;
 }
 }
 This patch was merged into kernel the ending of last year
 (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/com
 mi
 t/?id=e7d820a5e549b3eb6c3f9467507566565646a669 )
 
 Should we be doing a clear_cpu_cap(X86_FEATURE_MPX) in here?
 
 This isn't major, but I can't _ever_ imagine a user being able to
 track down why MPX is not working from this message.  Should we spruce it up 
 somehow?

Maybe. If the error log disabling some xstate features: is changed to 
disabling MPX xstate features:, do you think it is OK?

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v7 09/10] x86, mpx: cleanup unused bound tables

2014-07-23 Thread Ren, Qiaowei


On 2014-07-24, Hansen, Dave wrote:
 On 07/23/2014 05:49 PM, Ren, Qiaowei wrote:
 I can check a lot of debug information when one VMA and related
 bounds tables are allocated and freed through adding a lot of
 print() like log into kernel/runtime. Do you think this is enough?
 
 I thought the entire reason we grabbed a VM_ flag was to make it
 possible to figure out without resorting to this.

All cleanup work certainly depends on this VM_ flag. In addition, as we 
discussed, this new VM_ flag can mainly have runtime know how much memory is 
occupied by MPX.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v7 03/10] x86, mpx: add macro cpu_has_mpx

2014-07-23 Thread Ren, Qiaowei


On 2014-07-24, Hansen, Dave wrote:
 On 07/23/2014 05:56 PM, Ren, Qiaowei wrote:
 On 2014-07-24, Hansen, Dave wrote:
 On 07/22/2014 07:35 PM, Ren, Qiaowei wrote:
 The checking about MPX feature should be as follow:
 
 if (pcntxt_mask  XSTATE_EAGER) {
 if (eagerfpu == DISABLE) {
 pr_err(eagerfpu not present, disabling
 some
 xstate features: 0x%llx\n,
 pcntxt_mask 
 XSTATE_EAGER);
 pcntxt_mask = ~XSTATE_EAGER; } else {
 eagerfpu = ENABLE;
 }
 }
 This patch was merged into kernel the ending of last year
 (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/c
 om
 mi
 t/?id=e7d820a5e549b3eb6c3f9467507566565646a669 )
 
 Should we be doing a clear_cpu_cap(X86_FEATURE_MPX) in here?
 
 This isn't major, but I can't _ever_ imagine a user being able to
 track down why MPX is not working from this message. Should we
 spruce it up somehow?
 
 Maybe. If the error log disabling some xstate features: is changed
 to disabling MPX xstate features:, do you think it is OK?
 
 That's better.  Is it really disabling MPX, though?
 
 And shouldn't we clear the cpu feature bit too?

I am not sure. I am suspecting whether this checking should be moved before 
xstate_enable().

Peter, what do you think of it?

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v7 03/10] x86, mpx: add macro cpu_has_mpx

2014-07-22 Thread Ren, Qiaowei


On 2014-07-23, Hansen, Dave wrote:
> On 07/20/2014 10:38 PM, Qiaowei Ren wrote:
>> +#ifdef CONFIG_X86_INTEL_MPX
>> +#define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX) #else #define
>> +cpu_has_mpx 0 #endif /* CONFIG_X86_INTEL_MPX */
> 
> Is this enough checking?  Looking at the extension reference, it says:
> 
>> 9.3.3 Enabling of Intel MPX States An OS can enable Intel MPX states to
>> support software operation using bounds registers with the following
>> steps: * Verify the processor supports XSAVE/XRSTOR/XSETBV/XGETBV
>> instructions and XCR0 by checking CPUID.1.ECX.XSAVE[bit 26]=1.
> 
> That, I assume the xsave code is already doing.
> 
>> * Verify the processor supports both Intel MPX states by checking
> CPUID.(EAX=0x0D, ECX=0):EAX[4:3] is 11b.
> 
> I see these bits _attempting_ to get set in pcntxt_mask via XCNTXT_MASK.
>  But, I don't see us ever actually checking that they _do_ get set.
> For instance, we do this for:
> 
>> if ((pcntxt_mask & XSTATE_FPSSE) != XSTATE_FPSSE) {
>> pr_err("FP/SSE not shown under xsave features
> 0x%llx\n",
>>pcntxt_mask);
>> BUG();
>> }

The checking about MPX feature should be as follow:

if (pcntxt_mask & XSTATE_EAGER) {
if (eagerfpu == DISABLE) {
pr_err("eagerfpu not present, disabling some xstate 
features: 0x%llx\n",
pcntxt_mask & XSTATE_EAGER);
pcntxt_mask &= ~XSTATE_EAGER;
} else {
eagerfpu = ENABLE;
}
}

This patch was merged into kernel the ending of last year 
(https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e7d820a5e549b3eb6c3f9467507566565646a669
 )

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v7 03/10] x86, mpx: add macro cpu_has_mpx

2014-07-22 Thread Ren, Qiaowei


On 2014-07-23, Hansen, Dave wrote:
 On 07/20/2014 10:38 PM, Qiaowei Ren wrote:
 +#ifdef CONFIG_X86_INTEL_MPX
 +#define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX) #else #define
 +cpu_has_mpx 0 #endif /* CONFIG_X86_INTEL_MPX */
 
 Is this enough checking?  Looking at the extension reference, it says:
 
 9.3.3 Enabling of Intel MPX States An OS can enable Intel MPX states to
 support software operation using bounds registers with the following
 steps: * Verify the processor supports XSAVE/XRSTOR/XSETBV/XGETBV
 instructions and XCR0 by checking CPUID.1.ECX.XSAVE[bit 26]=1.
 
 That, I assume the xsave code is already doing.
 
 * Verify the processor supports both Intel MPX states by checking
 CPUID.(EAX=0x0D, ECX=0):EAX[4:3] is 11b.
 
 I see these bits _attempting_ to get set in pcntxt_mask via XCNTXT_MASK.
  But, I don't see us ever actually checking that they _do_ get set.
 For instance, we do this for:
 
 if ((pcntxt_mask  XSTATE_FPSSE) != XSTATE_FPSSE) {
 pr_err(FP/SSE not shown under xsave features
 0x%llx\n,
pcntxt_mask);
 BUG();
 }

The checking about MPX feature should be as follow:

if (pcntxt_mask  XSTATE_EAGER) {
if (eagerfpu == DISABLE) {
pr_err(eagerfpu not present, disabling some xstate 
features: 0x%llx\n,
pcntxt_mask  XSTATE_EAGER);
pcntxt_mask = ~XSTATE_EAGER;
} else {
eagerfpu = ENABLE;
}
}

This patch was merged into kernel the ending of last year 
(https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e7d820a5e549b3eb6c3f9467507566565646a669
 )

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v7 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-07-21 Thread Ren, Qiaowei


On 2014-07-21, Andi Kleen wrote:
> Qiaowei Ren  writes:
>> + */
>> +#ifdef CONFIG_X86_64
>> +insn->x86_64 = 1;
>> +insn->addr_bytes = 8;
>> +#else
>> +insn->x86_64 = 0;
>> +insn->addr_bytes = 4;
>> +#endif
> 
> How would that handle compat mode on a 64bit kernel?
> Should likely look at the code segment instead of ifdef.
>> +/* Note: the upper 32 bits are ignored in 32-bit mode. */
> 
> Again correct for compat mode? I believe the upper bits are undefined.
> 
Compat mode will be supported on next patch in future, as 0/ patch mentioned. 
^-^

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v7 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-07-21 Thread Ren, Qiaowei


On 2014-07-21, Andi Kleen wrote:
 Qiaowei Ren qiaowei@intel.com writes:
 + */
 +#ifdef CONFIG_X86_64
 +insn-x86_64 = 1;
 +insn-addr_bytes = 8;
 +#else
 +insn-x86_64 = 0;
 +insn-addr_bytes = 4;
 +#endif
 
 How would that handle compat mode on a 64bit kernel?
 Should likely look at the code segment instead of ifdef.
 +/* Note: the upper 32 bits are ignored in 32-bit mode. */
 
 Again correct for compat mode? I believe the upper bits are undefined.
 
Compat mode will be supported on next patch in future, as 0/ patch mentioned. 
^-^

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-24 Thread Ren, Qiaowei
On 2014-06-25, Andy Lutomirski wrote:
> On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei 
> wrote:
>> On 2014-06-24, Andy Lutomirski wrote:
>>>> On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
>>>>> Can the new vm_operation "name" be use for this?  The magic
>>>>> "always written to core dumps" feature might need to be reconsidered.
>>>> 
>>>> One thing I'd like to avoid is an MPX vma getting merged with a
>>>> non-MPX vma.  I don't see any code to prevent two VMAs with
>>>> different vm_ops->names from getting merged.  That seems like a
>>>> bit of a design oversight for ->name.  Right?
>>> 
>>> AFAIK there are no ->name users that don't also set ->close, for
>>> exactly that reason.  I'd be okay with adding a check for ->name, too.
>>> 
>>> Hmm.  If MPX vmas had a real struct file attached, this would all
>>> come for free. Maybe vmas with non-default vm_ops and file != NULL
>>> should never be mergeable?
>>> 
>>>> 
>>>> Thinking out loud a bit... There are also some more complicated
>>>> but more performant cleanup mechanisms that I'd like to go after in the 
>>>> future.
>>>> Given a page, we might want to figure out if it is an MPX page or not.
>>>> I wonder if we'll ever collide with some other user of vm_ops->name.
>>>> It looks fairly narrowly used at the moment, but would this keep
>>>> us from putting these pages on, say, a tmpfs mount?  Doesn't look
>>>> that way at the moment.
>>> 
>>> You could always check the vm_ops pointer to see if it's MPX.
>>> 
>>> One feature I've wanted: a way to have special per-process vmas that
>>> can be easily found.  For example, I want to be able to efficiently
>>> find out where the vdso and vvar vmas are.  I don't think this is
>>> currently supported.
>>> 
>> Andy, if you add a check for ->name to avoid the MPX vmas merged
>> with
> non-MPX vmas, I guess the work flow should be as follow (use
> _install_special_mapping to get a new vma):
>> 
>> unsigned long mpx_mmap(unsigned long len) {
>> ..
>> static struct vm_special_mapping mpx_mapping = {
>> .name = "[mpx]",
>> .pages = no_pages,
>> };
>> 
>> ... vma = _install_special_mapping(mm, addr, len, vm_flags,
>> _mapping); ..
>> }
>> 
>> Then, we could check the ->name to see if the VMA is MPX specific. Right?
> 
> Does this actually create a vma backed with real memory?  Doesn't this
> need to go through anon_vma or something?  _install_special_mapping
> completely prevents merging.
> 
Hmm, _install_special_mapping should completely prevent merging, even among MPX 
vmas.

So, could you tell me how to set MPX specific ->name to the vma when it is 
created? Seems like that I could not find such interface.

Thanks,
Qiaowei
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-24 Thread Ren, Qiaowei
On 2014-06-25, Andy Lutomirski wrote:
 On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei qiaowei@intel.com
 wrote:
 On 2014-06-24, Andy Lutomirski wrote:
 On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
 Can the new vm_operation name be use for this?  The magic
 always written to core dumps feature might need to be reconsidered.
 
 One thing I'd like to avoid is an MPX vma getting merged with a
 non-MPX vma.  I don't see any code to prevent two VMAs with
 different vm_ops-names from getting merged.  That seems like a
 bit of a design oversight for -name.  Right?
 
 AFAIK there are no -name users that don't also set -close, for
 exactly that reason.  I'd be okay with adding a check for -name, too.
 
 Hmm.  If MPX vmas had a real struct file attached, this would all
 come for free. Maybe vmas with non-default vm_ops and file != NULL
 should never be mergeable?
 
 
 Thinking out loud a bit... There are also some more complicated
 but more performant cleanup mechanisms that I'd like to go after in the 
 future.
 Given a page, we might want to figure out if it is an MPX page or not.
 I wonder if we'll ever collide with some other user of vm_ops-name.
 It looks fairly narrowly used at the moment, but would this keep
 us from putting these pages on, say, a tmpfs mount?  Doesn't look
 that way at the moment.
 
 You could always check the vm_ops pointer to see if it's MPX.
 
 One feature I've wanted: a way to have special per-process vmas that
 can be easily found.  For example, I want to be able to efficiently
 find out where the vdso and vvar vmas are.  I don't think this is
 currently supported.
 
 Andy, if you add a check for -name to avoid the MPX vmas merged
 with
 non-MPX vmas, I guess the work flow should be as follow (use
 _install_special_mapping to get a new vma):
 
 unsigned long mpx_mmap(unsigned long len) {
 ..
 static struct vm_special_mapping mpx_mapping = {
 .name = [mpx],
 .pages = no_pages,
 };
 
 ... vma = _install_special_mapping(mm, addr, len, vm_flags,
 mpx_mapping); ..
 }
 
 Then, we could check the -name to see if the VMA is MPX specific. Right?
 
 Does this actually create a vma backed with real memory?  Doesn't this
 need to go through anon_vma or something?  _install_special_mapping
 completely prevents merging.
 
Hmm, _install_special_mapping should completely prevent merging, even among MPX 
vmas.

So, could you tell me how to set MPX specific -name to the vma when it is 
created? Seems like that I could not find such interface.

Thanks,
Qiaowei
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a���
0��h���i

RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Ren, Qiaowei
On 2014-06-24, Andy Lutomirski wrote:
>> On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
>>> Can the new vm_operation "name" be use for this?  The magic "always
>>> written to core dumps" feature might need to be reconsidered.
>> 
>> One thing I'd like to avoid is an MPX vma getting merged with a
>> non-MPX vma.  I don't see any code to prevent two VMAs with
>> different vm_ops->names from getting merged.  That seems like a bit
>> of a design oversight for ->name.  Right?
> 
> AFAIK there are no ->name users that don't also set ->close, for
> exactly that reason.  I'd be okay with adding a check for ->name, too.
> 
> Hmm.  If MPX vmas had a real struct file attached, this would all come
> for free. Maybe vmas with non-default vm_ops and file != NULL should
> never be mergeable?
> 
>> 
>> Thinking out loud a bit... There are also some more complicated but
>> more performant cleanup mechanisms that I'd like to go after in the future.
>> Given a page, we might want to figure out if it is an MPX page or not.
>> I wonder if we'll ever collide with some other user of vm_ops->name.
>> It looks fairly narrowly used at the moment, but would this keep us
>> from putting these pages on, say, a tmpfs mount?  Doesn't look that
>> way at the moment.
> 
> You could always check the vm_ops pointer to see if it's MPX.
> 
> One feature I've wanted: a way to have special per-process vmas that
> can be easily found.  For example, I want to be able to efficiently
> find out where the vdso and vvar vmas are.  I don't think this is currently 
> supported.
> 
Andy, if you add a check for ->name to avoid the MPX vmas merged with non-MPX 
vmas, I guess the work flow should be as follow (use _install_special_mapping 
to get a new vma):

unsigned long mpx_mmap(unsigned long len)
{
..
static struct vm_special_mapping mpx_mapping = {
.name = "[mpx]",
.pages = no_pages,
};

...
vma = _install_special_mapping(mm, addr, len, vm_flags, _mapping);
..
}

Then, we could check the ->name to see if the VMA is MPX specific. Right?

Thanks,
Qiaowei



RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Ren, Qiaowei
On 2014-06-24, Andy Lutomirski wrote:
>> +/* Make bounds tables and bouds directory unlocked. */
>> +if (vm_flags & VM_LOCKED)
>> +vm_flags &= ~VM_LOCKED;
> 
> Why?  I would expect MCL_FUTURE to lock these.
> 
Andy, I was just a little confused about LOCKED & POPULATE earlier and I 
thought VM_LOCKED is not necessary for MPX specific bounds tables. Now, this 
checking should be removed, and there should be mm_populate() for VM_LOCKED 
case after mmap_region():

   if (!IS_ERR_VALUE(addr) && (vm_flags & VM_LOCKED))
   mm_populate(addr, len);

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v6 04/10] x86, mpx: hook #BR exception handler to allocate bound tables

2014-06-23 Thread Ren, Qiaowei
On 2014-06-24, Andy Lutomirski wrote:
> On 06/18/2014 02:44 AM, Qiaowei Ren wrote:
>> +/*
>> + * The error code field of the BNDSTATUS register communicates status
>> + * information of a bound range exception #BR or operation involving
>> + * bound directory.
>> + */
>> +switch (status & MPX_BNDSTA_ERROR_CODE) {
>> +case 2:
>> +/*
>> + * Bound directory has invalid entry.
>> + * No signal will be sent to the user space.
> 
> This comment is a lie.
> 
Hmm, thanks. 

>> + */
>> +if (do_mpx_bt_fault(xsave_buf))
>> +force_sig(SIGBUS, tsk);
> 
> Would it make sense to assign and use a new si_code value here?
> 
There is a new si_code SEGV_BNDERR for bounds violation reported by MPX. But in 
this case, it is mainly due to the failure caused by allocation of bounds 
table. I guess it is not necessary to add another new si_code value.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v6 04/10] x86, mpx: hook #BR exception handler to allocate bound tables

2014-06-23 Thread Ren, Qiaowei
On 2014-06-24, Andy Lutomirski wrote:
 On 06/18/2014 02:44 AM, Qiaowei Ren wrote:
 +/*
 + * The error code field of the BNDSTATUS register communicates status
 + * information of a bound range exception #BR or operation involving
 + * bound directory.
 + */
 +switch (status  MPX_BNDSTA_ERROR_CODE) {
 +case 2:
 +/*
 + * Bound directory has invalid entry.
 + * No signal will be sent to the user space.
 
 This comment is a lie.
 
Hmm, thanks. 

 + */
 +if (do_mpx_bt_fault(xsave_buf))
 +force_sig(SIGBUS, tsk);
 
 Would it make sense to assign and use a new si_code value here?
 
There is a new si_code SEGV_BNDERR for bounds violation reported by MPX. But in 
this case, it is mainly due to the failure caused by allocation of bounds 
table. I guess it is not necessary to add another new si_code value.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Ren, Qiaowei
On 2014-06-24, Andy Lutomirski wrote:
 +/* Make bounds tables and bouds directory unlocked. */
 +if (vm_flags  VM_LOCKED)
 +vm_flags = ~VM_LOCKED;
 
 Why?  I would expect MCL_FUTURE to lock these.
 
Andy, I was just a little confused about LOCKED  POPULATE earlier and I 
thought VM_LOCKED is not necessary for MPX specific bounds tables. Now, this 
checking should be removed, and there should be mm_populate() for VM_LOCKED 
case after mmap_region():

   if (!IS_ERR_VALUE(addr)  (vm_flags  VM_LOCKED))
   mm_populate(addr, len);

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Ren, Qiaowei
On 2014-06-24, Andy Lutomirski wrote:
 On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
 Can the new vm_operation name be use for this?  The magic always
 written to core dumps feature might need to be reconsidered.
 
 One thing I'd like to avoid is an MPX vma getting merged with a
 non-MPX vma.  I don't see any code to prevent two VMAs with
 different vm_ops-names from getting merged.  That seems like a bit
 of a design oversight for -name.  Right?
 
 AFAIK there are no -name users that don't also set -close, for
 exactly that reason.  I'd be okay with adding a check for -name, too.
 
 Hmm.  If MPX vmas had a real struct file attached, this would all come
 for free. Maybe vmas with non-default vm_ops and file != NULL should
 never be mergeable?
 
 
 Thinking out loud a bit... There are also some more complicated but
 more performant cleanup mechanisms that I'd like to go after in the future.
 Given a page, we might want to figure out if it is an MPX page or not.
 I wonder if we'll ever collide with some other user of vm_ops-name.
 It looks fairly narrowly used at the moment, but would this keep us
 from putting these pages on, say, a tmpfs mount?  Doesn't look that
 way at the moment.
 
 You could always check the vm_ops pointer to see if it's MPX.
 
 One feature I've wanted: a way to have special per-process vmas that
 can be easily found.  For example, I want to be able to efficiently
 find out where the vdso and vvar vmas are.  I don't think this is currently 
 supported.
 
Andy, if you add a check for -name to avoid the MPX vmas merged with non-MPX 
vmas, I guess the work flow should be as follow (use _install_special_mapping 
to get a new vma):

unsigned long mpx_mmap(unsigned long len)
{
..
static struct vm_special_mapping mpx_mapping = {
.name = [mpx],
.pages = no_pages,
};

...
vma = _install_special_mapping(mm, addr, len, vm_flags, mpx_mapping);
..
}

Then, we could check the -name to see if the VMA is MPX specific. Right?

Thanks,
Qiaowei



RE: [PATCH v6 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-06-19 Thread Ren, Qiaowei
On 2014-06-20, H. Peter Anvin wrote:
> On 06/19/2014 10:04 AM, Dave Hansen wrote:
>> 
>> Could you please support this position with some data?  I'm a bit
>> skeptical that instruction decoding is going to be a
>> performance-critical path.
>> 
>> I also don't see the extra field that you talked about in the
>> previous thread?  What's the extra field?  I see a 'limit' vs.
>> 'length', but you don't use 'length' at all, so I think you can use
>> it instead, or at least union it.
>> 
>> I've taken a quick stab at trying to consolidate things.  I think I
>> may have screwed up this:
>> 
>>  insn->limit = MAX_MPX_INSN_SIZE - bytes;
>> 
>> Qiaowei, is there anything fundamentally broken with what I've got here?
>> 
> 
Firstly instructions will be got from user pointer stored in 'ip', and then 
validate_next() will use 'limit' to make sure that next sizeof(t) bytes can be 
on the same instruction.

As hpa said, generic decoder, including struct insn and implementation of 
decoding, is very heavyweight because it has to. So MPX specific decoding 
should be better choice.

> So I encouraged Qiaowei to do a limited special-purpose decoder,
> simply because the glue to use the generic decoder was almost as
> large.  I am overall not a huge fan of using the generic decoder in
> constrained situation, because the generic decoder is very heavyweight
> not just in terms of performance but in terms of interface -- because it has 
> to.
> 
Thanks,
Qiaowei


RE: [PATCH v6 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-06-19 Thread Ren, Qiaowei
On 2014-06-19, Borislav Petkov wrote:
> On Thu, Jun 19, 2014 at 01:13:48AM +0000, Ren, Qiaowei wrote:
>> On 2014-06-18, Borislav Petkov wrote:
>>> On Wed, Jun 18, 2014 at 05:44:13PM +0800, Qiaowei Ren wrote:
>>> 
>>> This whole insn decoding machinery above looks like adapted from
>>> arch/x86/lib/insn.c. You should merge it with the generic code in
>>> insn.c instead of homegrowing it here only for the purposes of MPX.
>>> And if it doesn't work for your needs, you should should extend
>>> the generic code to do so.
> 
>> Petkov, as we discussed on initial version of this patchset, general
>> insn framework didn't work out well and I have tried to use generic
>> struct insn, insn_field, etc. for obvious benefits.
> 
> Let me repeat myself: "And if it doesn't work for your needs, you
> should extend the generic code to do so."
> 
> We don't do homegrown almost-copies of generic code.
>
I see. If possible, I will be very happy to use or extend generic code. But due 
to extra overhead caused by MPX, I have to use MPX specific decoding to do 
performance optimization.

You can check the discussion on this: https://lkml.org/lkml/2014/1/11/190

Thanks,
Qiaowei

N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

RE: [PATCH v6 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-06-19 Thread Ren, Qiaowei
On 2014-06-19, Borislav Petkov wrote:
 On Thu, Jun 19, 2014 at 01:13:48AM +, Ren, Qiaowei wrote:
 On 2014-06-18, Borislav Petkov wrote:
 On Wed, Jun 18, 2014 at 05:44:13PM +0800, Qiaowei Ren wrote:
 
 This whole insn decoding machinery above looks like adapted from
 arch/x86/lib/insn.c. You should merge it with the generic code in
 insn.c instead of homegrowing it here only for the purposes of MPX.
 And if it doesn't work for your needs, you should should extend
 the generic code to do so.
 
 Petkov, as we discussed on initial version of this patchset, general
 insn framework didn't work out well and I have tried to use generic
 struct insn, insn_field, etc. for obvious benefits.
 
 Let me repeat myself: And if it doesn't work for your needs, you
 should extend the generic code to do so.
 
 We don't do homegrown almost-copies of generic code.

I see. If possible, I will be very happy to use or extend generic code. But due 
to extra overhead caused by MPX, I have to use MPX specific decoding to do 
performance optimization.

You can check the discussion on this: https://lkml.org/lkml/2014/1/11/190

Thanks,
Qiaowei

N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a���
0��h���i

RE: [PATCH v6 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-06-19 Thread Ren, Qiaowei
On 2014-06-20, H. Peter Anvin wrote:
 On 06/19/2014 10:04 AM, Dave Hansen wrote:
 
 Could you please support this position with some data?  I'm a bit
 skeptical that instruction decoding is going to be a
 performance-critical path.
 
 I also don't see the extra field that you talked about in the
 previous thread?  What's the extra field?  I see a 'limit' vs.
 'length', but you don't use 'length' at all, so I think you can use
 it instead, or at least union it.
 
 I've taken a quick stab at trying to consolidate things.  I think I
 may have screwed up this:
 
  insn-limit = MAX_MPX_INSN_SIZE - bytes;
 
 Qiaowei, is there anything fundamentally broken with what I've got here?
 
 
Firstly instructions will be got from user pointer stored in 'ip', and then 
validate_next() will use 'limit' to make sure that next sizeof(t) bytes can be 
on the same instruction.

As hpa said, generic decoder, including struct insn and implementation of 
decoding, is very heavyweight because it has to. So MPX specific decoding 
should be better choice.

 So I encouraged Qiaowei to do a limited special-purpose decoder,
 simply because the glue to use the generic decoder was almost as
 large.  I am overall not a huge fan of using the generic decoder in
 constrained situation, because the generic decoder is very heavyweight
 not just in terms of performance but in terms of interface -- because it has 
 to.
 
Thanks,
Qiaowei


RE: [PATCH v6 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-06-18 Thread Ren, Qiaowei
On 2014-06-18, Borislav Petkov wrote:
> On Wed, Jun 18, 2014 at 05:44:13PM +0800, Qiaowei Ren wrote:
> 
> This whole insn decoding machinery above looks like adapted from
> arch/x86/lib/insn.c. You should merge it with the generic code in
> insn.c instead of homegrowing it here only for the purposes of MPX.
> And if it doesn't work for your needs, you should should extend the
> generic code to do so. I think we even talked about this last time.
> 
> Also, make sure you run all your patches through checkpatch.pl before
> submitting.
>
Petkov, as we discussed on initial version of this patchset, general insn 
framework didn't work out well and I have tried to use generic struct insn, 
insn_field, etc. for obvious benefits.

Thanks,
Qiaowei



RE: [PATCH v6 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-06-18 Thread Ren, Qiaowei
On 2014-06-18, Borislav Petkov wrote:
 On Wed, Jun 18, 2014 at 05:44:13PM +0800, Qiaowei Ren wrote:
 
 This whole insn decoding machinery above looks like adapted from
 arch/x86/lib/insn.c. You should merge it with the generic code in
 insn.c instead of homegrowing it here only for the purposes of MPX.
 And if it doesn't work for your needs, you should should extend the
 generic code to do so. I think we even talked about this last time.
 
 Also, make sure you run all your patches through checkpatch.pl before
 submitting.

Petkov, as we discussed on initial version of this patchset, general insn 
framework didn't work out well and I have tried to use generic struct insn, 
insn_field, etc. for obvious benefits.

Thanks,
Qiaowei



Re: [PATCH v5 1/3] x86, mpx: add documentation on Intel MPX

2014-02-26 Thread Ren Qiaowei

On 02/27/2014 05:44 AM, H. Peter Anvin wrote:

On 02/26/2014 11:17 AM, Dave Hansen wrote:

On 02/23/2014 05:27 AM, Qiaowei Ren wrote:

+Bounds Directory (BD) and Bounds Tables (BT) are stored in
+application memory and are allocated by the application (in case
+of kernel use, the structures will be in kernel memory). The
+bound directory and each instance of bound table are in contiguous
+linear memory.


Hi Qiaowei,

Does this mean that if userspace decided to map something in the way of
the bounds tables that it would break MPX?

Also, in the description, could we s/linear/virtual/?  Linear seems to
be the term that Intel likes to use in its documents, but we almost
universally call them virtual addresses in the kernel.



It might be useful to clarify, though, just so noone gets confused with
effective addresses (offsets).

Ok. Some words here come from some MPX public documents. I need to go 
through more carefully.


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 1/3] x86, mpx: add documentation on Intel MPX

2014-02-26 Thread Ren Qiaowei

On 02/27/2014 05:19 AM, Dave Hansen wrote:

On 02/23/2014 05:27 AM, Qiaowei Ren wrote:

+The other case that generates a #BR is when a BNDSTX instruction
+attempts to save bounds to a BD entry marked as invalid. This is
+an indication that no BT exists for this entry. In this case the
+fault handler will allocate a new BT.


Hi Qiaowei,

Can you make an effort to move these comments over to the code?  The
bounds-table allocation function is a much more appropriate place for
this text than the Documentation/ file.


Ok. They will be moved.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 1/3] x86, mpx: add documentation on Intel MPX

2014-02-26 Thread Ren Qiaowei

On 02/27/2014 05:19 AM, Dave Hansen wrote:

On 02/23/2014 05:27 AM, Qiaowei Ren wrote:

+The other case that generates a #BR is when a BNDSTX instruction
+attempts to save bounds to a BD entry marked as invalid. This is
+an indication that no BT exists for this entry. In this case the
+fault handler will allocate a new BT.


Hi Qiaowei,

Can you make an effort to move these comments over to the code?  The
bounds-table allocation function is a much more appropriate place for
this text than the Documentation/ file.


Ok. They will be moved.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 1/3] x86, mpx: add documentation on Intel MPX

2014-02-26 Thread Ren Qiaowei

On 02/27/2014 05:44 AM, H. Peter Anvin wrote:

On 02/26/2014 11:17 AM, Dave Hansen wrote:

On 02/23/2014 05:27 AM, Qiaowei Ren wrote:

+Bounds Directory (BD) and Bounds Tables (BT) are stored in
+application memory and are allocated by the application (in case
+of kernel use, the structures will be in kernel memory). The
+bound directory and each instance of bound table are in contiguous
+linear memory.


Hi Qiaowei,

Does this mean that if userspace decided to map something in the way of
the bounds tables that it would break MPX?

Also, in the description, could we s/linear/virtual/?  Linear seems to
be the term that Intel likes to use in its documents, but we almost
universally call them virtual addresses in the kernel.



It might be useful to clarify, though, just so noone gets confused with
effective addresses (offsets).

Ok. Some words here come from some MPX public documents. I need to go 
through more carefully.


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v5 2/3] x86, mpx: hook #BR exception handler to allocate bound tables

2014-02-25 Thread Ren, Qiaowei

> -Original Message-
> From: H. Peter Anvin [mailto:h...@zytor.com]
> Sent: Tuesday, February 25, 2014 1:52 AM
> To: Hansen, Dave; Ren, Qiaowei; Thomas Gleixner; Ingo Molnar
> Cc: x...@kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v5 2/3] x86, mpx: hook #BR exception handler to allocate
> bound tables
> 
> On 02/24/2014 09:27 AM, Dave Hansen wrote:
> >
> > Can you talk a little bit about what the design is here?  Why does the
> > kernel have to do the allocation of the virtual address space?  Does
> > it really need to MAP_POPULATE?  bt_size looks like 4MB, and that's an
> > awful lot of memory to eat up at once.  Shouldn't we just let the
> > kernel demand-fault this like everything else?
> >
> 
> MAP_POPULATE definitely seems like the wrong thing.
> 
Oh. This option should be removed.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v5 2/3] x86, mpx: hook #BR exception handler to allocate bound tables

2014-02-25 Thread Ren, Qiaowei

 -Original Message-
 From: H. Peter Anvin [mailto:h...@zytor.com]
 Sent: Tuesday, February 25, 2014 1:52 AM
 To: Hansen, Dave; Ren, Qiaowei; Thomas Gleixner; Ingo Molnar
 Cc: x...@kernel.org; linux-kernel@vger.kernel.org
 Subject: Re: [PATCH v5 2/3] x86, mpx: hook #BR exception handler to allocate
 bound tables
 
 On 02/24/2014 09:27 AM, Dave Hansen wrote:
 
  Can you talk a little bit about what the design is here?  Why does the
  kernel have to do the allocation of the virtual address space?  Does
  it really need to MAP_POPULATE?  bt_size looks like 4MB, and that's an
  awful lot of memory to eat up at once.  Shouldn't we just let the
  kernel demand-fault this like everything else?
 
 
 MAP_POPULATE definitely seems like the wrong thing.
 
Oh. This option should be removed.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 2/3] x86, mpx: hook #BR exception handler to allocate bound tables

2014-02-20 Thread Ren Qiaowei

On 02/13/2014 04:19 AM, Andy Lutomirski wrote:

On 02/12/2014 10:36 AM, Qiaowei Ren wrote:

An access to an invalid bound directory entry will cause a #BR
exception. This patch hook #BR exception handler to allocate
one bound table and bind it with that buond directory entry.

This will avoid the need of forwarding the #BR exception
to the user space when bound directory has invalid entry.

Signed-off-by: Qiaowei Ren 
---
+void do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+   unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT);
+
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf->bndcsr.status_reg;
+
+   bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size))
+   allocate_bt(bd_entry);
+}


This still just loops on failure, right?


Seems like that SIGBUS should be raised if the allocation fail.

if (!do_mpx_bt_fault(xsave_buf))
force_sig(SIGBUS, tsk);

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 2/3] x86, mpx: hook #BR exception handler to allocate bound tables

2014-02-20 Thread Ren Qiaowei

On 02/13/2014 04:19 AM, Andy Lutomirski wrote:

On 02/12/2014 10:36 AM, Qiaowei Ren wrote:

An access to an invalid bound directory entry will cause a #BR
exception. This patch hook #BR exception handler to allocate
one bound table and bind it with that buond directory entry.

This will avoid the need of forwarding the #BR exception
to the user space when bound directory has invalid entry.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
+void do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+   unsigned long bd_size = 1UL  (MPX_L1_BITS+MPX_L1_SHIFT);
+
+   bd_base = xsave_buf-bndcsr.cfg_reg_u  MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf-bndcsr.status_reg;
+
+   bd_entry = status  MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry = bd_base)  (bd_entry  bd_base + bd_size))
+   allocate_bt(bd_entry);
+}


This still just loops on failure, right?


Seems like that SIGBUS should be raised if the allocation fail.

if (!do_mpx_bt_fault(xsave_buf))
force_sig(SIGBUS, tsk);

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 0/4] Intel MPX support

2014-01-27 Thread Ren Qiaowei

On 01/28/2014 02:42 PM, Ingo Molnar wrote:


* Ren Qiaowei  wrote:


MPX kernel code, namely this patchset, has mainly the 2
responsibilities: provide handlers for bounds faults (#BR), and
manage bounds memory.


AFAICS the kernel side implementation causes no runtime overhead
for non-MPX workloads, and also causes no runtime overhead for
non-MPX hardware, right?


Yes.


Actually, I think that's not entirely true.

For example if within the same mm there's a lot of non-MPX threads and
an MPX thread, then the MMU notifier will be called for MMU operations
of every non-MPX thread as well!

So MPX state of a thread will slow down all the other non-MPX threads
as well.

The statement is only true for non-MPX tasks that have their separate
mm's that does not have a single MPX thread.



Yes. Though all non-MPX threads are slowed down, the whole process 
benefit from MPX.


Anyway, HPA suggest these syscalls, which use MMU notifier, should be 
not needed, we can do what they do in userspace runtime. What do you 
think about it? I guess that I should remove the third patch which adds 
new prctl() syscalls in next version of this patchset.


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/4] x86, mpx: hook #BR exception handler to allocate bound tables

2014-01-27 Thread Ren Qiaowei

On 01/28/2014 02:42 PM, Andy Lutomirski wrote:

I just read it.  do_trap_no_signal presumably calls fixup_exception
because #UD uses it and #UD needs that handling.  (I'm guessing that
there is actually a legitimate use for a kernel fixup on #UD somewhere
-- there's probably something that isn't covered by cpuid.)

There should not be a #BR from the kernel using the fixup mechanism.
IMO if the exception comes from the kernel, it should unconditionally
call die.

Oh. I agree with you, and if a #BR from the kernel it should 
unconditionally call die.


if (!user_mode(regs))
die("bounds", regs, error_code);

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/4] x86, mpx: hook #BR exception handler to allocate bound tables

2014-01-27 Thread Ren Qiaowei

On 01/28/2014 01:21 PM, Andy Lutomirski wrote:

On Mon, Jan 27, 2014 at 7:35 PM, Ren Qiaowei  wrote:

On 01/28/2014 04:36 AM, Andy Lutomirski wrote:


+   bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size))
+   allocate_bt(bd_entry);



What happens if this fails?  Retrying forever isn't very nice.


If allocation of the bound table fail, the related entry in the bound
directory is still invalid. The following access to this entry still produce
#BR fault.



By the "following access" I think you mean the same instruction that
just trapped -- it will trap again because the exception hasn't been
fixed up.  Then mmap will fail again, and you'll retry again, leading
to an infinite loop.


I don't mean the same instruction that just trapped.


I think that failure to fix up the exception should either let the
normal bounds error through or should raise SIGBUS.

Maybe we need HPA help answer this question. Peter, what do you think 
about it? If allocation of the bound table fail, what should we do?





+   if (!user_mode(regs)) {
+   if (!fixup_exception(regs)) {
+   tsk->thread.error_code = error_code;
+   tsk->thread.trap_nr = X86_TRAP_BR;
+   die("bounds", regs, error_code);
+   }



Why the fixup?  Unless I'm missing something, the kernel has no business
getting #BR on access to a user address.

Or are you adding code to allow the kernel to use MPX itself?  If so,
shouldn't this use an MPX-specific fixup to allow normal C code to use
this stuff?


It checks whether #BR come from user-space. You can see do_trap_no_signal().


Wasn't #BR using do_trap before?  do_trap doesn't call
fixup_exception.  I don't see why it should do it now.  (I also don't
think it should come from kernel space until someone adds kernel-mode
MPX support.)

do_trap() -> do_trap_no_signal() call similar code to check if the fault 
occurred in userspace or kernel space. You can see previous discussion 
for the first version of this patchset.


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/4] x86, mpx: add documentation on Intel MPX

2014-01-27 Thread Ren Qiaowei

On 01/28/2014 04:27 AM, Andy Lutomirski wrote:

On 01/26/2014 01:08 AM, Qiaowei Ren wrote:

+1) Providing handlers for bounds faults (#BR).
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults. If a bounds overflow occurs then a #BR is generated.
+The fault handler will decode MPX instructions to get violation
+address and set this address into extended struct siginfo.


Can you document exactly where the insn address and pointer value end
up?  (If it's in the (IMO hideous) cr2 field in ucontext, this needs to
be documented.  If it's somewhere useful in siginfo, that should also be
documented to save people the time it takes to figure it out.)


Ok. I will describe extended siginfo at this documentation.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/4] x86, mpx: hook #BR exception handler to allocate bound tables

2014-01-27 Thread Ren Qiaowei

On 01/28/2014 04:36 AM, Andy Lutomirski wrote:

+   bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size))
+   allocate_bt(bd_entry);


What happens if this fails?  Retrying forever isn't very nice.

If allocation of the bound table fail, the related entry in the bound 
directory is still invalid. The following access to this entry still 
produce #BR fault.



+   if (!user_mode(regs)) {
+   if (!fixup_exception(regs)) {
+   tsk->thread.error_code = error_code;
+   tsk->thread.trap_nr = X86_TRAP_BR;
+   die("bounds", regs, error_code);
+   }


Why the fixup?  Unless I'm missing something, the kernel has no business
getting #BR on access to a user address.

Or are you adding code to allow the kernel to use MPX itself?  If so,
shouldn't this use an MPX-specific fixup to allow normal C code to use
this stuff?


It checks whether #BR come from user-space. You can see do_trap_no_signal().


+   goto exit;
+   }
+
+   if (!boot_cpu_has(X86_FEATURE_MPX)) {
+   do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, NULL);
+   goto exit;


This, as well as the status == 0 case, should probably document that the
exception is from BOUND, not MPX.


Ok. I will add one comment for this.



+   break;
+
+   case 1: /* Bound violation. */
+   case 0: /* No MPX exception. */
+   do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, NULL);
+   break;


What does "No Intel MPX exception" mean?  Surely that has business
sending #BR.

Oh. It comes from spec, and just mean it is not from MPX. :) I will 
change it to be accurate.



+
+   default:
+   break;


What does status 3 mean?  The docs say "reserved".  Presumably this
should log and kill the process.

I guess it should be a good suggestion.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/4] x86, mpx: extend siginfo structure to include bound violation information

2014-01-27 Thread Ren Qiaowei

On 01/28/2014 05:58 AM, Andy Lutomirski wrote:

On 01/26/2014 01:08 AM, Qiaowei Ren wrote:
(Why on earth does Intel not expose this stuff in cr2 or an MSR or
something?)



I guess it is due to some design reason.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/4] x86, mpx: extend siginfo structure to include bound violation information

2014-01-27 Thread Ren Qiaowei

On 01/28/2014 05:58 AM, Andy Lutomirski wrote:

On 01/26/2014 01:08 AM, Qiaowei Ren wrote:
(Why on earth does Intel not expose this stuff in cr2 or an MSR or
something?)



I guess it is due to some design reason.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/4] x86, mpx: hook #BR exception handler to allocate bound tables

2014-01-27 Thread Ren Qiaowei

On 01/28/2014 04:36 AM, Andy Lutomirski wrote:

+   bd_entry = status  MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry = bd_base)  (bd_entry  bd_base + bd_size))
+   allocate_bt(bd_entry);


What happens if this fails?  Retrying forever isn't very nice.

If allocation of the bound table fail, the related entry in the bound 
directory is still invalid. The following access to this entry still 
produce #BR fault.



+   if (!user_mode(regs)) {
+   if (!fixup_exception(regs)) {
+   tsk-thread.error_code = error_code;
+   tsk-thread.trap_nr = X86_TRAP_BR;
+   die(bounds, regs, error_code);
+   }


Why the fixup?  Unless I'm missing something, the kernel has no business
getting #BR on access to a user address.

Or are you adding code to allow the kernel to use MPX itself?  If so,
shouldn't this use an MPX-specific fixup to allow normal C code to use
this stuff?


It checks whether #BR come from user-space. You can see do_trap_no_signal().


+   goto exit;
+   }
+
+   if (!boot_cpu_has(X86_FEATURE_MPX)) {
+   do_trap(X86_TRAP_BR, SIGSEGV, bounds, regs, error_code, NULL);
+   goto exit;


This, as well as the status == 0 case, should probably document that the
exception is from BOUND, not MPX.


Ok. I will add one comment for this.



+   break;
+
+   case 1: /* Bound violation. */
+   case 0: /* No MPX exception. */
+   do_trap(X86_TRAP_BR, SIGSEGV, bounds, regs, error_code, NULL);
+   break;


What does No Intel MPX exception mean?  Surely that has business
sending #BR.

Oh. It comes from spec, and just mean it is not from MPX. :) I will 
change it to be accurate.



+
+   default:
+   break;


What does status 3 mean?  The docs say reserved.  Presumably this
should log and kill the process.

I guess it should be a good suggestion.

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/4] x86, mpx: add documentation on Intel MPX

2014-01-27 Thread Ren Qiaowei

On 01/28/2014 04:27 AM, Andy Lutomirski wrote:

On 01/26/2014 01:08 AM, Qiaowei Ren wrote:

+1) Providing handlers for bounds faults (#BR).
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults. If a bounds overflow occurs then a #BR is generated.
+The fault handler will decode MPX instructions to get violation
+address and set this address into extended struct siginfo.


Can you document exactly where the insn address and pointer value end
up?  (If it's in the (IMO hideous) cr2 field in ucontext, this needs to
be documented.  If it's somewhere useful in siginfo, that should also be
documented to save people the time it takes to figure it out.)


Ok. I will describe extended siginfo at this documentation.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/4] x86, mpx: hook #BR exception handler to allocate bound tables

2014-01-27 Thread Ren Qiaowei

On 01/28/2014 01:21 PM, Andy Lutomirski wrote:

On Mon, Jan 27, 2014 at 7:35 PM, Ren Qiaowei qiaowei@intel.com wrote:

On 01/28/2014 04:36 AM, Andy Lutomirski wrote:


+   bd_entry = status  MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry = bd_base)  (bd_entry  bd_base + bd_size))
+   allocate_bt(bd_entry);



What happens if this fails?  Retrying forever isn't very nice.


If allocation of the bound table fail, the related entry in the bound
directory is still invalid. The following access to this entry still produce
#BR fault.



By the following access I think you mean the same instruction that
just trapped -- it will trap again because the exception hasn't been
fixed up.  Then mmap will fail again, and you'll retry again, leading
to an infinite loop.


I don't mean the same instruction that just trapped.


I think that failure to fix up the exception should either let the
normal bounds error through or should raise SIGBUS.

Maybe we need HPA help answer this question. Peter, what do you think 
about it? If allocation of the bound table fail, what should we do?





+   if (!user_mode(regs)) {
+   if (!fixup_exception(regs)) {
+   tsk-thread.error_code = error_code;
+   tsk-thread.trap_nr = X86_TRAP_BR;
+   die(bounds, regs, error_code);
+   }



Why the fixup?  Unless I'm missing something, the kernel has no business
getting #BR on access to a user address.

Or are you adding code to allow the kernel to use MPX itself?  If so,
shouldn't this use an MPX-specific fixup to allow normal C code to use
this stuff?


It checks whether #BR come from user-space. You can see do_trap_no_signal().


Wasn't #BR using do_trap before?  do_trap doesn't call
fixup_exception.  I don't see why it should do it now.  (I also don't
think it should come from kernel space until someone adds kernel-mode
MPX support.)

do_trap() - do_trap_no_signal() call similar code to check if the fault 
occurred in userspace or kernel space. You can see previous discussion 
for the first version of this patchset.


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/4] x86, mpx: hook #BR exception handler to allocate bound tables

2014-01-27 Thread Ren Qiaowei

On 01/28/2014 02:42 PM, Andy Lutomirski wrote:

I just read it.  do_trap_no_signal presumably calls fixup_exception
because #UD uses it and #UD needs that handling.  (I'm guessing that
there is actually a legitimate use for a kernel fixup on #UD somewhere
-- there's probably something that isn't covered by cpuid.)

There should not be a #BR from the kernel using the fixup mechanism.
IMO if the exception comes from the kernel, it should unconditionally
call die.

Oh. I agree with you, and if a #BR from the kernel it should 
unconditionally call die.


if (!user_mode(regs))
die(bounds, regs, error_code);

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 0/4] Intel MPX support

2014-01-27 Thread Ren Qiaowei

On 01/28/2014 02:42 PM, Ingo Molnar wrote:


* Ren Qiaowei qiaowei@intel.com wrote:


MPX kernel code, namely this patchset, has mainly the 2
responsibilities: provide handlers for bounds faults (#BR), and
manage bounds memory.


AFAICS the kernel side implementation causes no runtime overhead
for non-MPX workloads, and also causes no runtime overhead for
non-MPX hardware, right?


Yes.


Actually, I think that's not entirely true.

For example if within the same mm there's a lot of non-MPX threads and
an MPX thread, then the MMU notifier will be called for MMU operations
of every non-MPX thread as well!

So MPX state of a thread will slow down all the other non-MPX threads
as well.

The statement is only true for non-MPX tasks that have their separate
mm's that does not have a single MPX thread.



Yes. Though all non-MPX threads are slowed down, the whole process 
benefit from MPX.


Anyway, HPA suggest these syscalls, which use MMU notifier, should be 
not needed, we can do what they do in userspace runtime. What do you 
think about it? I guess that I should remove the third patch which adds 
new prctl() syscalls in next version of this patchset.


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/27/2014 10:10 AM, H. Peter Anvin wrote:

On 01/26/2014 05:55 PM, Ren Qiaowei wrote:


Peter, you mean we should remove these two call and do what they do in
user-space, right?



Unless we think there is a benefit to the kernel to have a on/off switch
for the #BR exception (if disabled, all #BR exceptions are signals,
regardless of source.)

There might be, would like other people's opinion.



These two syscalls are only used to release automaticlly the bound 
tables, and they will not set BNDSTATUS and BNDCFGU.


I want to remove them also. :) But if so, we have to release the bound 
tables in user-space.


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/26/2014 11:14 PM, Ingo Molnar wrote:


* Ren, Qiaowei  wrote:


The size of one bound table is 4M bytes for 64bit, and 16K bytes for
32bit. It can not be accessed by user-space, and it will be accessed
automatically by hardware.


So, here's the bound-table allocation AFAICS:

+static bool allocate_bt(unsigned long bd_entry)
+{
+   unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
+   unsigned long bt_addr, old_val = 0;
+
+   bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE,
+   MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);

What ensures that user-space cannot access (and in particular, modify)
the pages at bt_addr? It's a read-write anonymous mapping AFAICS.

Looks like that we can not be able to ensure this. I just mean that 
user-space doesn't know the bound tables, and it should not access them 
also.


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/4] x86, mpx: extend siginfo structure to include bound violation information

2014-01-26 Thread Ren Qiaowei

On 01/27/2014 09:53 AM, H. Peter Anvin wrote:

On 01/26/2014 05:34 PM, Ren Qiaowei wrote:



I agree with you and we should suppress all the warnings as possible. If
I use (unsgined long) to cast like the following code, what do you think
about it? sizeof(long) will be 4 for 32-bit.

 info->si_lower = (void __user *)(unsigned long)
 (xsave_buf->bndregs.bndregs[2*bndregno]);
 info->si_upper = (void __user *)(unsigned long)
 (~xsave_buf->bndregs.bndregs[2*bndregno+1]);



That is the way it is usually done, yes.  Add a comment saying something
like:

/* Note: the upper 32 bits are ignored in 32-bit mode. */

It is worth watching out a bit here, though, because you could be
running a 32-bit process on top of a 64-bit kernel.


Ok. I will update it in next version.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/27/2014 09:50 AM, H. Peter Anvin wrote:

On 01/26/2014 12:39 AM, Ingo Molnar wrote:


It will be only once per startup.


In that case it would be more efficient to make this part of the
binary execution environment so that exec() sets it up automatically,
not a separate prctl() syscall.



This is not necessarily possible, and in particular it might need to be
deferred until the MPX runtime has initialized.

What isn't clear to me is if these syscalls are needed at all, or if it
would be better to just let the MPX runtile set BNDSTATUS and BNDCFGU
directly in userspace.  The kernel cannot rely on them staying
consistent across userspace anyway.

Now, it might be beneficial for the kernel to have them anyway.  It's a
bit of a tough call.

-hpa

Peter, you mean we should remove these two call and do what they do in 
user-space, right?


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/4] x86, mpx: extend siginfo structure to include bound violation information

2014-01-26 Thread Ren Qiaowei

On 01/27/2014 05:38 AM, David Rientjes wrote:

On Sun, 26 Jan 2014, Ren Qiaowei wrote:


arch/x86/kernel/mpx.c: In function ‘do_mpx_bounds’:
arch/x86/kernel/mpx.c:407:3: warning: cast to pointer from integer of
different size [-Wint-to-pointer-cast]
arch/x86/kernel/mpx.c:409:3: warning: cast to pointer from integer of
different size [-Wint-to-pointer-cast]

and the documentation says you explicitly want to support this config.

These types of warnings are usually indicative of real problems when
you're storing upper and lower bits in 32-bit fields after casting them
from 64-bit values.

I'm also not sure if the added fields to the generic struct siginfo can be
justified for this.


According to MPX spec, for 32-bit case, the upper 32-bits of 64-bits bound
register are ignored, and so casting to pointer from 64-bit values should be
not produce any problems.



Ok, so this is intended per the spec which nobody reading the code is
going to know and people who report the compile warnings are going to
continue to question it.

How are you planning on suppressing the warnings?  It will probably
require either

  - separate 64-bit and 32-bit helper functions to do_mpx_bounds() to
do appropriate casts before casting to a pointer, or

  - a macro defined as a no-op for 64-bit and as a cast to 32-bit value
for 32-bit configs that will be used in do_mpx_bounds() and casted
to the pointer.

I agree with you and we should suppress all the warnings as possible. If 
I use (unsgined long) to cast like the following code, what do you think 
about it? sizeof(long) will be 4 for 32-bit.


info->si_lower = (void __user *)(unsigned long)
(xsave_buf->bndregs.bndregs[2*bndregno]);
info->si_upper = (void __user *)(unsigned long)
(~xsave_buf->bndregs.bndregs[2*bndregno+1]);

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren, Qiaowei

> -Original Message-
> From: Ingo Molnar [mailto:mingo.kernel@gmail.com] On Behalf Of Ingo
> Molnar
> Sent: Sunday, January 26, 2014 5:08 PM
> To: Ren, Qiaowei
> Cc: H. Peter Anvin; Thomas Gleixner; Ingo Molnar; x...@kernel.org;
> linux-kernel@vger.kernel.org; Peter Zijlstra; Linus Torvalds; Andrew Morton
> Subject: Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT,
> PR_MPX_RELEASE
> 
> 
> * Qiaowei Ren  wrote:
> 
> > @@ -7,6 +9,88 @@
> >  #include 
> >  #include 
> >
> > +static struct mmu_notifier mpx_mn;
> 
> > +static struct mmu_notifier_ops mpx_mmuops = {
> > +   .invalidate_range_end = mpx_invl_range_end, };
> > +
> > +int mpx_init(struct task_struct *tsk) {
> > +   if (!boot_cpu_has(X86_FEATURE_MPX))
> > +   return -EINVAL;
> > +
> > +   /* register mmu_notifier to delallocate the bound tables */
> > +   mpx_mn.ops = _mmuops;
> > +   mmu_notifier_register(_mn, current->mm);
> 
> 0)
> 
> I do think MPX should be supported by Linux, but this is the best thing I can 
> say
> about the code at the moment:
> 
> 1)
> 
> The above MMU notifier bit is absolutely disgusting: it adds an
> O(nr_mpx_tasks) overhead to every MMU operation!
> 
> MPX needs to be called from architecture methods. (And the MMU notifier
> should probably be removed, it literally invites such abuse.)
> 
> 2)
> 
> The whole MPX_INIT() macro wrappery is ugly beyond relief.
> 
> 3)
> 
> The kernel/sys.c bits are x86-only, it probably won't even build on other
> architectures.
> 
> 4)
> 
> I also argue that MPX setup should be automatic through the ELF
> loader:
> 
>  - MPX setup could also be initiated through the ELF notes section or
>so - similar to the executable bit stack attribute in ELF binaries.
>(See PT_GNU_STACK handling in fs/binfmt_elf.c.)
> 
>  - What is the typical life time of the bounds table? Does user-space
>get access to it? I don't see where it's discoverable to
>user-space. (for example for debuggers)
> 
> 5)
> 
> Teardown can be done through regular munmap() if the notifier is eliminated,
> so that step falls away as well.
> 
> 6)
> 
> How many entries are in the bounds table? Because mpx_invl_range_end()
> looks utterly unscalable if it has any size beyond 1-2 entries.
> 
The size of one bound table is 4M bytes for 64bit, and 16K bytes for 32bit. It 
can not be accessed by user-space, and it will be accessed automatically by 
hardware.

When #BR faults come, and the entry of the bound directory is invalid, one 
bound table have to be allocated to save value of the bounds. It is what the 
second patch does. As for the lifetime of the bound tables, we can see the 
following case:
  User application use malloc or mmap to dynamically allocate memory. when 
address in the memory region is first accessed, related entry in the bound 
directory will be checked and it should be invalid. Then one new bound table 
will be allocated.
  After a time, this memory area is released, and so the bound tables related 
with this memory area become meaningless and may be released. If we don't 
release these unnecessary bound tables, it will save the workload of allocation 
of new bound tables when the memory area related with the entry of the bound 
directory is allocated and accessed again. But in this way the loss of virtual 
space will have to be worried.

Anyway, what this patch does is only that when a piece of address space is 
unmapped, we destroy/unmap the bounds tables that correspond to that same 
address space. MMU notifier will add some overhead for those tasks which use 
MPX. But I have no better way to know when one address space is unmapped, and 
then destroy the related bounds tables. Do you have any suggestion?

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren, Qiaowei

> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org
> [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Ingo Molnar
> Sent: Sunday, January 26, 2014 4:40 PM
> To: Ren, Qiaowei
> Cc: H. Peter Anvin; Thomas Gleixner; Ingo Molnar; x...@kernel.org;
> linux-kernel@vger.kernel.org; Peter Zijlstra
> Subject: Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT,
> PR_MPX_RELEASE
> 
> 
> * Ren Qiaowei  wrote:
> 
> > On 01/26/2014 04:22 PM, Ingo Molnar wrote:
> > >
> > >* Qiaowei Ren  wrote:
> > >
> > >>This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl() commands
> > >>on the x86 platform. These commands can be used to init and release
> > >>MPX related resource.
> > >>
> > >>A MMU notifier will be registered during PR_MPX_INIT command
> > >>execution. So the bound tables can be automatically deallocated when
> > >>one memory area is unmapped.
> > >>
> > >>Signed-off-by: Qiaowei Ren 
> > >>---
> > >>  arch/x86/Kconfig |4 ++
> > >>  arch/x86/include/asm/mpx.h   |9 
> > >>  arch/x86/include/asm/processor.h |   16 +++
> > >>  arch/x86/kernel/mpx.c|   84
> ++
> > >>  include/uapi/linux/prctl.h   |6 +++
> > >>  kernel/sys.c |   12 +
> > >>  6 files changed, 131 insertions(+), 0 deletions(-)
> > >
> > > Hm. What is the expected typical frequency of these syscalls for a
> > > larger application like a web browser? Only once per startup
> > > typically, or will they be called more frequently?
> >
> > It will be only once per startup.
> 
> In that case it would be more efficient to make this part of the binary 
> execution
> environment so that exec() sets it up automatically, not a separate prctl()
> syscall.
> 
Sorry, I guess what I said is not accurate. Normally it will be only once per 
startup. But user application maybe only partly want to use MPX, e.g. only one 
thread of process. For those cases, user application need to enable MPX for 
specific part of the code. So it is not enough to set it up automatically.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/26/2014 04:22 PM, Ingo Molnar wrote:


* Qiaowei Ren  wrote:


This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl()
commands on the x86 platform. These commands can be used to
init and release MPX related resource.

A MMU notifier will be registered during PR_MPX_INIT
command execution. So the bound tables can be automatically
deallocated when one memory area is unmapped.

Signed-off-by: Qiaowei Ren 
---
  arch/x86/Kconfig |4 ++
  arch/x86/include/asm/mpx.h   |9 
  arch/x86/include/asm/processor.h |   16 +++
  arch/x86/kernel/mpx.c|   84 ++
  include/uapi/linux/prctl.h   |6 +++
  kernel/sys.c |   12 +
  6 files changed, 131 insertions(+), 0 deletions(-)


Hm. What is the expected typical frequency of these syscalls for a
larger application like a web browser? Only once per startup
typically, or will they be called more frequently?



It will be only once per startup.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 0/4] Intel MPX support

2014-01-26 Thread Ren Qiaowei

On 01/26/2014 04:19 PM, Ingo Molnar wrote:


* Qiaowei Ren  wrote:


This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in "Intel(R) Architecture
Instruction Set Extensions Programming Reference".

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.


AFAICS the kernel side implementation causes no runtime overhead for
non-MPX workloads, and also causes no runtime overhead for non-MPX
hardware, right?


Yes.


Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator


Changes since v1:
   * check to see if #BR occurred in userspace or kernel space.
   * use generic structure and macro as much as possible when
 decode mpx instructions.

Changes since v2:
   * fix some compile warnings.
   * update documentation.

Qiaowei Ren (4):
   x86, mpx: add documentation on Intel MPX
   x86, mpx: hook #BR exception handler to allocate bound tables
   x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE
   x86, mpx: extend siginfo structure to include bound violation
 information

  Documentation/x86/intel_mpx.txt|  226 
  arch/x86/Kconfig   |4 +
  arch/x86/include/asm/mpx.h |   63 ++
  arch/x86/include/asm/processor.h   |   16 ++
  arch/x86/kernel/Makefile   |1 +
  arch/x86/kernel/mpx.c  |  415 
  arch/x86/kernel/traps.c|   61 +-
  include/uapi/asm-generic/siginfo.h |9 +-
  include/uapi/linux/prctl.h |6 +
  kernel/signal.c|4 +
  kernel/sys.c   |   12 +
  11 files changed, 815 insertions(+), 2 deletions(-)
  create mode 100644 Documentation/x86/intel_mpx.txt
  create mode 100644 arch/x86/include/asm/mpx.h
  create mode 100644 arch/x86/kernel/mpx.c


Ok, this summary was pretty good - it addresses my prior objections
wrt. submission quality. Once the details are fleshed out this sure
looks like a useful feature.


Thanks. I apologize for previous submission.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 0/4] Intel MPX support

2014-01-26 Thread Ren Qiaowei

On 01/26/2014 04:19 PM, Ingo Molnar wrote:


* Qiaowei Ren qiaowei@intel.com wrote:


This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in Intel(R) Architecture
Instruction Set Extensions Programming Reference.

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.


AFAICS the kernel side implementation causes no runtime overhead for
non-MPX workloads, and also causes no runtime overhead for non-MPX
hardware, right?


Yes.


Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator


Changes since v1:
   * check to see if #BR occurred in userspace or kernel space.
   * use generic structure and macro as much as possible when
 decode mpx instructions.

Changes since v2:
   * fix some compile warnings.
   * update documentation.

Qiaowei Ren (4):
   x86, mpx: add documentation on Intel MPX
   x86, mpx: hook #BR exception handler to allocate bound tables
   x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE
   x86, mpx: extend siginfo structure to include bound violation
 information

  Documentation/x86/intel_mpx.txt|  226 
  arch/x86/Kconfig   |4 +
  arch/x86/include/asm/mpx.h |   63 ++
  arch/x86/include/asm/processor.h   |   16 ++
  arch/x86/kernel/Makefile   |1 +
  arch/x86/kernel/mpx.c  |  415 
  arch/x86/kernel/traps.c|   61 +-
  include/uapi/asm-generic/siginfo.h |9 +-
  include/uapi/linux/prctl.h |6 +
  kernel/signal.c|4 +
  kernel/sys.c   |   12 +
  11 files changed, 815 insertions(+), 2 deletions(-)
  create mode 100644 Documentation/x86/intel_mpx.txt
  create mode 100644 arch/x86/include/asm/mpx.h
  create mode 100644 arch/x86/kernel/mpx.c


Ok, this summary was pretty good - it addresses my prior objections
wrt. submission quality. Once the details are fleshed out this sure
looks like a useful feature.


Thanks. I apologize for previous submission.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/26/2014 04:22 PM, Ingo Molnar wrote:


* Qiaowei Ren qiaowei@intel.com wrote:


This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl()
commands on the x86 platform. These commands can be used to
init and release MPX related resource.

A MMU notifier will be registered during PR_MPX_INIT
command execution. So the bound tables can be automatically
deallocated when one memory area is unmapped.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
  arch/x86/Kconfig |4 ++
  arch/x86/include/asm/mpx.h   |9 
  arch/x86/include/asm/processor.h |   16 +++
  arch/x86/kernel/mpx.c|   84 ++
  include/uapi/linux/prctl.h   |6 +++
  kernel/sys.c |   12 +
  6 files changed, 131 insertions(+), 0 deletions(-)


Hm. What is the expected typical frequency of these syscalls for a
larger application like a web browser? Only once per startup
typically, or will they be called more frequently?



It will be only once per startup.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren, Qiaowei

 -Original Message-
 From: linux-kernel-ow...@vger.kernel.org
 [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Ingo Molnar
 Sent: Sunday, January 26, 2014 4:40 PM
 To: Ren, Qiaowei
 Cc: H. Peter Anvin; Thomas Gleixner; Ingo Molnar; x...@kernel.org;
 linux-kernel@vger.kernel.org; Peter Zijlstra
 Subject: Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT,
 PR_MPX_RELEASE
 
 
 * Ren Qiaowei qiaowei@intel.com wrote:
 
  On 01/26/2014 04:22 PM, Ingo Molnar wrote:
  
  * Qiaowei Ren qiaowei@intel.com wrote:
  
  This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl() commands
  on the x86 platform. These commands can be used to init and release
  MPX related resource.
  
  A MMU notifier will be registered during PR_MPX_INIT command
  execution. So the bound tables can be automatically deallocated when
  one memory area is unmapped.
  
  Signed-off-by: Qiaowei Ren qiaowei@intel.com
  ---
arch/x86/Kconfig |4 ++
arch/x86/include/asm/mpx.h   |9 
arch/x86/include/asm/processor.h |   16 +++
arch/x86/kernel/mpx.c|   84
 ++
include/uapi/linux/prctl.h   |6 +++
kernel/sys.c |   12 +
6 files changed, 131 insertions(+), 0 deletions(-)
  
   Hm. What is the expected typical frequency of these syscalls for a
   larger application like a web browser? Only once per startup
   typically, or will they be called more frequently?
 
  It will be only once per startup.
 
 In that case it would be more efficient to make this part of the binary 
 execution
 environment so that exec() sets it up automatically, not a separate prctl()
 syscall.
 
Sorry, I guess what I said is not accurate. Normally it will be only once per 
startup. But user application maybe only partly want to use MPX, e.g. only one 
thread of process. For those cases, user application need to enable MPX for 
specific part of the code. So it is not enough to set it up automatically.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren, Qiaowei

 -Original Message-
 From: Ingo Molnar [mailto:mingo.kernel@gmail.com] On Behalf Of Ingo
 Molnar
 Sent: Sunday, January 26, 2014 5:08 PM
 To: Ren, Qiaowei
 Cc: H. Peter Anvin; Thomas Gleixner; Ingo Molnar; x...@kernel.org;
 linux-kernel@vger.kernel.org; Peter Zijlstra; Linus Torvalds; Andrew Morton
 Subject: Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT,
 PR_MPX_RELEASE
 
 
 * Qiaowei Ren qiaowei@intel.com wrote:
 
  @@ -7,6 +9,88 @@
   #include asm/fpu-internal.h
   #include asm/alternative.h
 
  +static struct mmu_notifier mpx_mn;
 
  +static struct mmu_notifier_ops mpx_mmuops = {
  +   .invalidate_range_end = mpx_invl_range_end, };
  +
  +int mpx_init(struct task_struct *tsk) {
  +   if (!boot_cpu_has(X86_FEATURE_MPX))
  +   return -EINVAL;
  +
  +   /* register mmu_notifier to delallocate the bound tables */
  +   mpx_mn.ops = mpx_mmuops;
  +   mmu_notifier_register(mpx_mn, current-mm);
 
 0)
 
 I do think MPX should be supported by Linux, but this is the best thing I can 
 say
 about the code at the moment:
 
 1)
 
 The above MMU notifier bit is absolutely disgusting: it adds an
 O(nr_mpx_tasks) overhead to every MMU operation!
 
 MPX needs to be called from architecture methods. (And the MMU notifier
 should probably be removed, it literally invites such abuse.)
 
 2)
 
 The whole MPX_INIT() macro wrappery is ugly beyond relief.
 
 3)
 
 The kernel/sys.c bits are x86-only, it probably won't even build on other
 architectures.
 
 4)
 
 I also argue that MPX setup should be automatic through the ELF
 loader:
 
  - MPX setup could also be initiated through the ELF notes section or
so - similar to the executable bit stack attribute in ELF binaries.
(See PT_GNU_STACK handling in fs/binfmt_elf.c.)
 
  - What is the typical life time of the bounds table? Does user-space
get access to it? I don't see where it's discoverable to
user-space. (for example for debuggers)
 
 5)
 
 Teardown can be done through regular munmap() if the notifier is eliminated,
 so that step falls away as well.
 
 6)
 
 How many entries are in the bounds table? Because mpx_invl_range_end()
 looks utterly unscalable if it has any size beyond 1-2 entries.
 
The size of one bound table is 4M bytes for 64bit, and 16K bytes for 32bit. It 
can not be accessed by user-space, and it will be accessed automatically by 
hardware.

When #BR faults come, and the entry of the bound directory is invalid, one 
bound table have to be allocated to save value of the bounds. It is what the 
second patch does. As for the lifetime of the bound tables, we can see the 
following case:
  User application use malloc or mmap to dynamically allocate memory. when 
address in the memory region is first accessed, related entry in the bound 
directory will be checked and it should be invalid. Then one new bound table 
will be allocated.
  After a time, this memory area is released, and so the bound tables related 
with this memory area become meaningless and may be released. If we don't 
release these unnecessary bound tables, it will save the workload of allocation 
of new bound tables when the memory area related with the entry of the bound 
directory is allocated and accessed again. But in this way the loss of virtual 
space will have to be worried.

Anyway, what this patch does is only that when a piece of address space is 
unmapped, we destroy/unmap the bounds tables that correspond to that same 
address space. MMU notifier will add some overhead for those tasks which use 
MPX. But I have no better way to know when one address space is unmapped, and 
then destroy the related bounds tables. Do you have any suggestion?

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/4] x86, mpx: extend siginfo structure to include bound violation information

2014-01-26 Thread Ren Qiaowei

On 01/27/2014 05:38 AM, David Rientjes wrote:

On Sun, 26 Jan 2014, Ren Qiaowei wrote:


arch/x86/kernel/mpx.c: In function ‘do_mpx_bounds’:
arch/x86/kernel/mpx.c:407:3: warning: cast to pointer from integer of
different size [-Wint-to-pointer-cast]
arch/x86/kernel/mpx.c:409:3: warning: cast to pointer from integer of
different size [-Wint-to-pointer-cast]

and the documentation says you explicitly want to support this config.

These types of warnings are usually indicative of real problems when
you're storing upper and lower bits in 32-bit fields after casting them
from 64-bit values.

I'm also not sure if the added fields to the generic struct siginfo can be
justified for this.


According to MPX spec, for 32-bit case, the upper 32-bits of 64-bits bound
register are ignored, and so casting to pointer from 64-bit values should be
not produce any problems.



Ok, so this is intended per the spec which nobody reading the code is
going to know and people who report the compile warnings are going to
continue to question it.

How are you planning on suppressing the warnings?  It will probably
require either

  - separate 64-bit and 32-bit helper functions to do_mpx_bounds() to
do appropriate casts before casting to a pointer, or

  - a macro defined as a no-op for 64-bit and as a cast to 32-bit value
for 32-bit configs that will be used in do_mpx_bounds() and casted
to the pointer.

I agree with you and we should suppress all the warnings as possible. If 
I use (unsgined long) to cast like the following code, what do you think 
about it? sizeof(long) will be 4 for 32-bit.


info-si_lower = (void __user *)(unsigned long)
(xsave_buf-bndregs.bndregs[2*bndregno]);
info-si_upper = (void __user *)(unsigned long)
(~xsave_buf-bndregs.bndregs[2*bndregno+1]);

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/27/2014 09:50 AM, H. Peter Anvin wrote:

On 01/26/2014 12:39 AM, Ingo Molnar wrote:


It will be only once per startup.


In that case it would be more efficient to make this part of the
binary execution environment so that exec() sets it up automatically,
not a separate prctl() syscall.



This is not necessarily possible, and in particular it might need to be
deferred until the MPX runtime has initialized.

What isn't clear to me is if these syscalls are needed at all, or if it
would be better to just let the MPX runtile set BNDSTATUS and BNDCFGU
directly in userspace.  The kernel cannot rely on them staying
consistent across userspace anyway.

Now, it might be beneficial for the kernel to have them anyway.  It's a
bit of a tough call.

-hpa

Peter, you mean we should remove these two call and do what they do in 
user-space, right?


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/4] x86, mpx: extend siginfo structure to include bound violation information

2014-01-26 Thread Ren Qiaowei

On 01/27/2014 09:53 AM, H. Peter Anvin wrote:

On 01/26/2014 05:34 PM, Ren Qiaowei wrote:



I agree with you and we should suppress all the warnings as possible. If
I use (unsgined long) to cast like the following code, what do you think
about it? sizeof(long) will be 4 for 32-bit.

 info-si_lower = (void __user *)(unsigned long)
 (xsave_buf-bndregs.bndregs[2*bndregno]);
 info-si_upper = (void __user *)(unsigned long)
 (~xsave_buf-bndregs.bndregs[2*bndregno+1]);



That is the way it is usually done, yes.  Add a comment saying something
like:

/* Note: the upper 32 bits are ignored in 32-bit mode. */

It is worth watching out a bit here, though, because you could be
running a 32-bit process on top of a 64-bit kernel.


Ok. I will update it in next version.

Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/26/2014 11:14 PM, Ingo Molnar wrote:


* Ren, Qiaowei qiaowei@intel.com wrote:


The size of one bound table is 4M bytes for 64bit, and 16K bytes for
32bit. It can not be accessed by user-space, and it will be accessed
automatically by hardware.


So, here's the bound-table allocation AFAICS:

+static bool allocate_bt(unsigned long bd_entry)
+{
+   unsigned long bt_size = 1UL  (MPX_L2_BITS+MPX_L2_SHIFT);
+   unsigned long bt_addr, old_val = 0;
+
+   bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE,
+   MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);

What ensures that user-space cannot access (and in particular, modify)
the pages at bt_addr? It's a read-write anonymous mapping AFAICS.

Looks like that we can not be able to ensure this. I just mean that 
user-space doesn't know the bound tables, and it should not access them 
also.


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-26 Thread Ren Qiaowei

On 01/27/2014 10:10 AM, H. Peter Anvin wrote:

On 01/26/2014 05:55 PM, Ren Qiaowei wrote:


Peter, you mean we should remove these two call and do what they do in
user-space, right?



Unless we think there is a benefit to the kernel to have a on/off switch
for the #BR exception (if disabled, all #BR exceptions are signals,
regardless of source.)

There might be, would like other people's opinion.



These two syscalls are only used to release automaticlly the bound 
tables, and they will not set BNDSTATUS and BNDCFGU.


I want to remove them also. :) But if so, we have to release the bound 
tables in user-space.


Thanks,
Qiaowei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   >