Re: [PATCH v8 00/10] Intel MPX support

2014-09-13 Thread Thomas Gleixner
On Fri, 12 Sep 2014, Dave Hansen wrote:

> OK, here's some revised text for patch 00/10.  Again, this will
> obviously be updated for the next post, but comments before that would
> be much appreciated.

That looks good. So much of this wants to end up in documentation as
well.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/10] Intel MPX support

2014-09-13 Thread Thomas Gleixner
On Fri, 12 Sep 2014, Dave Hansen wrote:

> On 09/12/2014 12:21 PM, Thomas Gleixner wrote:
> > Yes, the most important question is WHY must the kernel handle the
> > bound table memory allocation in the first place. The "documentation"
> > patch completely fails to tell that.
> 
> This will become the description of "patch 04/10".  Feel free to wait

Thanks for writing this up! That helps a lot.

> until we repost these to read it, but I'm posting it here because it's
> going to be a couple of days before we actually get a new set of patches
> out.
> 
> Any suggestions for how much of this is appropriate for Documentation/
> would be much appreciated.  I don't have a good feel for it.

I think all of it. The kernels problem is definitely not that it
drains in documentation :)
 
> Having ruled out all of the userspace-only approaches for managing
> bounds tables that we could think of, we create them on demand
> in the kernel.

So what the documentation wants on top of this is the rule set which
describes the expected behaviour of sane applications and perhaps the
potential consequences for insane ones. Not that people care about
that much, but at least we can point them to documentation if they
come up with their weird ass "bug" reports :)

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/10] Intel MPX support

2014-09-13 Thread Thomas Gleixner
On Fri, 12 Sep 2014, Dave Hansen wrote:

 On 09/12/2014 12:21 PM, Thomas Gleixner wrote:
  Yes, the most important question is WHY must the kernel handle the
  bound table memory allocation in the first place. The documentation
  patch completely fails to tell that.
 
 This will become the description of patch 04/10.  Feel free to wait

Thanks for writing this up! That helps a lot.

 until we repost these to read it, but I'm posting it here because it's
 going to be a couple of days before we actually get a new set of patches
 out.
 
 Any suggestions for how much of this is appropriate for Documentation/
 would be much appreciated.  I don't have a good feel for it.

I think all of it. The kernels problem is definitely not that it
drains in documentation :)
 
 Having ruled out all of the userspace-only approaches for managing
 bounds tables that we could think of, we create them on demand
 in the kernel.

So what the documentation wants on top of this is the rule set which
describes the expected behaviour of sane applications and perhaps the
potential consequences for insane ones. Not that people care about
that much, but at least we can point them to documentation if they
come up with their weird ass bug reports :)

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/10] Intel MPX support

2014-09-13 Thread Thomas Gleixner
On Fri, 12 Sep 2014, Dave Hansen wrote:

 OK, here's some revised text for patch 00/10.  Again, this will
 obviously be updated for the next post, but comments before that would
 be much appreciated.

That looks good. So much of this wants to end up in documentation as
well.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/10] Intel MPX support

2014-09-12 Thread Dave Hansen
OK, here's some revised text for patch 00/10.  Again, this will
obviously be updated for the next post, but comments before that would
be much appreciated.

-

This patch set adds support for the Memory Protection eXtensions
(MPX) feature found in future Intel processors.  MPX is used in
conjunction with compiler changes to check memory references, and can be
used to catch buffer overflow or underflow.

For MPX to work, changes are required in the kernel, binutils and
compiler.  No source changes are required for applications, just a
recompile.

There are a lot of moving parts of this to all work right:

= Example Compiler / Application / Kernel Interaction =

1. Application developer compiles with -fmpx.  The compiler will add the
   instrumentation as well as some setup code called early after the app
   starts.  New instruction prefixes are noops for old CPUs.
2. That setup code allocates (virtual) space for the "bounds directory",
   points the "bndcfgu" register to the directory and notifies the
   kernel (via the new prctl()) that the app will be using MPX.
3. The kernel detects that the CPU has MPX, allows the new prctl() to
   succeed, and notes the location of the bounds directory.  We note it
   instead of reading it each time because the 'xsave' operation needed
   to access the bounds directory register is an expensive operation.
4. If the application needs to spill bounds out of the 4 registers, it
   issues a bndstx instruction.  Since the bounds directory is empty at
   this point, a bounds fault (#BR) is raised, the kernel allocates a
   bounds table (in the user address space) and makes the relevant
   entry in the bounds directory point to the new table. [1]
5. If the application violates the bounds specified in the bounds
   registers, a separate kind of #BR is raised which will deliver a
   signal with information about the violation in the 'struct siginfo'.
6. Whenever memory is freed, we know that it can no longer contain
   valid pointers, and we attempt to free the associated space in the
   bounds tables.  If an entire table becomes unused, we will attempt
   to free the table and remove the entry in the directory.

To summarize, there are essentially three things interacting here:

GCC with -fmpx:
 * enables annotation of code with MPX instructions and prefixes
 * inserts code early in the application to call in to the "gcc runtime"
GCC MPX Runtime:
 * Checks for hardware MPX support in cpuid leaf
 * allocates virtual space for the bounds directory (malloc()
   essentially)
 * points the hardware BNDCFGU register at the directory
 * calls a new prctl() to notify the kernel to start managing the
   bounds directories
Kernel MPX Code:
 * Checks for hardware MPX support in cpuid leaf
 * Handles #BR exceptions and sends SIGSEGV to the app when it violates
   bounds, like during a buffer overflow.
 * When bounds are spilled in to an unallocated bounds table, the kernel
   notices in the #BR exception, allocates the virtual space, then
   updates the bounds directory to point to the new table.  It keeps
   special track of the memory with a VM_MPX flag.
 * Frees unused bounds tables at the time that the memory they described
   is unmapped. (See "cleanup unused bound tables")

= Testing =

This patchset has been tested on real internal hardware platform at
Intel.  We have some simple unit tests in user space, which directly
call MPX instructions to produce #BR to let kernel allocate bounds
tables and cause bounds violations. We also compiled several benchmarks
with an MPX-enabled compiler and ran them with this patch set.  We found
a number of bugs in this code in these tests.

1. For more info on why the kernel does these allocations, see the patch
"on-demand kernel allocation of bounds tables"

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/10] Intel MPX support

2014-09-12 Thread Dave Hansen
On 09/12/2014 12:21 PM, Thomas Gleixner wrote:
> On Thu, 11 Sep 2014, Dave Hansen wrote:
>> +When #BR fault is produced due to invalid entry, bounds table will be
>> +created in kernel on demand and kernel will not transfer this fault to
>> +userspace. So usersapce can't receive #BR fault for invalid entry, and
>> +it is not also necessary for users to create bounds tables by themselves.
>> +
>> +Certainly users can allocate bounds tables and forcibly point the bounds
>> +directory at them through XSAVE instruction, and then set valid bit
>> +of bounds entry to have this entry valid. But we have no way to track
>> +the memory usage of these user-created bounds tables. In regard to this,
>> +this behaviour is outlawed here.
> 
> So what's the point of declaring it outlawed? Nothing as far as I can
> see simply because you cannot enforce it. This is possible and people
> simply will do it.

All that we want to get across is: if the kernel didn't make the mess,
we're not going to clean it up.

Userspace is free to do whatever the heck it wants.  But, if it wants
the kernel to clean up the bounds tables, it needs to follow the rules
we're laying out here.

I think it boils down to two rules:
1. Don't move the bounds directory without telling the kernel.
2. The kernel will not free any memory which it did not allocate.

>> +2) We will not support the case that multiple bounds directory entries
>> +are pointed at the same bounds table.
>> +
>> +Users can be allowed to take multiple bounds directory entries and point
>> +them at the same bounds table. See more information "Intel(R) Architecture
>> +Instruction Set Extensions Programming Reference" (9.3.4).
>> +
>> +If userspace did this, it will be possible for kernel to unmap an in-use
>> +bounds table since it does not recognize sharing. So this behavior is
>> +also outlawed here.
> 
> Again, this is nothing you can enforce and just saying its outlawed
> does not prevent user space from doing it and then sending hard to
> decode bug reports where it complains about mappings silently
> vanishing under it.
> 
> So all you can do here is to write up a rule set how well behaving
> user space is supposed to use this facility and the kernel side of it. 

"Outlaw" was probably the wrong word.

I completely agree that all we can do is set up a set of rules for what
well-behaved userspace is expected to do.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/10] Intel MPX support

2014-09-12 Thread Dave Hansen
On 09/12/2014 12:21 PM, Thomas Gleixner wrote:
> Yes, the most important question is WHY must the kernel handle the
> bound table memory allocation in the first place. The "documentation"
> patch completely fails to tell that.

This will become the description of "patch 04/10".  Feel free to wait
until we repost these to read it, but I'm posting it here because it's
going to be a couple of days before we actually get a new set of patches
out.

Any suggestions for how much of this is appropriate for Documentation/
would be much appreciated.  I don't have a good feel for it.

---

Subject: x86: mpx: on-demand kernel allocation of bounds tables
MPX only has 4 hardware registers for storing bounds information.
If MPX-enabled code needs more than these 4 registers, it needs
to spill them somewhere.  It has two special instructions for
this which allow the bounds to be moved between the bounds
registers and some new "bounds tables".

#BR exceptions are a new class of exceptions just for MPX.  They
are similar conceptually to a page fault and will be raised by
the MPX hardware during both bounds violations or when the tables
are not present.  This patch handles those #BR exceptions for
not-present tables by carving the space out of the normal
processes address space (essentially calling mmap() from inside
the kernel) and then pointing the bounds-directory over to it.

The tables *need* to be accessed and controlled by userspace
because the instructions for moving bounds in and out of them are
extremely frequent.  They potentially happen every time a
register points to memory.  Any direct kernel involvement (like a
syscall) to access the tables would obviously destroy
performance.

 Why not do this in userspace? 

This patch is obviously doing this allocation in the kernel.
However, MPX does not strictly *require* anything in the kernel.
It can theoretically be done completely from userspace.  Here are
a few ways this *could* be done.  I don't think any of them are
practical in the real-world, but here they are.

Q: Can virtual space simply be reserved for the bounds tables so
   that we never have to allocate them?
A: As noted earlier, these tables are *HUGE*.  An X-GB virtual
   area needs 4*X GB of virtual space, plus 2GB for the bounds
   directory.  If we were to preallocate them for the 128TB of
   user virtual address space, we would need to reserve 512TB+2GB,
   which is larger than the entire virtual address space today.
   This means they can not be reserved ahead of time.  Also, a
   single process's pre-popualated bounds directory consumes 2GB
   of virtual *AND* physical memory.  IOW, it's completely
   infeasible to prepopulate bounds directories.

Q: Can we preallocate bounds table space at the same time memory
   is allocated which might contain pointers that might eventually
   need bounds tables?
A: This would work if we could hook the site of each and every
   memory allocation syscall.  This can be done for small,
   constrained applications.  But, it isn't practical at a larger
   scale since a given app has no way of controlling how all the
   parts of the app migth allocate memory (think libraries).  The
   kernel is really the only place to intercept these calls.

Q: Could a bounds fault be handed to userspace and the tables
   allocated there in a signal handler intead of in the kernel?
A: (thanks to tglx) mmap() is not on the list of safe async
   handler functions and even if mmap() would work it still
   requires locking or nasty tricks to keep track of the
   allocation state there.

Having ruled out all of the userspace-only approaches for managing
bounds tables that we could think of, we create them on demand
in the kernel.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/10] Intel MPX support

2014-09-12 Thread Thomas Gleixner
On Thu, 11 Sep 2014, Dave Hansen wrote:

> On 09/11/2014 01:46 AM, Qiaowei Ren wrote:
> > MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
> > provide handlers for bounds faults (#BR), and manage bounds memory.
> 
> Qiaowei, We probably need to mention here what "bounds memory" is, and
> why it has to be managed, and who is responsible for the different pieces.
> 
> Who allocates the memory?
> Who fills the memory?
> When is it freed?
> 
> Thomas, do you have any other suggestions for things you'd like to see
> clarified?

Yes, the most important question is WHY must the kernel handle the
bound table memory allocation in the first place. The "documentation"
patch completely fails to tell that.

> +3. Tips
> +===
> +
> +1) Users are not allowed to create bounds tables and point the bounds
> +directory at them in the userspace. In fact, it is not also necessary
> +for users to create bounds tables in the userspace.

This misses to explain why. I studied the manual carefully and I have
no idea why you think this is a requirement.

MPX can be handled completely from user space. See below before you
answer.

> +When #BR fault is produced due to invalid entry, bounds table will be
> +created in kernel on demand and kernel will not transfer this fault to
> +userspace. So usersapce can't receive #BR fault for invalid entry, and
> +it is not also necessary for users to create bounds tables by themselves.
> +
> +Certainly users can allocate bounds tables and forcibly point the bounds
> +directory at them through XSAVE instruction, and then set valid bit
> +of bounds entry to have this entry valid. But we have no way to track
> +the memory usage of these user-created bounds tables. In regard to this,
> +this behaviour is outlawed here.

So what's the point of declaring it outlawed? Nothing as far as I can
see simply because you cannot enforce it. This is possible and people
simply will do it.

> +2) We will not support the case that multiple bounds directory entries
> +are pointed at the same bounds table.
> +
> +Users can be allowed to take multiple bounds directory entries and point
> +them at the same bounds table. See more information "Intel(R) Architecture
> +Instruction Set Extensions Programming Reference" (9.3.4).
> +
> +If userspace did this, it will be possible for kernel to unmap an in-use
> +bounds table since it does not recognize sharing. So this behavior is
> +also outlawed here.

Again, this is nothing you can enforce and just saying its outlawed
does not prevent user space from doing it and then sending hard to
decode bug reports where it complains about mappings silently
vanishing under it.

So all you can do here is to write up a rule set how well behaving
user space is supposed to use this facility and the kernel side of it. 

Now back to the original question WHY:

The only kind of "argument" you provided in the whole blurb is "if
user space handles the allocation we have no way to track the memory
usage of these tables".

So if the only value of this whole allocation endavour is that we have
a separate "name" entry in proc/$PID/maps then this definitely does
not justify the mess it creates. You'd be better off with creating a
syscall which allows putting a name tag on a anonymous
mapping. Seriously, that would be handy for other purposes than MPX as
well.

But after staring into the manual and the code trainwreck for a day, I
certainly know WHY you want to handle it in kernel space.

If user space wants to handle it, it needs to preallocate all the
Bound Table mappings simply because it cannot do so from the signal
handler which gets invoked on the #BR 'Invalid BD entry'. mmap is not
on the list of safe async handler functions and even if mmap would
work it still requires locking or nasty tricks to keep track of the
allocation state there.

Preallocation is simply not feasible, because user space does not know
about the requirements of libraries etc. So letting the kernel help
out here is the right approach.

All that information is completely missing in the "doc" and all
over the patch series. 

Thanks,

tglx





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/10] Intel MPX support

2014-09-12 Thread Thomas Gleixner
On Thu, 11 Sep 2014, Dave Hansen wrote:

 On 09/11/2014 01:46 AM, Qiaowei Ren wrote:
  MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
  provide handlers for bounds faults (#BR), and manage bounds memory.
 
 Qiaowei, We probably need to mention here what bounds memory is, and
 why it has to be managed, and who is responsible for the different pieces.
 
 Who allocates the memory?
 Who fills the memory?
 When is it freed?
 
 Thomas, do you have any other suggestions for things you'd like to see
 clarified?

Yes, the most important question is WHY must the kernel handle the
bound table memory allocation in the first place. The documentation
patch completely fails to tell that.

 +3. Tips
 +===
 +
 +1) Users are not allowed to create bounds tables and point the bounds
 +directory at them in the userspace. In fact, it is not also necessary
 +for users to create bounds tables in the userspace.

This misses to explain why. I studied the manual carefully and I have
no idea why you think this is a requirement.

MPX can be handled completely from user space. See below before you
answer.

 +When #BR fault is produced due to invalid entry, bounds table will be
 +created in kernel on demand and kernel will not transfer this fault to
 +userspace. So usersapce can't receive #BR fault for invalid entry, and
 +it is not also necessary for users to create bounds tables by themselves.
 +
 +Certainly users can allocate bounds tables and forcibly point the bounds
 +directory at them through XSAVE instruction, and then set valid bit
 +of bounds entry to have this entry valid. But we have no way to track
 +the memory usage of these user-created bounds tables. In regard to this,
 +this behaviour is outlawed here.

So what's the point of declaring it outlawed? Nothing as far as I can
see simply because you cannot enforce it. This is possible and people
simply will do it.

 +2) We will not support the case that multiple bounds directory entries
 +are pointed at the same bounds table.
 +
 +Users can be allowed to take multiple bounds directory entries and point
 +them at the same bounds table. See more information Intel(R) Architecture
 +Instruction Set Extensions Programming Reference (9.3.4).
 +
 +If userspace did this, it will be possible for kernel to unmap an in-use
 +bounds table since it does not recognize sharing. So this behavior is
 +also outlawed here.

Again, this is nothing you can enforce and just saying its outlawed
does not prevent user space from doing it and then sending hard to
decode bug reports where it complains about mappings silently
vanishing under it.

So all you can do here is to write up a rule set how well behaving
user space is supposed to use this facility and the kernel side of it. 

Now back to the original question WHY:

The only kind of argument you provided in the whole blurb is if
user space handles the allocation we have no way to track the memory
usage of these tables.

So if the only value of this whole allocation endavour is that we have
a separate name entry in proc/$PID/maps then this definitely does
not justify the mess it creates. You'd be better off with creating a
syscall which allows putting a name tag on a anonymous
mapping. Seriously, that would be handy for other purposes than MPX as
well.

But after staring into the manual and the code trainwreck for a day, I
certainly know WHY you want to handle it in kernel space.

If user space wants to handle it, it needs to preallocate all the
Bound Table mappings simply because it cannot do so from the signal
handler which gets invoked on the #BR 'Invalid BD entry'. mmap is not
on the list of safe async handler functions and even if mmap would
work it still requires locking or nasty tricks to keep track of the
allocation state there.

Preallocation is simply not feasible, because user space does not know
about the requirements of libraries etc. So letting the kernel help
out here is the right approach.

All that information is completely missing in the doc and all
over the patch series. 

Thanks,

tglx





--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/10] Intel MPX support

2014-09-12 Thread Dave Hansen
On 09/12/2014 12:21 PM, Thomas Gleixner wrote:
 Yes, the most important question is WHY must the kernel handle the
 bound table memory allocation in the first place. The documentation
 patch completely fails to tell that.

This will become the description of patch 04/10.  Feel free to wait
until we repost these to read it, but I'm posting it here because it's
going to be a couple of days before we actually get a new set of patches
out.

Any suggestions for how much of this is appropriate for Documentation/
would be much appreciated.  I don't have a good feel for it.

---

Subject: x86: mpx: on-demand kernel allocation of bounds tables
MPX only has 4 hardware registers for storing bounds information.
If MPX-enabled code needs more than these 4 registers, it needs
to spill them somewhere.  It has two special instructions for
this which allow the bounds to be moved between the bounds
registers and some new bounds tables.

#BR exceptions are a new class of exceptions just for MPX.  They
are similar conceptually to a page fault and will be raised by
the MPX hardware during both bounds violations or when the tables
are not present.  This patch handles those #BR exceptions for
not-present tables by carving the space out of the normal
processes address space (essentially calling mmap() from inside
the kernel) and then pointing the bounds-directory over to it.

The tables *need* to be accessed and controlled by userspace
because the instructions for moving bounds in and out of them are
extremely frequent.  They potentially happen every time a
register points to memory.  Any direct kernel involvement (like a
syscall) to access the tables would obviously destroy
performance.

 Why not do this in userspace? 

This patch is obviously doing this allocation in the kernel.
However, MPX does not strictly *require* anything in the kernel.
It can theoretically be done completely from userspace.  Here are
a few ways this *could* be done.  I don't think any of them are
practical in the real-world, but here they are.

Q: Can virtual space simply be reserved for the bounds tables so
   that we never have to allocate them?
A: As noted earlier, these tables are *HUGE*.  An X-GB virtual
   area needs 4*X GB of virtual space, plus 2GB for the bounds
   directory.  If we were to preallocate them for the 128TB of
   user virtual address space, we would need to reserve 512TB+2GB,
   which is larger than the entire virtual address space today.
   This means they can not be reserved ahead of time.  Also, a
   single process's pre-popualated bounds directory consumes 2GB
   of virtual *AND* physical memory.  IOW, it's completely
   infeasible to prepopulate bounds directories.

Q: Can we preallocate bounds table space at the same time memory
   is allocated which might contain pointers that might eventually
   need bounds tables?
A: This would work if we could hook the site of each and every
   memory allocation syscall.  This can be done for small,
   constrained applications.  But, it isn't practical at a larger
   scale since a given app has no way of controlling how all the
   parts of the app migth allocate memory (think libraries).  The
   kernel is really the only place to intercept these calls.

Q: Could a bounds fault be handed to userspace and the tables
   allocated there in a signal handler intead of in the kernel?
A: (thanks to tglx) mmap() is not on the list of safe async
   handler functions and even if mmap() would work it still
   requires locking or nasty tricks to keep track of the
   allocation state there.

Having ruled out all of the userspace-only approaches for managing
bounds tables that we could think of, we create them on demand
in the kernel.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/10] Intel MPX support

2014-09-12 Thread Dave Hansen
On 09/12/2014 12:21 PM, Thomas Gleixner wrote:
 On Thu, 11 Sep 2014, Dave Hansen wrote:
 +When #BR fault is produced due to invalid entry, bounds table will be
 +created in kernel on demand and kernel will not transfer this fault to
 +userspace. So usersapce can't receive #BR fault for invalid entry, and
 +it is not also necessary for users to create bounds tables by themselves.
 +
 +Certainly users can allocate bounds tables and forcibly point the bounds
 +directory at them through XSAVE instruction, and then set valid bit
 +of bounds entry to have this entry valid. But we have no way to track
 +the memory usage of these user-created bounds tables. In regard to this,
 +this behaviour is outlawed here.
 
 So what's the point of declaring it outlawed? Nothing as far as I can
 see simply because you cannot enforce it. This is possible and people
 simply will do it.

All that we want to get across is: if the kernel didn't make the mess,
we're not going to clean it up.

Userspace is free to do whatever the heck it wants.  But, if it wants
the kernel to clean up the bounds tables, it needs to follow the rules
we're laying out here.

I think it boils down to two rules:
1. Don't move the bounds directory without telling the kernel.
2. The kernel will not free any memory which it did not allocate.

 +2) We will not support the case that multiple bounds directory entries
 +are pointed at the same bounds table.
 +
 +Users can be allowed to take multiple bounds directory entries and point
 +them at the same bounds table. See more information Intel(R) Architecture
 +Instruction Set Extensions Programming Reference (9.3.4).
 +
 +If userspace did this, it will be possible for kernel to unmap an in-use
 +bounds table since it does not recognize sharing. So this behavior is
 +also outlawed here.
 
 Again, this is nothing you can enforce and just saying its outlawed
 does not prevent user space from doing it and then sending hard to
 decode bug reports where it complains about mappings silently
 vanishing under it.
 
 So all you can do here is to write up a rule set how well behaving
 user space is supposed to use this facility and the kernel side of it. 

Outlaw was probably the wrong word.

I completely agree that all we can do is set up a set of rules for what
well-behaved userspace is expected to do.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/10] Intel MPX support

2014-09-12 Thread Dave Hansen
OK, here's some revised text for patch 00/10.  Again, this will
obviously be updated for the next post, but comments before that would
be much appreciated.

-

This patch set adds support for the Memory Protection eXtensions
(MPX) feature found in future Intel processors.  MPX is used in
conjunction with compiler changes to check memory references, and can be
used to catch buffer overflow or underflow.

For MPX to work, changes are required in the kernel, binutils and
compiler.  No source changes are required for applications, just a
recompile.

There are a lot of moving parts of this to all work right:

= Example Compiler / Application / Kernel Interaction =

1. Application developer compiles with -fmpx.  The compiler will add the
   instrumentation as well as some setup code called early after the app
   starts.  New instruction prefixes are noops for old CPUs.
2. That setup code allocates (virtual) space for the bounds directory,
   points the bndcfgu register to the directory and notifies the
   kernel (via the new prctl()) that the app will be using MPX.
3. The kernel detects that the CPU has MPX, allows the new prctl() to
   succeed, and notes the location of the bounds directory.  We note it
   instead of reading it each time because the 'xsave' operation needed
   to access the bounds directory register is an expensive operation.
4. If the application needs to spill bounds out of the 4 registers, it
   issues a bndstx instruction.  Since the bounds directory is empty at
   this point, a bounds fault (#BR) is raised, the kernel allocates a
   bounds table (in the user address space) and makes the relevant
   entry in the bounds directory point to the new table. [1]
5. If the application violates the bounds specified in the bounds
   registers, a separate kind of #BR is raised which will deliver a
   signal with information about the violation in the 'struct siginfo'.
6. Whenever memory is freed, we know that it can no longer contain
   valid pointers, and we attempt to free the associated space in the
   bounds tables.  If an entire table becomes unused, we will attempt
   to free the table and remove the entry in the directory.

To summarize, there are essentially three things interacting here:

GCC with -fmpx:
 * enables annotation of code with MPX instructions and prefixes
 * inserts code early in the application to call in to the gcc runtime
GCC MPX Runtime:
 * Checks for hardware MPX support in cpuid leaf
 * allocates virtual space for the bounds directory (malloc()
   essentially)
 * points the hardware BNDCFGU register at the directory
 * calls a new prctl() to notify the kernel to start managing the
   bounds directories
Kernel MPX Code:
 * Checks for hardware MPX support in cpuid leaf
 * Handles #BR exceptions and sends SIGSEGV to the app when it violates
   bounds, like during a buffer overflow.
 * When bounds are spilled in to an unallocated bounds table, the kernel
   notices in the #BR exception, allocates the virtual space, then
   updates the bounds directory to point to the new table.  It keeps
   special track of the memory with a VM_MPX flag.
 * Frees unused bounds tables at the time that the memory they described
   is unmapped. (See cleanup unused bound tables)

= Testing =

This patchset has been tested on real internal hardware platform at
Intel.  We have some simple unit tests in user space, which directly
call MPX instructions to produce #BR to let kernel allocate bounds
tables and cause bounds violations. We also compiled several benchmarks
with an MPX-enabled compiler and ran them with this patch set.  We found
a number of bugs in this code in these tests.

1. For more info on why the kernel does these allocations, see the patch
on-demand kernel allocation of bounds tables

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/10] Intel MPX support

2014-09-11 Thread Dave Hansen
On 09/11/2014 01:46 AM, Qiaowei Ren wrote:
> MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
> provide handlers for bounds faults (#BR), and manage bounds memory.

Qiaowei, We probably need to mention here what "bounds memory" is, and
why it has to be managed, and who is responsible for the different pieces.

Who allocates the memory?
Who fills the memory?
When is it freed?

Thomas, do you have any other suggestions for things you'd like to see
clarified?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 00/10] Intel MPX support

2014-09-11 Thread Qiaowei Ren
This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in "Intel(R) Architecture
Instruction Set Extensions Programming Reference".

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.

The high-level areas modified in the patchset are as follow:
1) struct siginfo is extended to include bound violation information.
2) two prctl() commands are added to do performance optimization.

Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator

This patchset has been tested on real internal hardware platform at Intel.
We have some simple unit tests in user space, which directly call MPX
instructions to produce #BR to let kernel allocate bounds tables and
cause bounds violations. We also compiled several benchmarks with an
MPX-enabled Gcc/Glibc and ICC, an ran them with this patch set.
We found a number of bugs in this code in these tests.

Future TODO items:
1) support 32-bit binaries on 64-bit kernels.

Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Changes since v2:
  * fix some compile warnings.
  * update documentation.

Changes since v3:
  * correct some syntax errors at documentation, and document
extended struct siginfo.
  * for kill the process when the error code of BNDSTATUS is 3.
  * add some comments.
  * remove new prctl() commands.
  * fix some compile warnings for 32-bit.

Changes since v4:
  * raise SIGBUS if the allocations of the bound tables fail.

Changes since v5:
  * hook unmap() path to cleanup unused bounds tables, and use
new prctl() command to register bounds directory address to
struct mm_struct to check whether one process is MPX enabled
during unmap().
  * in order track precisely MPX memory usage, add MPX specific
mmap interface and one VM_MPX flag to check whether a VMA
is MPX bounds table.
  * add macro cpu_has_mpx to do performance optimization.
  * sync struct figinfo for mips with general version to avoid
build issue.

Changes since v6:
  * because arch_vma_name is removed, this patchset have toset MPX
specific ->vm_ops to do the same thing.
  * fix warnings for 32 bit arch.
  * add more description into these patches.

Changes since v7:
  * introduce VM_ARCH_2 flag. 
  * remove all of the pr_debug()s.
  * fix prctl numbers in documentation.
  * fix some bugs on bounds tables freeing.

Qiaowei Ren (10):
  x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add macro cpu_has_mpx
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: extend siginfo structure to include bound violation
information
  mips: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER
  x86, mpx: cleanup unused bound tables
  x86, mpx: add 

Re: [PATCH v8 00/10] Intel MPX support

2014-09-11 Thread Dave Hansen
On 09/11/2014 01:46 AM, Qiaowei Ren wrote:
 MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
 provide handlers for bounds faults (#BR), and manage bounds memory.

Qiaowei, We probably need to mention here what bounds memory is, and
why it has to be managed, and who is responsible for the different pieces.

Who allocates the memory?
Who fills the memory?
When is it freed?

Thomas, do you have any other suggestions for things you'd like to see
clarified?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 00/10] Intel MPX support

2014-09-11 Thread Qiaowei Ren
This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in Intel(R) Architecture
Instruction Set Extensions Programming Reference.

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.

The high-level areas modified in the patchset are as follow:
1) struct siginfo is extended to include bound violation information.
2) two prctl() commands are added to do performance optimization.

Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator

This patchset has been tested on real internal hardware platform at Intel.
We have some simple unit tests in user space, which directly call MPX
instructions to produce #BR to let kernel allocate bounds tables and
cause bounds violations. We also compiled several benchmarks with an
MPX-enabled Gcc/Glibc and ICC, an ran them with this patch set.
We found a number of bugs in this code in these tests.

Future TODO items:
1) support 32-bit binaries on 64-bit kernels.

Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Changes since v2:
  * fix some compile warnings.
  * update documentation.

Changes since v3:
  * correct some syntax errors at documentation, and document
extended struct siginfo.
  * for kill the process when the error code of BNDSTATUS is 3.
  * add some comments.
  * remove new prctl() commands.
  * fix some compile warnings for 32-bit.

Changes since v4:
  * raise SIGBUS if the allocations of the bound tables fail.

Changes since v5:
  * hook unmap() path to cleanup unused bounds tables, and use
new prctl() command to register bounds directory address to
struct mm_struct to check whether one process is MPX enabled
during unmap().
  * in order track precisely MPX memory usage, add MPX specific
mmap interface and one VM_MPX flag to check whether a VMA
is MPX bounds table.
  * add macro cpu_has_mpx to do performance optimization.
  * sync struct figinfo for mips with general version to avoid
build issue.

Changes since v6:
  * because arch_vma_name is removed, this patchset have toset MPX
specific -vm_ops to do the same thing.
  * fix warnings for 32 bit arch.
  * add more description into these patches.

Changes since v7:
  * introduce VM_ARCH_2 flag. 
  * remove all of the pr_debug()s.
  * fix prctl numbers in documentation.
  * fix some bugs on bounds tables freeing.

Qiaowei Ren (10):
  x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add macro cpu_has_mpx
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: extend siginfo structure to include bound violation
information
  mips: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER
  x86, mpx: cleanup unused bound tables
  x86, mpx: add