Re: [RFC PATCH] mm: extend memfd with ability to create "secret" memory areas

2020-08-26 Thread Florian Weimer
* Andy Lutomirski:

>> I _believe_ there are also things like AES-NI that can get strong
>> protection from stuff like this.  They load encryption keys into (AVX)
>> registers and then can do encrypt/decrypt operations without the keys
>> leaving the registers.  If the key was loaded from a secret memory area
>> right into the registers, I think the protection from cache attacks
>> would be pretty strong.
>
> Except for context switches :)

An rseq sequence could request that the AVX registers should be
cleared on context switch.  (I'm mostly kidding.)

I think the main issue is that we do not have a good established
programming model to actually use such features and completely avoid
making copies of secret data.


Re: [RFC PATCH] mm: extend memfd with ability to create "secret" memory areas

2020-08-26 Thread Andy Lutomirski
On Fri, Aug 14, 2020 at 11:09 AM Dave Hansen  wrote:
>
> On 8/14/20 10:46 AM, Andy Lutomirski wrote:
> > I'm a little unconvinced about the security benefits.  As far as I
> > know, UC memory will not end up in cache by any means (unless
> > aliased), but it's going to be tough to do much with UC data with
> > anything resembling reasonable performance without derived values
> > getting cached.
>
> I think this is much more in the category of raising the bar than
> providing any absolute security guarantees.

The problem here is that we're raising the bar in a way that is
weirdly architecture dependent, *extremely* nonperformant, and may not
even accomplish what it's trying to accomplish.

>
> Let's say you have a secret and you read it into some registers and then
> spill them on the stack.  You've got two cached copies, one for the
> primary data and another for the stack copy.  Secret areas don't get rid
> of the stack copy, but they do get rid of the other one.  One cache copy
> is better than two.  Bar raised. :)

If we have two bars right next to each other and we raise one of them,
did we really accomplish much?  I admit that having a secret in its
own dedicated cache line seems like an easier target than a secret in
a cache line that may be quickly overwritten by something else.  But
even user registers right now aren't specially protected -- pt_regs
lives is cached and probably has a predictable location, especially if
you execve() a setuid program.

>
> There are also some stronger protections, less in the bar-raising
> category.  On x86 at least, uncached accesses also crush speculation.
> You can't, for instance, speculatively get wrong values if you're not
> speculating in the first place.  I was thinking of things like Load
> Value Injection[1].

This seems genuinely useful, but it doesn't really address the fact
that requesting UC memory via PAT apparently has a good chance of
getting WB anyway.

>
> I _believe_ there are also things like AES-NI that can get strong
> protection from stuff like this.  They load encryption keys into (AVX)
> registers and then can do encrypt/decrypt operations without the keys
> leaving the registers.  If the key was loaded from a secret memory area
> right into the registers, I think the protection from cache attacks
> would be pretty strong.
>

Except for context switches :)
>
> 1.
> https://software.intel.com/security-software-guidance/insights/deep-dive-load-value-injection


Re: [RFC PATCH] mm: extend memfd with ability to create "secret" memory areas

2020-08-14 Thread Dave Hansen
On 8/14/20 10:46 AM, Andy Lutomirski wrote:
> I'm a little unconvinced about the security benefits.  As far as I
> know, UC memory will not end up in cache by any means (unless
> aliased), but it's going to be tough to do much with UC data with
> anything resembling reasonable performance without derived values
> getting cached.

I think this is much more in the category of raising the bar than
providing any absolute security guarantees.

Let's say you have a secret and you read it into some registers and then
spill them on the stack.  You've got two cached copies, one for the
primary data and another for the stack copy.  Secret areas don't get rid
of the stack copy, but they do get rid of the other one.  One cache copy
is better than two.  Bar raised. :)

There are also some stronger protections, less in the bar-raising
category.  On x86 at least, uncached accesses also crush speculation.
You can't, for instance, speculatively get wrong values if you're not
speculating in the first place.  I was thinking of things like Load
Value Injection[1].

I _believe_ there are also things like AES-NI that can get strong
protection from stuff like this.  They load encryption keys into (AVX)
registers and then can do encrypt/decrypt operations without the keys
leaving the registers.  If the key was loaded from a secret memory area
right into the registers, I think the protection from cache attacks
would be pretty strong.


1.
https://software.intel.com/security-software-guidance/insights/deep-dive-load-value-injection


Re: [RFC PATCH] mm: extend memfd with ability to create "secret" memory areas

2020-08-14 Thread Andy Lutomirski
On Thu, Jan 30, 2020 at 8:23 AM Mike Rapoport  wrote:
>
> Hi,
>
> This is essentially a resend of my attempt to implement "secret" mappings
> using a file descriptor [1].
>
> I've done a couple of experiments with secret/exclusive/whatever
> memory backed by a file-descriptor using a chardev and memfd_create
> syscall. There is indeed no need for VM_ flag, but there are still places
> that would require special care, e.g vm_normal_page(), madvise(DO_FORK), so
> it won't be completely free of core mm modifications.
>
> Below is a POC that implements extension to memfd_create() that allows
> mapping of a "secret" memory. The "secrecy" mode should be explicitly set
> using ioctl(), for now I've implemented exclusive and uncached mappings.

Hi-

Sorry for the extremely delayed response.

I like the general concept, and I like the exclusive concept.  While
it is certainly annoying for the kernel to manage non-direct-mapped
pages, I think it's the future.  But I have serious concerns about the
uncached part.  Here are some concerns.

If it's done at all, I think it should be MFD_SECRET_X86_UNCACHED.  I
think that uncached memory is outside the scope of things that can
reasonably be considered to be architecture-neutral.  (For example, on
x86, UC and WC have very different semantics, and UC has quite
different properties than WB for things like atomics.  Also, the
performance of UC is interesting at best, and the ways to even
moderately efficiently read from UC memory or write to UC memory are
highly x86-specific.)

I'm a little unconvinced about the security benefits.  As far as I
know, UC memory will not end up in cache by any means (unless
aliased), but it's going to be tough to do much with UC data with
anything resembling reasonable performance without derived values
getting cached.  It's likely entirely impossible to do it reliably
without asm.  But even with plain WB memory, getting it into L1 really
should not be that bad unless major new vulnerabilities are
discovered.  And there are other approaches that could be more
arch-neutral and more performant.  For example, there could be an
option to flush a few cache lines on schedule out.  This way a task
could work on some (exclusive but WB) secret memory and have the cache
lines flushed if anything interrupts it.  Combined with turning SMT
off, this could offer comparable protection with much less overhead.

UC also doesn't seem reliable on x86, sadly.  From asking around,
there are at least a handful of scenarios under which the kernel can
ask the CPU for UC but get WB anyway.  Apparently Xen hypervisors will
do this unless the domain has privileged MMIO access, and ESXi will do
it under some set of common circumstances.  So unless we probe somehow
or have fancy enumeration or administrative configuration, I'm not
sure we can even get predictable behavior if we hand userspace a
supposedly UC mapping.  Giving user code WB when it thinks it has UC
could end badly.

--Andy