subject:"\[PATCH v17 18\/23\] platform\/x86\: Intel SGX driver"

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-21 Thread Jarkko Sakkinen

On Fri, Dec 21, 2018 at 10:28:09AM -0800, Sean Christopherson wrote:
> > Why would you want to pass EPC through user space to KVM rather than
> > KVM allocating it through kernel interfaces?
> 
> Delegating EPC management to userspace fits better with KVM's existing
> memory ABI.  KVM provides a single ioctl(), KVM_SET_USER_MEMORY_REGION[1],
> that allows userspace to create, move, modify and delete memory regions.
> 
> Skipping over a lot of details, there are essentially three options for
> exposing EPC to a KVM guest:
> 
>  1) Provide a dedicated KVM ioctl() to manage EPC without routing it
> through KVM_SET_USER_MEMORY_REGION.
> 
>  2) Add a flag to 'struct kvm_userspace_memory_region' that denotes an
> EPC memory region and mmap() / allocate EPC in KVM.
> 
>  3) Provide an ABI to allocate raw EPC and let userspace manage it like
> any other memory region.
> 
> Option (1) requires duplicating all of KVM_SET_USER_MEMORY_REGION's
> functionality unless the ioctl() is severly restricted.
> 
> Option (2) is an ugly abuse of KVM_SET_USER_MEMORY_REGION since the EPC
> flag would have completely different semantics than all other usage of
> KVM_SET_USER_MEMORY_REGION.
> 
> Thus, option (3).

OK, thank you for patience explaining this.

> Probably a better question to answer is why provide the ABI through
> /dev/sgx and not /dev/kvm.  IMO /dev/sgx is a more logical way to
> advertise support to userspace, e.g. userspace can simply check if
> /dev/sgx (or /dev/sgx/epc) exists vs. probing a KVM capability.

You have to understand that for a KVM non-expert like me it was really
important to get the context, which you kindly gave. I have never used
KVM's memory management API but now that I know how it works all of this
makes perfect sense. This is not a better question but it is definitely
a good follow up question :-)

I don't really understand you deduction here, however. If SGX was not
supported, why couldn't the hypothetical /dev/kvm functionality just
return an error?

For me it sounds a bit messy that KVM functionality, which is a client
to the SGX functionality, places some of its functionality to the SGX
core.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-21 Thread Sean Christopherson

On Wed, Dec 19, 2018 at 07:00:47AM +0200, Jarkko Sakkinen wrote:
> On Tue, Dec 18, 2018 at 10:53:49AM -0800, Sean Christopherson wrote:
> > What if we re-organize the ioctls in such a way that we leave open the
> > possibility of allocating raw EPC for KVM via /dev/sgx?  I'm not 100%
> > positive this approach will work[1], but conceptually it fits well with
> > KVM's memory model, e.g. KVM is aware of the GPA<->HVA association but
> > generally speaking doesn't know what's physically backing each memory
> > region.
> 
> Why would you want to pass EPC through user space to KVM rather than
> KVM allocating it through kernel interfaces?

Delegating EPC management to userspace fits better with KVM's existing
memory ABI.  KVM provides a single ioctl(), KVM_SET_USER_MEMORY_REGION[1],
that allows userspace to create, move, modify and delete memory regions.

Skipping over a lot of details, there are essentially three options for
exposing EPC to a KVM guest:

 1) Provide a dedicated KVM ioctl() to manage EPC without routing it
through KVM_SET_USER_MEMORY_REGION.

 2) Add a flag to 'struct kvm_userspace_memory_region' that denotes an
EPC memory region and mmap() / allocate EPC in KVM.

 3) Provide an ABI to allocate raw EPC and let userspace manage it like
any other memory region.

Option (1) requires duplicating all of KVM_SET_USER_MEMORY_REGION's
functionality unless the ioctl() is severly restricted.

Option (2) is an ugly abuse of KVM_SET_USER_MEMORY_REGION since the EPC
flag would have completely different semantics than all other usage of
KVM_SET_USER_MEMORY_REGION.

Thus, option (3).

Probably a better question to answer is why provide the ABI through
/dev/sgx and not /dev/kvm.  IMO /dev/sgx is a more logical way to
advertise support to userspace, e.g. userspace can simply check if
/dev/sgx (or /dev/sgx/epc) exists vs. probing a KVM capability.

Without EPC oversubscription in KVM, /dev/sgx is easily the best fit
since the EPC management would reside completely in x86/sgx, i.e. KVM
would essentially have zero code related to EPC management.

EPC oversubscription complicates things because the architecture forces
aspects of VMM oversubscription into the KVM domain, e.g. requires a
post-VMXON instruction (ENCLV) and a VM-Exit handler.   I still think
/dev/sgx is a better fit, my only concern is that the oversubscription
code would be even more heinous due to splitting responsibilities.
But, Andy's idea of having /dev/sgx/enclave vs. /dev/sgx/epc might help
avoid that entirely.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-18 Thread Jarkko Sakkinen

On Wed, Dec 19, 2018 at 06:47:32AM +0200, Jarkko Sakkinen wrote:
> On Tue, Dec 18, 2018 at 07:44:18AM -0800, Sean Christopherson wrote:
> > My fd/inode knowledge is lacking, to say the least.  Whatever works, so
> > long as we have a way to uniquely identify enclaves.
> 
> I will simply trial and error :-) I think it should work since it does
> own an address space, but yeah, testing will tell. We can go also with
> anon inode if required.

I think this can be concluded with the fact that it is nice to be able
to multiplex the dev fd. That is the key reason for using anon inode.
You KVM comment locks the decision here.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-18 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 09:55:08PM -0800, Andy Lutomirski wrote:
> On Thu, Nov 15, 2018 at 5:08 PM Jarkko Sakkinen
>  wrote:
> >
> > Intel Software Guard eXtensions (SGX) is a set of CPU instructions that
> > can be used by applications to set aside private regions of code and
> > data. The code outside the enclave is disallowed to access the memory
> > inside the enclave by the CPU access control.
> 
> This is a very partial review.

Thank you, appreciate it.

> > +int sgx_encl_find(struct mm_struct *mm, unsigned long addr,
> > + struct vm_area_struct **vma)
> > +{
> > +   struct vm_area_struct *result;
> > +   struct sgx_encl *encl;
> > +
> > +   result = find_vma(mm, addr);
> > +   if (!result || result->vm_ops != _vm_ops || addr < 
> > result->vm_start)
> > +   return -EINVAL;
> > +
> > +   encl = result->vm_private_data;
> > +   *vma = result;
> > +
> > +   return encl ? 0 : -ENOENT;
> > +}
> 
> I realize that this function may go away entirely but, if you keep it:
> what are the locking rules?  What, if anything, prevents another
> thread from destroying the enclave after sgx_encl_find() returns?

The kref inside the enclave is used to manage this but this function
directly does not prevent it (see for example sgx_encl_get). But yes,
this function does not give any guarantees (should probably have
a documentation stating this).

> > +static int sgx_validate_secs(const struct sgx_secs *secs,
> > +unsigned long ssaframesize)
> > +{
> 
> ...
> 
> > +   if (secs->attributes & SGX_ATTR_MODE64BIT) {
> > +   if (secs->size > sgx_encl_size_max_64)
> > +   return -EINVAL;
> > +   } else {
> > +   /* On 64-bit architecture allow 32-bit encls only in
> > +* the compatibility mode.
> > +*/
> > +   if (!test_thread_flag(TIF_ADDR32))
> > +   return -EINVAL;
> > +   if (secs->size > sgx_encl_size_max_32)
> > +   return -EINVAL;
> > +   }
> 
> Why do we need the 32-bit-on-64-bit check?  In general, anything that
> checks per-task or per-mm flags like TIF_ADDR32 is IMO likely to be
> problematic.  You're allowing 64-bit enclaves in 32-bit tasks, so I'm
> guessing you could just delete the check.

I guess you are right. I can remove this.


> 
> > +
> > +   if (!(secs->xfrm & XFEATURE_MASK_FP) ||
> > +   !(secs->xfrm & XFEATURE_MASK_SSE) ||
> > +   (((secs->xfrm >> XFEATURE_BNDREGS) & 1) !=
> > +((secs->xfrm >> XFEATURE_BNDCSR) & 1)) ||
> > +   (secs->xfrm & ~sgx_xfrm_mask))
> > +   return -EINVAL;
> 
> Do we need to check that the enclave doesn't use xfeatures that the
> kernel doesn't know about?  Or are they all safe by design in enclave
> mode?

Really good catch BTW. We don't check what kernel doesn't know about.

I'm not sure what harm this would have as the enclave cannot have much
effect to the outside world. Is there easy way to get similar mask
of kernel supported features as sgx_xfrm_mask? The safe play would
be to use such here as I don't have definitive answer to your second
question.

> 
> > +static int sgx_encl_pm_notifier(struct notifier_block *nb,
> > +   unsigned long action, void *data)
> > +{
> > +   struct sgx_encl *encl = container_of(nb, struct sgx_encl, 
> > pm_notifier);
> > +
> > +   if (action != PM_SUSPEND_PREPARE && action != 
> > PM_HIBERNATION_PREPARE)
> > +   return NOTIFY_DONE;
> 
> Hmm.  There's an argument to made that omitting this would better
> exercise the code that handles fully asynchronous loss of an enclave.
> Also, I think you're unnecessarily killing enclaves when suspend is
> attempted but fails.

Are you proposing to not do anything at all to the enclaves? There was
is a problem that it could lead to infinite #PF loop if we don't do
it.


> 
> > +
> > +static int sgx_get_key_hash(const void *modulus, void *hash)
> > +{
> > +   struct crypto_shash *tfm;
> > +   int ret;
> > +
> > +   tfm = crypto_alloc_shash("sha256", 0, CRYPTO_ALG_ASYNC);
> > +   if (IS_ERR(tfm))
> > +   return PTR_ERR(tfm);
> > +
> > +   ret = __sgx_get_key_hash(tfm, modulus, hash);
> > +
> > +   crypto_free_shash(tfm);
> > +   return ret;
> > +}
> > +
> 
> I'm so sorry you had to deal with this API.  Once Zinc lands, you
> could clean this up :)
> 
> 
> > +static int sgx_encl_get(unsigned long addr, struct sgx_encl **encl)
> > +{
> > +   struct mm_struct *mm = current->mm;
> > +   struct vm_area_struct *vma;
> > +   int ret;
> > +
> > +   if (addr & (PAGE_SIZE - 1))
> > +   return -EINVAL;
> > +
> > +   down_read(>mmap_sem);
> > +
> > +   ret = sgx_encl_find(mm, addr, );
> > +   if (!ret) {
> > +   *encl = vma->vm_private_data;
> > +
> > +   if ((*encl)->flags &

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-18 Thread Jarkko Sakkinen

On Wed, Dec 19, 2018 at 07:00:47AM +0200, Jarkko Sakkinen wrote:
> On Tue, Dec 18, 2018 at 10:53:49AM -0800, Sean Christopherson wrote:
> > What if we re-organize the ioctls in such a way that we leave open the
> > possibility of allocating raw EPC for KVM via /dev/sgx?  I'm not 100%
> > positive this approach will work[1], but conceptually it fits well with
> > KVM's memory model, e.g. KVM is aware of the GPA<->HVA association but
> > generally speaking doesn't know what's physically backing each memory
> > region.
> 
> Why would you want to pass EPC through user space to KVM rather than
> KVM allocating it through kernel interfaces?
> 
> > Tangentially related, I think we should support allocating multiple
> > enclaves from a single /dev/sgx fd, i.e. a process shouldn't have to
> > open /dev/sgx every time it wants to create a new enclave.
> 
> I'm fine with this. It just requires to create anon inode. I'll just
> add a new field called 'enclave_fd' to struct sgx_enclave_create and
> that's all.
> 
> I think I have otherwise ingredients for v19 ready except where to swap.

If I follow your proposal here and allow to create multiple enclaves
(i.e. with anon inodes for each) with one descriptor, is that sufficient
API to later add what you want to KVM?

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-18 Thread Jarkko Sakkinen

On Tue, Dec 18, 2018 at 10:53:49AM -0800, Sean Christopherson wrote:
> What if we re-organize the ioctls in such a way that we leave open the
> possibility of allocating raw EPC for KVM via /dev/sgx?  I'm not 100%
> positive this approach will work[1], but conceptually it fits well with
> KVM's memory model, e.g. KVM is aware of the GPA<->HVA association but
> generally speaking doesn't know what's physically backing each memory
> region.

Why would you want to pass EPC through user space to KVM rather than
KVM allocating it through kernel interfaces?

> Tangentially related, I think we should support allocating multiple
> enclaves from a single /dev/sgx fd, i.e. a process shouldn't have to
> open /dev/sgx every time it wants to create a new enclave.

I'm fine with this. It just requires to create anon inode. I'll just
add a new field called 'enclave_fd' to struct sgx_enclave_create and
that's all.

I think I have otherwise ingredients for v19 ready except where to swap.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-18 Thread Jarkko Sakkinen

On Tue, Dec 18, 2018 at 07:44:18AM -0800, Sean Christopherson wrote:
> My fd/inode knowledge is lacking, to say the least.  Whatever works, so
> long as we have a way to uniquely identify enclaves.

I will simply trial and error :-) I think it should work since it does
own an address space, but yeah, testing will tell. We can go also with
anon inode if required.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-18 Thread Sean Christopherson

On Tue, Dec 18, 2018 at 07:44:18AM -0800, Sean Christopherson wrote:
> On Mon, Dec 17, 2018 at 08:59:54PM -0800, Andy Lutomirski wrote:
> > On Mon, Dec 17, 2018 at 2:20 PM Sean Christopherson
> >  wrote:
> > >
> > 
> > > My brain is still sorting out the details, but I generally like the idea
> > > of allocating an anon inode when creating an enclave, and exposing the
> > > other ioctls() via the returned fd.  This is essentially the approach
> > > used by KVM to manage multiple "layers" of ioctls across KVM itself, VMs
> > > and vCPUS.  There are even similarities to accessing physical memory via
> > > multiple disparate domains, e.g. host kernel, host userspace and guest.
> > >
> > 
> > In my mind, opening /dev/sgx would give you the requisite inode.  I'm
> > not 100% sure that the chardev infrastructure allows this, but I think
> > it does.
> 
> My fd/inode knowledge is lacking, to say the least.  Whatever works, so
> long as we have a way to uniquely identify enclaves.

Actually, while we're dissecting the interface...

What if we re-organize the ioctls in such a way that we leave open the
possibility of allocating raw EPC for KVM via /dev/sgx?  I'm not 100%
positive this approach will work[1], but conceptually it fits well with
KVM's memory model, e.g. KVM is aware of the GPA<->HVA association but
generally speaking doesn't know what's physically backing each memory
region.

Tangentially related, I think we should support allocating multiple
enclaves from a single /dev/sgx fd, i.e. a process shouldn't have to
open /dev/sgx every time it wants to create a new enclave.

Something like this:

/dev/sgx
  |
  -> mmap() { return -EINVAL; }
  |
  -> unlocked_ioctl()
 |
 -> SGX_CREATE_ENCLAVE: { return alloc_enclave_fd(); }
 |  |
 |   -> mmap() { ... }
 |  | 
 |   -> get_unmapped_area() { 
 |  |   if (enclave->size) {
 |  |   if (!addr)
 |  |   addr = enclave->base;
 |  |   if (addr + len + pgoff > enclave->base + 
enclave->size)
 |  |   return -EINVAL;
 |  |   } else {
 |  |   if (!validate_size(len))
 |  |   return -EINVAL;
 |  |   addr = naturally_align(len);
 |  |   }
 |  |   }
 |  |
 |   -> unlocked_ioctl() {
 |  SGX_ENCLAVE_ADD_PAGE: { ... }
 |  SGX_ENCLAVE_INIT: { ... }
 |  SGX_ENCLAVE_REMOVE_PAGES: { ... }
 |  SGX_ENCLAVE_MODIFY_PAGES: { ... }
 |  }
 |
 -> SGX_CREATE_VIRTUAL_EPC: {return alloc_epc_fd(); }
|
 -> mmap() { ... }
|
 -> get_unmapped_area() { }
|
 -> unlocked_ioctl() {
SGX_VIRTUAL_EPC_???:
SGX_VIRTUAL_EPC_???:
}

[1] Delegating EPC management to /dev/sgx is viable for virtualizing SGX
without oversubscribing EPC to guests, but oversubscribing EPC in a
VMM requires handling EPC-related VM-Exits and using instructions
that will #UD if the CPU is not post-VMXON.  I *think* having KVM
forward VM-Exits to x86/sgx would work, but it's entirely possible
it'd be a complete cluster.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-18 Thread Sean Christopherson

On Tue, Dec 18, 2018 at 03:13:11PM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 10:21:49PM +0200, Jarkko Sakkinen wrote:
> > On Mon, Dec 17, 2018 at 09:33:22PM +0200, Jarkko Sakkinen wrote:
> > > On Mon, Dec 17, 2018 at 10:48:58AM -0800, Sean Christopherson wrote:
> > > > On Mon, Dec 17, 2018 at 08:43:33PM +0200, Jarkko Sakkinen wrote:
> > > > > On Mon, Dec 17, 2018 at 10:36:13AM -0800, Sean Christopherson wrote:
> > > > > > I'm pretty sure doing mmget() would result in circular dependencies 
> > > > > > and
> > > > > > a zombie enclave.  In the do_exit() case where a task is abruptly 
> > > > > > killed:
> > > > > >  
> > > > > >   - __mmput() is never called because the enclave holds a ref
> > > > > >   - sgx_encl_release() is never be called because its VMAs hold refs
> > > > > >   - sgx_vma_close() is never called because __mmput()->exit_mmap() 
> > > > > > is
> > > > > > blocked and the process itself is dead, i.e. won't unmap 
> > > > > > anything.
> > > > > 
> > > > > Right, it does, you are absolutely right. Tried it and removed the
> > > > > commit already.
> > > > > 
> > > > > Well, what we came up from your suggestion i.e. setting mm to NULL
> > > > > and checking that is very subtle change and does not have any such
> > > > > circular dependencies. We'll go with that.
> > > > 
> > > > We can't set mm to NULL as we need it to unregister the notifier, and
> > > > I'm fairly certain attempting to unregister in the release callback
> > > > will deadlock.
> > > 
> > > Noticed that too. mmu_notifier_unregister() requires a valid mm.
> > 
> > Both branches updated...
> 
> I'm not still seeing why you would want to call sgx_free_page() from
> sgx_invalidate(). Kind of resistant to adding extra logging just for
> checking for programming errors. What I would do if I had to debug
> there a leak would be simply put kretprobe on __sgx_free_page().

The WARN is needed to detect the leak in the first place.  And leaking
pages because EREMOVE fails usually means there's a serious bug.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-18 Thread Sean Christopherson

On Mon, Dec 17, 2018 at 08:59:54PM -0800, Andy Lutomirski wrote:
> On Mon, Dec 17, 2018 at 2:20 PM Sean Christopherson
>  wrote:
> >
> 
> > My brain is still sorting out the details, but I generally like the idea
> > of allocating an anon inode when creating an enclave, and exposing the
> > other ioctls() via the returned fd.  This is essentially the approach
> > used by KVM to manage multiple "layers" of ioctls across KVM itself, VMs
> > and vCPUS.  There are even similarities to accessing physical memory via
> > multiple disparate domains, e.g. host kernel, host userspace and guest.
> >
> 
> In my mind, opening /dev/sgx would give you the requisite inode.  I'm
> not 100% sure that the chardev infrastructure allows this, but I think
> it does.

My fd/inode knowledge is lacking, to say the least.  Whatever works, so
long as we have a way to uniquely identify enclaves.

> > The only potential hiccup I can see is the build flow.  Currently,
> > EADD+EEXTEND is done via a work queue to avoid major performance issues
> > (10x regression) when userspace is building multiple enclaves in parallel
> > using goroutines to wrap Cgo (the issue might apply to any M:N scheduler,
> > but I've only confirmed the Golang case).  The issue is that allocating
> > an EPC page acts like a blocking syscall when the EPC is under pressure,
> > i.e. an EPC page isn't immediately available.  This causes Go's scheduler
> > to thrash and tank performance[1].
> 
> What's the issue, and how does a workqueue help?  I'm wondering if a
> nicer solution would be an ioctl to add lots of pages in a single
> call.

Adding pages via workqueue makes the ioctl itself fast enough to avoid
triggering Go's rescheduling.  A batched EADD flow would likely help,
I just haven't had the time to rework the userspace side to be able to
test the performance.

> >
> > Alternatively, we could change the EADD+EEXTEND flow to not insert the
> > added page's PFN into the owner's process space, i.e. force userspace to
> > fault when it runs the enclave.  But that only delays the issue because
> > eventually we'll want to account EPC pages, i.e. add a cgroup, at which
> > point we'll likely need current->mm anyways.
> 
> You should be able to account the backing pages to a cgroup without
> actually sticking them into the EPC, no?  Or am I misunderstanding?  I
> guess we'll eventually want a cgroup to limit use of the limited EPC
> resources.

It's the latter, a cgroup to limit EPC.  The mm is used to retrieve the
cgroup without having track e.g. the task_struct.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-18 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 09:02:03PM -0800, Andy Lutomirski wrote:
> That's what unmap_mapping_range(), etc do for you, no?  IOW make a
> struct address_space that represents the logical enclave address
> space, i.e. address 0 is the start and the pages count up from there.
> You can unmap pages whenever you want, and the core mm code will take
> care of zapping the pages from all vmas referencing that
> address_space.

OK, so it does. Did not have time to look at it last night (about
3AM) :-) Yes, we could use that to do the N process zapping.

Based on this discussion I can take the first steps with the swapping
code.

And yeah, I don't think we need anon inode for this one. Can just use
the dev inode (did not check in detail but on the surface looks like
it).

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-18 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 08:55:02PM -0800, Andy Lutomirski wrote:
> On Mon, Dec 17, 2018 at 5:39 PM Jarkko Sakkinen
>  wrote:
> >
> > On Mon, Dec 17, 2018 at 02:20:48PM -0800, Sean Christopherson wrote:
> > > The only potential hiccup I can see is the build flow.  Currently,
> > > EADD+EEXTEND is done via a work queue to avoid major performance issues
> > > (10x regression) when userspace is building multiple enclaves in parallel
> > > using goroutines to wrap Cgo (the issue might apply to any M:N scheduler,
> > > but I've only confirmed the Golang case).  The issue is that allocating
> > > an EPC page acts like a blocking syscall when the EPC is under pressure,
> > > i.e. an EPC page isn't immediately available.  This causes Go's scheduler
> > > to thrash and tank performance[1].
> >
> > I don't see any major issues having that kthread. All the code that
> > maps the enclave would be removed.
> >
> > I would only allow to map enclave to process address space after the
> > enclave has been initialized i.e. SGX_IOC_ENCLAVE_ATTACH.
> >
> 
> What's SGX_IOC_ENCLAVE_ATTACH?  Why would it be needed at all?  I
> would imagine that all pages would be faulted in as needed (or
> prefaulted as an optimization) and the enclave would just work in any
> process.

The way I see it the efficient way to implement this is to have the
enclave attached to a single process address space at a time.

#PF handler is trivial with multiple address spaces but swapping is
a bit tedious as you would need to zap N processes.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-18 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 10:21:49PM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 09:33:22PM +0200, Jarkko Sakkinen wrote:
> > On Mon, Dec 17, 2018 at 10:48:58AM -0800, Sean Christopherson wrote:
> > > On Mon, Dec 17, 2018 at 08:43:33PM +0200, Jarkko Sakkinen wrote:
> > > > On Mon, Dec 17, 2018 at 10:36:13AM -0800, Sean Christopherson wrote:
> > > > > I'm pretty sure doing mmget() would result in circular dependencies 
> > > > > and
> > > > > a zombie enclave.  In the do_exit() case where a task is abruptly 
> > > > > killed:
> > > > >  
> > > > >   - __mmput() is never called because the enclave holds a ref
> > > > >   - sgx_encl_release() is never be called because its VMAs hold refs
> > > > >   - sgx_vma_close() is never called because __mmput()->exit_mmap() is
> > > > > blocked and the process itself is dead, i.e. won't unmap anything.
> > > > 
> > > > Right, it does, you are absolutely right. Tried it and removed the
> > > > commit already.
> > > > 
> > > > Well, what we came up from your suggestion i.e. setting mm to NULL
> > > > and checking that is very subtle change and does not have any such
> > > > circular dependencies. We'll go with that.
> > > 
> > > We can't set mm to NULL as we need it to unregister the notifier, and
> > > I'm fairly certain attempting to unregister in the release callback
> > > will deadlock.
> > 
> > Noticed that too. mmu_notifier_unregister() requires a valid mm.
> 
> Both branches updated...

I'm not still seeing why you would want to call sgx_free_page() from
sgx_invalidate(). Kind of resistant to adding extra logging just for
checking for programming errors. What I would do if I had to debug
there a leak would be simply put kretprobe on __sgx_free_page().

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-18 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 08:59:54PM -0800, Andy Lutomirski wrote:
> On Mon, Dec 17, 2018 at 2:20 PM Sean Christopherson
>  wrote:
> >
> 
> > My brain is still sorting out the details, but I generally like the idea
> > of allocating an anon inode when creating an enclave, and exposing the
> > other ioctls() via the returned fd.  This is essentially the approach
> > used by KVM to manage multiple "layers" of ioctls across KVM itself, VMs
> > and vCPUS.  There are even similarities to accessing physical memory via
> > multiple disparate domains, e.g. host kernel, host userspace and guest.
> >
> 
> In my mind, opening /dev/sgx would give you the requisite inode.  I'm
> not 100% sure that the chardev infrastructure allows this, but I think
> it does.

Yes, this is what I was thinking too i.e.

enclave_fd = open("/dev/sgx/", O_RDWR);

After this enclave_fd "is" the enclave up until the file is closed.

> > The only potential hiccup I can see is the build flow.  Currently,
> > EADD+EEXTEND is done via a work queue to avoid major performance issues
> > (10x regression) when userspace is building multiple enclaves in parallel
> > using goroutines to wrap Cgo (the issue might apply to any M:N scheduler,
> > but I've only confirmed the Golang case).  The issue is that allocating
> > an EPC page acts like a blocking syscall when the EPC is under pressure,
> > i.e. an EPC page isn't immediately available.  This causes Go's scheduler
> > to thrash and tank performance[1].
> 
> What's the issue, and how does a workqueue help?  I'm wondering if a
> nicer solution would be an ioctl to add lots of pages in a single
> call.

I don't think this really is an issue as long as the thread does not
depend on any VMAs.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Andy Lutomirski

On Thu, Nov 15, 2018 at 5:08 PM Jarkko Sakkinen
 wrote:
>
> Intel Software Guard eXtensions (SGX) is a set of CPU instructions that
> can be used by applications to set aside private regions of code and
> data. The code outside the enclave is disallowed to access the memory
> inside the enclave by the CPU access control.

This is a very partial review.

> +int sgx_encl_find(struct mm_struct *mm, unsigned long addr,
> + struct vm_area_struct **vma)
> +{
> +   struct vm_area_struct *result;
> +   struct sgx_encl *encl;
> +
> +   result = find_vma(mm, addr);
> +   if (!result || result->vm_ops != _vm_ops || addr < 
> result->vm_start)
> +   return -EINVAL;
> +
> +   encl = result->vm_private_data;
> +   *vma = result;
> +
> +   return encl ? 0 : -ENOENT;
> +}

I realize that this function may go away entirely but, if you keep it:
what are the locking rules?  What, if anything, prevents another
thread from destroying the enclave after sgx_encl_find() returns?

> +static int sgx_validate_secs(const struct sgx_secs *secs,
> +unsigned long ssaframesize)
> +{

...

> +   if (secs->attributes & SGX_ATTR_MODE64BIT) {
> +   if (secs->size > sgx_encl_size_max_64)
> +   return -EINVAL;
> +   } else {
> +   /* On 64-bit architecture allow 32-bit encls only in
> +* the compatibility mode.
> +*/
> +   if (!test_thread_flag(TIF_ADDR32))
> +   return -EINVAL;
> +   if (secs->size > sgx_encl_size_max_32)
> +   return -EINVAL;
> +   }

Why do we need the 32-bit-on-64-bit check?  In general, anything that
checks per-task or per-mm flags like TIF_ADDR32 is IMO likely to be
problematic.  You're allowing 64-bit enclaves in 32-bit tasks, so I'm
guessing you could just delete the check.

> +
> +   if (!(secs->xfrm & XFEATURE_MASK_FP) ||
> +   !(secs->xfrm & XFEATURE_MASK_SSE) ||
> +   (((secs->xfrm >> XFEATURE_BNDREGS) & 1) !=
> +((secs->xfrm >> XFEATURE_BNDCSR) & 1)) ||
> +   (secs->xfrm & ~sgx_xfrm_mask))
> +   return -EINVAL;

Do we need to check that the enclave doesn't use xfeatures that the
kernel doesn't know about?  Or are they all safe by design in enclave
mode?

> +static int sgx_encl_pm_notifier(struct notifier_block *nb,
> +   unsigned long action, void *data)
> +{
> +   struct sgx_encl *encl = container_of(nb, struct sgx_encl, 
> pm_notifier);
> +
> +   if (action != PM_SUSPEND_PREPARE && action != PM_HIBERNATION_PREPARE)
> +   return NOTIFY_DONE;

Hmm.  There's an argument to made that omitting this would better
exercise the code that handles fully asynchronous loss of an enclave.
Also, I think you're unnecessarily killing enclaves when suspend is
attempted but fails.

> +
> +static int sgx_get_key_hash(const void *modulus, void *hash)
> +{
> +   struct crypto_shash *tfm;
> +   int ret;
> +
> +   tfm = crypto_alloc_shash("sha256", 0, CRYPTO_ALG_ASYNC);
> +   if (IS_ERR(tfm))
> +   return PTR_ERR(tfm);
> +
> +   ret = __sgx_get_key_hash(tfm, modulus, hash);
> +
> +   crypto_free_shash(tfm);
> +   return ret;
> +}
> +

I'm so sorry you had to deal with this API.  Once Zinc lands, you
could clean this up :)


> +static int sgx_encl_get(unsigned long addr, struct sgx_encl **encl)
> +{
> +   struct mm_struct *mm = current->mm;
> +   struct vm_area_struct *vma;
> +   int ret;
> +
> +   if (addr & (PAGE_SIZE - 1))
> +   return -EINVAL;
> +
> +   down_read(>mmap_sem);
> +
> +   ret = sgx_encl_find(mm, addr, );
> +   if (!ret) {
> +   *encl = vma->vm_private_data;
> +
> +   if ((*encl)->flags & SGX_ENCL_SUSPEND)
> +   ret = SGX_POWER_LOST_ENCLAVE;
> +   else
> +   kref_get(&(*encl)->refcount);
> +   }

Hmm.  This version has explicit refcounting.

> +static int sgx_mmap(struct file *file, struct vm_area_struct *vma)
> +{
> +   vma->vm_ops = _vm_ops;
> +   vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO |
> +VM_DONTCOPY;
> +
> +   return 0;
> +}
> +
> +static unsigned long sgx_get_unmapped_area(struct file *file,
> +  unsigned long addr,
> +  unsigned long len,
> +  unsigned long pgoff,
> +  unsigned long flags)
> +{
> +   if (len < 2 * PAGE_SIZE || (len & (len - 1)))
> +   return -EINVAL;
> +
> +   if (len > sgx_encl_size_max_64)
> +   return -EINVAL;
> +
> +   if (len > sgx_encl_size_max_32 && test_thread_flag(TIF_ADDR32))
> +   return -EINVAL;

Generally speaking, this type of

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Andy Lutomirski

On Mon, Dec 17, 2018 at 7:27 PM Jarkko Sakkinen
 wrote:
>
> On Tue, Dec 18, 2018 at 03:39:18AM +0200, Jarkko Sakkinen wrote:
> > On Mon, Dec 17, 2018 at 02:20:48PM -0800, Sean Christopherson wrote:
> > > The only potential hiccup I can see is the build flow.  Currently,
> > > EADD+EEXTEND is done via a work queue to avoid major performance issues
> > > (10x regression) when userspace is building multiple enclaves in parallel
> > > using goroutines to wrap Cgo (the issue might apply to any M:N scheduler,
> > > but I've only confirmed the Golang case).  The issue is that allocating
> > > an EPC page acts like a blocking syscall when the EPC is under pressure,
> > > i.e. an EPC page isn't immediately available.  This causes Go's scheduler
> > > to thrash and tank performance[1].
> >
> > I don't see any major issues having that kthread. All the code that
> > maps the enclave would be removed.
> >
> > I would only allow to map enclave to process address space after the
> > enclave has been initialized i.e. SGX_IOC_ENCLAVE_ATTACH.
>
> Some refined thoughts.
>
> PTE insertion can done in the #PF handler. In fact, we can PoC this
> already with the current architecture (and I will right after sending
> v18).
>
> The backing space is a bit more nasty issue in the add pager thread.
> The previous shmem swapping would have been a better fit. Maybe that
> should be reconsidered?
>
> If shmem was used, all the commits up to "SGX Enclave Driver" could
> be reworked to the new model.
>
> When we think about the swapping code, there uprises some difficulties.
> Namely, when a page is swapped, the enclave must unmap the PTE from all
> processes that have it mapped.

That's what unmap_mapping_range(), etc do for you, no?  IOW make a
struct address_space that represents the logical enclave address
space, i.e. address 0 is the start and the pages count up from there.
You can unmap pages whenever you want, and the core mm code will take
care of zapping the pages from all vmas referencing that
address_space.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Andy Lutomirski

On Mon, Dec 17, 2018 at 2:20 PM Sean Christopherson
 wrote:
>

> My brain is still sorting out the details, but I generally like the idea
> of allocating an anon inode when creating an enclave, and exposing the
> other ioctls() via the returned fd.  This is essentially the approach
> used by KVM to manage multiple "layers" of ioctls across KVM itself, VMs
> and vCPUS.  There are even similarities to accessing physical memory via
> multiple disparate domains, e.g. host kernel, host userspace and guest.
>

In my mind, opening /dev/sgx would give you the requisite inode.  I'm
not 100% sure that the chardev infrastructure allows this, but I think
it does.

> The only potential hiccup I can see is the build flow.  Currently,
> EADD+EEXTEND is done via a work queue to avoid major performance issues
> (10x regression) when userspace is building multiple enclaves in parallel
> using goroutines to wrap Cgo (the issue might apply to any M:N scheduler,
> but I've only confirmed the Golang case).  The issue is that allocating
> an EPC page acts like a blocking syscall when the EPC is under pressure,
> i.e. an EPC page isn't immediately available.  This causes Go's scheduler
> to thrash and tank performance[1].

What's the issue, and how does a workqueue help?  I'm wondering if a
nicer solution would be an ioctl to add lots of pages in a single
call.

>
> Alternatively, we could change the EADD+EEXTEND flow to not insert the
> added page's PFN into the owner's process space, i.e. force userspace to
> fault when it runs the enclave.  But that only delays the issue because
> eventually we'll want to account EPC pages, i.e. add a cgroup, at which
> point we'll likely need current->mm anyways.

You should be able to account the backing pages to a cgroup without
actually sticking them into the EPC, no?  Or am I misunderstanding?  I
guess we'll eventually want a cgroup to limit use of the limited EPC
resources.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Andy Lutomirski

On Mon, Dec 17, 2018 at 5:39 PM Jarkko Sakkinen
 wrote:
>
> On Mon, Dec 17, 2018 at 02:20:48PM -0800, Sean Christopherson wrote:
> > The only potential hiccup I can see is the build flow.  Currently,
> > EADD+EEXTEND is done via a work queue to avoid major performance issues
> > (10x regression) when userspace is building multiple enclaves in parallel
> > using goroutines to wrap Cgo (the issue might apply to any M:N scheduler,
> > but I've only confirmed the Golang case).  The issue is that allocating
> > an EPC page acts like a blocking syscall when the EPC is under pressure,
> > i.e. an EPC page isn't immediately available.  This causes Go's scheduler
> > to thrash and tank performance[1].
>
> I don't see any major issues having that kthread. All the code that
> maps the enclave would be removed.
>
> I would only allow to map enclave to process address space after the
> enclave has been initialized i.e. SGX_IOC_ENCLAVE_ATTACH.
>

What's SGX_IOC_ENCLAVE_ATTACH?  Why would it be needed at all?  I
would imagine that all pages would be faulted in as needed (or
prefaulted as an optimization) and the enclave would just work in any
process.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Tue, Dec 18, 2018 at 03:39:18AM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 02:20:48PM -0800, Sean Christopherson wrote:
> > The only potential hiccup I can see is the build flow.  Currently,
> > EADD+EEXTEND is done via a work queue to avoid major performance issues
> > (10x regression) when userspace is building multiple enclaves in parallel
> > using goroutines to wrap Cgo (the issue might apply to any M:N scheduler,
> > but I've only confirmed the Golang case).  The issue is that allocating
> > an EPC page acts like a blocking syscall when the EPC is under pressure,
> > i.e. an EPC page isn't immediately available.  This causes Go's scheduler
> > to thrash and tank performance[1].
> 
> I don't see any major issues having that kthread. All the code that
> maps the enclave would be removed.
> 
> I would only allow to map enclave to process address space after the
> enclave has been initialized i.e. SGX_IOC_ENCLAVE_ATTACH.

Some refined thoughts.

PTE insertion can done in the #PF handler. In fact, we can PoC this
already with the current architecture (and I will right after sending
v18).

The backing space is a bit more nasty issue in the add pager thread.
The previous shmem swapping would have been a better fit. Maybe that
should be reconsidered?

If shmem was used, all the commits up to "SGX Enclave Driver" could
be reworked to the new model.

When we think about the swapping code, there uprises some difficulties.
Namely, when a page is swapped, the enclave must unmap the PTE from all
processes that have it mapped.

I have a one compromise solution for the problem above: make enclaves
shared BUT mutually exclusive. When you attach an enclave it gets
detached from the previous process that had it. This would still fully
implement the daemon example that Andy gave in earlier response.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 12:10:17PM -0800, Andy Lutomirski wrote:
> On Mon, Dec 17, 2018 at 12:03 PM Dave Hansen  wrote:
> >
> > On 12/17/18 11:55 AM, Andy Lutomirski wrote:
> > >> You're effectively rebuilding reverse-mapping infrastructure here.  It's
> > >> a frequent thing for the core VM to need to go from 'struct page' back
> > >> to the page tables mapping it.  For that we go (logically)
> > >> page->{anon_vma,mapping}->vma->vm_mm->pagetable.
> > > This is a bit outside my expertise here, but doesn't
> > > unmap_mapping_range() do exactly what SGX wants?
> >
> > There's no 'struct page' for enclave memory as it stands.  That means no
> > page cache, and that means there's no 'struct address_space *mapping' in
> > the first place.
> >
> > Basically, the choice was made a long time ago to have SGX's memory
> > management live outside the core VM.  I've waffled back and forth on it,
> > but I do still think this is the right way to do it.
> 
> AFAICS a lack of struct page isn't a problem.  The core code seems to
> understand that address_space objects might cover non-struct-page
> memory.  Morally, enclave memory is a lot like hot-unpluggable PCI
> space.

I'm fine using it if it works. Will try it for v19.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 02:20:48PM -0800, Sean Christopherson wrote:
> The only potential hiccup I can see is the build flow.  Currently,
> EADD+EEXTEND is done via a work queue to avoid major performance issues
> (10x regression) when userspace is building multiple enclaves in parallel
> using goroutines to wrap Cgo (the issue might apply to any M:N scheduler,
> but I've only confirmed the Golang case).  The issue is that allocating
> an EPC page acts like a blocking syscall when the EPC is under pressure,
> i.e. an EPC page isn't immediately available.  This causes Go's scheduler
> to thrash and tank performance[1].

I don't see any major issues having that kthread. All the code that
maps the enclave would be removed.

I would only allow to map enclave to process address space after the
enclave has been initialized i.e. SGX_IOC_ENCLAVE_ATTACH.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Tue, Dec 18, 2018 at 03:17:25AM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 11:12:21AM -0800, Andy Lutomirski wrote:
> > I'm going to ask an obnoxious high-level question: why does an enclave
> > even refer to a specific mm?
> 
> The reason is that it has not been yet in focus in the review process
> and there has been other concerns.
> 
> At least the code is fairly stable i.e. working code is usually good
> starting point for making something different (ignoring the recent
> regression caused by the shmem to VMA migration).
> 
> > If I were designing this thing, and if I hadn't started trying to
> > implement it, my first thought would be that an enclave tracks its
> > linear address range, which is just a pair of numbers, and also keeps
> > track of a whole bunch of physical EPC pages, data structures, etc.
> > And that mmap() gets rejected unless the requested virtual address
> > matches the linear address range that the enclave wants and, aside
> > from that, just creates a VMA that keeps a reference to the enclave.
> > (And, for convenience, I suppose that the first mmap() call done
> > before any actual enclave setup happens could choose any address and
> > then cause the enclave to lock itself to that address, although a
> > regular anonymous PROT_NONE MAP_NORESERVE mapping would do just fine,
> > too.)  And the driver would explicitly allow multiple different mms to
> > have the same enclave mapped.  More importantly, a daemon could set up
> > an enclave without necessarily mapping it at all and then SCM_RIGHTS
> > the enclave over to the process that plans to run it.
> 
> The current SGX_IOC_ENCLAVE_CREATE ioctl would be trivial to change to
> use this approach. Instead looking up VMA with an enclave instance it
> would create a new enclave instance.
> 
> Then we could have SGX_IOC_ENCLAVE_ATTACH to attach an enclave to a VMA.
> 
> This does not sound too complicated.
> 
> > Now I'm sure this has all kinds of problems, such as the ISA possibly
> > making it rather obnoxious to add pages to the enclave without having
> > it mapped.  But these operations could, in principle, be done by
> 
> We do EADD in a kthread. What this would require to put current->mm
> into a request that it is processed by that thread. This would be
> doable with mmget().

Correction here. We need mm just for vm_insert_pfn(), which would be
removed, no need to pass mm.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 11:12:21AM -0800, Andy Lutomirski wrote:
> I'm going to ask an obnoxious high-level question: why does an enclave
> even refer to a specific mm?

The reason is that it has not been yet in focus in the review process
and there has been other concerns.

At least the code is fairly stable i.e. working code is usually good
starting point for making something different (ignoring the recent
regression caused by the shmem to VMA migration).

> If I were designing this thing, and if I hadn't started trying to
> implement it, my first thought would be that an enclave tracks its
> linear address range, which is just a pair of numbers, and also keeps
> track of a whole bunch of physical EPC pages, data structures, etc.
> And that mmap() gets rejected unless the requested virtual address
> matches the linear address range that the enclave wants and, aside
> from that, just creates a VMA that keeps a reference to the enclave.
> (And, for convenience, I suppose that the first mmap() call done
> before any actual enclave setup happens could choose any address and
> then cause the enclave to lock itself to that address, although a
> regular anonymous PROT_NONE MAP_NORESERVE mapping would do just fine,
> too.)  And the driver would explicitly allow multiple different mms to
> have the same enclave mapped.  More importantly, a daemon could set up
> an enclave without necessarily mapping it at all and then SCM_RIGHTS
> the enclave over to the process that plans to run it.

The current SGX_IOC_ENCLAVE_CREATE ioctl would be trivial to change to
use this approach. Instead looking up VMA with an enclave instance it
would create a new enclave instance.

Then we could have SGX_IOC_ENCLAVE_ATTACH to attach an enclave to a VMA.

This does not sound too complicated.

> Now I'm sure this has all kinds of problems, such as the ISA possibly
> making it rather obnoxious to add pages to the enclave without having
> it mapped.  But these operations could, in principle, be done by

We do EADD in a kthread. What this would require to put current->mm
into a request that it is processed by that thread. This would be
doable with mmget().

The deadlock that Sean mentioned would not exist since closing VMAs
is not bounded to the enclave life-cycle anymore.

So at least non-swapping ISA is easy to fit to this framework. I can
rework this for v19.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Sean Christopherson

On Mon, Dec 17, 2018 at 12:15:47PM -0800, Dave Hansen wrote:
> On 12/17/18 12:10 PM, Andy Lutomirski wrote:
> >> There's no 'struct page' for enclave memory as it stands.  That means no
> >> page cache, and that means there's no 'struct address_space *mapping' in
> >> the first place.
> >>
> >> Basically, the choice was made a long time ago to have SGX's memory
> >> management live outside the core VM.  I've waffled back and forth on it,
> >> but I do still think this is the right way to do it.
> > AFAICS a lack of struct page isn't a problem.  The core code seems to
> > understand that address_space objects might cover non-struct-page
> > memory.  Morally, enclave memory is a lot like hot-unpluggable PCI
> > space.
> 
> Yeah, this is true.  The existing code seems to make it all the way from
> unmap_mapping_range() down to zap_page_range() without 'struct page'.
> 
> Overall, I think what Andy is saying here is that an open(/dev/sgx)
> should give you a "unique" enclave fd.  That fd can end up mapped into
> one or more processes either via fork() or the other ways fds end up
> getting handed around.  mmap() of this fd would be *required* to be
> MAP_SHARED.  That means you don't need to support COW, and the semantics
> are the same as any other MAP_SHARED mapping: children and parents and
> anybody mmap()'ing it must all coordinate.
> 
> This sounds interesting at least.  It might lead to an unholy mess in
> the driver, or it might be a great cleanup.  But, it does sound like
> something that would both potentially simplify the semantics and the
> implementation.

It's very similar to KVM's model, which has proven to be fairly robust,
so I don't think it'll be an unholy mess (famous last words).  It
probably won't be a "great" cleanup per se, but it definitely should
make the code more maintainable in the long run.

The other interesting aspect of the enclave fd approach is that it would
allow userspace to *execute* an enclave from multiple processes, so long
as it did the necessary multiplexing of pthreads to enclave threads.  I
think SGX2 instructions (dynamic EPC management) would even allow adding
new enclave threads on-demand.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Sean Christopherson

On Mon, Dec 17, 2018 at 11:12:21AM -0800, Andy Lutomirski wrote:
> On Mon, Dec 17, 2018 at 10:47 AM Dave Hansen  wrote:
> >
> > On 12/17/18 10:43 AM, Jarkko Sakkinen wrote:
> > > On Mon, Dec 17, 2018 at 10:36:13AM -0800, Sean Christopherson wrote:
> > >> I'm pretty sure doing mmget() would result in circular dependencies and
> > >> a zombie enclave.  In the do_exit() case where a task is abruptly killed:
> > >>
> > >>   - __mmput() is never called because the enclave holds a ref
> > >>   - sgx_encl_release() is never be called because its VMAs hold refs
> > >>   - sgx_vma_close() is never called because __mmput()->exit_mmap() is
> > >> blocked and the process itself is dead, i.e. won't unmap anything.
> > > Right, it does, you are absolutely right. Tried it and removed the
> > > commit already.
> > >
> > > Well, what we came up from your suggestion i.e. setting mm to NULL
> > > and checking that is very subtle change and does not have any such
> > > circular dependencies. We'll go with that.
> >
> > This all screams that you need to break out this code from the massive
> > "18" patch and get the mm interactions reviewed more thoroughly.
> >
> > Also, no matter what method you go with, you have a bunch of commenting
> > and changelogging to do here.
> 
> I'm going to ask an obnoxious high-level question: why does an enclave
> even refer to a specific mm?

Primarily because that's what the code has "always" done.  I can't
speak for Jarkko, but I got involved with this joyful project long after
the code was originally written.

> If I were designing this thing, and if I hadn't started trying to
> implement it, my first thought would be that an enclave tracks its
> linear address range, which is just a pair of numbers, and also keeps
> track of a whole bunch of physical EPC pages, data structures, etc.
> And that mmap() gets rejected unless the requested virtual address
> matches the linear address range that the enclave wants and, aside
> from that, just creates a VMA that keeps a reference to the enclave.
> (And, for convenience, I suppose that the first mmap() call done
> before any actual enclave setup happens could choose any address and
> then cause the enclave to lock itself to that address, although a
> regular anonymous PROT_NONE MAP_NORESERVE mapping would do just fine,
> too.)  And the driver would explicitly allow multiple different mms to
> have the same enclave mapped.  More importantly, a daemon could set up
> an enclave without necessarily mapping it at all and then SCM_RIGHTS
> the enclave over to the process that plans to run it.

Hmm, this could work, the obvious weirdness would be ensuring the linear
range is available in the destination mm, but that'd be userspace's
problem.

I don't think we'd need to keep a reference to the enclave in the VMA.
The enclave's ref could be held by the fd.  Assuming the kernel is using
its private mapping to access the enclave, that's all we'd need to be
able to manipulate the enclave, e.g. reclaim EPC pages.  Userspace would
need to keep the fd alive in order to use the VMA, but that sort of goes
without saying.  The mm/VMA juggling today is for zapping/testing the
correct PTEs, but as you pointed out in a different email we can use
unmap_mapping_range(), with the enclave's fd being the source of the
address space passed to unmap_mapping_range().  Removing a VMA simply
means we don't need to zap it or test its age.
 
> Now I'm sure this has all kinds of problems, such as the ISA possibly
> making it rather obnoxious to add pages to the enclave without having
> it mapped.  But these operations could, in principle, be done by
> having the enclave own a private mm that's used just for setup.  While
> this would be vaguely annoying, Nadav's still-pending-but-nearly-done
> text_poke series adds all the infrastructure that's needed for the
> kernel to manage little private mms.  But some things get simpler --
> invalidating the enclave can presumably use the regular rmap APIs to
> zap all the PTEs in all VMAs pointing into the enclave.

We don't even need a private mm, we can (and already do) use the kernel's
translations for ENCLS instructions.  Hardware only enforces the linear
address stuff when it's actually in enclave mode, i.e. executing the
enclave.  ENCLS instructions aren't subject to the ELRANGE checks and
can use any VA->PA combination.

> So I'm not saying that you shouldn't do it the way you are now, but I
> do think that the changelog or at least some emails should explain
> *why* the enclave needs to keep a pointer to the creating process's
> mm.  And, if you do keep the current model, it would be nice to
> understand what happens if you do something awful like mremap()ing an
> enclave, or calling madvise on it, or otherwise abusing the vma.  Or
> doing fork(), for that matter.
> 
> I also find it suspicious that the various ioctl handlers
> systematically ignore their "filep" parameters and instead use
> find_vma() to find the relevant mm

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 09:33:22PM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 10:48:58AM -0800, Sean Christopherson wrote:
> > On Mon, Dec 17, 2018 at 08:43:33PM +0200, Jarkko Sakkinen wrote:
> > > On Mon, Dec 17, 2018 at 10:36:13AM -0800, Sean Christopherson wrote:
> > > > I'm pretty sure doing mmget() would result in circular dependencies and
> > > > a zombie enclave.  In the do_exit() case where a task is abruptly 
> > > > killed:
> > > >  
> > > >   - __mmput() is never called because the enclave holds a ref
> > > >   - sgx_encl_release() is never be called because its VMAs hold refs
> > > >   - sgx_vma_close() is never called because __mmput()->exit_mmap() is
> > > > blocked and the process itself is dead, i.e. won't unmap anything.
> > > 
> > > Right, it does, you are absolutely right. Tried it and removed the
> > > commit already.
> > > 
> > > Well, what we came up from your suggestion i.e. setting mm to NULL
> > > and checking that is very subtle change and does not have any such
> > > circular dependencies. We'll go with that.
> > 
> > We can't set mm to NULL as we need it to unregister the notifier, and
> > I'm fairly certain attempting to unregister in the release callback
> > will deadlock.
> 
> Noticed that too. mmu_notifier_unregister() requires a valid mm.

Both branches updated...

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Dave Hansen

On 12/17/18 12:10 PM, Andy Lutomirski wrote:
>> There's no 'struct page' for enclave memory as it stands.  That means no
>> page cache, and that means there's no 'struct address_space *mapping' in
>> the first place.
>>
>> Basically, the choice was made a long time ago to have SGX's memory
>> management live outside the core VM.  I've waffled back and forth on it,
>> but I do still think this is the right way to do it.
> AFAICS a lack of struct page isn't a problem.  The core code seems to
> understand that address_space objects might cover non-struct-page
> memory.  Morally, enclave memory is a lot like hot-unpluggable PCI
> space.

Yeah, this is true.  The existing code seems to make it all the way from
unmap_mapping_range() down to zap_page_range() without 'struct page'.

Overall, I think what Andy is saying here is that an open(/dev/sgx)
should give you a "unique" enclave fd.  That fd can end up mapped into
one or more processes either via fork() or the other ways fds end up
getting handed around.  mmap() of this fd would be *required* to be
MAP_SHARED.  That means you don't need to support COW, and the semantics
are the same as any other MAP_SHARED mapping: children and parents and
anybody mmap()'ing it must all coordinate.

This sounds interesting at least.  It might lead to an unholy mess in
the driver, or it might be a great cleanup.  But, it does sound like
something that would both potentially simplify the semantics and the
implementation.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Andy Lutomirski

On Mon, Dec 17, 2018 at 12:03 PM Dave Hansen  wrote:
>
> On 12/17/18 11:55 AM, Andy Lutomirski wrote:
> >> You're effectively rebuilding reverse-mapping infrastructure here.  It's
> >> a frequent thing for the core VM to need to go from 'struct page' back
> >> to the page tables mapping it.  For that we go (logically)
> >> page->{anon_vma,mapping}->vma->vm_mm->pagetable.
> > This is a bit outside my expertise here, but doesn't
> > unmap_mapping_range() do exactly what SGX wants?
>
> There's no 'struct page' for enclave memory as it stands.  That means no
> page cache, and that means there's no 'struct address_space *mapping' in
> the first place.
>
> Basically, the choice was made a long time ago to have SGX's memory
> management live outside the core VM.  I've waffled back and forth on it,
> but I do still think this is the right way to do it.

AFAICS a lack of struct page isn't a problem.  The core code seems to
understand that address_space objects might cover non-struct-page
memory.  Morally, enclave memory is a lot like hot-unpluggable PCI
space.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Dave Hansen

On 12/17/18 11:55 AM, Andy Lutomirski wrote:
>> You're effectively rebuilding reverse-mapping infrastructure here.  It's
>> a frequent thing for the core VM to need to go from 'struct page' back
>> to the page tables mapping it.  For that we go (logically)
>> page->{anon_vma,mapping}->vma->vm_mm->pagetable.
> This is a bit outside my expertise here, but doesn't
> unmap_mapping_range() do exactly what SGX wants?

There's no 'struct page' for enclave memory as it stands.  That means no
page cache, and that means there's no 'struct address_space *mapping' in
the first place.

Basically, the choice was made a long time ago to have SGX's memory
management live outside the core VM.  I've waffled back and forth on it,
but I do still think this is the right way to do it.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Andy Lutomirski

On Mon, Dec 17, 2018 at 11:53 AM Dave Hansen  wrote:
>
> On 12/17/18 11:49 AM, Jarkko Sakkinen wrote:
> >> Yeah, the code is built to have one VMA and only one VMA per enclave.
> >> You need to go over the origin of this restriction and what enforces this.
> > It is before ECREATE but after that you can split it with mprotect().
> >
> > Lets take an example. I'm not sure how we would acquire mm efficiently
> > in sgx_encl_page_reclaim() otherwise than having it as a field in encl.
>
> You're effectively rebuilding reverse-mapping infrastructure here.  It's
> a frequent thing for the core VM to need to go from 'struct page' back
> to the page tables mapping it.  For that we go (logically)
> page->{anon_vma,mapping}->vma->vm_mm->pagetable.

This is a bit outside my expertise here, but doesn't
unmap_mapping_range() do exactly what SGX wants?

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 11:25:47AM -0800, Andy Lutomirski wrote:
> On Mon, Dec 17, 2018 at 11:17 AM Dave Hansen  wrote:
> >
> > On 12/17/18 11:12 AM, Andy Lutomirski wrote:
> > > So I'm not saying that you shouldn't do it the way you are now, but I
> > > do think that the changelog or at least some emails should explain
> > > *why* the enclave needs to keep a pointer to the creating process's
> > > mm.  And, if you do keep the current model, it would be nice to
> > > understand what happens if you do something awful like mremap()ing an
> > > enclave, or calling madvise on it, or otherwise abusing the vma.  Or
> > > doing fork(), for that matter.
> >
> > Yeah, the code is built to have one VMA and only one VMA per enclave.
> > You need to go over the origin of this restriction and what enforces this.
> 
> There is a sad historical reason that you may regret keeping this
> restriction.  There are plenty of pieces of code out there that think
> it's reasonable to spawn a subprocess by calling fork() and then
> execve().  (This is *not* a sensible thing to do.  One should use
> posix_spawn() or some CLONE_VM variant.  But even fairly recent
> posix_spawn() implementations will fork().  So the driver has to do
> *something* sensible on fork() or a bunch of things that use SGX
> unsuspectingly via, for example, PKCS #11, are going to be very sad.
> I suppose you could make enclaves just not show up in the fork()ed
> children, but then you have a different problem: creating an enclave
> and then doing daemon() won't work.
> 
> Yes, POSIX traditions are rather silly.

ATM enclave VMAs are not copied on fork. Not sure how you would
implement COW semantics with enclaves.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Dave Hansen

On 12/17/18 11:49 AM, Jarkko Sakkinen wrote:
>> Yeah, the code is built to have one VMA and only one VMA per enclave.
>> You need to go over the origin of this restriction and what enforces this.
> It is before ECREATE but after that you can split it with mprotect().
> 
> Lets take an example. I'm not sure how we would acquire mm efficiently
> in sgx_encl_page_reclaim() otherwise than having it as a field in encl.

You're effectively rebuilding reverse-mapping infrastructure here.  It's
a frequent thing for the core VM to need to go from 'struct page' back
to the page tables mapping it.  For that we go (logically)
page->{anon_vma,mapping}->vma->vm_mm->pagetable.

This, on the other hand, is trying to do page->encl->mm->pagetable.  You
could very easily have a VMA analog in there instead of jumping straight
to the mm.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 11:17:49AM -0800, Dave Hansen wrote:
> On 12/17/18 11:12 AM, Andy Lutomirski wrote:
> > So I'm not saying that you shouldn't do it the way you are now, but I
> > do think that the changelog or at least some emails should explain
> > *why* the enclave needs to keep a pointer to the creating process's
> > mm.  And, if you do keep the current model, it would be nice to
> > understand what happens if you do something awful like mremap()ing an
> > enclave, or calling madvise on it, or otherwise abusing the vma.  Or
> > doing fork(), for that matter.
> 
> Yeah, the code is built to have one VMA and only one VMA per enclave.
> You need to go over the origin of this restriction and what enforces this.

It is before ECREATE but after that you can split it with mprotect().

Lets take an example. I'm not sure how we would acquire mm efficiently
in sgx_encl_page_reclaim() otherwise than having it as a field in encl.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Dave Hansen

On 12/17/18 11:37 AM, Jarkko Sakkinen wrote:
>> Suggestion:
>>
>> It looks like you only expect one VMA per enclave.  Things go bonkers if
>> this is not true.  So, instead of storing encl->mm, don't.  You can get
>> the mm from vma->vm_mm and you could just store encl->vma instead.
> The code actually supports having multiple VMAs per enclave.

That seems at least somewhat at odds with this comment:

> static void sgx_vma_open(struct vm_area_struct *vma)
> {
> struct sgx_encl *encl = vma->vm_private_data;
> 
> if (!encl)
> return;
> 
> /* kref cannot underflow because ECREATE ioctl checks that there is 
> only
>  * one single VMA for the enclave before proceeding.
>  */
> kref_get(>refcount);
> }

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 11:09:33AM -0800, Dave Hansen wrote:
> On 12/17/18 10:48 AM, Sean Christopherson wrote:
> > We can't set mm to NULL as we need it to unregister the notifier, and
> > I'm fairly certain attempting to unregister in the release callback
> > will deadlock.
> 
> Suggestion:
> 
> It looks like you only expect one VMA per enclave.  Things go bonkers if
> this is not true.  So, instead of storing encl->mm, don't.  You can get
> the mm from vma->vm_mm and you could just store encl->vma instead.

The code actually supports having multiple VMAs per enclave.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 10:46:25AM -0800, Sean Christopherson wrote:
> On Mon, Dec 17, 2018 at 08:23:19PM +0200, Jarkko Sakkinen wrote:
> > On Mon, Dec 17, 2018 at 10:09:57AM -0800, Sean Christopherson wrote:
> > > No, EREMOVE should never fail if the enclave is being released, i.e. all
> > > references to the enclave are gone.  And failure during sgx_encl_release()
> > > means we leaked an EPC page, which warrants a WARN.
> > 
> > Right that what I was suspecting as swapper should hold a ref to the
> > enclave while it is working on it. It is a programming error when this
> > happens.
> > 
> > Maybe change the boolean parameter to flags parameter have a flag to
> > use sgx_free_page()?
> 
> I tried that approach when I first split it to __sgx_free_page() and
> sgx_free_page(), but IMO the code is more difficult to read and harder
> to maintain since sgx_free_page() should be used except under special
> circumstances, e.g. race with reclaim or the freeing is "untrusted",
> i.e. requested by userspace via sgx_ioc_enclave_remove_pages().

I mean inside sgx_invalidate() call either __sgx_free_page() or
sgx_free_page() depending on a flag.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 10:48:58AM -0800, Sean Christopherson wrote:
> On Mon, Dec 17, 2018 at 08:43:33PM +0200, Jarkko Sakkinen wrote:
> > On Mon, Dec 17, 2018 at 10:36:13AM -0800, Sean Christopherson wrote:
> > > I'm pretty sure doing mmget() would result in circular dependencies and
> > > a zombie enclave.  In the do_exit() case where a task is abruptly killed:
> > >  
> > >   - __mmput() is never called because the enclave holds a ref
> > >   - sgx_encl_release() is never be called because its VMAs hold refs
> > >   - sgx_vma_close() is never called because __mmput()->exit_mmap() is
> > > blocked and the process itself is dead, i.e. won't unmap anything.
> > 
> > Right, it does, you are absolutely right. Tried it and removed the
> > commit already.
> > 
> > Well, what we came up from your suggestion i.e. setting mm to NULL
> > and checking that is very subtle change and does not have any such
> > circular dependencies. We'll go with that.
> 
> We can't set mm to NULL as we need it to unregister the notifier, and
> I'm fairly certain attempting to unregister in the release callback
> will deadlock.

Noticed that too. mmu_notifier_unregister() requires a valid mm.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Andy Lutomirski

On Mon, Dec 17, 2018 at 11:17 AM Dave Hansen  wrote:
>
> On 12/17/18 11:12 AM, Andy Lutomirski wrote:
> > So I'm not saying that you shouldn't do it the way you are now, but I
> > do think that the changelog or at least some emails should explain
> > *why* the enclave needs to keep a pointer to the creating process's
> > mm.  And, if you do keep the current model, it would be nice to
> > understand what happens if you do something awful like mremap()ing an
> > enclave, or calling madvise on it, or otherwise abusing the vma.  Or
> > doing fork(), for that matter.
>
> Yeah, the code is built to have one VMA and only one VMA per enclave.
> You need to go over the origin of this restriction and what enforces this.

There is a sad historical reason that you may regret keeping this
restriction.  There are plenty of pieces of code out there that think
it's reasonable to spawn a subprocess by calling fork() and then
execve().  (This is *not* a sensible thing to do.  One should use
posix_spawn() or some CLONE_VM variant.  But even fairly recent
posix_spawn() implementations will fork().  So the driver has to do
*something* sensible on fork() or a bunch of things that use SGX
unsuspectingly via, for example, PKCS #11, are going to be very sad.
I suppose you could make enclaves just not show up in the fork()ed
children, but then you have a different problem: creating an enclave
and then doing daemon() won't work.

Yes, POSIX traditions are rather silly.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Dave Hansen

On 12/17/18 11:12 AM, Andy Lutomirski wrote:
> So I'm not saying that you shouldn't do it the way you are now, but I
> do think that the changelog or at least some emails should explain
> *why* the enclave needs to keep a pointer to the creating process's
> mm.  And, if you do keep the current model, it would be nice to
> understand what happens if you do something awful like mremap()ing an
> enclave, or calling madvise on it, or otherwise abusing the vma.  Or
> doing fork(), for that matter.

Yeah, the code is built to have one VMA and only one VMA per enclave.
You need to go over the origin of this restriction and what enforces this.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Andy Lutomirski

On Mon, Dec 17, 2018 at 10:47 AM Dave Hansen  wrote:
>
> On 12/17/18 10:43 AM, Jarkko Sakkinen wrote:
> > On Mon, Dec 17, 2018 at 10:36:13AM -0800, Sean Christopherson wrote:
> >> I'm pretty sure doing mmget() would result in circular dependencies and
> >> a zombie enclave.  In the do_exit() case where a task is abruptly killed:
> >>
> >>   - __mmput() is never called because the enclave holds a ref
> >>   - sgx_encl_release() is never be called because its VMAs hold refs
> >>   - sgx_vma_close() is never called because __mmput()->exit_mmap() is
> >> blocked and the process itself is dead, i.e. won't unmap anything.
> > Right, it does, you are absolutely right. Tried it and removed the
> > commit already.
> >
> > Well, what we came up from your suggestion i.e. setting mm to NULL
> > and checking that is very subtle change and does not have any such
> > circular dependencies. We'll go with that.
>
> This all screams that you need to break out this code from the massive
> "18" patch and get the mm interactions reviewed more thoroughly.
>
> Also, no matter what method you go with, you have a bunch of commenting
> and changelogging to do here.

I'm going to ask an obnoxious high-level question: why does an enclave
even refer to a specific mm?

If I were designing this thing, and if I hadn't started trying to
implement it, my first thought would be that an enclave tracks its
linear address range, which is just a pair of numbers, and also keeps
track of a whole bunch of physical EPC pages, data structures, etc.
And that mmap() gets rejected unless the requested virtual address
matches the linear address range that the enclave wants and, aside
from that, just creates a VMA that keeps a reference to the enclave.
(And, for convenience, I suppose that the first mmap() call done
before any actual enclave setup happens could choose any address and
then cause the enclave to lock itself to that address, although a
regular anonymous PROT_NONE MAP_NORESERVE mapping would do just fine,
too.)  And the driver would explicitly allow multiple different mms to
have the same enclave mapped.  More importantly, a daemon could set up
an enclave without necessarily mapping it at all and then SCM_RIGHTS
the enclave over to the process that plans to run it.

Now I'm sure this has all kinds of problems, such as the ISA possibly
making it rather obnoxious to add pages to the enclave without having
it mapped.  But these operations could, in principle, be done by
having the enclave own a private mm that's used just for setup.  While
this would be vaguely annoying, Nadav's still-pending-but-nearly-done
text_poke series adds all the infrastructure that's needed for the
kernel to manage little private mms.  But some things get simpler --
invalidating the enclave can presumably use the regular rmap APIs to
zap all the PTEs in all VMAs pointing into the enclave.

So I'm not saying that you shouldn't do it the way you are now, but I
do think that the changelog or at least some emails should explain
*why* the enclave needs to keep a pointer to the creating process's
mm.  And, if you do keep the current model, it would be nice to
understand what happens if you do something awful like mremap()ing an
enclave, or calling madvise on it, or otherwise abusing the vma.  Or
doing fork(), for that matter.

I also find it suspicious that the various ioctl handlers
systematically ignore their "filep" parameters and instead use
find_vma() to find the relevant mm data structures.  That seems
backwards.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Dave Hansen

On 12/17/18 10:48 AM, Sean Christopherson wrote:
> We can't set mm to NULL as we need it to unregister the notifier, and
> I'm fairly certain attempting to unregister in the release callback
> will deadlock.

Suggestion:

It looks like you only expect one VMA per enclave.  Things go bonkers if
this is not true.  So, instead of storing encl->mm, don't.  You can get
the mm from vma->vm_mm and you could just store encl->vma instead.

Doing that, you could even axe encl->base and encl->size, I think
because you just get those from the VMA itself.

That makes the relationship clearer: 1 VMA per enclave.  We also
implicitly understand that if you have a VMA, you implicitly have a ref
to the mm *and* the VMA is immutable.

If there were ever a path where encl->vma wasn't immutable, we'd have a
bug (or load of bugs) somewhere, right?

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Sean Christopherson

On Mon, Dec 17, 2018 at 08:43:33PM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 10:36:13AM -0800, Sean Christopherson wrote:
> > I'm pretty sure doing mmget() would result in circular dependencies and
> > a zombie enclave.  In the do_exit() case where a task is abruptly killed:
> >  
> >   - __mmput() is never called because the enclave holds a ref
> >   - sgx_encl_release() is never be called because its VMAs hold refs
> >   - sgx_vma_close() is never called because __mmput()->exit_mmap() is
> > blocked and the process itself is dead, i.e. won't unmap anything.
> 
> Right, it does, you are absolutely right. Tried it and removed the
> commit already.
> 
> Well, what we came up from your suggestion i.e. setting mm to NULL
> and checking that is very subtle change and does not have any such
> circular dependencies. We'll go with that.

We can't set mm to NULL as we need it to unregister the notifier, and
I'm fairly certain attempting to unregister in the release callback
will deadlock.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Dave Hansen

On 12/17/18 10:43 AM, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 10:36:13AM -0800, Sean Christopherson wrote:
>> I'm pretty sure doing mmget() would result in circular dependencies and
>> a zombie enclave.  In the do_exit() case where a task is abruptly killed:
>>  
>>   - __mmput() is never called because the enclave holds a ref
>>   - sgx_encl_release() is never be called because its VMAs hold refs
>>   - sgx_vma_close() is never called because __mmput()->exit_mmap() is
>> blocked and the process itself is dead, i.e. won't unmap anything.
> Right, it does, you are absolutely right. Tried it and removed the
> commit already.
> 
> Well, what we came up from your suggestion i.e. setting mm to NULL
> and checking that is very subtle change and does not have any such
> circular dependencies. We'll go with that.

This all screams that you need to break out this code from the massive
"18" patch and get the mm interactions reviewed more thoroughly.

Also, no matter what method you go with, you have a bunch of commenting
and changelogging to do here.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Sean Christopherson

On Mon, Dec 17, 2018 at 08:23:19PM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 10:09:57AM -0800, Sean Christopherson wrote:
> > No, EREMOVE should never fail if the enclave is being released, i.e. all
> > references to the enclave are gone.  And failure during sgx_encl_release()
> > means we leaked an EPC page, which warrants a WARN.
> 
> Right that what I was suspecting as swapper should hold a ref to the
> enclave while it is working on it. It is a programming error when this
> happens.
> 
> Maybe change the boolean parameter to flags parameter have a flag to
> use sgx_free_page()?

I tried that approach when I first split it to __sgx_free_page() and
sgx_free_page(), but IMO the code is more difficult to read and harder
to maintain since sgx_free_page() should be used except under special
circumstances, e.g. race with reclaim or the freeing is "untrusted",
i.e. requested by userspace via sgx_ioc_enclave_remove_pages().

> 
> > That makes sense.
> 
> What do you think of Dave's proposal?
> 
> /Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 10:36:13AM -0800, Sean Christopherson wrote:
> I'm pretty sure doing mmget() would result in circular dependencies and
> a zombie enclave.  In the do_exit() case where a task is abruptly killed:
>  
>   - __mmput() is never called because the enclave holds a ref
>   - sgx_encl_release() is never be called because its VMAs hold refs
>   - sgx_vma_close() is never called because __mmput()->exit_mmap() is
> blocked and the process itself is dead, i.e. won't unmap anything.

Right, it does, you are absolutely right. Tried it and removed the
commit already.

Well, what we came up from your suggestion i.e. setting mm to NULL
and checking that is very subtle change and does not have any such
circular dependencies. We'll go with that.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Sean Christopherson

On Mon, Dec 17, 2018 at 08:01:02PM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 09:45:40AM -0800, Dave Hansen wrote:
> > > +struct sgx_encl *sgx_encl_alloc(struct sgx_secs *secs)
> > > +{
> > ...
> > > + kref_init(>refcount);
> > > + INIT_LIST_HEAD(>add_page_reqs);
> > > + INIT_RADIX_TREE(>page_tree, GFP_KERNEL);
> > > + mutex_init(>lock);
> > > + INIT_WORK(>add_page_work, sgx_add_page_worker);
> > > +
> > > + encl->mm = current->mm;  <-> +  
> > > encl->base = secs->base;
> > > + encl->size = secs->size;
> > > + encl->ssaframesize = secs->ssa_frame_size;
> > > + encl->backing = backing;
> > > +
> > > + return encl;
> > > +}
> > 
> > How is this OK without taking a reference on the mm?

It's subtle and the ordering is all kinds of weird, but technically we
are taking a reference on mm when the mmu_notifier is registered in
sgx_encl_create().  sgx_encl_alloc() and sgx_encl_create() are always
called in tandem and with mm->mm_users > 0, so we'll never use encl->mm
without holding a reference to mm.  We need to comment the weirdness
or maybe register the notifier before

> > I have a feeling a bunch of your bugs with the mmu notifiers and so
> > forth are because the refcounting is wrong here.

Eh, not really.  Maybe the mmu_notifier is more subtle, e.g. calling
do_unmap() after mmput() would be quite obvious, but there's no
fundamental bug, we just haven't needed to touch VMAs during release
prior to moving away from shmem.

> > Sean's SGX_ENCL_MM_RELEASED would, I think be unnecessary if you just
> > take a refcount here and release it when the enclave is destroyed.
> 
> Right, atomic_inc(encl->mm->count) here and once when releasing.
> 
> The we would not even need the whole mmu notifier in the first place.

I'm pretty sure doing mmget() would result in circular dependencies and
a zombie enclave.  In the do_exit() case where a task is abruptly killed:

  - __mmput() is never called because the enclave holds a ref
  - sgx_encl_release() is never be called because its VMAs hold refs
  - sgx_vma_close() is never called because __mmput()->exit_mmap() is
blocked and the process itself is dead, i.e. won't unmap anything.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 10:07:08AM -0800, Dave Hansen wrote:
> On 12/17/18 10:01 AM, Jarkko Sakkinen wrote:
> >>> + encl->mm = current->mm;  <-> +  
> >>> encl->base = secs->base;
> >>> + encl->size = secs->size;
> >>> + encl->ssaframesize = secs->ssa_frame_size;
> >>> + encl->backing = backing;
> >>> +
> >>> + return encl;
> >>> +}
> >> How is this OK without taking a reference on the mm?
> >>
> >> I have a feeling a bunch of your bugs with the mmu notifiers and so
> >> forth are because the refcounting is wrong here.
> >>
> >> Sean's SGX_ENCL_MM_RELEASED would, I think be unnecessary if you just
> >> take a refcount here and release it when the enclave is destroyed.
> > Right, atomic_inc(encl->mm->count) here and once when releasing.
> > 
> > The we would not even need the whole mmu notifier in the first place.
> 
> Please use mmget()/mmput().

There's now a patch to test on top of the master.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 10:09:57AM -0800, Sean Christopherson wrote:
> No, EREMOVE should never fail if the enclave is being released, i.e. all
> references to the enclave are gone.  And failure during sgx_encl_release()
> means we leaked an EPC page, which warrants a WARN.

Right that what I was suspecting as swapper should hold a ref to the
enclave while it is working on it. It is a programming error when this
happens.

Maybe change the boolean parameter to flags parameter have a flag to
use sgx_free_page()?

> That makes sense.

What do you think of Dave's proposal?

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Sean Christopherson

On Mon, Dec 17, 2018 at 07:49:35PM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 09:31:06AM -0800, Sean Christopherson wrote:
> > This doesn't work as-is.  sgx_encl_release() needs to use sgx_free_page()
> > and not __sgx_free_page() so that we get a WARN() if the page can't be
> > freed.  sgx_invalidate() needs to use __sgx_free_page() as freeing a page
> > can fail due to running concurrently with reclaim.  I'll play around with
> > the code a bit, there's probably a fairly clean way to share code between
> > the two flows.
> 
> Hmm... but why issue a warning in that case? It should be legit
> behaviour.

No, EREMOVE should never fail if the enclave is being released, i.e. all
references to the enclave are gone.  And failure during sgx_encl_release()
means we leaked an EPC page, which warrants a WARN.

The only legitimate reason __sgx_free_page() can fail in sgx_invalidate()
is because a page might be in the process of being reclaimed.  We could
theoretically WARN on EREMOVE failure in that case, but it'd make the code
a little fragile and it's not "fatal" in the sense that we get a second
chance to free the page during sgx_encl_release().

And unless I missed something, using sgx_invalidate() means were' leaking
all sgx_encl_page structs as well as the radix tree entries.

> > sgx_encl_release_worker() calls do_unmap() without checking the validity
> > of the page tables[1].  As is, the code doesn't even guarantee mm_struct
> > itself is valid.
> > 
> > The easiest fix I can think of is to add a SGX_ENCL_MM_RELEASED flag
> > that is set along with SGX_ENCL_DEAD in sgx_mmu_notifier_release(), and
> > only call do_unmap() if SGX_ENCL_MM_RELEASED is false.  Note that this
> > means we cant unregister the mmu_notifier until after do_unmap(), but
> > that's true no matter what since we're relying on the mmu_notifier to
> > hold a reference to mm_struct.  Patch attached.
> 
> OK, the fix change makes sense but I'm thinking that would it be a
> better idea just to set mm NULL and check that instead?

That makes sense.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Dave Hansen

On 12/17/18 10:01 AM, Jarkko Sakkinen wrote:
>>> +   encl->mm = current->mm;  <-> +  
>>> encl->base = secs->base;
>>> +   encl->size = secs->size;
>>> +   encl->ssaframesize = secs->ssa_frame_size;
>>> +   encl->backing = backing;
>>> +
>>> +   return encl;
>>> +}
>> How is this OK without taking a reference on the mm?
>>
>> I have a feeling a bunch of your bugs with the mmu notifiers and so
>> forth are because the refcounting is wrong here.
>>
>> Sean's SGX_ENCL_MM_RELEASED would, I think be unnecessary if you just
>> take a refcount here and release it when the enclave is destroyed.
> Right, atomic_inc(encl->mm->count) here and once when releasing.
> 
> The we would not even need the whole mmu notifier in the first place.

Please use mmget()/mmput().

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 09:45:40AM -0800, Dave Hansen wrote:
> > +struct sgx_encl *sgx_encl_alloc(struct sgx_secs *secs)
> > +{
> ...
> > +   kref_init(>refcount);
> > +   INIT_LIST_HEAD(>add_page_reqs);
> > +   INIT_RADIX_TREE(>page_tree, GFP_KERNEL);
> > +   mutex_init(>lock);
> > +   INIT_WORK(>add_page_work, sgx_add_page_worker);
> > +
> > +   encl->mm = current->mm;  <-> +  
> > encl->base = secs->base;
> > +   encl->size = secs->size;
> > +   encl->ssaframesize = secs->ssa_frame_size;
> > +   encl->backing = backing;
> > +
> > +   return encl;
> > +}
> 
> How is this OK without taking a reference on the mm?
> 
> I have a feeling a bunch of your bugs with the mmu notifiers and so
> forth are because the refcounting is wrong here.
> 
> Sean's SGX_ENCL_MM_RELEASED would, I think be unnecessary if you just
> take a refcount here and release it when the enclave is destroyed.

Right, atomic_inc(encl->mm->count) here and once when releasing.

The we would not even need the whole mmu notifier in the first place.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 09:31:06AM -0800, Sean Christopherson wrote:
> This doesn't work as-is.  sgx_encl_release() needs to use sgx_free_page()
> and not __sgx_free_page() so that we get a WARN() if the page can't be
> freed.  sgx_invalidate() needs to use __sgx_free_page() as freeing a page
> can fail due to running concurrently with reclaim.  I'll play around with
> the code a bit, there's probably a fairly clean way to share code between
> the two flows.

Hmm... but why issue a warning in that case? It should be legit
behaviour.

> sgx_encl_release_worker() calls do_unmap() without checking the validity
> of the page tables[1].  As is, the code doesn't even guarantee mm_struct
> itself is valid.
> 
> The easiest fix I can think of is to add a SGX_ENCL_MM_RELEASED flag
> that is set along with SGX_ENCL_DEAD in sgx_mmu_notifier_release(), and
> only call do_unmap() if SGX_ENCL_MM_RELEASED is false.  Note that this
> means we cant unregister the mmu_notifier until after do_unmap(), but
> that's true no matter what since we're relying on the mmu_notifier to
> hold a reference to mm_struct.  Patch attached.

OK, the fix change makes sense but I'm thinking that would it be a
better idea just to set mm NULL and check that instead?

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Dave Hansen

> +struct sgx_encl *sgx_encl_alloc(struct sgx_secs *secs)
> +{
...
> + kref_init(>refcount);
> + INIT_LIST_HEAD(>add_page_reqs);
> + INIT_RADIX_TREE(>page_tree, GFP_KERNEL);
> + mutex_init(>lock);
> + INIT_WORK(>add_page_work, sgx_add_page_worker);
> +
> + encl->mm = current->mm;  <-> +  
> encl->base = secs->base;
> + encl->size = secs->size;
> + encl->ssaframesize = secs->ssa_frame_size;
> + encl->backing = backing;
> +
> + return encl;
> +}

How is this OK without taking a reference on the mm?

I have a feeling a bunch of your bugs with the mmu notifiers and so
forth are because the refcounting is wrong here.

Sean's SGX_ENCL_MM_RELEASED would, I think be unnecessary if you just
take a refcount here and release it when the enclave is destroyed.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Sean Christopherson

On Mon, Dec 17, 2018 at 04:08:11PM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 03:39:28PM +0200, Jarkko Sakkinen wrote:
> > On Mon, Dec 17, 2018 at 03:28:59PM +0200, Jarkko Sakkinen wrote:
> > > On Fri, Dec 14, 2018 at 04:06:27PM -0800, Sean Christopherson wrote:
> > > > [  504.149548] [ cut here ]
> > > > [  504.149550] kernel BUG at 
> > > > /home/sean/go/src/kernel.org/linux/mm/mmap.c:669!
> > > > [  504.150288] invalid opcode:  [#1] SMP
> > > > [  504.150614] CPU: 2 PID: 237 Comm: kworker/u20:2 Not tainted 
> > > > 4.20.0-rc2+ #267
> > > > [  504.151165] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> > > > 0.0.0 02/06/2015
> > > > [  504.151818] Workqueue: sgx-encl-wq sgx_encl_release_worker
> > > > [  504.152267] RIP: 0010:__vma_adjust+0x64a/0x820
> > > > [  504.152626] Code: ff 48 89 50 18 e9 6f fc ff ff 4c 8b ab 88 00 00 00 
> > > > 45 31 e4 e9 61 fb ff ff 31 c0 48 83 c4 60 5b 5d 41 5c 41 5d 41 5e 41 5f 
> > > > c3 <0f> 0b 49 89 de 49 83 c6 20 0f 84 06 fe ff ff 49 8d 7e e0 e8 fe ee
> > > > [  504.154109] RSP: :c94ebd60 EFLAGS: 00010206
> > > > [  504.154535] RAX: 7fd92ef7e000 RBX: 888467af16c0 RCX: 
> > > > 888467af16e0
> > > > [  504.155104] RDX: 888458fd09e0 RSI: 7fd954021000 RDI: 
> > > > 88846bf9e798
> > > > [  504.155673] RBP: 888467af1480 R08: 88845bea2000 R09: 
> > > > 
> > > > [  504.156242] R10: 8000 R11: fefefefefefefeff R12: 
> > > > 
> > > > [  504.156810] R13: 88846bf9e790 R14: 888467af1b70 R15: 
> > > > 888467af1b60
> > > > [  504.157378] FS:  () GS:88846f70() 
> > > > knlGS:
> > > > [  504.158021] CS:  0010 DS:  ES:  CR0: 80050033
> > > > [  504.158483] CR2: 7f2c56e99000 CR3: 05009001 CR4: 
> > > > 00360ee0
> > > > [  504.159054] DR0:  DR1:  DR2: 
> > > > 
> > > > [  504.159623] DR3:  DR6: fffe0ff0 DR7: 
> > > > 0400
> > > > [  504.160193] Call Trace:
> > > > [  504.160406]  __split_vma+0x16f/0x180
> > > > [  504.160706]  ? __switch_to_asm+0x40/0x70
> > > > [  504.161024]  __do_munmap+0xfb/0x450
> > > > [  504.161308]  sgx_encl_release_worker+0x44/0x70
> > > > [  504.161675]  process_one_work+0x200/0x3f0
> > > > [  504.162004]  worker_thread+0x2d/0x3d0
> > > > [  504.162301]  ? process_one_work+0x3f0/0x3f0
> > > > [  504.162645]  kthread+0x113/0x130
> > > > [  504.162912]  ? kthread_park+0x90/0x90
> > > > [  504.163209]  ret_from_fork+0x35/0x40
> > > > [  504.163503] Modules linked in: bridge stp llc
> > > > [  504.163866] ---[ end trace 83076139fc25e3e0 ]---
> > > 
> > > There was a race with release and swapping code that I thought I fixed,
> > > and this is looks like a race there. Have to recheck what I did not
> > > consider. Anyway, though to share this if you have time to look at it.
> > > That is the part where something is now unsync most probably.
> > 
> > I think I found it. I was careless to make sgx_encl_release() to use
> > sgx_invalidate(), which does not delete pages in the case when enclave
> > is already marked as dead. This was after I had fixed the race that I
> > had there in the first place. That is why I was puzzled why it suddenly
> > reappeared.
> > 
> > Would be nice to use sgx_invalidate() also in release for consistency in
> > semantics sake so maybe just delete this:
> > 
> > if (encl->flags & SGX_ENCL_DEAD)
> > return;

This doesn't work as-is.  sgx_encl_release() needs to use sgx_free_page()
and not __sgx_free_page() so that we get a WARN() if the page can't be
freed.  sgx_invalidate() needs to use __sgx_free_page() as freeing a page
can fail due to running concurrently with reclaim.  I'll play around with
the code a bit, there's probably a fairly clean way to share code between
the two flows.

> 
> Updated master, not at this point next.

Still broken (as Greg's parallel email points out).

sgx_encl_release_worker() calls do_unmap() without checking the validity
of the page tables[1].  As is, the code doesn't even guarantee mm_struct
itself is valid.

The easiest fix I can think of is to add a SGX_ENCL_MM_RELEASED flag
that is set along with SGX_ENCL_DEAD in sgx_mmu_notifier_release(), and
only call do_unmap() if SGX_ENCL_MM_RELEASED is false.  Note that this
means we cant unregister the mmu_notifier until after do_unmap(), but
that's true no matter what since we're relying on the mmu_notifier to
hold a reference to mm_struct.  Patch attached.

[1] https://www.spinics.net/lists/dri-devel/msg186827.html
>From 7cfdf34ec5b70392216b24853d6b8cc5e3192a92 Mon Sep 17 00:00:00 2001
From: Sean Christopherson 
Date: Mon, 17 Dec 2018 09:21:14 -0800
Subject: [PATCH] x86/sgx: Do not attempt to unmap enclave VMAs if mm_struct is
 defunct

Add a flag, SGX_ENCL_MM_RELEASED, to explicitly track the lifecycle of
the enclave's associated

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Dr. Greg

On Mon, Dec 17, 2018 at 04:13:15PM +0200, Jarkko Sakkinen wrote:

Good morning to everyone.

> On Mon, Dec 17, 2018 at 04:08:11PM +0200, Jarkko Sakkinen wrote:
> > On Mon, Dec 17, 2018 at 03:39:28PM +0200, Jarkko Sakkinen wrote:
> > > On Mon, Dec 17, 2018 at 03:28:59PM +0200, Jarkko Sakkinen wrote:
> > > > On Fri, Dec 14, 2018 at 04:06:27PM -0800, Sean Christopherson wrote:
> > > > > [  504.149548] [ cut here ]
> > > > > [  504.149550] kernel BUG at 
> > > > > /home/sean/go/src/kernel.org/linux/mm/mmap.c:669!
> > > > > [  504.150288] invalid opcode:  [#1] SMP
> > > > > [  504.150614] CPU: 2 PID: 237 Comm: kworker/u20:2 Not tainted 
> > > > > 4.20.0-rc2+ #267
> > > > > [  504.151165] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), 
> > > > > BIOS 0.0.0 02/06/2015
> > > > > [  504.151818] Workqueue: sgx-encl-wq sgx_encl_release_worker
> > > > > [  504.152267] RIP: 0010:__vma_adjust+0x64a/0x820
> > > > > [  504.152626] Code: ff 48 89 50 18 e9 6f fc ff ff 4c 8b ab 88 00 00 
> > > > > 00 45 31 e4 e9 61 fb ff ff 31 c0 48 83 c4 60 5b 5d 41 5c 41 5d 41 5e 
> > > > > 41 5f c3 <0f> 0b 49 89 de 49 83 c6 20 0f 84 06 fe ff ff 49 8d 7e e0 
> > > > > e8 fe ee
> > > > > [  504.154109] RSP: :c94ebd60 EFLAGS: 00010206
> > > > > [  504.154535] RAX: 7fd92ef7e000 RBX: 888467af16c0 RCX: 
> > > > > 888467af16e0
> > > > > [  504.155104] RDX: 888458fd09e0 RSI: 7fd954021000 RDI: 
> > > > > 88846bf9e798
> > > > > [  504.155673] RBP: 888467af1480 R08: 88845bea2000 R09: 
> > > > > 
> > > > > [  504.156242] R10: 8000 R11: fefefefefefefeff R12: 
> > > > > 
> > > > > [  504.156810] R13: 88846bf9e790 R14: 888467af1b70 R15: 
> > > > > 888467af1b60
> > > > > [  504.157378] FS:  () GS:88846f70() 
> > > > > knlGS:
> > > > > [  504.158021] CS:  0010 DS:  ES:  CR0: 80050033
> > > > > [  504.158483] CR2: 7f2c56e99000 CR3: 05009001 CR4: 
> > > > > 00360ee0
> > > > > [  504.159054] DR0:  DR1:  DR2: 
> > > > > 
> > > > > [  504.159623] DR3:  DR6: fffe0ff0 DR7: 
> > > > > 0400
> > > > > [  504.160193] Call Trace:
> > > > > [  504.160406]  __split_vma+0x16f/0x180
> > > > > [  504.160706]  ? __switch_to_asm+0x40/0x70
> > > > > [  504.161024]  __do_munmap+0xfb/0x450
> > > > > [  504.161308]  sgx_encl_release_worker+0x44/0x70
> > > > > [  504.161675]  process_one_work+0x200/0x3f0
> > > > > [  504.162004]  worker_thread+0x2d/0x3d0
> > > > > [  504.162301]  ? process_one_work+0x3f0/0x3f0
> > > > > [  504.162645]  kthread+0x113/0x130
> > > > > [  504.162912]  ? kthread_park+0x90/0x90
> > > > > [  504.163209]  ret_from_fork+0x35/0x40
> > > > > [  504.163503] Modules linked in: bridge stp llc
> > > > > [  504.163866] ---[ end trace 83076139fc25e3e0 ]---

> > > > There was a race with release and swapping code that I thought
> > > > I fixed, and this is looks like a race there. Have to recheck
> > > > what I did not consider. Anyway, though to share this if you
> > > > have time to look at it.  That is the part where something is
> > > > now unsync most probably.

> > > I think I found it. I was careless to make sgx_encl_release() to
> > > use sgx_invalidate(), which does not delete pages in the case
> > > when enclave is already marked as dead. This was after I had
> > > fixed the race that I had there in the first place. That is why
> > > I was puzzled why it suddenly reappeared.

> > > Would be nice to use sgx_invalidate() also in release for consistency in
> > > semantics sake so maybe just delete this:
> > > 
> > >   if (encl->flags & SGX_ENCL_DEAD)
> > >   return;
> > 
> > Updated master, not at this point next.

> If I checked this right was that mmu_notifier_unregister() cause
> DEAD to set, and thus when sgx_invalidate() is executed, it returns
> without doing anything...

On a pristine jarkko-sgx/next local branch we commented out the 'if
(encl->flags & SGX_ENCL_DEAD) return' clause in the following
file/function:

arch/x86/kernel/cpu/sgx/driver/encl.c:sgx_invalidate()

And tested the kernel.

This fix seems to prevent the memory manager from getting
catastrophically corrupted but the EINIT ioctl still fails.

On the first invocation after a fresh boot the EINIT ioctl returns -1.

On subsequent invocations of the loader it returns EBUSY.  Every 8-10
invocations we get the -1 (EPERM -?) from the EINIT call and then it
returns to issueing EBUSY.

Here is a representative call trace from the loader utility:

---
address: 7ff5cbe0, create address: 7ff5cbe0
Non-token initialization requested.
EINIT retn: -1 / No error information
[SGXenclave.c,init_enclave,652]: Error location.
[sgx-load.c,main,180]: Error location.

address: 7f425520, create address: 7f425520

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Sean Christopherson

On Sat, Dec 15, 2018 at 05:22:31PM -0600, Dr. Greg wrote:
> On Fri, Dec 14, 2018 at 04:06:27PM -0800, Sean Christopherson wrote:
> 
> Good afternoon, I hope the weekend is going well for everyone.
> 
> > On Fri, Dec 14, 2018 at 05:59:17PM -0600, Dr. Greg wrote:
> > > On Wed, Dec 12, 2018 at 08:00:36PM +0200, Jarkko Sakkinen wrote:
> > > 
> > > Good evening, I hope the week has gone well for everyone.
> > > 
> > > > On Mon, Dec 10, 2018 at 04:49:08AM -0600, Dr. Greg wrote:
> > > > > In the meantime, I wanted to confirm that your jarkko-sgx/master
> > > > > branch contains the proposed driver that is headed upstream.
> > > > > Before adding the SFLC patches we thought it best to run the
> > > > > driver through some testing in order to verify that any problems
> > > > > we generated where attributable to our work and not the base
> > > > > driver.
> > > >
> > > > The master branch is by definition unstable at the moment i.e. it
> > > > can sometimes (not often) contain unfinished changes. Use next for
> > > > testing.  I update next when I consider the master contents "stable
> > > > enough".
> > > 
> > > I noticed in the last day or so that you appeared to sync
> > > jarkko-sgx/master with jarkko-sgx/next, so I checked out a local
> > > branch against jarkko-sgx/next and ran it against our unit tests.
> > > Based on what we are seeing the driver is still experiencing issues
> > > with initialization of a non-trivial enclave.
> 
> > master branch is broken, looks like the VMA code Jarkko is reworking is
> > buggy.  I should be able to help debug this next week.
> > 
> > [  504.149548] [ cut here ]
> > [  504.149550] kernel BUG at 
> > /home/sean/go/src/kernel.org/linux/mm/mmap.c:669!
> 
> Rodger, dodger.
> 
> Let us know when you think you have something working pushed up into
> one of the branches and we will put it on the bench here in the lab
> and see what our runtime is able to do with it.
> 
> BTW, your new vDSO work appears to be shaping up well.  Just out of
> curiosity though, how are you testing and validating the new vDSO
> based exception handler if it isn't possible to initialize and run an
> enclave with the new driver?

Cherry-pick the patches to a stable version of the driver, the vDSO code
doesn't use any of the SGX APIs.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 04:08:11PM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 03:39:28PM +0200, Jarkko Sakkinen wrote:
> > On Mon, Dec 17, 2018 at 03:28:59PM +0200, Jarkko Sakkinen wrote:
> > > On Fri, Dec 14, 2018 at 04:06:27PM -0800, Sean Christopherson wrote:
> > > > [  504.149548] [ cut here ]
> > > > [  504.149550] kernel BUG at 
> > > > /home/sean/go/src/kernel.org/linux/mm/mmap.c:669!
> > > > [  504.150288] invalid opcode:  [#1] SMP
> > > > [  504.150614] CPU: 2 PID: 237 Comm: kworker/u20:2 Not tainted 
> > > > 4.20.0-rc2+ #267
> > > > [  504.151165] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> > > > 0.0.0 02/06/2015
> > > > [  504.151818] Workqueue: sgx-encl-wq sgx_encl_release_worker
> > > > [  504.152267] RIP: 0010:__vma_adjust+0x64a/0x820
> > > > [  504.152626] Code: ff 48 89 50 18 e9 6f fc ff ff 4c 8b ab 88 00 00 00 
> > > > 45 31 e4 e9 61 fb ff ff 31 c0 48 83 c4 60 5b 5d 41 5c 41 5d 41 5e 41 5f 
> > > > c3 <0f> 0b 49 89 de 49 83 c6 20 0f 84 06 fe ff ff 49 8d 7e e0 e8 fe ee
> > > > [  504.154109] RSP: :c94ebd60 EFLAGS: 00010206
> > > > [  504.154535] RAX: 7fd92ef7e000 RBX: 888467af16c0 RCX: 
> > > > 888467af16e0
> > > > [  504.155104] RDX: 888458fd09e0 RSI: 7fd954021000 RDI: 
> > > > 88846bf9e798
> > > > [  504.155673] RBP: 888467af1480 R08: 88845bea2000 R09: 
> > > > 
> > > > [  504.156242] R10: 8000 R11: fefefefefefefeff R12: 
> > > > 
> > > > [  504.156810] R13: 88846bf9e790 R14: 888467af1b70 R15: 
> > > > 888467af1b60
> > > > [  504.157378] FS:  () GS:88846f70() 
> > > > knlGS:
> > > > [  504.158021] CS:  0010 DS:  ES:  CR0: 80050033
> > > > [  504.158483] CR2: 7f2c56e99000 CR3: 05009001 CR4: 
> > > > 00360ee0
> > > > [  504.159054] DR0:  DR1:  DR2: 
> > > > 
> > > > [  504.159623] DR3:  DR6: fffe0ff0 DR7: 
> > > > 0400
> > > > [  504.160193] Call Trace:
> > > > [  504.160406]  __split_vma+0x16f/0x180
> > > > [  504.160706]  ? __switch_to_asm+0x40/0x70
> > > > [  504.161024]  __do_munmap+0xfb/0x450
> > > > [  504.161308]  sgx_encl_release_worker+0x44/0x70
> > > > [  504.161675]  process_one_work+0x200/0x3f0
> > > > [  504.162004]  worker_thread+0x2d/0x3d0
> > > > [  504.162301]  ? process_one_work+0x3f0/0x3f0
> > > > [  504.162645]  kthread+0x113/0x130
> > > > [  504.162912]  ? kthread_park+0x90/0x90
> > > > [  504.163209]  ret_from_fork+0x35/0x40
> > > > [  504.163503] Modules linked in: bridge stp llc
> > > > [  504.163866] ---[ end trace 83076139fc25e3e0 ]---
> > > 
> > > There was a race with release and swapping code that I thought I fixed,
> > > and this is looks like a race there. Have to recheck what I did not
> > > consider. Anyway, though to share this if you have time to look at it.
> > > That is the part where something is now unsync most probably.
> > 
> > I think I found it. I was careless to make sgx_encl_release() to use
> > sgx_invalidate(), which does not delete pages in the case when enclave
> > is already marked as dead. This was after I had fixed the race that I
> > had there in the first place. That is why I was puzzled why it suddenly
> > reappeared.
> > 
> > Would be nice to use sgx_invalidate() also in release for consistency in
> > semantics sake so maybe just delete this:
> > 
> > if (encl->flags & SGX_ENCL_DEAD)
> > return;
> 
> Updated master, not at this point next.

If I checked this right was that mmu_notifier_unregister() cause DEAD
to set, and thus when sgx_invalidate() is executed, it returns without
doing anything...

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 03:39:28PM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 03:28:59PM +0200, Jarkko Sakkinen wrote:
> > On Fri, Dec 14, 2018 at 04:06:27PM -0800, Sean Christopherson wrote:
> > > [  504.149548] [ cut here ]
> > > [  504.149550] kernel BUG at 
> > > /home/sean/go/src/kernel.org/linux/mm/mmap.c:669!
> > > [  504.150288] invalid opcode:  [#1] SMP
> > > [  504.150614] CPU: 2 PID: 237 Comm: kworker/u20:2 Not tainted 
> > > 4.20.0-rc2+ #267
> > > [  504.151165] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> > > 0.0.0 02/06/2015
> > > [  504.151818] Workqueue: sgx-encl-wq sgx_encl_release_worker
> > > [  504.152267] RIP: 0010:__vma_adjust+0x64a/0x820
> > > [  504.152626] Code: ff 48 89 50 18 e9 6f fc ff ff 4c 8b ab 88 00 00 00 
> > > 45 31 e4 e9 61 fb ff ff 31 c0 48 83 c4 60 5b 5d 41 5c 41 5d 41 5e 41 5f 
> > > c3 <0f> 0b 49 89 de 49 83 c6 20 0f 84 06 fe ff ff 49 8d 7e e0 e8 fe ee
> > > [  504.154109] RSP: :c94ebd60 EFLAGS: 00010206
> > > [  504.154535] RAX: 7fd92ef7e000 RBX: 888467af16c0 RCX: 
> > > 888467af16e0
> > > [  504.155104] RDX: 888458fd09e0 RSI: 7fd954021000 RDI: 
> > > 88846bf9e798
> > > [  504.155673] RBP: 888467af1480 R08: 88845bea2000 R09: 
> > > 
> > > [  504.156242] R10: 8000 R11: fefefefefefefeff R12: 
> > > 
> > > [  504.156810] R13: 88846bf9e790 R14: 888467af1b70 R15: 
> > > 888467af1b60
> > > [  504.157378] FS:  () GS:88846f70() 
> > > knlGS:
> > > [  504.158021] CS:  0010 DS:  ES:  CR0: 80050033
> > > [  504.158483] CR2: 7f2c56e99000 CR3: 05009001 CR4: 
> > > 00360ee0
> > > [  504.159054] DR0:  DR1:  DR2: 
> > > 
> > > [  504.159623] DR3:  DR6: fffe0ff0 DR7: 
> > > 0400
> > > [  504.160193] Call Trace:
> > > [  504.160406]  __split_vma+0x16f/0x180
> > > [  504.160706]  ? __switch_to_asm+0x40/0x70
> > > [  504.161024]  __do_munmap+0xfb/0x450
> > > [  504.161308]  sgx_encl_release_worker+0x44/0x70
> > > [  504.161675]  process_one_work+0x200/0x3f0
> > > [  504.162004]  worker_thread+0x2d/0x3d0
> > > [  504.162301]  ? process_one_work+0x3f0/0x3f0
> > > [  504.162645]  kthread+0x113/0x130
> > > [  504.162912]  ? kthread_park+0x90/0x90
> > > [  504.163209]  ret_from_fork+0x35/0x40
> > > [  504.163503] Modules linked in: bridge stp llc
> > > [  504.163866] ---[ end trace 83076139fc25e3e0 ]---
> > 
> > There was a race with release and swapping code that I thought I fixed,
> > and this is looks like a race there. Have to recheck what I did not
> > consider. Anyway, though to share this if you have time to look at it.
> > That is the part where something is now unsync most probably.
> 
> I think I found it. I was careless to make sgx_encl_release() to use
> sgx_invalidate(), which does not delete pages in the case when enclave
> is already marked as dead. This was after I had fixed the race that I
> had there in the first place. That is why I was puzzled why it suddenly
> reappeared.
> 
> Would be nice to use sgx_invalidate() also in release for consistency in
> semantics sake so maybe just delete this:
> 
>   if (encl->flags & SGX_ENCL_DEAD)
>   return;

Updated master, not at this point next.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Mon, Dec 17, 2018 at 03:28:59PM +0200, Jarkko Sakkinen wrote:
> On Fri, Dec 14, 2018 at 04:06:27PM -0800, Sean Christopherson wrote:
> > [  504.149548] [ cut here ]
> > [  504.149550] kernel BUG at 
> > /home/sean/go/src/kernel.org/linux/mm/mmap.c:669!
> > [  504.150288] invalid opcode:  [#1] SMP
> > [  504.150614] CPU: 2 PID: 237 Comm: kworker/u20:2 Not tainted 4.20.0-rc2+ 
> > #267
> > [  504.151165] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> > 0.0.0 02/06/2015
> > [  504.151818] Workqueue: sgx-encl-wq sgx_encl_release_worker
> > [  504.152267] RIP: 0010:__vma_adjust+0x64a/0x820
> > [  504.152626] Code: ff 48 89 50 18 e9 6f fc ff ff 4c 8b ab 88 00 00 00 45 
> > 31 e4 e9 61 fb ff ff 31 c0 48 83 c4 60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 
> > <0f> 0b 49 89 de 49 83 c6 20 0f 84 06 fe ff ff 49 8d 7e e0 e8 fe ee
> > [  504.154109] RSP: :c94ebd60 EFLAGS: 00010206
> > [  504.154535] RAX: 7fd92ef7e000 RBX: 888467af16c0 RCX: 
> > 888467af16e0
> > [  504.155104] RDX: 888458fd09e0 RSI: 7fd954021000 RDI: 
> > 88846bf9e798
> > [  504.155673] RBP: 888467af1480 R08: 88845bea2000 R09: 
> > 
> > [  504.156242] R10: 8000 R11: fefefefefefefeff R12: 
> > 
> > [  504.156810] R13: 88846bf9e790 R14: 888467af1b70 R15: 
> > 888467af1b60
> > [  504.157378] FS:  () GS:88846f70() 
> > knlGS:
> > [  504.158021] CS:  0010 DS:  ES:  CR0: 80050033
> > [  504.158483] CR2: 7f2c56e99000 CR3: 05009001 CR4: 
> > 00360ee0
> > [  504.159054] DR0:  DR1:  DR2: 
> > 
> > [  504.159623] DR3:  DR6: fffe0ff0 DR7: 
> > 0400
> > [  504.160193] Call Trace:
> > [  504.160406]  __split_vma+0x16f/0x180
> > [  504.160706]  ? __switch_to_asm+0x40/0x70
> > [  504.161024]  __do_munmap+0xfb/0x450
> > [  504.161308]  sgx_encl_release_worker+0x44/0x70
> > [  504.161675]  process_one_work+0x200/0x3f0
> > [  504.162004]  worker_thread+0x2d/0x3d0
> > [  504.162301]  ? process_one_work+0x3f0/0x3f0
> > [  504.162645]  kthread+0x113/0x130
> > [  504.162912]  ? kthread_park+0x90/0x90
> > [  504.163209]  ret_from_fork+0x35/0x40
> > [  504.163503] Modules linked in: bridge stp llc
> > [  504.163866] ---[ end trace 83076139fc25e3e0 ]---
> 
> There was a race with release and swapping code that I thought I fixed,
> and this is looks like a race there. Have to recheck what I did not
> consider. Anyway, though to share this if you have time to look at it.
> That is the part where something is now unsync most probably.

I think I found it. I was careless to make sgx_encl_release() to use
sgx_invalidate(), which does not delete pages in the case when enclave
is already marked as dead. This was after I had fixed the race that I
had there in the first place. That is why I was puzzled why it suddenly
reappeared.

Would be nice to use sgx_invalidate() also in release for consistency in
semantics sake so maybe just delete this:

if (encl->flags & SGX_ENCL_DEAD)
return;

?

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Jarkko Sakkinen

On Fri, Dec 14, 2018 at 04:06:27PM -0800, Sean Christopherson wrote:
> [  504.149548] [ cut here ]
> [  504.149550] kernel BUG at /home/sean/go/src/kernel.org/linux/mm/mmap.c:669!
> [  504.150288] invalid opcode:  [#1] SMP
> [  504.150614] CPU: 2 PID: 237 Comm: kworker/u20:2 Not tainted 4.20.0-rc2+ 
> #267
> [  504.151165] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 
> 02/06/2015
> [  504.151818] Workqueue: sgx-encl-wq sgx_encl_release_worker
> [  504.152267] RIP: 0010:__vma_adjust+0x64a/0x820
> [  504.152626] Code: ff 48 89 50 18 e9 6f fc ff ff 4c 8b ab 88 00 00 00 45 31 
> e4 e9 61 fb ff ff 31 c0 48 83 c4 60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <0f> 0b 
> 49 89 de 49 83 c6 20 0f 84 06 fe ff ff 49 8d 7e e0 e8 fe ee
> [  504.154109] RSP: :c94ebd60 EFLAGS: 00010206
> [  504.154535] RAX: 7fd92ef7e000 RBX: 888467af16c0 RCX: 
> 888467af16e0
> [  504.155104] RDX: 888458fd09e0 RSI: 7fd954021000 RDI: 
> 88846bf9e798
> [  504.155673] RBP: 888467af1480 R08: 88845bea2000 R09: 
> 
> [  504.156242] R10: 8000 R11: fefefefefefefeff R12: 
> 
> [  504.156810] R13: 88846bf9e790 R14: 888467af1b70 R15: 
> 888467af1b60
> [  504.157378] FS:  () GS:88846f70() 
> knlGS:
> [  504.158021] CS:  0010 DS:  ES:  CR0: 80050033
> [  504.158483] CR2: 7f2c56e99000 CR3: 05009001 CR4: 
> 00360ee0
> [  504.159054] DR0:  DR1:  DR2: 
> 
> [  504.159623] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [  504.160193] Call Trace:
> [  504.160406]  __split_vma+0x16f/0x180
> [  504.160706]  ? __switch_to_asm+0x40/0x70
> [  504.161024]  __do_munmap+0xfb/0x450
> [  504.161308]  sgx_encl_release_worker+0x44/0x70
> [  504.161675]  process_one_work+0x200/0x3f0
> [  504.162004]  worker_thread+0x2d/0x3d0
> [  504.162301]  ? process_one_work+0x3f0/0x3f0
> [  504.162645]  kthread+0x113/0x130
> [  504.162912]  ? kthread_park+0x90/0x90
> [  504.163209]  ret_from_fork+0x35/0x40
> [  504.163503] Modules linked in: bridge stp llc
> [  504.163866] ---[ end trace 83076139fc25e3e0 ]---

There was a race with release and swapping code that I thought I fixed,
and this is looks like a race there. Have to recheck what I did not
consider. Anyway, though to share this if you have time to look at it.
That is the part where something is now unsync most probably.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-15 Thread Dr. Greg

On Fri, Dec 14, 2018 at 04:06:27PM -0800, Sean Christopherson wrote:

Good afternoon, I hope the weekend is going well for everyone.

> On Fri, Dec 14, 2018 at 05:59:17PM -0600, Dr. Greg wrote:
> > On Wed, Dec 12, 2018 at 08:00:36PM +0200, Jarkko Sakkinen wrote:
> > 
> > Good evening, I hope the week has gone well for everyone.
> > 
> > > On Mon, Dec 10, 2018 at 04:49:08AM -0600, Dr. Greg wrote:
> > > > In the meantime, I wanted to confirm that your jarkko-sgx/master
> > > > branch contains the proposed driver that is headed upstream.
> > > > Before adding the SFLC patches we thought it best to run the
> > > > driver through some testing in order to verify that any problems
> > > > we generated where attributable to our work and not the base
> > > > driver.
> > >
> > > The master branch is by definition unstable at the moment i.e. it
> > > can sometimes (not often) contain unfinished changes. Use next for
> > > testing.  I update next when I consider the master contents "stable
> > > enough".
> > 
> > I noticed in the last day or so that you appeared to sync
> > jarkko-sgx/master with jarkko-sgx/next, so I checked out a local
> > branch against jarkko-sgx/next and ran it against our unit tests.
> > Based on what we are seeing the driver is still experiencing issues
> > with initialization of a non-trivial enclave.

> master branch is broken, looks like the VMA code Jarkko is reworking is
> buggy.  I should be able to help debug this next week.
> 
> [  504.149548] [ cut here ]
> [  504.149550] kernel BUG at /home/sean/go/src/kernel.org/linux/mm/mmap.c:669!

Rodger, dodger.

Let us know when you think you have something working pushed up into
one of the branches and we will put it on the bench here in the lab
and see what our runtime is able to do with it.

BTW, your new vDSO work appears to be shaping up well.  Just out of
curiosity though, how are you testing and validating the new vDSO
based exception handler if it isn't possible to initialize and run an
enclave with the new driver?

We will look forward to hearing from you.

Have a good remainder of the weekend.

Dr. Greg

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.   Specializing in information infra-structure
Fargo, ND  58102development.
PH: 701-281-1686
FAX: 701-281-3949   EMAIL: g...@enjellic.com
--
"Don't worry about people stealing your ideas.  If your ideas are any
 good, you'll have to ram them down people's throats."
-- Howard Aiken

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-14 Thread Sean Christopherson

On Fri, Dec 14, 2018 at 05:59:17PM -0600, Dr. Greg wrote:
> On Wed, Dec 12, 2018 at 08:00:36PM +0200, Jarkko Sakkinen wrote:
> 
> Good evening, I hope the week has gone well for everyone.
> 
> > On Mon, Dec 10, 2018 at 04:49:08AM -0600, Dr. Greg wrote:
> > > In the meantime, I wanted to confirm that your jarkko-sgx/master
> > > branch contains the proposed driver that is headed upstream.
> > > Before adding the SFLC patches we thought it best to run the
> > > driver through some testing in order to verify that any problems
> > > we generated where attributable to our work and not the base
> > > driver.
> >
> > The master branch is by definition unstable at the moment i.e. it
> > can sometimes (not often) contain unfinished changes. Use next for
> > testing.  I update next when I consider the master contents "stable
> > enough".
> 
> I noticed in the last day or so that you appeared to sync
> jarkko-sgx/master with jarkko-sgx/next, so I checked out a local
> branch against jarkko-sgx/next and ran it against our unit tests.
> Based on what we are seeing the driver is still experiencing issues
> with initialization of a non-trivial enclave.

master branch is broken, looks like the VMA code Jarkko is reworking is
buggy.  I should be able to help debug this next week.

[  504.149548] [ cut here ]
[  504.149550] kernel BUG at /home/sean/go/src/kernel.org/linux/mm/mmap.c:669!
[  504.150288] invalid opcode:  [#1] SMP
[  504.150614] CPU: 2 PID: 237 Comm: kworker/u20:2 Not tainted 4.20.0-rc2+ #267
[  504.151165] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 
02/06/2015
[  504.151818] Workqueue: sgx-encl-wq sgx_encl_release_worker
[  504.152267] RIP: 0010:__vma_adjust+0x64a/0x820
[  504.152626] Code: ff 48 89 50 18 e9 6f fc ff ff 4c 8b ab 88 00 00 00 45 31 
e4 e9 61 fb ff ff 31 c0 48 83 c4 60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <0f> 0b 49 
89 de 49 83 c6 20 0f 84 06 fe ff ff 49 8d 7e e0 e8 fe ee
[  504.154109] RSP: :c94ebd60 EFLAGS: 00010206
[  504.154535] RAX: 7fd92ef7e000 RBX: 888467af16c0 RCX: 888467af16e0
[  504.155104] RDX: 888458fd09e0 RSI: 7fd954021000 RDI: 88846bf9e798
[  504.155673] RBP: 888467af1480 R08: 88845bea2000 R09: 
[  504.156242] R10: 8000 R11: fefefefefefefeff R12: 
[  504.156810] R13: 88846bf9e790 R14: 888467af1b70 R15: 888467af1b60
[  504.157378] FS:  () GS:88846f70() 
knlGS:
[  504.158021] CS:  0010 DS:  ES:  CR0: 80050033
[  504.158483] CR2: 7f2c56e99000 CR3: 05009001 CR4: 00360ee0
[  504.159054] DR0:  DR1:  DR2: 
[  504.159623] DR3:  DR6: fffe0ff0 DR7: 0400
[  504.160193] Call Trace:
[  504.160406]  __split_vma+0x16f/0x180
[  504.160706]  ? __switch_to_asm+0x40/0x70
[  504.161024]  __do_munmap+0xfb/0x450
[  504.161308]  sgx_encl_release_worker+0x44/0x70
[  504.161675]  process_one_work+0x200/0x3f0
[  504.162004]  worker_thread+0x2d/0x3d0
[  504.162301]  ? process_one_work+0x3f0/0x3f0
[  504.162645]  kthread+0x113/0x130
[  504.162912]  ? kthread_park+0x90/0x90
[  504.163209]  ret_from_fork+0x35/0x40
[  504.163503] Modules linked in: bridge stp llc
[  504.163866] ---[ end trace 83076139fc25e3e0 ]---

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-14 Thread Dr. Greg

On Wed, Dec 12, 2018 at 08:00:36PM +0200, Jarkko Sakkinen wrote:

Good evening, I hope the week has gone well for everyone.

> On Mon, Dec 10, 2018 at 04:49:08AM -0600, Dr. Greg wrote:
> > In the meantime, I wanted to confirm that your jarkko-sgx/master
> > branch contains the proposed driver that is headed upstream.
> > Before adding the SFLC patches we thought it best to run the
> > driver through some testing in order to verify that any problems
> > we generated where attributable to our work and not the base
> > driver.
>
> The master branch is by definition unstable at the moment i.e. it
> can sometimes (not often) contain unfinished changes. Use next for
> testing.  I update next when I consider the master contents "stable
> enough".

I noticed in the last day or so that you appeared to sync
jarkko-sgx/master with jarkko-sgx/next, so I checked out a local
branch against jarkko-sgx/next and ran it against our unit tests.
Based on what we are seeing the driver is still experiencing issues
with initialization of a non-trivial enclave.

On the first test boot of the new kernel, the EINIT ioctl consistently
returned EBUSY over multiple invocations of the unit test.  This did
not appear to generate any negative issues with the kernel at large.

We rebooted the box to run the test against a fresh kernel load.  This
time around we experienced issues similar to what we had previously
described.  The EINIT ioctl generates a segmentation fault which seems
to largely incapacitate the kernel.

Here are the logs and first backtrace from the test:

---
Dec 14 13:25:06 nuc2 kernel: PGD 4f001067 P4D 4f001067 PUD 0 
Dec 14 13:25:06 nuc2 kernel: BUG: unable to handle kernel paging request at 
97bf3ae916fe
Dec 14 13:25:06 nuc2 kernel: Oops: 0002 [#1] SMP PTI
Dec 14 13:25:06 nuc2 kernel: CPU: 1 PID: 34 Comm: kworker/1:1 Not tainted 
4.20.0-rc2-sgx-nuc2+ #12
Dec 14 13:25:06 nuc2 kernel: Hardware name: Intel Corporation NUC7CJYH/NUC7JYB, 
BIOS JYGLKCPX.86A.0046.2018.1103.1316 11/03/2018
Dec 14 13:25:06 nuc2 kernel: Workqueue: events cache_reap
Dec 14 13:25:06 nuc2 kernel: RIP: 0010:free_block+0xe3/0x182
Dec 14 13:25:06 nuc2 kernel: Code: 20 45 29 d4 41 d3 ec 0f b6 4f 1d 45 01 e2 41 
d3 ea 41 8b 49 30 ff c9 49 83 79 20 00 41 89 49 30 75 04 4d 89 59 20 4d 8b 59 
20 <45> 88 14 0b 49 8d 49 08 41 83 79 30 00 75 1a 4c 8b 50 28 49 89 4a
Dec 14 13:25:06 nuc2 kernel: RSP: 0018:b90800123db0 EFLAGS: 00210046
Dec 14 13:25:06 nuc2 kernel: RAX: 97be3b419080 RBX: 000f RCX: 
ff7e
Dec 14 13:25:06 nuc2 kernel: RDX: 0018 RSI: d907ffc82b70 RDI: 
97be3b44c200
Dec 14 13:25:06 nuc2 kernel: RBP: b90800123dd8 R08: b90800123e10 R09: 
f9b345eba440
Dec 14 13:25:06 nuc2 kernel: R10: 0051f663 R11: 97be3ae91780 R12: 
11ede5c3
Dec 14 13:25:06 nuc2 kernel: R13: 8000 R14: 97be3b419088 R15: 
97be3b4190a8
Dec 14 13:25:06 nuc2 kernel: FS:  () 
GS:97be3be8() knlGS:
Dec 14 13:25:06 nuc2 kernel: CS:  0010 DS:  ES:  CR0: 80050033
Dec 14 13:25:06 nuc2 kernel: CR2: 97bf3ae916fe CR3: 4ec0a000 CR4: 
00340ee0
Dec 14 13:25:06 nuc2 kernel: Call Trace:
Dec 14 13:25:06 nuc2 kernel:  drain_array_locked+0x50/0x75
Dec 14 13:25:06 nuc2 kernel:  drain_array.constprop.67+0x57/0x72
Dec 14 13:25:06 nuc2 kernel:  cache_reap+0x58/0x101
Dec 14 13:25:06 nuc2 kernel:  process_one_work+0x183/0x271
Dec 14 13:25:06 nuc2 kernel:  worker_thread+0x1e5/0x2b4
Dec 14 13:25:06 nuc2 kernel:  ? cancel_delayed_work_sync+0x10/0x10
Dec 14 13:25:06 nuc2 kernel:  kthread+0x116/0x11e
Dec 14 13:25:06 nuc2 kernel:  ? kthread_park+0x7e/0x7e
Dec 14 13:25:06 nuc2 kernel:  ret_from_fork+0x1f/0x40
Dec 14 13:25:06 nuc2 kernel: Modules linked in:
Dec 14 13:25:06 nuc2 kernel: CR2: 97bf3ae916fe
Dec 14 13:25:06 nuc2 kernel: ---[ end trace 7f5dc24edc7285b3 ]---
Dec 14 13:25:06 nuc2 kernel: RIP: 0010:free_block+0xe3/0x182
Dec 14 13:25:06 nuc2 kernel: Code: 20 45 29 d4 41 d3 ec 0f b6 4f 1d 45 01 e2 41 
d3 ea 41 8b 49 30 ff c9 49 83 79 20 00 41 89 49 30 75 04 4d 89 59 20 4d 8b 59 
20 <45> 88 14 0b 49 8d 49 08 41 83 79 30 00 75 1a 4c 8b 50 28 49 89 4a
Dec 14 13:25:06 nuc2 kernel: RSP: 0018:b90800123db0 EFLAGS: 00210046
Dec 14 13:25:06 nuc2 kernel: RAX: 97be3b419080 RBX: 000f RCX: 
ff7e
Dec 14 13:25:06 nuc2 kernel: RDX: 0018 RSI: d907ffc82b70 RDI: 
97be3b44c200
Dec 14 13:25:06 nuc2 kernel: RBP: b90800123dd8 R08: b90800123e10 R09: 
f9b345eba440
Dec 14 13:25:06 nuc2 kernel: R10: 0051f663 R11: 97be3ae91780 R12: 
11ede5c3
Dec 14 13:25:06 nuc2 kernel: R13: 8000 R14: 97be3b419088 R15: 
97be3b4190a8
Dec 14 13:25:06 nuc2 kernel: FS:  () 
GS:97be3be8() knlGS:
Dec 14 13:25:06 nuc2 kernel: CS:  0010 DS:  ES:

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-12 Thread Jarkko Sakkinen

On Mon, Dec 10, 2018 at 04:49:08AM -0600, Dr. Greg wrote:
> In the meantime, I wanted to confirm that your jarkko-sgx/master
> branch contains the proposed driver that is headed upstream.  Before
> adding the SFLC patches we thought it best to run the driver through
> some testing in order to verify that any problems we generated where
> attributable to our work and not the base driver.

The master branch is by definition unstable at the moment i.e. it can
sometimes (not often) contain unfinished changes. Use next for testing.
I update next when I consider the master contents "stable enough".

Thanks.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-10 Thread Dr. Greg

On Sun, Dec 09, 2018 at 06:01:32PM +0100, Pavel Machek wrote:

> Hi!

Good morning to everyone.

> > On Thu, Nov 15, 2018 at 5:08 PM Jarkko Sakkinen
> >  wrote:
> > >
> > > Intel Software Guard eXtensions (SGX) is a set of CPU instructions that
> > > can be used by applications to set aside private regions of code and
> > > data. The code outside the enclave is disallowed to access the memory
> > > inside the enclave by the CPU access control.
> > >
> > > SGX driver provides a ioctl API for loading and initializing enclaves.
> > > Address range for enclaves is reserved with mmap() and they are
> > > destroyed with munmap(). Enclave construction, measurement and
> > > initialization is done with the provided the ioctl API.
> > 
> > I brought this up a while back, and I think I should re-ask it now
> > that this driver is getting close to ready:
> > 
> > As it stands, there's just one SGX character device, and I imagine
> > that it'll be available to unprivileged applications.  I'm concerned
> > that this isn't quite what we want.  I certainly think that everyone,
> > or at least almost everyone, ought to be able to run normal
> > enclaves.

> I don't think nobody or postfix or guest should be running enclaves
> on my systems. First, I'd like to be able to debug my systems.
>
> Second, sgx quite complex and tricky. It may turn out to be secure
> in the end, but I'd not be surprised if we got few CVEs before we
> get there.
>
> Last, I'd hate to find out in few years that I can't switch to amd
> cpu because firefox now requires sgx.
>
> Just make it root-only or 660 by default. Users can get permission
> in similar way they get rights to audio..

I'm not sure that using root or group restricted access to a character
device is going to stop an ISV from embracing a technology, but that
is an alternate debate.

Relying on discretionary, or mandatory for that matter, access
controls is not consistent with the security architecture of SGX.  The
technology was designed to provide robustness in the face of
aggressors who may have compromised the operating system or hardware
platform.

The lingua franca of SGX security and access controls are MRSIGNER
values.  The SFLC patches that we will be making available, once we
are convinced the upstream driver is working, implement MRSIGNER based
security controls with an absolutely minimal TCB footprint in the
kernel.  This strategy allows the platform owner to use SGX compliant
and cryptographically enforced access controls.

Just as an aside, secondary to our perception that this technology and
what it can do is not widely understood, we are developing a 2-part
LWN article series on SGX and its implications for Linux.

>   Pavel

Best wishes for a good day and a productive week.

Dr. Greg

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.   Specializing in information infra-structure
Fargo, ND  58102development.
PH: 701-281-1686
FAX: 701-281-3949   EMAIL: g...@enjellic.com
--
"... remember that innovation is saying 'no' to 1000 things."
-- Moxie Marlinspike

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-10 Thread Dr. Greg

On Wed, Nov 28, 2018 at 11:22:28AM -0800, Jarkko Sakkinen wrote:

Good morning, I hope everyone had a pleasant weekend.

> On Wed, Nov 28, 2018 at 04:49:41AM -0600, Dr. Greg wrote:
> > We've been carrying a patch, that drops in on top of the proposed
> > kernel driver, that implements the needed policy management framework
> > for DAC fragile (FLC) platforms.  After a meeting yesterday with the
> > client that is funding the work, a decision was made to release the
> > enhancements when the SGX driver goes mainline.  That will at least
> > give developers the option of creating solutions on Linux that
> > implement the security guarantees that SGX was designed to deliver.

> We do not need yet another policy management framework to the *kernel*.
>
> The token based approach that Andy is proposing is proven and well
> established method to create a mechanism. You can then create a
> daemon to user space that decides who it wants to send tokes.

I guess there will be plenty of time to argue about all of that.

In the meantime, I wanted to confirm that your jarkko-sgx/master
branch contains the proposed driver that is headed upstream.  Before
adding the SFLC patches we thought it best to run the driver through
some testing in order to verify that any problems we generated where
attributable to our work and not the base driver.

At the current time jarkko-sgx/master appears to be having difficulty
initializing the unit test enclave for our trusted runtime API
librarary.  Enclave creation and loading appear to work fine, things
go south after the EINIT ioctl is called on the loaded image.

We specifically isolated the regressions to occur secondary to the
EINIT ioctl being called.  We modified our sgx-load test utility to
pause with the image loaded, but not initialized.  We generated a fair
amount of system activity while the process was holding the enclave
image open and there were no issues.  The process was then allowed to
unmap the virtual memory image without calling EINIT and the system
was fine after that as well.

Symptoms vary, but in all cases appear to be linked to corruption of
the virtual memory infrastructure.  In all cases, the kernel ends up
at a point where any attempt to start a new process hangs and becomes
uninterruptible.  The full kernel failure does not appear to be
synchronous with when EINIT is called, which would support the notion
that something is going wrong with the VM management that is being
workqueue deferred.

This is with your MPX patch applied that corrects issues with the
wrong memory management context being acted upon by that system.  In
any event, the kernel configuration being used for testing does not
have MPX support even enabled.  Given that the changelog for the patch
is indicating the new driver is attempting something unique with
workqueue deferred VM management, it would seem possible that the
driver is tickling bad and possibly untested behavior elsewhere in the
kernel as well.

The enclave in question is not terribly sophisticated by the standards
of our other enclaves, but it is a non-trivial test of SGX
functionality.  It weighs in at about 156K and is generated and signed
in debug mode with version 1.4 compliant metadata.  Obviously it
initializes and runs fine with the out-of-tree driver.

We managed to capture two separate sets of error logs/backtraces that
are included below.  As I'm sure you know, without module support,
working on all of this is a bit painful as it requires the classic
edit-compile-link-boot-whimper procedure :-)

Given that the self-test committed to the kernel sources is a trivial
one page enclave and the proposed driver ABI is incompatible with the
released Intel Linux PSW/SDK, this may be the most challenging test
the driver has been put through.  Unless your PSW/SDK team is testing
the new driver behind the scenes.

Obviously let us know if jarkko-master/sgx is not where the action is
at or if you would like us to move forward with alternative testing.

Regression traces follow:

Event 1: ---
Dec  9 07:35:15 nuc2 kernel: general protection fault:  [#1] SMP PTI
Dec  9 07:35:15 nuc2 kernel: CPU: 1 PID: 1594 Comm: less Not tainted 
4.20.0-rc2-sgx-nuc2+ #11
Dec  9 07:35:15 nuc2 kernel: Hardware name: Intel Corporation NUC7CJYH/NUC7JYB, 
BIOS JYGLKCPX.86A.0046.2018.1103.1316 11/03/2018
Dec  9 07:35:15 nuc2 kernel: RIP: 0010:unmap_vmas+0x3c/0x83
Dec  9 07:35:15 nuc2 kernel: Code: 49 89 cc 53 48 89 f3 4c 8b 6e 40 49 83 bd a0 
03 00 00 00 74 32 b9 01 00 00 00 4c 89 e2 4c 89 f6 4c 89 ef e8 db be 01 00 eb 
1d <4c> 39 23 73 1d 48 89 de 45 31 c0 4c 89 e1 4c 89 f2 4c 89 ff e8 cb
Dec  9 07:35:15 nuc2 kernel: RSP: 0018:9fd7404c7d90 EFLAGS: 00010282
Dec  9 07:35:15 nuc2 kernel: RAX: 0007755e RBX: 0f66fad412e0 RCX: 

Dec  9 07:35:15 nuc2 kernel: RDX: 8b66f9e42ee0 RSI: 8b66f9e42c00 RDI: 
9fd7404c7dc8
Dec  9 07:35:15 nuc2 kernel: RBP:

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-09 Thread Pavel Machek

Hi!

> On Thu, Nov 15, 2018 at 5:08 PM Jarkko Sakkinen
>  wrote:
> >
> > Intel Software Guard eXtensions (SGX) is a set of CPU instructions that
> > can be used by applications to set aside private regions of code and
> > data. The code outside the enclave is disallowed to access the memory
> > inside the enclave by the CPU access control.
> >
> > SGX driver provides a ioctl API for loading and initializing enclaves.
> > Address range for enclaves is reserved with mmap() and they are
> > destroyed with munmap(). Enclave construction, measurement and
> > initialization is done with the provided the ioctl API.
> >
> 
> I brought this up a while back, and I think I should re-ask it now
> that this driver is getting close to ready:
> 
> As it stands, there's just one SGX character device, and I imagine
> that it'll be available to unprivileged applications.  I'm concerned
> that this isn't quite what we want.  I certainly think that everyone,
> or at least almost everyone, ought to be able to run normal
> enclaves.

I don't think nobody or postfix or guest should be running enclaves on
my systems. First, I'd like to  be able to debug my systems.

Second, sgx quite complex and tricky. It may turn out to be secure in
the end, but I'd not be surprised if we got few CVEs before we get
there.

Last, I'd hate to find out in few years that I can't switch to amd
cpu because firefox now requires sgx.

Just make it root-only or 660 by default. Users can get permission in
similar way they get rights to audio..

Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-09 Thread Pavel Machek

Hi!

> > There would be three types of users:
> >
> > 1. Ones that have access to neither of the devices.
> > 2. Ones that have access to unprivileged. Who are these?
> 
> Either 0666 (world) or an sgx group.

Sgx group, please. Or even better, what is generic term for sgx? We
probably want to use that, as sgx is likely trademarked and certainly
Intelism.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-28 Thread Andy Lutomirski

On Tue, Nov 27, 2018 at 12:55 AM Dr. Greg  wrote:
> Since the thread has become a bit divergent I wanted to note that we
> have offered a proposal for a general policy management framework
> based on MRSIGNER values.  This framework is consistent with the SGX
> security model, ie. cryptographic rather then DAC based policy
> controls.  This framework also allows a much more flexible policy
> implementation that doesn't result in combinatoric issues.

Can you give a concrete explanation of a problem that your proposal
would solve?  As far as I can tell, it gets rid of a case in which an
unprivileged attacker who can run enclaves but hasn't compromised the
kernel can learn the PPID and other SGX-related permanent platform
identifiers, but it does nothing to prevent the same attacker from
learning non-SGX-related permanent identifiers, nor does it prevent
the attacker from using the Intel quoting enclave (unless configured
in a surprising way) and thus attesting to a remote system.

So what problem does it solve?

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-28 Thread Andy Lutomirski

On Tue, Nov 27, 2018 at 12:55 AM Dr. Greg  wrote:
> Since the thread has become a bit divergent I wanted to note that we
> have offered a proposal for a general policy management framework
> based on MRSIGNER values.  This framework is consistent with the SGX
> security model, ie. cryptographic rather then DAC based policy
> controls.  This framework also allows a much more flexible policy
> implementation that doesn't result in combinatoric issues.

Can you give a concrete explanation of a problem that your proposal
would solve?  As far as I can tell, it gets rid of a case in which an
unprivileged attacker who can run enclaves but hasn't compromised the
kernel can learn the PPID and other SGX-related permanent platform
identifiers, but it does nothing to prevent the same attacker from
learning non-SGX-related permanent identifiers, nor does it prevent
the attacker from using the Intel quoting enclave (unless configured
in a surprising way) and thus attesting to a remote system.

So what problem does it solve?

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-28 Thread Jarkko Sakkinen

On Wed, Nov 28, 2018 at 04:49:41AM -0600, Dr. Greg wrote:
> We've been carrying a patch, that drops in on top of the proposed
> kernel driver, that implements the needed policy management framework
> for DAC fragile (FLC) platforms.  After a meeting yesterday with the
> client that is funding the work, a decision was made to release the
> enhancements when the SGX driver goes mainline.  That will at least
> give developers the option of creating solutions on Linux that
> implement the security guarantees that SGX was designed to deliver.

We do not need yet another policy management framework to the *kernel*.

The token based approach that Andy is proposing is proven and well
established method to create a mechanism. You can then create a daemon
to user space that decides who it wants to send tokes.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-28 Thread Jarkko Sakkinen

On Wed, Nov 28, 2018 at 04:49:41AM -0600, Dr. Greg wrote:
> We've been carrying a patch, that drops in on top of the proposed
> kernel driver, that implements the needed policy management framework
> for DAC fragile (FLC) platforms.  After a meeting yesterday with the
> client that is funding the work, a decision was made to release the
> enhancements when the SGX driver goes mainline.  That will at least
> give developers the option of creating solutions on Linux that
> implement the security guarantees that SGX was designed to deliver.

We do not need yet another policy management framework to the *kernel*.

The token based approach that Andy is proposing is proven and well
established method to create a mechanism. You can then create a daemon
to user space that decides who it wants to send tokes.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-28 Thread Dr. Greg

On Tue, Nov 27, 2018 at 09:55:45AM -0800, Andy Lutomirski wrote:

> > On Nov 27, 2018, at 8:41 AM, Jarkko Sakkinen 
> >  wrote:
> > 
> >> On Tue, Nov 27, 2018 at 02:55:33AM -0600, Dr. Greg wrote:
> >> Since the thread has become a bit divergent I wanted to note that we
> >> have offered a proposal for a general policy management framework
> >> based on MRSIGNER values.  This framework is consistent with the SGX
> >> security model, ie. cryptographic rather then DAC based policy
> >> controls.  This framework also allows a much more flexible policy
> >> implementation that doesn't result in combinatoric issues.
> >> 
> >> Our framework also allows the preservation of the current ABI which
> >> allows an EINITTOKEN to be passed in from userspace.  The framework
> >> also supports the ability to specify that only a kernel based launch
> >> enclave (LE) should be available if the platform owner or distribution
> >> should desire to implement such a model.
> >> 
> >> The policy management framework is straight forward.  Three linked
> >> lists or their equivalent which are populated through /sysfs
> >> pseudo-files or equivalent plumbing.  Each list is populated with
> >> MRSIGNER values for signing keys that are allowed to initialize
> >> enclaves under three separate conditions.
> >> 
> >> 1.) General enclaves without special attribute bits.
> >> 
> >> 2.) Enclaves with the SGX_FLAGS_PROVISION_KEY attribute set. - i.e.,
> >> 'Provisioning Enclaves'.
> >> 
> >> 3.) Enclaves with the SGX_FLAGS_LICENSE_KEY attribute set - i.e., 'Launch
> >> Enclaves'.
> >> 
> >> An all-null MRSIGNER value serves as a 'sealing' value that locks a
> >> list from any further modifications.
> >> 
> >> This architecture allows platform policies to be specified and then
> >> sealed at early boot by the root user.  At that point cryptographic
> >> policy controls are in place rather then DAC based controls, the
> >> latter of which have perpetual security liabilities in addition to the
> >> useability constraints inherent in a DAC or device node model.
> >> 
> >> We have developed an independent implementation of the PSW and
> >> arguably have as much experience with issues surrounding how to
> >> interact with the device driver as anyone.  We have spent a lot of
> >> time thinking about these issues and the above framework provides the
> >> most flexible architecture available.
> > 
> > Sounds like a lot bloat and policy added to the kernel whereas with
> > Andy's proposal you can implement logic to a daemon and provide only
> > mechanism to do it.

> Well, almost. We'd need SGX_IOC_FREEZE_MR{ENCLAVE,SIGNER} or
> similar.  Or maybe the daemon could handle the entire loading process.
> But this can wait until after the main driver is upstream.
>
> This does lead to a question: enclaves are kind-of-sort-of mapped
> into a given address space. What happens if you issue the various
> ioctls in the context of a different mm?  For that matter, can two
> processes mmap the same enclave?

Fascinating.

We've been carrying a patch, that drops in on top of the proposed
kernel driver, that implements the needed policy management framework
for DAC fragile (FLC) platforms.  After a meeting yesterday with the
client that is funding the work, a decision was made to release the
enhancements when the SGX driver goes mainline.  That will at least
give developers the option of creating solutions on Linux that
implement the security guarantees that SGX was designed to deliver.

Most importantly, since it implements a driver consistent with the
design of SGX, it has the added benefit of allowing system
administrators the ability to enable the driver to work on non-FLC
(locked) platforms.  Since Jarkko confirmed that FLC is the option of
platform vendors, this would seem to be important as SGX on Linux will
only work in a random fashion dependent on the whims of hardware OEM's
in probably a SKU dependent fashion.  Which is why the client has
interest in the work.

Best wishes for a productive remainder of the week.

Dr. Greg

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.   Specializing in information infra-structure
Fargo, ND  58102development.
PH: 701-281-1686
FAX: 701-281-3949   EMAIL: g...@enjellic.com
--
"Five year projections, are you kidding me.  We don't know what we are
 supposed to be doing at the 4 o'clock meeting this afternoon."
-- Terry Wieland
   Resurrection

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-28 Thread Dr. Greg

On Tue, Nov 27, 2018 at 09:55:45AM -0800, Andy Lutomirski wrote:

> > On Nov 27, 2018, at 8:41 AM, Jarkko Sakkinen 
> >  wrote:
> > 
> >> On Tue, Nov 27, 2018 at 02:55:33AM -0600, Dr. Greg wrote:
> >> Since the thread has become a bit divergent I wanted to note that we
> >> have offered a proposal for a general policy management framework
> >> based on MRSIGNER values.  This framework is consistent with the SGX
> >> security model, ie. cryptographic rather then DAC based policy
> >> controls.  This framework also allows a much more flexible policy
> >> implementation that doesn't result in combinatoric issues.
> >> 
> >> Our framework also allows the preservation of the current ABI which
> >> allows an EINITTOKEN to be passed in from userspace.  The framework
> >> also supports the ability to specify that only a kernel based launch
> >> enclave (LE) should be available if the platform owner or distribution
> >> should desire to implement such a model.
> >> 
> >> The policy management framework is straight forward.  Three linked
> >> lists or their equivalent which are populated through /sysfs
> >> pseudo-files or equivalent plumbing.  Each list is populated with
> >> MRSIGNER values for signing keys that are allowed to initialize
> >> enclaves under three separate conditions.
> >> 
> >> 1.) General enclaves without special attribute bits.
> >> 
> >> 2.) Enclaves with the SGX_FLAGS_PROVISION_KEY attribute set. - i.e.,
> >> 'Provisioning Enclaves'.
> >> 
> >> 3.) Enclaves with the SGX_FLAGS_LICENSE_KEY attribute set - i.e., 'Launch
> >> Enclaves'.
> >> 
> >> An all-null MRSIGNER value serves as a 'sealing' value that locks a
> >> list from any further modifications.
> >> 
> >> This architecture allows platform policies to be specified and then
> >> sealed at early boot by the root user.  At that point cryptographic
> >> policy controls are in place rather then DAC based controls, the
> >> latter of which have perpetual security liabilities in addition to the
> >> useability constraints inherent in a DAC or device node model.
> >> 
> >> We have developed an independent implementation of the PSW and
> >> arguably have as much experience with issues surrounding how to
> >> interact with the device driver as anyone.  We have spent a lot of
> >> time thinking about these issues and the above framework provides the
> >> most flexible architecture available.
> > 
> > Sounds like a lot bloat and policy added to the kernel whereas with
> > Andy's proposal you can implement logic to a daemon and provide only
> > mechanism to do it.

> Well, almost. We'd need SGX_IOC_FREEZE_MR{ENCLAVE,SIGNER} or
> similar.  Or maybe the daemon could handle the entire loading process.
> But this can wait until after the main driver is upstream.
>
> This does lead to a question: enclaves are kind-of-sort-of mapped
> into a given address space. What happens if you issue the various
> ioctls in the context of a different mm?  For that matter, can two
> processes mmap the same enclave?

Fascinating.

We've been carrying a patch, that drops in on top of the proposed
kernel driver, that implements the needed policy management framework
for DAC fragile (FLC) platforms.  After a meeting yesterday with the
client that is funding the work, a decision was made to release the
enhancements when the SGX driver goes mainline.  That will at least
give developers the option of creating solutions on Linux that
implement the security guarantees that SGX was designed to deliver.

Most importantly, since it implements a driver consistent with the
design of SGX, it has the added benefit of allowing system
administrators the ability to enable the driver to work on non-FLC
(locked) platforms.  Since Jarkko confirmed that FLC is the option of
platform vendors, this would seem to be important as SGX on Linux will
only work in a random fashion dependent on the whims of hardware OEM's
in probably a SKU dependent fashion.  Which is why the client has
interest in the work.

Best wishes for a productive remainder of the week.

Dr. Greg

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.   Specializing in information infra-structure
Fargo, ND  58102development.
PH: 701-281-1686
FAX: 701-281-3949   EMAIL: g...@enjellic.com
--
"Five year projections, are you kidding me.  We don't know what we are
 supposed to be doing at the 4 o'clock meeting this afternoon."
-- Terry Wieland
   Resurrection

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-27 Thread Jarkko Sakkinen

On Sat, Nov 24, 2018 at 08:45:34AM -0800, Jarkko Sakkinen wrote:
> On Fri, Nov 23, 2018 at 04:39:23AM -0600, Dr. Greg wrote:
> > Jarkko, when this driver lands it will set the SGX ABI in stone for
> > Linux.  It would be very, very helpful to the development community if
> > there was some official guidance from Intel on whether or not FLC will
> > be a universal feature on all hardware and the date that is going to
> > happen or has happened.
> 
> I seriously don't know but I can take this message to the mothership...

LC enabling is essentially a platform vendors choice, not Intels choice,
like many other CPU features that Linux is dependent on. Of course, if
Linux ends supporting only LC that will without doubt have a big impact
on vendors so in that way it is indirectly also communitys choice.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-27 Thread Jarkko Sakkinen

On Sat, Nov 24, 2018 at 08:45:34AM -0800, Jarkko Sakkinen wrote:
> On Fri, Nov 23, 2018 at 04:39:23AM -0600, Dr. Greg wrote:
> > Jarkko, when this driver lands it will set the SGX ABI in stone for
> > Linux.  It would be very, very helpful to the development community if
> > there was some official guidance from Intel on whether or not FLC will
> > be a universal feature on all hardware and the date that is going to
> > happen or has happened.
> 
> I seriously don't know but I can take this message to the mothership...

LC enabling is essentially a platform vendors choice, not Intels choice,
like many other CPU features that Linux is dependent on. Of course, if
Linux ends supporting only LC that will without doubt have a big impact
on vendors so in that way it is indirectly also communitys choice.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-27 Thread Andy Lutomirski




> On Nov 27, 2018, at 8:41 AM, Jarkko Sakkinen 
>  wrote:
> 
>> On Tue, Nov 27, 2018 at 02:55:33AM -0600, Dr. Greg wrote:
>> Since the thread has become a bit divergent I wanted to note that we
>> have offered a proposal for a general policy management framework
>> based on MRSIGNER values.  This framework is consistent with the SGX
>> security model, ie. cryptographic rather then DAC based policy
>> controls.  This framework also allows a much more flexible policy
>> implementation that doesn't result in combinatoric issues.
>> 
>> Our framework also allows the preservation of the current ABI which
>> allows an EINITTOKEN to be passed in from userspace.  The framework
>> also supports the ability to specify that only a kernel based launch
>> enclave (LE) should be available if the platform owner or distribution
>> should desire to implement such a model.
>> 
>> The policy management framework is straight forward.  Three linked
>> lists or their equivalent which are populated through /sysfs
>> pseudo-files or equivalent plumbing.  Each list is populated with
>> MRSIGNER values for signing keys that are allowed to initialize
>> enclaves under three separate conditions.
>> 
>> 1.) General enclaves without special attribute bits.
>> 
>> 2.) Enclaves with the SGX_FLAGS_PROVISION_KEY attribute set. - i.e.,
>> 'Provisioning Enclaves'.
>> 
>> 3.) Enclaves with the SGX_FLAGS_LICENSE_KEY attribute set - i.e., 'Launch
>> Enclaves'.
>> 
>> An all-null MRSIGNER value serves as a 'sealing' value that locks a
>> list from any further modifications.
>> 
>> This architecture allows platform policies to be specified and then
>> sealed at early boot by the root user.  At that point cryptographic
>> policy controls are in place rather then DAC based controls, the
>> latter of which have perpetual security liabilities in addition to the
>> useability constraints inherent in a DAC or device node model.
>> 
>> We have developed an independent implementation of the PSW and
>> arguably have as much experience with issues surrounding how to
>> interact with the device driver as anyone.  We have spent a lot of
>> time thinking about these issues and the above framework provides the
>> most flexible architecture available.
> 
> Sounds like a lot bloat and policy added to the kernel whereas with
> Andy's proposal you can implement logic to a daemon and provide only
> mechanism to do it.
> 
> 

Well, almost. We’d need SGX_IOC_FREEZE_MR{ENCLAVE,SIGNER} or similar.  Or maybe 
the daemon could handle the entire loading process.  But this can wait until 
after the main driver is upstream.

This does lead to a question: enclaves are kind-of-sort-of mapped into a given 
address space. What happens if you issue the various ioctls in the context of a 
different mm?  For that matter, can two processes mmap the same enclave?

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-27 Thread Andy Lutomirski




> On Nov 27, 2018, at 8:41 AM, Jarkko Sakkinen 
>  wrote:
> 
>> On Tue, Nov 27, 2018 at 02:55:33AM -0600, Dr. Greg wrote:
>> Since the thread has become a bit divergent I wanted to note that we
>> have offered a proposal for a general policy management framework
>> based on MRSIGNER values.  This framework is consistent with the SGX
>> security model, ie. cryptographic rather then DAC based policy
>> controls.  This framework also allows a much more flexible policy
>> implementation that doesn't result in combinatoric issues.
>> 
>> Our framework also allows the preservation of the current ABI which
>> allows an EINITTOKEN to be passed in from userspace.  The framework
>> also supports the ability to specify that only a kernel based launch
>> enclave (LE) should be available if the platform owner or distribution
>> should desire to implement such a model.
>> 
>> The policy management framework is straight forward.  Three linked
>> lists or their equivalent which are populated through /sysfs
>> pseudo-files or equivalent plumbing.  Each list is populated with
>> MRSIGNER values for signing keys that are allowed to initialize
>> enclaves under three separate conditions.
>> 
>> 1.) General enclaves without special attribute bits.
>> 
>> 2.) Enclaves with the SGX_FLAGS_PROVISION_KEY attribute set. - i.e.,
>> 'Provisioning Enclaves'.
>> 
>> 3.) Enclaves with the SGX_FLAGS_LICENSE_KEY attribute set - i.e., 'Launch
>> Enclaves'.
>> 
>> An all-null MRSIGNER value serves as a 'sealing' value that locks a
>> list from any further modifications.
>> 
>> This architecture allows platform policies to be specified and then
>> sealed at early boot by the root user.  At that point cryptographic
>> policy controls are in place rather then DAC based controls, the
>> latter of which have perpetual security liabilities in addition to the
>> useability constraints inherent in a DAC or device node model.
>> 
>> We have developed an independent implementation of the PSW and
>> arguably have as much experience with issues surrounding how to
>> interact with the device driver as anyone.  We have spent a lot of
>> time thinking about these issues and the above framework provides the
>> most flexible architecture available.
> 
> Sounds like a lot bloat and policy added to the kernel whereas with
> Andy's proposal you can implement logic to a daemon and provide only
> mechanism to do it.
> 
> 

Well, almost. We’d need SGX_IOC_FREEZE_MR{ENCLAVE,SIGNER} or similar.  Or maybe 
the daemon could handle the entire loading process.  But this can wait until 
after the main driver is upstream.

This does lead to a question: enclaves are kind-of-sort-of mapped into a given 
address space. What happens if you issue the various ioctls in the context of a 
different mm?  For that matter, can two processes mmap the same enclave?

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-27 Thread Jarkko Sakkinen

On Tue, Nov 27, 2018 at 02:55:33AM -0600, Dr. Greg wrote:
> 3.) Enclaves with the SGX_FLAGS_LICENSE_KEY attribute set - i.e., 'Launch
> Enclaves'.

Kernel does not have to manage this. If the MSRs are read-only, they
should match your LE. If the MSRs writable, you don't need an LE.

This whole scheme sounds like adding own SELinux for SGX and it is
only words. No code available.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-27 Thread Jarkko Sakkinen

On Tue, Nov 27, 2018 at 02:55:33AM -0600, Dr. Greg wrote:
> 3.) Enclaves with the SGX_FLAGS_LICENSE_KEY attribute set - i.e., 'Launch
> Enclaves'.

Kernel does not have to manage this. If the MSRs are read-only, they
should match your LE. If the MSRs writable, you don't need an LE.

This whole scheme sounds like adding own SELinux for SGX and it is
only words. No code available.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-27 Thread Jarkko Sakkinen

On Tue, Nov 27, 2018 at 02:55:33AM -0600, Dr. Greg wrote:
> Since the thread has become a bit divergent I wanted to note that we
> have offered a proposal for a general policy management framework
> based on MRSIGNER values.  This framework is consistent with the SGX
> security model, ie. cryptographic rather then DAC based policy
> controls.  This framework also allows a much more flexible policy
> implementation that doesn't result in combinatoric issues.
> 
> Our framework also allows the preservation of the current ABI which
> allows an EINITTOKEN to be passed in from userspace.  The framework
> also supports the ability to specify that only a kernel based launch
> enclave (LE) should be available if the platform owner or distribution
> should desire to implement such a model.
> 
> The policy management framework is straight forward.  Three linked
> lists or their equivalent which are populated through /sysfs
> pseudo-files or equivalent plumbing.  Each list is populated with
> MRSIGNER values for signing keys that are allowed to initialize
> enclaves under three separate conditions.
> 
> 1.) General enclaves without special attribute bits.
> 
> 2.) Enclaves with the SGX_FLAGS_PROVISION_KEY attribute set. - i.e.,
> 'Provisioning Enclaves'.
> 
> 3.) Enclaves with the SGX_FLAGS_LICENSE_KEY attribute set - i.e., 'Launch
> Enclaves'.
> 
> An all-null MRSIGNER value serves as a 'sealing' value that locks a
> list from any further modifications.
> 
> This architecture allows platform policies to be specified and then
> sealed at early boot by the root user.  At that point cryptographic
> policy controls are in place rather then DAC based controls, the
> latter of which have perpetual security liabilities in addition to the
> useability constraints inherent in a DAC or device node model.
> 
> We have developed an independent implementation of the PSW and
> arguably have as much experience with issues surrounding how to
> interact with the device driver as anyone.  We have spent a lot of
> time thinking about these issues and the above framework provides the
> most flexible architecture available.

Sounds like a lot bloat and policy added to the kernel whereas with
Andy's proposal you can implement logic to a daemon and provide only
mechanism to do it.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-27 Thread Jarkko Sakkinen

On Tue, Nov 27, 2018 at 02:55:33AM -0600, Dr. Greg wrote:
> Since the thread has become a bit divergent I wanted to note that we
> have offered a proposal for a general policy management framework
> based on MRSIGNER values.  This framework is consistent with the SGX
> security model, ie. cryptographic rather then DAC based policy
> controls.  This framework also allows a much more flexible policy
> implementation that doesn't result in combinatoric issues.
> 
> Our framework also allows the preservation of the current ABI which
> allows an EINITTOKEN to be passed in from userspace.  The framework
> also supports the ability to specify that only a kernel based launch
> enclave (LE) should be available if the platform owner or distribution
> should desire to implement such a model.
> 
> The policy management framework is straight forward.  Three linked
> lists or their equivalent which are populated through /sysfs
> pseudo-files or equivalent plumbing.  Each list is populated with
> MRSIGNER values for signing keys that are allowed to initialize
> enclaves under three separate conditions.
> 
> 1.) General enclaves without special attribute bits.
> 
> 2.) Enclaves with the SGX_FLAGS_PROVISION_KEY attribute set. - i.e.,
> 'Provisioning Enclaves'.
> 
> 3.) Enclaves with the SGX_FLAGS_LICENSE_KEY attribute set - i.e., 'Launch
> Enclaves'.
> 
> An all-null MRSIGNER value serves as a 'sealing' value that locks a
> list from any further modifications.
> 
> This architecture allows platform policies to be specified and then
> sealed at early boot by the root user.  At that point cryptographic
> policy controls are in place rather then DAC based controls, the
> latter of which have perpetual security liabilities in addition to the
> useability constraints inherent in a DAC or device node model.
> 
> We have developed an independent implementation of the PSW and
> arguably have as much experience with issues surrounding how to
> interact with the device driver as anyone.  We have spent a lot of
> time thinking about these issues and the above framework provides the
> most flexible architecture available.

Sounds like a lot bloat and policy added to the kernel whereas with
Andy's proposal you can implement logic to a daemon and provide only
mechanism to do it.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-27 Thread Dr. Greg

On Mon, Nov 26, 2018 at 03:04:36PM -0800, Jarkko Sakkinen wrote:

Good morning to everyone.

> On Mon, Nov 26, 2018 at 01:51:45PM -0800, Jarkko Sakkinen wrote:
> > > ioctl(sgx, SGX_IOC_ADD_RIGHT, sgx_provisioning);
> > > 
> > > This requires extra syscalls, but it doesn???t have the combinatorial
> > > explosion problem.
> > 
> > I like this design because it is extendable. I'm now also in the same
> > page why we need to protect provisioning in the first place. I would
> > slight restructure this as
> > 
> > /dev/sgx/control
> > /dev/sgx/attributes/provision

> I guess it would be OK to upstream only control node first as long
> as provision attribute is denied in order to keep the already huge
> patch set a tiny bit smaller?

At this point in time I believe there is a consensus that the driver
needs a policy management framework of some type for an optimum
implementation.  The PROVISION attribute has privacy implications and
unrestricted access to release mode (full security) is problematic.

Since the thread has become a bit divergent I wanted to note that we
have offered a proposal for a general policy management framework
based on MRSIGNER values.  This framework is consistent with the SGX
security model, ie. cryptographic rather then DAC based policy
controls.  This framework also allows a much more flexible policy
implementation that doesn't result in combinatoric issues.

Our framework also allows the preservation of the current ABI which
allows an EINITTOKEN to be passed in from userspace.  The framework
also supports the ability to specify that only a kernel based launch
enclave (LE) should be available if the platform owner or distribution
should desire to implement such a model.

The policy management framework is straight forward.  Three linked
lists or their equivalent which are populated through /sysfs
pseudo-files or equivalent plumbing.  Each list is populated with
MRSIGNER values for signing keys that are allowed to initialize
enclaves under three separate conditions.

1.) General enclaves without special attribute bits.

2.) Enclaves with the SGX_FLAGS_PROVISION_KEY attribute set. - i.e.,
'Provisioning Enclaves'.

3.) Enclaves with the SGX_FLAGS_LICENSE_KEY attribute set - i.e., 'Launch
Enclaves'.

An all-null MRSIGNER value serves as a 'sealing' value that locks a
list from any further modifications.

This architecture allows platform policies to be specified and then
sealed at early boot by the root user.  At that point cryptographic
policy controls are in place rather then DAC based controls, the
latter of which have perpetual security liabilities in addition to the
useability constraints inherent in a DAC or device node model.

We have developed an independent implementation of the PSW and
arguably have as much experience with issues surrounding how to
interact with the device driver as anyone.  We have spent a lot of
time thinking about these issues and the above framework provides the
most flexible architecture available.

> /Jarkko

We would be happy to discuss specific aspects of the implementation.

Have a good day.

Dr. Greg

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.   Specializing in information infra-structure
Fargo, ND  58102development.
PH: 701-281-1686
FAX: 701-281-3949   EMAIL: g...@enjellic.com
--
"Remember that when you take down the fishhouse you can't put
 the minnows back into the lake, so throw them out on the ice.
 Make sure you stomp on any of the live ones so they don't suffer."
-- Fritz Wettstein
   At the lake

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-27 Thread Dr. Greg

On Mon, Nov 26, 2018 at 03:04:36PM -0800, Jarkko Sakkinen wrote:

Good morning to everyone.

> On Mon, Nov 26, 2018 at 01:51:45PM -0800, Jarkko Sakkinen wrote:
> > > ioctl(sgx, SGX_IOC_ADD_RIGHT, sgx_provisioning);
> > > 
> > > This requires extra syscalls, but it doesn???t have the combinatorial
> > > explosion problem.
> > 
> > I like this design because it is extendable. I'm now also in the same
> > page why we need to protect provisioning in the first place. I would
> > slight restructure this as
> > 
> > /dev/sgx/control
> > /dev/sgx/attributes/provision

> I guess it would be OK to upstream only control node first as long
> as provision attribute is denied in order to keep the already huge
> patch set a tiny bit smaller?

At this point in time I believe there is a consensus that the driver
needs a policy management framework of some type for an optimum
implementation.  The PROVISION attribute has privacy implications and
unrestricted access to release mode (full security) is problematic.

Since the thread has become a bit divergent I wanted to note that we
have offered a proposal for a general policy management framework
based on MRSIGNER values.  This framework is consistent with the SGX
security model, ie. cryptographic rather then DAC based policy
controls.  This framework also allows a much more flexible policy
implementation that doesn't result in combinatoric issues.

Our framework also allows the preservation of the current ABI which
allows an EINITTOKEN to be passed in from userspace.  The framework
also supports the ability to specify that only a kernel based launch
enclave (LE) should be available if the platform owner or distribution
should desire to implement such a model.

The policy management framework is straight forward.  Three linked
lists or their equivalent which are populated through /sysfs
pseudo-files or equivalent plumbing.  Each list is populated with
MRSIGNER values for signing keys that are allowed to initialize
enclaves under three separate conditions.

1.) General enclaves without special attribute bits.

2.) Enclaves with the SGX_FLAGS_PROVISION_KEY attribute set. - i.e.,
'Provisioning Enclaves'.

3.) Enclaves with the SGX_FLAGS_LICENSE_KEY attribute set - i.e., 'Launch
Enclaves'.

An all-null MRSIGNER value serves as a 'sealing' value that locks a
list from any further modifications.

This architecture allows platform policies to be specified and then
sealed at early boot by the root user.  At that point cryptographic
policy controls are in place rather then DAC based controls, the
latter of which have perpetual security liabilities in addition to the
useability constraints inherent in a DAC or device node model.

We have developed an independent implementation of the PSW and
arguably have as much experience with issues surrounding how to
interact with the device driver as anyone.  We have spent a lot of
time thinking about these issues and the above framework provides the
most flexible architecture available.

> /Jarkko

We would be happy to discuss specific aspects of the implementation.

Have a good day.

Dr. Greg

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.   Specializing in information infra-structure
Fargo, ND  58102development.
PH: 701-281-1686
FAX: 701-281-3949   EMAIL: g...@enjellic.com
--
"Remember that when you take down the fishhouse you can't put
 the minnows back into the lake, so throw them out on the ice.
 Make sure you stomp on any of the live ones so they don't suffer."
-- Fritz Wettstein
   At the lake

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-26 Thread Jarkko Sakkinen

On Mon, Nov 26, 2018 at 01:51:45PM -0800, Jarkko Sakkinen wrote:
> > ioctl(sgx, SGX_IOC_ADD_RIGHT, sgx_provisioning);
> > 
> > This requires extra syscalls, but it doesn’t have the combinatorial
> > explosion problem.
> 
> I like this design because it is extendable. I'm now also in the same
> page why we need to protect provisioning in the first place. I would
> slight restructure this as
> 
> /dev/sgx/control
> /dev/sgx/attributes/provision

I guess it would be OK to upstream only control node first as long as
provision attribute is denied in order to keep the already huge patch
set a tiny bit smaller?

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-26 Thread Jarkko Sakkinen

On Mon, Nov 26, 2018 at 01:51:45PM -0800, Jarkko Sakkinen wrote:
> > ioctl(sgx, SGX_IOC_ADD_RIGHT, sgx_provisioning);
> > 
> > This requires extra syscalls, but it doesn’t have the combinatorial
> > explosion problem.
> 
> I like this design because it is extendable. I'm now also in the same
> page why we need to protect provisioning in the first place. I would
> slight restructure this as
> 
> /dev/sgx/control
> /dev/sgx/attributes/provision

I guess it would be OK to upstream only control node first as long as
provision attribute is denied in order to keep the already huge patch
set a tiny bit smaller?

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-26 Thread Jarkko Sakkinen

On Mon, Nov 26, 2018 at 05:00:39AM -0600, Dr. Greg wrote:
> We will be interested in your comments as to why the proposal is
> insufficient in the real world of FLC.
> 
> I believe the proposed architecture can be defended as being effective
> in the real world, as it allows the root user to use cryptographic
> protections of access to the PROVISION bit and to enclave execution in
> general.  On FLC that is the strongest guarantee that can be
> delivered.
> 
> When we speak of 'unauthorized' users I believe we are speaking in the
> parlance of discretionary access controls which has a much wider TCB
> scope then the cryptographic model we are proposing.  The model we
> propose allows the platform owner (root) to effectively implement the
> same level of security over the PROVISION bit that current locked
> platforms have, in a free and open fashion of course.
> 
> We can certainly attempt to explain our position further.

I think kernel controlled provision would in all cases lower the
mitigations of thread scenarios (at least what you've presented so far)
assuming that a compromissed kernel could be detected fairly quickly,
wouldn't it?

Even without SGX, having a compromissed kernel, you can anyhow stealth
your malware in many ways.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-26 Thread Jarkko Sakkinen

On Mon, Nov 26, 2018 at 05:00:39AM -0600, Dr. Greg wrote:
> We will be interested in your comments as to why the proposal is
> insufficient in the real world of FLC.
> 
> I believe the proposed architecture can be defended as being effective
> in the real world, as it allows the root user to use cryptographic
> protections of access to the PROVISION bit and to enclave execution in
> general.  On FLC that is the strongest guarantee that can be
> delivered.
> 
> When we speak of 'unauthorized' users I believe we are speaking in the
> parlance of discretionary access controls which has a much wider TCB
> scope then the cryptographic model we are proposing.  The model we
> propose allows the platform owner (root) to effectively implement the
> same level of security over the PROVISION bit that current locked
> platforms have, in a free and open fashion of course.
> 
> We can certainly attempt to explain our position further.

I think kernel controlled provision would in all cases lower the
mitigations of thread scenarios (at least what you've presented so far)
assuming that a compromissed kernel could be detected fairly quickly,
wouldn't it?

Even without SGX, having a compromissed kernel, you can anyhow stealth
your malware in many ways.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-26 Thread Jarkko Sakkinen

On Sun, Nov 25, 2018 at 08:22:35AM -0800, Andy Lutomirski wrote:
> Agreed. What I’m proposing adds additional security if the kernel is
> *not* compromised.

And even if the kernel is compromised evil use will detected quicker
i.e. compromissed kernel is "better" than a kernel that allows to
use provisioning freely.

> There are other ways to accomplish it that might be better in some
> respects.  For example, there could be /dev/sgx and
> /dev/sgx_rights/provision.  The former exposes the whole sgx API,
> except that it doesn’t allow provisioning by default. The latter does
> nothing by itself. To run a provisioning enclave, you open both nodes,
> then do something like:
> 
> ioctl(sgx, SGX_IOC_ADD_RIGHT, sgx_provisioning);
> 
> This requires extra syscalls, but it doesn’t have the combinatorial
> explosion problem.

I like this design because it is extendable. I'm now also in the same
page why we need to protect provisioning in the first place. I would
slight restructure this as

/dev/sgx/control
/dev/sgx/attributes/provision

Looks cleaner and the root /dev directory is less polluted.

BTW, off-topic from this but should we remove ENCLAVE from IOC names as
they all concern enclaves anyway? Seems kind of redundant. I.e.

SGX_IOC_ENCLAVE_CREATE -> SGX_IOC_CREATE
SGX_IOC_ENCLAVE_ADD_PAGE -> SGX_IOC_ADD_PAGE
SGX_IOC_ENCLAVE_INIT -> SGX_IOC_INIT

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-26 Thread Jarkko Sakkinen

On Sun, Nov 25, 2018 at 08:22:35AM -0800, Andy Lutomirski wrote:
> Agreed. What I’m proposing adds additional security if the kernel is
> *not* compromised.

And even if the kernel is compromised evil use will detected quicker
i.e. compromissed kernel is "better" than a kernel that allows to
use provisioning freely.

> There are other ways to accomplish it that might be better in some
> respects.  For example, there could be /dev/sgx and
> /dev/sgx_rights/provision.  The former exposes the whole sgx API,
> except that it doesn’t allow provisioning by default. The latter does
> nothing by itself. To run a provisioning enclave, you open both nodes,
> then do something like:
> 
> ioctl(sgx, SGX_IOC_ADD_RIGHT, sgx_provisioning);
> 
> This requires extra syscalls, but it doesn’t have the combinatorial
> explosion problem.

I like this design because it is extendable. I'm now also in the same
page why we need to protect provisioning in the first place. I would
slight restructure this as

/dev/sgx/control
/dev/sgx/attributes/provision

Looks cleaner and the root /dev directory is less polluted.

BTW, off-topic from this but should we remove ENCLAVE from IOC names as
they all concern enclaves anyway? Seems kind of redundant. I.e.

SGX_IOC_ENCLAVE_CREATE -> SGX_IOC_CREATE
SGX_IOC_ENCLAVE_ADD_PAGE -> SGX_IOC_ADD_PAGE
SGX_IOC_ENCLAVE_INIT -> SGX_IOC_INIT

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-26 Thread Jarkko Sakkinen

On Sat, Nov 24, 2018 at 02:13:18PM -0600, Dr. Greg wrote:
> This isn't about an enclave being able to tell that it is really an
> enclave.  As I noted in my previous reply, access to the provisioning
> bit allows an enclave author to create a perpetual hardware identifier
> for a platform based on a signing key of their choosing, along with a
> few other incidentals, all of which are completely under the control
> of the enclave author.

I think I'm now in the same page with the issue now. Thanks for the
patience explaining this.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-26 Thread Jarkko Sakkinen

On Sat, Nov 24, 2018 at 02:13:18PM -0600, Dr. Greg wrote:
> This isn't about an enclave being able to tell that it is really an
> enclave.  As I noted in my previous reply, access to the provisioning
> bit allows an enclave author to create a perpetual hardware identifier
> for a platform based on a signing key of their choosing, along with a
> few other incidentals, all of which are completely under the control
> of the enclave author.

I think I'm now in the same page with the issue now. Thanks for the
patience explaining this.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-26 Thread Jarkko Sakkinen

On Sat, Nov 24, 2018 at 01:24:54PM -0600, Dr. Greg wrote:
> On Sat, Nov 24, 2018 at 08:15:21AM -0800, Jarkko Sakkinen wrote:
> 
> > On Tue, Nov 20, 2018 at 05:15:08AM -0600, Dr. Greg wrote:
> > > Malware would not necessarily need the Intel attestation service.
> > > Once access to the PROVISION bit is available, malware teams could
> > > simply build their own attestation service.
> 
> > AFAIK not possible as they wouldn't have access to the root
> > provisioning key. Can be confirmed from the SDM's key derivation
> > table (41-56).
> 
> What provisioning and attestation is all about is establishing an
> identity binding for a platform in question.  The standard Intel
> service binds the identity of a platform to an EPID private key.
> 
> With access to the SGX_FLAGS_PROVISION_BIT an enclave can generate a
> perpetual identity for a platform based on the identity modulus
> signature (MRSIGNER) of the key that signs the signature structure of
> the enclave.  Without access to the root provisioning key a security
> quorum or group has to be implemented via a subscription or enrollment
> model but that is arguably not much of an obstacle.
> 
> That is pretty much the way standard botware works now.
> 
> Without provisions for cryptographically secure authorization and
> policy enforcement in the driver, we will be creating infrastructure
> for a new generation of botware/malware whose mothership will know
> that a participating platform is running with full confidentiality and
> integrity protections.

OK, I think I got what you mean.

With free access to the provision the bot net controller could be sure
that a node is running inside an enclave. Is this what you are worried
about? Please correct if not or even if there is a slight drift on what
you are trying to state.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-26 Thread Jarkko Sakkinen

On Sat, Nov 24, 2018 at 01:24:54PM -0600, Dr. Greg wrote:
> On Sat, Nov 24, 2018 at 08:15:21AM -0800, Jarkko Sakkinen wrote:
> 
> > On Tue, Nov 20, 2018 at 05:15:08AM -0600, Dr. Greg wrote:
> > > Malware would not necessarily need the Intel attestation service.
> > > Once access to the PROVISION bit is available, malware teams could
> > > simply build their own attestation service.
> 
> > AFAIK not possible as they wouldn't have access to the root
> > provisioning key. Can be confirmed from the SDM's key derivation
> > table (41-56).
> 
> What provisioning and attestation is all about is establishing an
> identity binding for a platform in question.  The standard Intel
> service binds the identity of a platform to an EPID private key.
> 
> With access to the SGX_FLAGS_PROVISION_BIT an enclave can generate a
> perpetual identity for a platform based on the identity modulus
> signature (MRSIGNER) of the key that signs the signature structure of
> the enclave.  Without access to the root provisioning key a security
> quorum or group has to be implemented via a subscription or enrollment
> model but that is arguably not much of an obstacle.
> 
> That is pretty much the way standard botware works now.
> 
> Without provisions for cryptographically secure authorization and
> policy enforcement in the driver, we will be creating infrastructure
> for a new generation of botware/malware whose mothership will know
> that a participating platform is running with full confidentiality and
> integrity protections.

OK, I think I got what you mean.

With free access to the provision the bot net controller could be sure
that a node is running inside an enclave. Is this what you are worried
about? Please correct if not or even if there is a slight drift on what
you are trying to state.

/Jarkko

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-26 Thread Andy Lutomirski

On Mon, Nov 26, 2018 at 3:00 AM Dr. Greg  wrote:
>
> On Sun, Nov 25, 2018 at 04:37:00PM -0800, Andy Lutomirski wrote:
>
> > Bah, I hit send on a partially written draft. I???ll try again soon.
> >
> > --Andy
>
> Your first issue seems to be complete so I will respond to that in
> order to make sure we are not talking past one another.

It wasn't, but your answer is enlightening!  I've read the SGX
*manual*, but I hadn't dug through the actual Intel-supplied enclaves.
So, when I said that the LE isn't an important part of the overall
trust model, I meant that it isn't *in hardware*.  It's certainly
possible to write SGX software that weakens the security of the
overall system, and Intel seems to have done so:

>
> The Intel supplied launch enclave (LE) specifically denies
> initialization of enclaves which have the PROVISION attribute set,
> with the exception of enclaves whose MRSIGNER values match those of
> the keys that Intel uses to sign the PCE and PVE enclaves.  See line
> 219 of psw/ae/le/launch_enclave.cpp in the Intel SDK.

This seems entirely reasonable.  (But see below...)

> This report, along with the PEK, is submitted to the PCE enclave in
> order to generate the Platform Provisioning IDentity (PPID), which is
> the privacy critical identifier.  The PCE verifies the report
> generated by the PVE and rejects the request to generate the PPID if
> the report was generated by an enclave that was not initialized with
> the PROVISION bit set, see line 109 of psw/ae/pce/pce.cpp.

...and this seems entirely unreasonable.  Your description does indeed
appear consistent with the code: the PCE will hand out the PPID to any
requesting enclave that has the PROVISION bit set, so you are correct
that:

> Once again, the important design factor in all of this is the premise
> that the launch enclave will not allow enclaves other then the PCE and
> PVE to access the PROVISION bit.

But here's where the whole thing goes off the rails.  I would argue
that the Intel-supplied (and Intel-signed, apparently!) PCE is just
straight-up buggy.  What Intel is *trying* to do is to hand out the
PPID to an appropriately signed enclave.  What they actually did is to
hand out the PPID to any enclave that has the PROVISION bit set.  This
is poor design because it overload the PROVISION bit.  That bit is
supposed to mean "may use EGETKEY to obtain provisioning and
provisioning seal keys", which is not actually what Intel wants here.
It's also poor design without FLC because it pointlessly relies on the
LE to enforce a restriction on the use of provisioning enclaves, when
the code could instead have checked MRSIGNER..  And it's just straight
up wrong with FLC because there is no guarantee whatsoever that the LE
being used is Intel's.  And, for that matter, there is no guarantee
that the requesting enclave doesn't have the DEBUG bit set.

(It's also poor design because the PCE doesn't appear to verify that
the report passed in is actually intended to be associated with a call
to get_ppid().  There appear to be reports associated with provision
"msg1" and "msg3".  If it's possible to get a valid report for msg3 to
be accepted as a msg1 report of vice versa, then it might be game
over.)

Sorry, but this is not Linux's problem.  The right fix is, in my
opinion, entirely clear: the PCE should check MRSIGNER and possibly
even MRENCLAVE in the report it receives.  Intel needs to fix their
PCE, sign a fixed version, and find some way to revoke, to the extent
possible, the old one.  And the SGX enclave authors need to understand
something that is apparently subtle: THE LAUNCH POLICY IS NOT PART OF
THE TCB AND SHOULD NOT BE RELIED UPON.  Enclaves can and should be
written to remain secure even in the complete absence of any form of
launch control.

I went and filed a bug on github.  Let's see what happens:

https://github.com/intel/linux-sgx/issues/345

Also, the whole SGX report mechanism seems to be misused in the SDK.
An SGX report is a cryptographic primitive that essentially acts like
a signed blob.  Building a secure protocol on top of signed messages
or on top of reports takes more than just making up ad hoc blob
formats and signing them.  There needs to be domain separation between
different messages, and this seems to be entirely missing.  Do you
know if Intel has had a serious audit of their platform enclave code
done?

>
> > No, I have no clue why Intel did it this way.  I consider it to be a
> > mistake.
>
> Are you referring to there being a mistake in the trust relationships
> that the provisioning protocol implements or the overall concept of a
> provisioning key?

I'm referring to the hardware's policy as to when keys that don't
depend on OWNEREPOCH can be obtained.  As far as I know, the only real
need for such keys is to verify that the running platform is a real
Intel platform, which means that access to the provisioning key is
only useful to Intel-approved services.  Why didn't Intel enforce this
in hardware or

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-26 Thread Andy Lutomirski

On Mon, Nov 26, 2018 at 3:00 AM Dr. Greg  wrote:
>
> On Sun, Nov 25, 2018 at 04:37:00PM -0800, Andy Lutomirski wrote:
>
> > Bah, I hit send on a partially written draft. I???ll try again soon.
> >
> > --Andy
>
> Your first issue seems to be complete so I will respond to that in
> order to make sure we are not talking past one another.

It wasn't, but your answer is enlightening!  I've read the SGX
*manual*, but I hadn't dug through the actual Intel-supplied enclaves.
So, when I said that the LE isn't an important part of the overall
trust model, I meant that it isn't *in hardware*.  It's certainly
possible to write SGX software that weakens the security of the
overall system, and Intel seems to have done so:

>
> The Intel supplied launch enclave (LE) specifically denies
> initialization of enclaves which have the PROVISION attribute set,
> with the exception of enclaves whose MRSIGNER values match those of
> the keys that Intel uses to sign the PCE and PVE enclaves.  See line
> 219 of psw/ae/le/launch_enclave.cpp in the Intel SDK.

This seems entirely reasonable.  (But see below...)

> This report, along with the PEK, is submitted to the PCE enclave in
> order to generate the Platform Provisioning IDentity (PPID), which is
> the privacy critical identifier.  The PCE verifies the report
> generated by the PVE and rejects the request to generate the PPID if
> the report was generated by an enclave that was not initialized with
> the PROVISION bit set, see line 109 of psw/ae/pce/pce.cpp.

...and this seems entirely unreasonable.  Your description does indeed
appear consistent with the code: the PCE will hand out the PPID to any
requesting enclave that has the PROVISION bit set, so you are correct
that:

> Once again, the important design factor in all of this is the premise
> that the launch enclave will not allow enclaves other then the PCE and
> PVE to access the PROVISION bit.

But here's where the whole thing goes off the rails.  I would argue
that the Intel-supplied (and Intel-signed, apparently!) PCE is just
straight-up buggy.  What Intel is *trying* to do is to hand out the
PPID to an appropriately signed enclave.  What they actually did is to
hand out the PPID to any enclave that has the PROVISION bit set.  This
is poor design because it overload the PROVISION bit.  That bit is
supposed to mean "may use EGETKEY to obtain provisioning and
provisioning seal keys", which is not actually what Intel wants here.
It's also poor design without FLC because it pointlessly relies on the
LE to enforce a restriction on the use of provisioning enclaves, when
the code could instead have checked MRSIGNER..  And it's just straight
up wrong with FLC because there is no guarantee whatsoever that the LE
being used is Intel's.  And, for that matter, there is no guarantee
that the requesting enclave doesn't have the DEBUG bit set.

(It's also poor design because the PCE doesn't appear to verify that
the report passed in is actually intended to be associated with a call
to get_ppid().  There appear to be reports associated with provision
"msg1" and "msg3".  If it's possible to get a valid report for msg3 to
be accepted as a msg1 report of vice versa, then it might be game
over.)

Sorry, but this is not Linux's problem.  The right fix is, in my
opinion, entirely clear: the PCE should check MRSIGNER and possibly
even MRENCLAVE in the report it receives.  Intel needs to fix their
PCE, sign a fixed version, and find some way to revoke, to the extent
possible, the old one.  And the SGX enclave authors need to understand
something that is apparently subtle: THE LAUNCH POLICY IS NOT PART OF
THE TCB AND SHOULD NOT BE RELIED UPON.  Enclaves can and should be
written to remain secure even in the complete absence of any form of
launch control.

I went and filed a bug on github.  Let's see what happens:

https://github.com/intel/linux-sgx/issues/345

Also, the whole SGX report mechanism seems to be misused in the SDK.
An SGX report is a cryptographic primitive that essentially acts like
a signed blob.  Building a secure protocol on top of signed messages
or on top of reports takes more than just making up ad hoc blob
formats and signing them.  There needs to be domain separation between
different messages, and this seems to be entirely missing.  Do you
know if Intel has had a serious audit of their platform enclave code
done?

>
> > No, I have no clue why Intel did it this way.  I consider it to be a
> > mistake.
>
> Are you referring to there being a mistake in the trust relationships
> that the provisioning protocol implements or the overall concept of a
> provisioning key?

I'm referring to the hardware's policy as to when keys that don't
depend on OWNEREPOCH can be obtained.  As far as I know, the only real
need for such keys is to verify that the running platform is a real
Intel platform, which means that access to the provisioning key is
only useful to Intel-approved services.  Why didn't Intel enforce this
in hardware or

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-26 Thread Dr. Greg

On Sun, Nov 25, 2018 at 04:37:00PM -0800, Andy Lutomirski wrote:

> Bah, I hit send on a partially written draft. I???ll try again soon.
> 
> --Andy

Your first issue seems to be complete so I will respond to that in
order to make sure we are not talking past one another.

> > On Nov 25, 2018, at 1:59 PM, Andy Lutomirski  wrote:
> >> On Nov 25, 2018, at 10:55 AM, Dr. Greg  wrote:
> >> 
> >> The notion of a launch enclave is critical to establishing these
> >> guarantees.  As soon as the kernel becomes involved in implementing
> >> SGX security policy the architecture becomes vulnerable to kernel
> >> and/or privilege modification attacks.
> >> 
> >> We've talked at length about the provisioning bit, I won't go into
> >> details unless people are interested, but the EPID provisioning
> >> protocol implements an SGX mediated cryptographic contract that a
> >> perpetual platform identifier will not be disclosed to anyone but
> >> Intel.  

> As a reviewer, and as an occasional academic cryptographer, I need
> to put my foot down here.  This is inaccurate.

I certainly wouldn't try to engage in a debate at the level of
academic cryptography, so I want to clarify that I was speaking
specifically with respect to the ability to use the Intel supplied
Platform Certification Enclave (PCE) to obtain a perpetual platform
identifier.

There could certainly be an academic level weakness in SGX, or in the
provisioning protocol, but at the end of the day the important issue
seems to be whether or not a PCE enclave can be exploited by anyone
with execution access to the enclave to generate a perpetual
identifier for a platform.

> There is an SGX-mediated contract that says:
>
> 1. For any given public key p, a perpetual platform identifier ID_p
> exists and will only be disclosed to the holder of the corresponding
> private key p_priv or to someone to whom the private key holder
> permits (intentionally or otherwise) to use that identifier.
>
> 2. The ability described in #1 is available to anyone whom the
> kernel and launch enclave (if the MSRs are locked ) permits
> (intentionally or otherwise) to use it.

Let me see if I can respond directly to point two as it seems the most
important.

In the EPID provisioning model, the PCE and the ProVisioning Enclave
(PVE) both have posession of the PROVISION attribute and thus access
to the derivation of an MRSIGNER specific provisioning key.

The Intel supplied launch enclave (LE) specifically denies
initialization of enclaves which have the PROVISION attribute set,
with the exception of enclaves whose MRSIGNER values match those of
the keys that Intel uses to sign the PCE and PVE enclaves.  See line
219 of psw/ae/le/launch_enclave.cpp in the Intel SDK.

In the message one phase of the provisioning protocol Intel supplies
the platform with a 3K RSA public key (PEK).  The identity of that key
is confirmed by an ECC256 based signature over the key.  Intel embeds
the public gx and gy curve points for the signature key in their SDK,
see ae/data/constants/linux/peksk_pub.hh.

The PVE verifies the signature of the PEK and generates a SHA256
verification hash over a message containing the key and uses that as
the data field in an attestation report that is generated against the
target information for the PCE.  The PVE rejects generation of the
report if the PCE target information does not have the PROVISION
attribute set.  See line 124 of psw/ae/pve/provision_msg1.cpp.

This report, along with the PEK, is submitted to the PCE enclave in
order to generate the Platform Provisioning IDentity (PPID), which is
the privacy critical identifier.  The PCE verifies the report
generated by the PVE and rejects the request to generate the PPID if
the report was generated by an enclave that was not initialized with
the PROVISION bit set, see line 109 of psw/ae/pce/pce.cpp.

The PCE enclave then recomputes the message hash using the PEK that it
is provided and verifies that the value matches the value in the data
field of the attestation report from the PVE enclave.  If the values
do not match the PPID generation request is denied.  The PCE enclave
then encrypts the PPID with the PEK key and the encrypted PPID is
returned to Intel to use as the platform identifier.

The PPID is the CMAC over a 16 byte null message which uses a derived
provisioning key based on CPUSVN and ISVSVN values set to zero.  See
the get_ppid() function in psw/ae/pce/pce_helper.cpp.

I believe this effectively denies the ability of anyone other then
Intel, who holds the private portion of the ECC256 signature key used
to authenticate the PEK, from using the PCE enclave to generate a
platform identifier.

As I conceded above, there could be an academic deficiency in all
this, I'm not qualified to comment.  I believe there is a reasonably
solid functional guarantee that on a locked platform the process
cannot be easily subverted by a privacy aggressor.

We contend that the model we propose below can also deliver this
guarantee

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-26 Thread Dr. Greg

On Sun, Nov 25, 2018 at 04:37:00PM -0800, Andy Lutomirski wrote:

> Bah, I hit send on a partially written draft. I???ll try again soon.
> 
> --Andy

Your first issue seems to be complete so I will respond to that in
order to make sure we are not talking past one another.

> > On Nov 25, 2018, at 1:59 PM, Andy Lutomirski  wrote:
> >> On Nov 25, 2018, at 10:55 AM, Dr. Greg  wrote:
> >> 
> >> The notion of a launch enclave is critical to establishing these
> >> guarantees.  As soon as the kernel becomes involved in implementing
> >> SGX security policy the architecture becomes vulnerable to kernel
> >> and/or privilege modification attacks.
> >> 
> >> We've talked at length about the provisioning bit, I won't go into
> >> details unless people are interested, but the EPID provisioning
> >> protocol implements an SGX mediated cryptographic contract that a
> >> perpetual platform identifier will not be disclosed to anyone but
> >> Intel.  

> As a reviewer, and as an occasional academic cryptographer, I need
> to put my foot down here.  This is inaccurate.

I certainly wouldn't try to engage in a debate at the level of
academic cryptography, so I want to clarify that I was speaking
specifically with respect to the ability to use the Intel supplied
Platform Certification Enclave (PCE) to obtain a perpetual platform
identifier.

There could certainly be an academic level weakness in SGX, or in the
provisioning protocol, but at the end of the day the important issue
seems to be whether or not a PCE enclave can be exploited by anyone
with execution access to the enclave to generate a perpetual
identifier for a platform.

> There is an SGX-mediated contract that says:
>
> 1. For any given public key p, a perpetual platform identifier ID_p
> exists and will only be disclosed to the holder of the corresponding
> private key p_priv or to someone to whom the private key holder
> permits (intentionally or otherwise) to use that identifier.
>
> 2. The ability described in #1 is available to anyone whom the
> kernel and launch enclave (if the MSRs are locked ) permits
> (intentionally or otherwise) to use it.

Let me see if I can respond directly to point two as it seems the most
important.

In the EPID provisioning model, the PCE and the ProVisioning Enclave
(PVE) both have posession of the PROVISION attribute and thus access
to the derivation of an MRSIGNER specific provisioning key.

The Intel supplied launch enclave (LE) specifically denies
initialization of enclaves which have the PROVISION attribute set,
with the exception of enclaves whose MRSIGNER values match those of
the keys that Intel uses to sign the PCE and PVE enclaves.  See line
219 of psw/ae/le/launch_enclave.cpp in the Intel SDK.

In the message one phase of the provisioning protocol Intel supplies
the platform with a 3K RSA public key (PEK).  The identity of that key
is confirmed by an ECC256 based signature over the key.  Intel embeds
the public gx and gy curve points for the signature key in their SDK,
see ae/data/constants/linux/peksk_pub.hh.

The PVE verifies the signature of the PEK and generates a SHA256
verification hash over a message containing the key and uses that as
the data field in an attestation report that is generated against the
target information for the PCE.  The PVE rejects generation of the
report if the PCE target information does not have the PROVISION
attribute set.  See line 124 of psw/ae/pve/provision_msg1.cpp.

This report, along with the PEK, is submitted to the PCE enclave in
order to generate the Platform Provisioning IDentity (PPID), which is
the privacy critical identifier.  The PCE verifies the report
generated by the PVE and rejects the request to generate the PPID if
the report was generated by an enclave that was not initialized with
the PROVISION bit set, see line 109 of psw/ae/pce/pce.cpp.

The PCE enclave then recomputes the message hash using the PEK that it
is provided and verifies that the value matches the value in the data
field of the attestation report from the PVE enclave.  If the values
do not match the PPID generation request is denied.  The PCE enclave
then encrypts the PPID with the PEK key and the encrypted PPID is
returned to Intel to use as the platform identifier.

The PPID is the CMAC over a 16 byte null message which uses a derived
provisioning key based on CPUSVN and ISVSVN values set to zero.  See
the get_ppid() function in psw/ae/pce/pce_helper.cpp.

I believe this effectively denies the ability of anyone other then
Intel, who holds the private portion of the ECC256 signature key used
to authenticate the PEK, from using the PCE enclave to generate a
platform identifier.

As I conceded above, there could be an academic deficiency in all
this, I'm not qualified to comment.  I believe there is a reasonably
solid functional guarantee that on a locked platform the process
cannot be easily subverted by a privacy aggressor.

We contend that the model we propose below can also deliver this
guarantee

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-25 Thread Andy Lutomirski

Bah, I hit send on a partially written draft. I’ll try again soon.

--Andy

> On Nov 25, 2018, at 1:59 PM, Andy Lutomirski  wrote:
> 
> 
> 
>> On Nov 25, 2018, at 10:55 AM, Dr. Greg  wrote:
>> 
> 
>> 
>> 
>> The notion of a launch enclave is critical to establishing these
>> guarantees.  As soon as the kernel becomes involved in implementing
>> SGX security policy the architecture becomes vulnerable to kernel
>> and/or privilege modification attacks.
>> 
>> We've talked at length about the provisioning bit, I won't go into
>> details unless people are interested, but the EPID provisioning
>> protocol implements an SGX mediated cryptographic contract that a
>> perpetual platform identifier will not be disclosed to anyone but
>> Intel.  
> 
> As a reviewer, and as an occasional academic cryptographer, I need to put my 
> foot down here.  This is inaccurate.
> 
> There is an SGX-mediated contract that says:
> 
> 1. For any given public key p, a perpetual platform identifier ID_p exists 
> and will only be disclosed to the holder of the corresponding private key 
> p_priv or to someone to whom the private key holder permits (intentionally or 
> otherwise) to use that identifier.
> 
> 2. The ability described in #1 is available to anyone whom the kernel and 
> launch enclave (if the MSRs are locked ) permits (intentionally or otherwise) 
> to use it.
> 
> No, I have no clue why Intel did it this way.  I consider it to be a mistake.
> 
>> The launch enclave is critical to that guarantee.
>> 
>> It is completely understandable why a locked down, (non-FLC) hardware
>> platform, is undesirable in this community.  That doesn't mean that a
>> launch enclave as a concept is unneeded or necessarily evil.
>> 
>> In an FLC environment the kernel assumes responsibility for SGX
>> privacy and security.  This means that we need to do at least as well
>> with a kernel based model as to what is currently available.
>> 
>>> There are other ways to accomplish it that might be better in some
>>> respects.  For example, there could be /dev/sgx and
>>> /dev/sgx_rights/provision.  The former exposes the whole sgx API,
>>> except that it doesn???t allow provisioning by default. The latter
>>> does nothing by itself. To run a provisioning enclave, you open both
>>> nodes, then do something like:
>>> 
>>> ioctl(sgx, SGX_IOC_ADD_RIGHT, sgx_provisioning);
>>> 
>>> This requires extra syscalls, but it doesn't have the combinatorial
>>> explosion problem.
>> 
>> Here is a proposal for the driver to add the needed policy control
>> that is 'SGXy' in nature.  The 'SGXy' way is to use MRSIGNER values as
>> the currency for security policy management.
>> 
>> The driver should establish the equivalent of three linked lists,
>> maintainable via /sysfs pseudo-files or equivalent plumbing.  The
>> lists are referenced by the kernel to enforce the following policies.
>> 
>> 1.) The right to initialize an enclave without special attributes.
>> 
>> 2.) The right to initialize an enclave with the PROVISION_KEY attribute set.
>> 
>> 3.) The right to initialize an enclave with the LICENSE_KEY attribute set.
>> 
>> The lists are populated with MRSIGNER values of enclaves that are
>> allowed to initialize under the specified conditions.
> 
> NAK because this is insufficient.  You’re thinking of a model in which 
> SGX-like protection is all that’s needed.  This is an inadequate model of the 
> real world.  The attack I’m most concerned about wrt provisioning is an 
> attack in which an unauthorized user is permitted 
> 
> The use case I see for attestation *privacy*
> 
>> 
>> The driver should either establish a 'seal' file or value,
>> ie. MRSIGNER value of all zero's, that once written will not allow
>> further modifications of the list(s).  This will allow
>> cryptographically guaranteed policies to be setup at early boot that
>> will limit the ability of subsequent DAC compromises to affect policy
>> management.
>> 
>> The lists are obviously vulnerable to a kernel compromise but the
>> vulnerability scope is significantly limited vs. 'can I get root or
>> some other userid'.  If we are really concerned about the scope of
>> that vulnerability there could be an option on TPM based systems to
>> verify a hash value over the lists once sealed on each enclave
>> initialization.  We have already conceded that EINIT isn't going to be
>> any type of speed daemon.
>> 
>> On an FLC system the driver verifies that the submitted enclave has an
>> MRSIGNER value on one of the lists consistent with the attributes of
>> the enclave before loading the value into the identity modulus
>> signature registers.
>> 
>> In this model, I would argue that the driver does not need to
>> arbitrarily exclude launch enclaves as it does now, since the kernel
>> has the ability to specify acceptable launch enclaves.  The driver API
>> can alaso continue to accept an EINITTOKEN which maintains
>> compatibility with the current ABI.  Punishment can be inflicted on
>> non-FLC

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-11-25 Thread Andy Lutomirski

Bah, I hit send on a partially written draft. I’ll try again soon.

--Andy

> On Nov 25, 2018, at 1:59 PM, Andy Lutomirski  wrote:
> 
> 
> 
>> On Nov 25, 2018, at 10:55 AM, Dr. Greg  wrote:
>> 
> 
>> 
>> 
>> The notion of a launch enclave is critical to establishing these
>> guarantees.  As soon as the kernel becomes involved in implementing
>> SGX security policy the architecture becomes vulnerable to kernel
>> and/or privilege modification attacks.
>> 
>> We've talked at length about the provisioning bit, I won't go into
>> details unless people are interested, but the EPID provisioning
>> protocol implements an SGX mediated cryptographic contract that a
>> perpetual platform identifier will not be disclosed to anyone but
>> Intel.  
> 
> As a reviewer, and as an occasional academic cryptographer, I need to put my 
> foot down here.  This is inaccurate.
> 
> There is an SGX-mediated contract that says:
> 
> 1. For any given public key p, a perpetual platform identifier ID_p exists 
> and will only be disclosed to the holder of the corresponding private key 
> p_priv or to someone to whom the private key holder permits (intentionally or 
> otherwise) to use that identifier.
> 
> 2. The ability described in #1 is available to anyone whom the kernel and 
> launch enclave (if the MSRs are locked ) permits (intentionally or otherwise) 
> to use it.
> 
> No, I have no clue why Intel did it this way.  I consider it to be a mistake.
> 
>> The launch enclave is critical to that guarantee.
>> 
>> It is completely understandable why a locked down, (non-FLC) hardware
>> platform, is undesirable in this community.  That doesn't mean that a
>> launch enclave as a concept is unneeded or necessarily evil.
>> 
>> In an FLC environment the kernel assumes responsibility for SGX
>> privacy and security.  This means that we need to do at least as well
>> with a kernel based model as to what is currently available.
>> 
>>> There are other ways to accomplish it that might be better in some
>>> respects.  For example, there could be /dev/sgx and
>>> /dev/sgx_rights/provision.  The former exposes the whole sgx API,
>>> except that it doesn???t allow provisioning by default. The latter
>>> does nothing by itself. To run a provisioning enclave, you open both
>>> nodes, then do something like:
>>> 
>>> ioctl(sgx, SGX_IOC_ADD_RIGHT, sgx_provisioning);
>>> 
>>> This requires extra syscalls, but it doesn't have the combinatorial
>>> explosion problem.
>> 
>> Here is a proposal for the driver to add the needed policy control
>> that is 'SGXy' in nature.  The 'SGXy' way is to use MRSIGNER values as
>> the currency for security policy management.
>> 
>> The driver should establish the equivalent of three linked lists,
>> maintainable via /sysfs pseudo-files or equivalent plumbing.  The
>> lists are referenced by the kernel to enforce the following policies.
>> 
>> 1.) The right to initialize an enclave without special attributes.
>> 
>> 2.) The right to initialize an enclave with the PROVISION_KEY attribute set.
>> 
>> 3.) The right to initialize an enclave with the LICENSE_KEY attribute set.
>> 
>> The lists are populated with MRSIGNER values of enclaves that are
>> allowed to initialize under the specified conditions.
> 
> NAK because this is insufficient.  You’re thinking of a model in which 
> SGX-like protection is all that’s needed.  This is an inadequate model of the 
> real world.  The attack I’m most concerned about wrt provisioning is an 
> attack in which an unauthorized user is permitted 
> 
> The use case I see for attestation *privacy*
> 
>> 
>> The driver should either establish a 'seal' file or value,
>> ie. MRSIGNER value of all zero's, that once written will not allow
>> further modifications of the list(s).  This will allow
>> cryptographically guaranteed policies to be setup at early boot that
>> will limit the ability of subsequent DAC compromises to affect policy
>> management.
>> 
>> The lists are obviously vulnerable to a kernel compromise but the
>> vulnerability scope is significantly limited vs. 'can I get root or
>> some other userid'.  If we are really concerned about the scope of
>> that vulnerability there could be an option on TPM based systems to
>> verify a hash value over the lists once sealed on each enclave
>> initialization.  We have already conceded that EINIT isn't going to be
>> any type of speed daemon.
>> 
>> On an FLC system the driver verifies that the submitted enclave has an
>> MRSIGNER value on one of the lists consistent with the attributes of
>> the enclave before loading the value into the identity modulus
>> signature registers.
>> 
>> In this model, I would argue that the driver does not need to
>> arbitrarily exclude launch enclaves as it does now, since the kernel
>> has the ability to specify acceptable launch enclaves.  The driver API
>> can alaso continue to accept an EINITTOKEN which maintains
>> compatibility with the current ABI.  Punishment can be inflicted on
>> non-FLC

1 2 >

1 - 100 of 152 matches

Mail list logo