On Thu, 2023-10-26 at 09:01 -0700, Chatre, Reinette wrote:
>
> On 10/25/2023 4:58 PM, Huang, Kai wrote:
> > On Wed, 2023-10-25 at 07:31 -0700, Hansen, Dave wrote:
> > > On 10/19/23 19:53, Haitao Huang wrote:
> > > > In the EAUG on page fault path, VM_FAULT_OOM is returned when the
> > > > Enclave Page Cache (EPC) runs out. This may trigger unneeded OOM kill
> > > > that will not free any EPCs. Return VM_FAULT_SIGBUS instead.
>
> This commit message does not seem accurate to me. From what I can tell
> VM_FAULT_SIGBUS is indeed returned when EPC runs out. What is addressed
> with this patch is the error returned when kernel (not EPC) memory runs
> out.
>
> > > So, when picking an error code and we look the documentation for the
> > > bits, we see:
> > >
> > > > * @VM_FAULT_OOM: Out Of Memory
> > > > * @VM_FAULT_SIGBUS: Bad access
> > >
> > > So if anything we'll need a bit more changelog where you explain how
> > > running out of enclave memory is more "Bad access" than "Out Of Memory".
> > > Because on the surface this patch looks wrong.
> > >
> > > But that's just a naming thing. What *behavior* is bad here? With the
> > > old code, what happens? With the new code, what happens? Why is the
> > > old better than the new?
> >
> > I think Haitao meant if we return OOM, the core-MM fault handler will
> > believe
> > the fault couldn't be handled because of running out of memory, and then it
> > could invoke the OOM killer which might select an unrelated victim who might
> > have no EPC at all.
>
> Since the issue is that system is out of kernel memory the resolution may
> need to
> look further than owners with EPC memory.
Oh right, I didn't look into the sgx_encl_page_alloc():
encl_page = kzalloc(sizeof(*encl_page), GFP_KERNEL);
if (!encl_page)
return ERR_PTR(-ENOMEM);
>
> ...
>
> >
> > (Also, currently the non-EAUG code path (ELDU) in sgx_vma_fault() also
> > returns
> > SIGBUS if it fails to allocate EPC, so making EAUG code path return SIGBUS
> > also
> > matches the ELDU path.)
> >
>
> These errors all seem related to EPC memory to me, not kernel memory.
Right.