On 09/26/2018 01:44 PM, Sean Christopherson wrote:
> On Wed, Sep 26, 2018 at 01:16:59PM -0700, Dave Hansen wrote:
>> We also need to clarify how this can happen.  Is it through something
>> than an app does, or is it solely when the hardware does something under
>> the covers, like suspend/resume.
> 
> Are you looking for something in the changelog, the comment, or just
> a response?  If it's the latter...

Comments, please.

> On bare metal with a bug-free kernel, the only scenario I'm aware of
> where we'll encounter these faults is when hardware pulls the rug out
> from under us.  In a virtualized environment all bets are off because
> the architecture allows VMMs to silently "destroy" the EPC at will,
> e.g. KVM, and I believe Hyper-V, will take advantage of this behavior
> to support live migration.  Post migration, the destination system
> will generate PF_SGX because the EPC{M} can't be migrated between
> system, i.e. the destination EPCM sees all EPC pages as invalid.

OK, cool.

That's good background fodder for the changelog.

But, for the comment, I'm happy with something like this:

        /*
         * The fault resulted from violation of SGX-specific access-
         * controls.  This is expected to be the result of some lower
         * layer action (CPU suspend/resume, VM migration) and is
         * not related to anything the OS did.  Treat it as an access
         * error to ensure it is passed up to the app via a signal where
         * it can be handled.
         */

I really don't think we need to delve too deeply into the relationship
between EPCM and PTEs or anything.  Let's just say, "it's not the
kernel's fault, it's not the app's fault, so throw up our hands".

Reply via email to