Hi,
 
> For TD guest kernel, it has its own reason to turn SEPT_VE on or off. E.g.,
> linux TD guest requires SEPT_VE to be disabled to avoid #VE on syscall gap
> [1].

Why is that a problem for a TD guest kernel?  Installing exception
handlers is done quite early in the boot process, certainly before any
userspace code runs.  So I think we should never see a syscall without
a #VE handler being installed.  /me is confused.

Or do you want tell me linux has no #VE handler?

> Frankly speaking, this bit is better to be configured by TD guest
> kernel, however current TDX architecture makes the design to let VMM
> configure.

Indeed.  Requiring users to know guest kernel capabilities and manually
configuring the vmm accordingly looks fragile to me.

Even better would be to not have that bit in the first place and require
TD guests properly handle #VE exceptions.

> This can cause problems with the "system call gap": a malicious
> hypervisor might trigger a #VE for example on the system call entry
> code, and when a user process does a system call it would trigger a
> and SYSCALL relies on the kernel code to switch to the kernel stack,
> this would lead to kernel code running on the ring 3 stack.

Hmm?  Exceptions switch to kernel context too ...

take care,
  Gerd


Reply via email to