Ackerley Tng <[email protected]> writes:

>
> [...snip...]
>
> guest_memfd's populate will first check that the memory is shared, then
> also set the memory to private after the populate.
>
> [...snip...]
>
Looking at this again, the above basically means that the entire
conversion process needs to be performed within populate.

In addition to setting the attributes in guest_memfd as private, for
consistency, populate will also have to do all the associated
operations, especially unmapping from the host, checking refcounts,
and the list of work in conversion will only increase in future with
direct map removal/restoration and huge page merging.

The complexity of conversion also means possible errors (EAGAIN for
elevated refcounts and ENOMEM when we're out of memory), and error
information like the offset where the elevated refcount was.

It doesn't look like there's room for this information to be plumbed out
through the platform-specific ioctls, and even if we make space, it
seems odd to have conversion-related error information returned through
the platform-specific call.


I agree with the goal of not having KVM touch private memory contents as
tracked by guest_memfd, so I'd like to propose that we distinguish:

1. private as tracked by KVM (guest_memfd/vm_memory_attributes)
2. private as tracked by trusted entity


Currently, in TDX's populate flow, KVM doesn't do any copying, it only
instructs TDX to do the copying.

"TDH.MEM.PAGE.ADD Operands Information Definition" in the TDX module
ABI spec says the source is accessed as shared, the destination is
accessed as private, and in the case that source and destination are
the same, the "In-Place Add" section says the source will be converted
to private.

In SNP's populate flow, SNP-specific code in KVM does the copying from
shared memory into the destination, then makes the destination address
private in the RMP table before telling the firmware to do the
in-place encryption.


I think we should think of the populate ioctls as being
platform-specific ioctls that, when called, accept

+ destination address: private (as tracked by guest_memfd)
+ source address: shared (as tracked by guest_memfd) or NULL

KVM doesn't touch private memory contents, even in this case, because
it's really a platform-specific ioctl that handles loading, and the
platform does expect the destination is private for both TDX and SNP
at the firmware boundary.

Since SNP (platform-specific) only allows in-place launch update, and
KVM had to provide an interface that allows a different source address
for support before in-place conversion, then SNP has to continue
supporting the to-be-deprecated path by doing the copying within
platform-specific code.

For consistency, guest_memfd can continue to check that it tracks the
destination address as private, and sev_gmem_populate will then hide
the copying code away just to support the legacy case.


The flow before in-place conversion is

1. Load memory (shared or non-guest_memfd memory)
2. KVM_SEV_SNP_LAUNCH_UPDATE or KVM_TDX_INIT_MEM_REGION (destination:
   gfn for separate private memory, source: shared memory)

The proposed flow for in-place conversion is

1. INIT_SHARED or convert to shared
2. Load memory while it is shared
3. Convert to private (PRESERVE, or new flag?)
4. KVM_SEV_SNP_LAUNCH_UPDATE or KVM_TDX_INIT_MEM_REGION (destination:
   gfn for converted private memory, source: NULL)


TLDR:

+ Think of populate ioctls not as KVM touching memory, but platform
  handling population.
+ KVM code (kvm_gmem_populate) still doesn't touch memory contents
+ post_populate is platform-specific code that handles loading into
  private destination memory just to support legacy non-in-place
  conversion.
+ Don't complicate populate ioctls by doing conversion just to support
  legacy use-cases where platform-specific code has to do copying on
  the host.

>>>
>>> [...snip...]
>>>

Reply via email to