On 4/1/2026 7:12 PM, Marc-André Lureau wrote:
In Confidential Computing (CoCo) environments such as Intel TDX or AMD
SEV-SNP, hotplugged memory must be explicitly "accepted" (transitioned to
a private/encrypted state) before it can be safely used by the guest.
Conversely, before returning memory to the hypervisor during an unplug
operation, it must be converted back to a shared/decrypted state.

It's not a must to convert it back to shared. The memory is going to be unplugged, the guest doesn't need to care the state of it unless there is restriction that private memory cannot be unplugged. But we don't have such restriction.

As I explained in the QEMU thread[1], the VMM needs to discard the memory (both shared and private) on unplug. If the VMM fails to do so, the memory is actually not unplugged and the guest is still able to access them.

If the VMM fails to discard/remove the private memory, either unintentionally or intentionally, it's the bug of the VMM. For TDX, this kind of VMM bug can lead to re-accept error. To make TDX guest more robust, we can let the guest release the memory itself on unplug, as suggested by Paolo[2] and Kiryl[3], so that it can survive even with buggy vmm. Converting the memory to shared is another approach for guest to proactively "release" the private memory. But the justification of it is not "guest must do so".

[1] https://lore.kernel.org/qemu-devel/[email protected]/ [2] https://lore.kernel.org/lkml/CABgObfZ7_w8Q-dW=Sd4YA3P==bun1edpv7ty4eppyu8ctw6...@mail.gmail.com/
[3] https://lore.kernel.org/lkml/acprNlPP7J_ttMrz@thinkstation/

Attempting to handle memory acceptance automatically using generic
architecture-level memory hotplug notifiers (e.g., MEM_GOING_ONLINE)
is not viable for devices like virtio-mem:

1. Granularity Mismatch: virtio-mem can dynamically hot(un)plug memory
    at a subblock granularity (e.g., 2MB chunks within a 128MB memory
    block). Generic memory notifiers operate on the entire memory block.
2. Lifecycle Control: Memory must be explicitly accepted *before* it is
    handed to the core memory management subsystem (the buddy allocator),
    and it must be decrypted *before* being handed back to the device.
3. State Tracking (Offline -> Re-online): If memory is offlined and
    re-onlined without proper state transitions, TDX will panic on
    attempting to accept an already-accepted page 
(TDX_EPT_ENTRY_STATE_INCORRECT).

To address this, this patch implements explicit CoCo memory conversions
directly within the virtio-mem driver using set_memory_encrypted() and
set_memory_decrypted():

- During hotplug, explicitly accepts only the physically plugged subblocks
   right before fake-onlining them into the buddy allocator.
- During unplug, memory is explicitly transitioned to the shared state
   before being handed back to the host. If the unplug operation fails,
   the driver attempts to re-accept (encrypt) the memory. If this
   re-acceptance fails, the memory is intentionally leaked to prevent
   confidentiality breaches or fatal hypervisor faults.

This was discovered while testing virtio-mem resize with TDX guests.
The associated QEMU virtio-mem + TDX patch series is under review at:
https://patchew.org/QEMU/[email protected]/

Note that QEMU punches the guest_memfd on KVM_HC_MAP_GPA_RANGE, when the
guest memory is decrypted. There is thus no need to discard the guest_memfd
in the virtio-mem device.

This patch is a follow-up and supersedes "[PATCH 0/2] x86/tdx: Fix
memory hotplug in TDX guests".




Reply via email to