Sean Christopherson <[email protected]> writes: > On Tue, Jun 23, 2026, Yan Zhao wrote: >> On Tue, Jun 23, 2026 at 01:16:14PM +0800, Yan Zhao wrote: >> > On Mon, Jun 22, 2026 at 06:22:45PM -0700, Sean Christopherson wrote: >> > > On Mon, Jun 22, 2026, Yan Zhao wrote: >> > > > On Thu, Jun 18, 2026 at 05:32:00PM -0700, Ackerley Tng via B4 Relay >> > > > wrote: >> > > > > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c >> > > > > index ffe9d0db58c59..56d10333c61a7 100644 >> > > > > --- a/arch/x86/kvm/vmx/tdx.c >> > > > > +++ b/arch/x86/kvm/vmx/tdx.c >> > > > > @@ -3198,8 +3198,12 @@ static int tdx_gmem_post_populate(struct kvm >> > > > > *kvm, gfn_t gfn, kvm_pfn_t pfn, >> > > > > if (KVM_BUG_ON(kvm_tdx->page_add_src, kvm)) >> > > > > return -EIO; >> > > > > >> > > > > - if (!src_page) >> > > > > - return -EOPNOTSUPP; >> > > > > + if (!src_page) { >> > > > > + if (!gmem_in_place_conversion) >> > > > When userspace turns on gmem_in_place_conversion while creating >> > > > guest_memfd >> > > > without the MMAP flag, the absence of src_page should still be treated >> > > > as an >> > > > error. >> > > >> > > Why MMAP? >> > Hmm, I was showing a scenario that in-place conversion couldn't occur. >> > I didn't mean that with the MMAP flag, mmap() and user write must occur. >> > >> > > Shouldn't this be a general "if (!src_page && !up-to-date)"? Just >> > > because userspace _can_ mmap() the memory doesn't mean userspace _has_ >> > > mmap()'d >> > > and written memory. And when write() lands, MMAP wouldn't be necessary >> > > to >> > > initialize the memory. >> > Do you mean using up-to-date flag as below? > > Yes? I didn't actually look at the implementation details. > >> > if (!src_page) { >> > src_page = pfn_to_page(pfn); >> > if (!folio_test_uptodate(page_folio(src_page))) >> > return -EOPNOTSUPP; >> > }
Yan is right that with the earlier patch "Zero page while getting pfn", folio_test_uptodate() here will always return true. Actually, this is an alternative fix for the issue Sashiko pointed out on v7 where userspace can do a populate() (either TDX or SNP) without first allocating the page, with src_address == NULL, and leak uninitialized memory into the guest. Advantage of using the uptodate check in populate: if the host never allocates the page, populate doesn't incur zeroing before writing the page anyway in populate(). Disadvantage: Both TDX and SNP will have to implement this uptodate check. guest_memfd can't check centrally because for SNP, for a PAGE_TYPE_ZERO, !src_page should be allowed with a !uptodate page since firmware will zero and there's no leakage of uninitialized host memory? >> >> Another concern with this fix is that: >> commit "KVM: guest_memfd: Zero page while getting pfn" [1] always marks the >> folio uptodate before reaching post_populate(). >> >> [1] >> https://lore.kernel.org/all/[email protected]/ >> >> > One concern is that TDX now does not much care about the up-to-date flag >> > since >> > TDX doesn't rely on the flag to clear pages on conversions. >> > I'm not sure if the flag can be reliably checked in this case. e.g., >> > now the whole folio is marked up-to-date even if only part of it is >> > faulted by >> > user access. >> > Ensuring that the up-to-date flag works correctly with huge page support >> > seems >> > to have more effort than introducing a dedicated flag for TDX. >> > >> > > > Additionally, to properly enable in-place copying for the TDX initial >> > > > memory >> > > > region, userspace must not only specify source_addr to NULL, but also >> > > > follow >> > > > a specific sequence (where steps 1/2/3/7 are required only for >> > > > in-place copy): >> > > > 1. create guest_memfd with MMAP flag >> > > > 2. mmap the guest_memfd. >> > > > 3. convert the initial memory range to shared. >> > > > 4. copy initial content to the source page. >> > > > 5. convert the initial memory range to private >> > > > 6. invoke ioctl KVM_TDX_INIT_MEM_REGION. >> > > > 7. do not unmap the source backend. >> > > > >> > > > So, would it be reasonable to introduce a dedicated flag that allows >> > > > userspace >> > > > to explicitly opt into the in-place copy functionality? e.g., >> > > >> > > Why? It's userspace's responsibility to get the above right. If >> > > userspace fails >> > > to provide a src_page when it doesn't want in-place copy, that's a >> > > userspace bug. Yan, is your concern that userspace forgot to update the code and forgets to provide a src_page, and if we keep the "Zero page while getting pfn" patch, ends up with the guest silently having a zero page? I think that would be found quite early in userspace VMM testing... >> > I mean if userspace specifies a NULL source_addr by mistake, it's better >> > for >> > kernel to detect this mistake, similar to how it validates whether >> > source_addr >> > is PAGE_ALIGNED. > > The alignment case is different. If userspace provides an unaligned value, > KVM > *can't* do what userspace is asking because hardware and thus KVM only > supports > converting on page boundaries. > > For a NULL source, KVM can still do what userspace is asking. Rejecting > userspace's > request would then be making assumptions about what userspace wants. > Also, +1 on this, what if userspace, knowing that pages are zeroed on allocation, actually wants to rely on that to get a zero page in the guest? >> > Since userspace already needs to perform additional steps to enable >> > in-place >> > copy, specifying a dedicated flag to indicate that the NULL source_addr is >> > intentional seems like a reasonable burden. > > I don't see how it adds any value. I wouldn't be at all surprised if most > VMMs > just wen up with code that does: > > if (in-place) { > src = NULL; > flags |= KVM_TDX_IN_PLACE_COPY_INITIAL_MEMORY_REGION; > }
