On Wed, 20 May 2026 at 22:44, Ackerley Tng <[email protected]> wrote: > > Fuad Tabba <[email protected]> writes: > > > > > [...snip...] > > > >> +unsigned long kvm_gmem_get_memory_attributes(struct kvm *kvm, gfn_t gfn) > >> +{ > >> + struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn); > >> + struct inode *inode; > >> + > >> + /* > >> + * If this gfn has no associated memslot, there's no chance of the > >> gfn > >> + * being backed by private memory, since guest_memfd must be used > >> for > >> + * private memory, and guest_memfd must be associated with some > >> memslot. > >> + */ > >> + if (!slot) > >> + return 0; > >> + > >> + CLASS(gmem_get_file, file)(slot); > >> + if (!file) > >> + return 0; > >> + > >> + inode = file_inode(file); > >> + > >> + /* > >> + * Rely on the maple tree's internal RCU lock to ensure a > >> + * stable result. This result can become stale as soon as the > >> + * lock is dropped, so the caller _must_ still protect > >> + * consumption of private vs. shared by checking > >> + * mmu_invalidate_retry_gfn() under mmu_lock to serialize > >> + * against ongoing attribute updates. > >> + */ > >> + return kvm_gmem_get_attributes(inode, kvm_gmem_get_index(slot, > >> gfn)); > >> +} > > > > Doesn't this imply that all consumers of kvm_mem_is_private() should > > validate the result using mmu_lock and the invalidation sequence? > > Let me know how I can improve the comment.
Given Sean's context, the comment is good I think. I would quibble with the the "_must_ still protect" phrasing being a bit too strict. Maybe just soften it slightly to acknowledge the exception? Something like: * lock is dropped, so callers that require a strict result _must_ protect * consumption of private vs. shared by checking mmu_invalidate_retry_gfn() * under mmu_lock to serialize against ongoing attribute updates. Callers * doing lockless reads must be able to tolerate a stale result. That aligns the comment with how KVM is actually using it today. That said, this is nitpicking. Feel free to use or ignore. > > I think the "consumption" of private vs shared here actually means > something like "don't commit a page being faulted into page tables based > on the result of kvm_gmem_get_memory_attributes() without checking > kvm->mmu_invalidate_in_progress.", since a racing conversion may > complete before you commit. > > kvm_mem_is_private() is used from these places: > > 1. Fault handling in KVM, like page_fault_can_be_fast(), > kvm_mmu_faultin_pfn(), kvm_mmu_page_fault(): this already handles the > entire mmu_lock and invalidation dance. No fault will be committed if > a racing conversion happened after kvm_mem_is_private() but before > the commit. > > 2. kvm_mmu_max_mapping_level() from recovering huge pages after > disabling dirty logging: Other than that it can't be used with > guest_memfd now since dirty logging can't be used with guest_memfd > and guest_memfd memslots are not updatable, this holds mmu_lock > throughout until the huge page recovery is done. invalidate_begin > also involves zapping the pages in the range, so if the order of > events is > > | Thread A | Thread B | > |------------------------------|-------------------| > | invalidate_begin + zap | | > | update attributes maple_tree | recover huge page | > | invalidate_end | | > > Then recovering will never see the zapped pages, nothing to > recover, no kvm_mem_is_private() lookup. > > 3. kvm_arch_vcpu_pre_fault_memory() > > This eventually calls kvm_tdp_mmu_page_fault(), which checks > is_page_fault_stale(), so it does check before committing. > > Were there any other calls I missed? The one I was looking at was `sev_handle_rmp_fault()`, which does a lockless read without the retry loop. But as Sean just pointed out, that path can tolerate false positives/negatives and relies on the guest faulting again, so the lack of synchronization there is existing behavior and considered "fine". > > > sev_handle_rmp_fault() calls kvm_mem_is_private() without holding > > mmu_lock and without any retry mechanism. Is that a problem? > > > > Sean already replied on your actual question separately :) > > > Cheers, > > /fuad > > > > > >> > >> [...snip...] > >>
