On Thu, Mar 26, 2026 at 07:33:03PM -0700, James Houghton wrote: > On Fri, Mar 6, 2026 at 9:19 AM Mike Rapoport <[email protected]> wrote: > > > > From: Nikita Kalyazin <[email protected]> > > > > userfaultfd notifications about page faults used for live migration > > and snapshotting of VMs. > > > > MISSING mode allows post-copy live migration and MINOR mode allows > > optimization for post-copy live migration for VMs backed with shared > > hugetlbfs or tmpfs mappings as described in detail in commit > > 7677f7fd8be7 ("userfaultfd: add minor fault registration mode"). > > > > To use the same mechanisms for VMs that use guest_memfd to map their > > memory, guest_memfd should support userfaultfd operations. > > > > Add implementation of vm_uffd_ops to guest_memfd. > > > > Signed-off-by: Nikita Kalyazin <[email protected]> > > Co-developed-by: Mike Rapoport (Microsoft) <[email protected]> > > Signed-off-by: Mike Rapoport (Microsoft) <[email protected]> > > Overall looks fine to me, but I am slightly concerned about in-place > conversion[1], and I think you're going to want to implement a > kvm_gmem_folio_present() op or something (like I was saying on the > previous patch[2]).
Let's solve each problem in it's time :) > [1]: > https://lore.kernel.org/kvm/[email protected]/ > [2]: > https://lore.kernel.org/linux-mm/cadrl8hvuj5fl97d9ytxp2wxos6hs+u+ycpsi5vxffsw9vac...@mail.gmail.com/ > > Some in-line comments below. > > > --- > > mm/filemap.c | 1 + > > virt/kvm/guest_memfd.c | 84 +++++++++++++++++++++++++++++++++++++++++- > > 2 files changed, 83 insertions(+), 2 deletions(-) > > > > diff --git a/mm/filemap.c b/mm/filemap.c > > index 6cd7974d4ada..19dfcebcd23f 100644 > > --- a/mm/filemap.c > > +++ b/mm/filemap.c > > @@ -107,6 +108,12 @@ static int kvm_gmem_prepare_folio(struct kvm *kvm, > > struct kvm_memory_slot *slot, > > return __kvm_gmem_prepare_folio(kvm, slot, index, folio); > > } > > > > +static struct folio *kvm_gmem_get_folio_noalloc(struct inode *inode, > > pgoff_t pgoff) > > +{ > > + return __filemap_get_folio(inode->i_mapping, pgoff, > > + FGP_LOCK | FGP_ACCESSED, 0); > > +} > > When in-place conversion is supported, I wonder what the semantics > should be for when we get userfaults. > > Upon a userspace access to a file offset that is populated but > private, should we get a userfault or a SIGBUS? > > I guess getting a userfault is strictly more useful for userspace, but > I'm not sure which choice is more correct. Me neither :) We can deliver userfault, but just block UFFDIO_COPY, can't we? > > +static int kvm_gmem_filemap_add(struct folio *folio, > > + struct vm_area_struct *vma, > > + unsigned long addr) > > +{ > > + struct inode *inode = file_inode(vma->vm_file); > > + struct address_space *mapping = inode->i_mapping; > > + pgoff_t pgoff = linear_page_index(vma, addr); > > + int err; > > + > > + __folio_set_locked(folio); > > + err = filemap_add_folio(mapping, folio, pgoff, GFP_KERNEL); > > This is going to get more interesting with in-place conversion. I'm > not really sure how to synchronize with it, but we'll probably need to > take the invalidate lock for reading. And then we'll need a separate > uffd_op to drop it after we install the PTE... I think. I think we can start simple and then move on along with the in-place conversion work. If there will be a need for a new uffd_ops callback we can add it then. -- Sincerely yours, Mike.

