On Thu, Mar 26, 2026 at 07:33:03PM -0700, James Houghton wrote:
> On Fri, Mar 6, 2026 at 9:19 AM Mike Rapoport <[email protected]> wrote:
> >
> > From: Nikita Kalyazin <[email protected]>
> >
> > userfaultfd notifications about page faults used for live migration
> > and snapshotting of VMs.
> >
> > MISSING mode allows post-copy live migration and MINOR mode allows
> > optimization for post-copy live migration for VMs backed with shared
> > hugetlbfs or tmpfs mappings as described in detail in commit
> > 7677f7fd8be7 ("userfaultfd: add minor fault registration mode").
> >
> > To use the same mechanisms for VMs that use guest_memfd to map their
> > memory, guest_memfd should support userfaultfd operations.
> >
> > Add implementation of vm_uffd_ops to guest_memfd.
> >
> > Signed-off-by: Nikita Kalyazin <[email protected]>
> > Co-developed-by: Mike Rapoport (Microsoft) <[email protected]>
> > Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
> 
> Overall looks fine to me, but I am slightly concerned about in-place
> conversion[1], and I think you're going to want to implement a
> kvm_gmem_folio_present() op or something (like I was saying on the
> previous patch[2]).

Let's solve each problem in it's time :)
 
> [1]: 
> https://lore.kernel.org/kvm/[email protected]/
> [2]: 
> https://lore.kernel.org/linux-mm/cadrl8hvuj5fl97d9ytxp2wxos6hs+u+ycpsi5vxffsw9vac...@mail.gmail.com/
> 
> Some in-line comments below.
> 
> > ---
> >  mm/filemap.c           |  1 +
> >  virt/kvm/guest_memfd.c | 84 +++++++++++++++++++++++++++++++++++++++++-
> >  2 files changed, 83 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/filemap.c b/mm/filemap.c
> > index 6cd7974d4ada..19dfcebcd23f 100644
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -107,6 +108,12 @@ static int kvm_gmem_prepare_folio(struct kvm *kvm, 
> > struct kvm_memory_slot *slot,
> >         return __kvm_gmem_prepare_folio(kvm, slot, index, folio);
> >  }
> >
> > +static struct folio *kvm_gmem_get_folio_noalloc(struct inode *inode, 
> > pgoff_t pgoff)
> > +{
> > +       return __filemap_get_folio(inode->i_mapping, pgoff,
> > +                                  FGP_LOCK | FGP_ACCESSED, 0);
> > +}
> 
> When in-place conversion is supported, I wonder what the semantics
> should be for when we get userfaults.
> 
> Upon a userspace access to a file offset that is populated but
> private, should we get a userfault or a SIGBUS?
> 
> I guess getting a userfault is strictly more useful for userspace, but
> I'm not sure which choice is more correct.
 
Me neither :)

We can deliver userfault, but just block UFFDIO_COPY, can't we?

> > +static int kvm_gmem_filemap_add(struct folio *folio,
> > +                               struct vm_area_struct *vma,
> > +                               unsigned long addr)
> > +{
> > +       struct inode *inode = file_inode(vma->vm_file);
> > +       struct address_space *mapping = inode->i_mapping;
> > +       pgoff_t pgoff = linear_page_index(vma, addr);
> > +       int err;
> > +
> > +       __folio_set_locked(folio);
> > +       err = filemap_add_folio(mapping, folio, pgoff, GFP_KERNEL);
> 
> This is going to get more interesting with in-place conversion. I'm
> not really sure how to synchronize with it, but we'll probably need to
> take the invalidate lock for reading. And then we'll need a separate
> uffd_op to drop it after we install the PTE... I think.

I think we can start simple and then move on along with the in-place
conversion work. If there will be a need for a new uffd_ops callback we can
add it then.
 
-- 
Sincerely yours,
Mike.

Reply via email to