On Fri, May 08, 2026 at 04:55:20PM +0100, Kiryl Shutsemau (Meta) wrote: > Add the userspace interface for read-write protection tracking: > > - UFFDIO_REGISTER_MODE_RWP register a range for RWP tracking > - UFFD_FEATURE_RWP capability bit > - UFFDIO_RWPROTECT install / remove RWP on a range > > Registration sets VM_UFFD_RWP on the VMA. Combining MODE_WP with > MODE_RWP is rejected because both modes claim the uffd PTE bit. > > UFFDIO_RWPROTECT is the bidirectional counterpart of > UFFDIO_WRITEPROTECT: > > - MODE_RWP change_protection() with MM_CP_UFFD_RWP > installs PAGE_NONE and sets the uffd bit on > present PTEs > - !MODE_RWP change_protection() with MM_CP_UFFD_RWP_RESOLVE > restores vma->vm_page_prot and clears the bit > > userfaultfd_clear_vma() runs the same resolve pass on unregister so > RWP state cannot outlive the uffd. > > Re-registering a range must not drop a mode that installs per-PTE > markers (WP or RWP); doing so returns -EBUSY. This also closes a > pre-existing window where re-registering without MODE_WP would strand > uffd-wp markers: before, those caused extra write-faults but were > otherwise benign; with RWP preservation in place, a subsequent > mprotect() on a VM_UFFD_RWP VMA would silently promote the stale > markers to RWP. > > The feature is not yet advertised. UFFDIO_REGISTER_MODE_RWP, > UFFD_FEATURE_RWP, and _UFFDIO_RWPROTECT are intentionally absent from > UFFD_API_REGISTER_MODES, UFFD_API_FEATURES, and UFFD_API_RANGE_IOCTLS, > so UFFDIO_API masks them out and the register-mode validator rejects > the bit. The follow-up patch adds fault dispatch and exposes the UAPI. > > Signed-off-by: Kiryl Shutsemau <[email protected]> > Assisted-by: Claude:claude-opus-4-6
Reviewed-by: Mike Rapoport (Microsoft) <[email protected]> with a comment below > --- > Documentation/admin-guide/mm/userfaultfd.rst | 10 ++ > fs/userfaultfd.c | 84 +++++++++++++++++ > include/linux/userfaultfd_k.h | 2 + > include/uapi/linux/userfaultfd.h | 19 ++++ > mm/userfaultfd.c | 97 +++++++++++++++++++- > 5 files changed, 209 insertions(+), 3 deletions(-) > > + /* > + * Pre-scan the range: validate every spanned VMA before applying > + * any change_protection() so a partial failure cannot leave the > + * process with only a prefix of the range re-protected. > + */ > + err = -ENOENT; > + for_each_vma_range(vmi, dst_vma, end) { > + if (!userfaultfd_rwp(dst_vma)) > + return -ENOENT; > + > + if (is_vm_hugetlb_page(dst_vma)) { > + unsigned long page_mask; > + > + page_mask = vma_kernel_pagesize(dst_vma) - 1; > + if ((start & page_mask) || (len & page_mask)) > + return -EINVAL; > + } > + err = 0; > + } > + if (err) > + return err; It's an interesting way to say "no VMA found in range" :) I think bool found and if (!found) return -ENOENT; looks more readable. -- Sincerely yours, Mike.

