On 3/6/26 18:18, Mike Rapoport wrote: > From: "Mike Rapoport (Microsoft)" <[email protected]>
Nit: "." at the end of the patch subject > > There is a lengthy code chunk in mfill_atomic() that establishes the PMD > for UFFDIO operations. This code may be called twice: first time when > the copy is performed with VMA/mm locks held and the other time after > the copy is retried with locks dropped. > > Move the code that establishes a PMD into a helper function so it can be > reused later during refactoring of mfill_atomic_pte_copy(). > > Signed-off-by: Mike Rapoport (Microsoft) <[email protected]> > --- > mm/userfaultfd.c | 103 ++++++++++++++++++++++++----------------------- > 1 file changed, 53 insertions(+), 50 deletions(-) > > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c > index e68d01743b03..224b55804f99 100644 > --- a/mm/userfaultfd.c > +++ b/mm/userfaultfd.c > @@ -157,6 +157,57 @@ static void uffd_mfill_unlock(struct vm_area_struct *vma) > } > #endif > > +static pmd_t *mm_alloc_pmd(struct mm_struct *mm, unsigned long address) > +{ > + pgd_t *pgd; > + p4d_t *p4d; > + pud_t *pud; > + > + pgd = pgd_offset(mm, address); > + p4d = p4d_alloc(mm, pgd, address); > + if (!p4d) > + return NULL; > + pud = pud_alloc(mm, p4d, address); > + if (!pud) > + return NULL; > + /* > + * Note that we didn't run this because the pmd was > + * missing, the *pmd may be already established and in > + * turn it may also be a trans_huge_pmd. > + */ > + return pmd_alloc(mm, pud, address); > +} > + > +static int mfill_get_pmd(struct mfill_state *state) > +{ > + struct mm_struct *dst_mm = state->ctx->mm; > + pmd_t *dst_pmd; > + pmd_t dst_pmdval; I'd just have both on a single line. > + > + dst_pmd = mm_alloc_pmd(dst_mm, state->dst_addr); > + if (unlikely(!dst_pmd)) > + return -ENOMEM; > + > + dst_pmdval = pmdp_get_lockless(dst_pmd); > + if (unlikely(pmd_none(dst_pmdval)) && > + unlikely(__pte_alloc(dst_mm, dst_pmd))) > + return -ENOMEM; > + > + dst_pmdval = pmdp_get_lockless(dst_pmd); > + /* > + * If the dst_pmd is THP don't override it and just be strict. > + * (This includes the case where the PMD used to be THP and > + * changed back to none after __pte_alloc().) > + */ > + if (unlikely(!pmd_present(dst_pmdval) || pmd_trans_huge(dst_pmdval))) Can we directly switch to pmd_leaf() while touching that? > + return -EEXIST; > + if (unlikely(pmd_bad(dst_pmdval))) > + return -EFAULT; > + > + state->pmd = dst_pmd; > + return 0; > +} [...] > /* > * Sanitize the command parameters: > @@ -809,41 +838,15 @@ static __always_inline ssize_t mfill_atomic(struct > userfaultfd_ctx *ctx, > while (state.src_addr < src_start + len) { > VM_WARN_ON_ONCE(state.dst_addr >= dst_start + len); > > - pmd_t dst_pmdval; > - > - dst_pmd = mm_alloc_pmd(dst_mm, state.dst_addr); > - if (unlikely(!dst_pmd)) { > - err = -ENOMEM; > + err = mfill_get_pmd(&state); > + if (err) It's a bit odd that a "get" function doesn't return a PMD pointer but instead stores it in the state. Maybe more like "mfill_prepare_pmd" ? But actually you want to have a pte table. mfill_prepare_pte_table() or alternatively mfill_alloc_pte_table() / mfill_alloc_dst_pte_table() -- Cheers, David

