> + for (; addr < end; addr += PAGE_SIZE) {
> + vm_fault_t ret;
> +
> + ret = handle_mm_fault(vma, addr, fault_flags, NULL);
> +
> + if (ret & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)) {
> + /*
> + * The mmap lock has been dropped by the fault handler.
> + * Record the failing address and signal lock-drop to
> + * the caller.
> + */
> + *hmm_vma_walk->locked = 0;
> + hmm_vma_walk->last = addr;
> + return -EAGAIN;
Okay, so we'll return straight from hmm_vma_fault() to
hmm_vma_handle_pte()/hmm_vma_walk_pmd() -> walk_page_range() machinery.
Hopefully we don't refer to the MM/VMA on any path there? It would be nicer if
the hmm_vma_fault() could be called by the caller of walk_page_range(), but
that's tricky I guess, as hmm_vma_fault() consumes the walk structure and
requires the vma in there.
Note: am I wrong, or is hmm_vma_fault() really always called with
required_fault=true?
> + }
> +
> + if (ret & VM_FAULT_ERROR)
> return -EFAULT;
> + }
> return -EBUSY;
> }
>
> @@ -566,6 +585,17 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte,
> unsigned long hmask,
> if (required_fault) {
> int ret;
>
> + /*
> + * Faulting hugetlb pages on the unlockable path is not
> + * supported. The walk framework holds hugetlb_vma_lock_read
> + * which must be dropped before handle_mm_fault, but if the
> + * mmap lock is also dropped (VM_FAULT_RETRY), the vma may
> + * be freed and the walk framework's unconditional unlock
> + * becomes a use-after-free.
> + */
> + if (hmm_vma_walk->locked)
> + return -EFAULT;
Just because it's unlockable doesn't mean that you must unlock. Can't this be
kept working as is, just simulating here as if it would not be unlockable?
--
Cheers,
David