[Public]

> -----Original Message-----
> From: Chen, Xiaogang <[email protected]>
> Sent: Friday, October 24, 2025 3:15 AM
> To: SHANMUGAM, SRINIVASAN <[email protected]>;
> Koenig, Christian <[email protected]>; Deucher, Alexander
> <[email protected]>
> Cc: [email protected]; Yang, Philip <[email protected]>; Khatri,
> Sunil <[email protected]>
> Subject: Re: [PATCH v3] drm/amdkfd: Fix use-after-free of HMM range in
> svm_range_validate_and_map()
>
>
> On 10/23/2025 9:34 AM, Srinivasan Shanmugam wrote:
> > The function svm_range_validate_and_map() was freeing `range` when
> > amdgpu_hmm_range_get_pages() failed. But later, the code still used
> > the same `range` pointer and freed it again. This could cause a
> > use-after-free and double-free issue.
> >
> > The fix sets `range = NULL` right after it is freed and checks for
> > `range` before using or freeing it again.
> >
> > v2: Removed duplicate !r check in the condition for clarity.
> >
> > v3: In amdgpu_hmm_range_get_pages(), when hmm_range_fault() fails, we
> > kvfree(pfns) but leave the pointer in hmm_range->hmm_pfns still
> > pointing to freed memory. The caller (or amdgpu_hmm_range_free(range))
> > may try to free range->hmm_range.hmm_pfns again, causing a double
> > free, Setting hmm_range->hmm_pfns = NULL immediately after
> > kvfree(pfns) prevents both double free. (Philip)
> what you fix is not "use-after-free", it is preventing double free, right?
> > In svm_range_validate_and_map(), When r == 0, it means success → range
> > is not NULL.  When r != 0, it means failure → already made range = NULL.
> > So checking both (!r && range) is unnecessary because the moment r ==
> > 0, we automatically know range exists and is safe to use. (Philip)
> >
> > Fixes: c5e357c924e5 ("drm/amdgpu: update the functions to use amdgpu
> > version of hmm") Reported by: Dan Carpenter <[email protected]>
> > Cc: Philip Yang <[email protected]>
> > Cc: Sunil Khatri <[email protected]>
> > Cc: Christian König <[email protected]>
> > Cc: Alex Deucher <[email protected]>
> > Signed-off-by: Srinivasan Shanmugam <[email protected]>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 1 +
> >   drivers/gpu/drm/amd/amdkfd/kfd_svm.c    | 6 ++++--
> >   2 files changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
> > index d6f903a2d573..90d26d820bac 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
> > @@ -221,6 +221,7 @@ int amdgpu_hmm_range_get_pages(struct
> > mmu_interval_notifier *notifier,
> >
> >   out_free_pfns:
> >     kvfree(pfns);
> > +   hmm_range->hmm_pfns = NULL;
> >   out_free_range:
> >     if (r == -EBUSY)
> >             r = -EAGAIN;
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> > b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> > index f041643308ca..103acb925c2b 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> > @@ -1744,6 +1744,7 @@ static int svm_range_validate_and_map(struct
> mm_struct *mm,
> >                     WRITE_ONCE(p->svms.faulting_task, NULL);
> >                     if (r) {
> >                             amdgpu_hmm_range_free(range);
> > +                           range = NULL;
> >                             pr_debug("failed %d to get svm range pages\n", 
> > r);
> >                     }
> >             } else {
> > @@ -1761,7 +1762,7 @@ static int svm_range_validate_and_map(struct
> mm_struct *mm,
> >             svm_range_lock(prange);
> >
> >             /* Free backing memory of hmm_range if it was initialized
> > -            * Overrride return value to TRY AGAIN only if prior returns
> > +            * Override return value to TRY AGAIN only if prior returns
> >              * were successful
> >              */
> >             if (range && !amdgpu_hmm_range_valid(range) && !r) { @@ -1769,7
> > +1770,8 @@ static int svm_range_validate_and_map(struct mm_struct *mm,
> >                     r = -EAGAIN;
> >             }
> >             /* Free the hmm range */
> > -           amdgpu_hmm_range_free(range);
> > +           if (range)
>
> Can just check if(!r) here and remove "range=NULL" above? if r is not 0 range 
> has
> been freed, if r is 0 free range here.


But there are later spots where r becomes non-zero after get_pages() succeeded 
and range is valid:
svm_range_dma_map(...) can fail → sets r != 0.
!amdgpu_hmm_range_valid(range) → you set r = -EAGAIN.
!list_empty(&prange->child_list) → you set r = -EAGAIN.
In all three cases, your new tail logic if (!r) amdgpu_hmm_range_free(range); 
skips freeing because r != 0 now. Since you also removed range = NULL; after 
the early free, nothing else frees it → memory leak.

Regards,
Srini

>
> Regards
>
> Xiaogang
>
> > +                   amdgpu_hmm_range_free(range);
> >
> >
> >             if (!r && !list_empty(&prange->child_list)) {

Reply via email to