AMD General

Greetings @Alex Deucher

Thanks for the ACK.  Waiting for the Reviewed-by:



> -----Original Message-----
> From: Alex Deucher <[email protected]>
> Sent: Thursday, May 28, 2026 3:06 PM
> To: Martin, Andrew <[email protected]>
> Cc: [email protected]
> Subject: Re: [PATCH v1] drm/amdkfd: Fix buffer overflow in SDMA queue
> checkpoint/restore on GFX11
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On Thu, May 28, 2026 at 1:34 PM Andrew Martin <[email protected]>
> wrote:
> >
> > The v11 MQD manager incorrectly assigned the CP-compute variants of
> > checkpoint_mqd/restore_mqd for KFD_MQD_TYPE_SDMA queues. These
> > functions use sizeof(struct v11_compute_mqd) (2048 bytes) instead of
> > sizeof(struct
> > v11_sdma_mqd) (512 bytes), causing a 1536-byte overflow.
> >
> > During CRIU checkpoint of an SDMA queue on Navi3x:
> > - checkpoint_mqd() reads 2048 bytes from a 512-byte SDMA MQD buffer,
> >   leaking 1536 bytes of adjacent GTT memory to userspace
> >
> > During CRIU restore:
> > - restore_mqd() writes 2048 bytes into a 512-byte SDMA MQD buffer,
> >   corrupting 1536 bytes of adjacent GTT memory (often the ring buffer
> >   or neighboring MQDs)
> >
> > This is a copy-paste regression unique to v11. All other ASIC backends
> > (cik, vi, v9, v10, v12) correctly use the SDMA-specific variants.
> >
> > Add checkpoint_mqd_sdma() and restore_mqd_sdma() functions that
> > properly handle the smaller v11_sdma_mqd structure, matching the
> > pattern used in other MQD managers.
> >
> > Fixes: cc009e613de6 ("drm/amdkfd: Add KFD support for soc21 v3")
> > Assisted-by: Claude:Sonnet 4-5
> > Signed-off-by: Andrew Martin <[email protected]>
>
> Acked-by: Alex Deucher <[email protected]>
>
> > ---
> >  .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c  | 40
> > ++++++++++++++++++-
> >  1 file changed, 38 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> > b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> > index 4d8cf6008a77..ce0f5e8e5c29 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> > @@ -355,6 +355,42 @@ static void restore_mqd(struct mqd_manager *mm,
> void **mqd,
> >         qp->is_active = 0;
> >  }
> >
> > +static void checkpoint_mqd_sdma(struct mqd_manager *mm,
> > +                               void *mqd,
> > +                               void *mqd_dst,
> > +                               void *ctl_stack_dst) {
> > +       struct v11_sdma_mqd *m;
> > +
> > +       m = get_sdma_mqd(mqd);
> > +
> > +       memcpy(mqd_dst, m, sizeof(struct v11_sdma_mqd)); }
> > +
> > +static void restore_mqd_sdma(struct mqd_manager *mm, void **mqd,
> > +                            struct kfd_mem_obj *mqd_mem_obj, uint64_t 
> > *gart_addr,
> > +                            struct queue_properties *qp,
> > +                            const void *mqd_src,
> > +                            const void *ctl_stack_src,
> > +                            const u32 ctl_stack_size) {
> > +       uint64_t addr;
> > +       struct v11_sdma_mqd *m;
> > +
> > +       m = (struct v11_sdma_mqd *) mqd_mem_obj->cpu_ptr;
> > +       addr = mqd_mem_obj->gpu_addr;
> > +
> > +       memcpy(m, mqd_src, sizeof(*m));
> > +
> > +       m->sdmax_rlcx_doorbell_offset =
> > +               qp->doorbell_off <<
> > + SDMA0_QUEUE0_DOORBELL_OFFSET__OFFSET__SHIFT;
> > +
> > +       *mqd = m;
> > +       if (gart_addr)
> > +               *gart_addr = addr;
> > +
> > +       qp->is_active = 0;
> > +}
> >
> >  static void init_mqd_hiq(struct mqd_manager *mm, void **mqd,
> >                         struct kfd_mem_obj *mqd_mem_obj, uint64_t
> > *gart_addr, @@ -539,8 +575,8 @@ struct mqd_manager
> *mqd_manager_init_v11(enum KFD_MQD_TYPE type,
> >                 mqd->update_mqd = update_mqd_sdma;
> >                 mqd->destroy_mqd = kfd_destroy_mqd_sdma;
> >                 mqd->is_occupied = kfd_is_occupied_sdma;
> > -               mqd->checkpoint_mqd = checkpoint_mqd;
> > -               mqd->restore_mqd = restore_mqd;
> > +               mqd->checkpoint_mqd = checkpoint_mqd_sdma;
> > +               mqd->restore_mqd = restore_mqd_sdma;
> >                 mqd->mqd_size = sizeof(struct v11_sdma_mqd);
> >                 mqd->mqd_stride = kfd_mqd_stride;  #if
> > defined(CONFIG_DEBUG_FS)
> > --
> > 2.43.0
> >

Reply via email to