AMD General
> -----Original Message-----
> From: Kamal, Asad <[email protected]>
> Sent: Friday, June 5, 2026 9:19 PM
> To: [email protected]
> Cc: Lazar, Lijo <[email protected]>; Zhang, Hawking
> <[email protected]>; Ma, Le <[email protected]>; Zhang, Morris
> <[email protected]>; Deucher, Alexander <[email protected]>;
> Wang, Yang(Kevin) <[email protected]>; Kamal, Asad
> <[email protected]>; SHANMUGAM, SRINIVASAN
> <[email protected]>
> Subject: [PATCH] drm/amdgpu/gfx: fix cleaner shader IB buffer overflow
>
> The cleaner shader sysfs path allocates a 16-dword (64 byte) IB but
> incorrectly fills
> (align_mask + 1) dwords. On GFX rings align_mask is 0xff, so the loop wrote
> 256
> dwords into a 64-byte buffer, causing a kernel page fault.
>
> The IB only needs to be a minimal NOP shell to schedule the job; the cleaner
> shader itself is emitted on the ring via emit_cleaner_shader().
> Fill 16 dwords to match the allocation.
>
> Fixes: d361ad5d2fc0 ("drm/amdgpu: Add sysfs interface for running cleaner
> shader")
>
> Suggested-by: Lijo Lazar <[email protected]>
> Signed-off-by: Asad Kamal <[email protected]>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index ff5a55f5f3c9..f2c536929446 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -1694,7 +1694,7 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct
> amdgpu_ring *ring)
> struct amdgpu_job *job;
> struct amdgpu_ib *ib;
> void *owner;
> - int i, r;
> + int r;
>
> /* Initialize the scheduler entity */
> r = drm_sched_entity_init(&entity, DRM_SCHED_PRIORITY_NORMAL,
> @@ -1722,9 +1722,8 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct
> amdgpu_ring *ring)
> job->run_cleaner_shader = true;
>
> ib = &job->ibs[0];
> - for (i = 0; i <= ring->funcs->align_mask; ++i)
> - ib->ptr[i] = ring->funcs->nop;
> - ib->length_dw = ring->funcs->align_mask + 1;
> + memset32(ib->ptr, ring->funcs->nop, 16);
> + ib->length_dw = 16;
The fix correctly limits the NOP IB to the allocated 64-byte (16-dword) size
and avoids the overflow. As a minor cleanup, it may be worth defining a
constant for the 16-dwords to avoid magic numbers and keep allocation/fill size
tied together.
+#define AMDGPU_CLEANER_SHADER_IB_SIZE 64
+#define AMDGPU_CLEANER_SHADER_IB_DW \
+ (AMDGPU_CLEANER_SHADER_IB_SIZE / sizeof(u32))
+
static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
{
@@
- r = amdgpu_job_alloc_with_ib(ring->adev, &entity, owner,
- 64, 0, &job,
+ r = amdgpu_job_alloc_with_ib(ring->adev, &entity, owner,
+ AMDGPU_CLEANER_SHADER_IB_SIZE, 0, &job,
AMDGPU_KERNEL_JOB_ID_CLEANER_SHADER);
@@
- memset32(ib->ptr, ring->funcs->nop, 16);
- ib->length_dw = 16;
+ memset32(ib->ptr, ring->funcs->nop,
+ AMDGPU_CLEANER_SHADER_IB_DW);
+ ib->length_dw = AMDGPU_CLEANER_SHADER_IB_DW;
Thanks,
Srini