On 06-Jun-26 1:30 AM, Alex Deucher wrote:
On Fri, Jun 5, 2026 at 11:59 AM Asad Kamal <[email protected]> wrote:

The cleaner shader sysfs path allocates a 16-dword (64 byte) IB but
incorrectly fills (align_mask + 1) dwords. On GFX rings align_mask is
0xff, so the loop wrote 256 dwords into a 64-byte buffer, causing a
kernel page fault.

It would be better to set the job alloc size to
ring->funcs->align_mask + 1.  The whole point of the align mask is to
align to the hardware's fetch boundary.


Hi Alex,

This is for IB packet. Is this a restriction from the FW? For 9.4.3, CP team mentioned that hardware doesn't have any such restriction.

As a side note (not related to IB), within the primary queue, the default RPTR_BLOCK_SIZE is 64DWs - block size granularity for RPTR updates.

Thanks,
Lijo

Alex


The IB only needs to be a minimal NOP shell to schedule the job; the
cleaner shader itself is emitted on the ring via emit_cleaner_shader().
Fill 16 dwords to match the allocation.

Fixes: d361ad5d2fc0 ("drm/amdgpu: Add sysfs interface for running cleaner 
shader")

Suggested-by: Lijo Lazar <[email protected]>
Signed-off-by: Asad Kamal <[email protected]>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 7 +++----
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index ff5a55f5f3c9..f2c536929446 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -1694,7 +1694,7 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct 
amdgpu_ring *ring)
         struct amdgpu_job *job;
         struct amdgpu_ib *ib;
         void *owner;
-       int i, r;
+       int r;

         /* Initialize the scheduler entity */
         r = drm_sched_entity_init(&entity, DRM_SCHED_PRIORITY_NORMAL,
@@ -1722,9 +1722,8 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct 
amdgpu_ring *ring)
         job->run_cleaner_shader = true;

         ib = &job->ibs[0];
-       for (i = 0; i <= ring->funcs->align_mask; ++i)
-               ib->ptr[i] = ring->funcs->nop;
-       ib->length_dw = ring->funcs->align_mask + 1;
+       memset32(ib->ptr, ring->funcs->nop, 16);
+       ib->length_dw = 16;

         f = amdgpu_job_submit(job);

--
2.46.0


Reply via email to