Reviewed-by: Iago Toral Quiroga <[email protected]>
El mar, 02-06-2026 a las 14:50 -0300, Maíra Canal escribió:
> A compute shader dispatch encodes its workgroup counts in the
> CFG0..CFG2
> registers. Kicking off a dispatch with a zero count in any of the
> three
> dimensions is invalid. First, the hardware will process 0 as 65536,
> while the user-space driver exposes a maximum of 65535. Over that, a
> submission with a zeroed workgroup dimension should be a no-op.
>
> These zeroed counts can reach the dispatch path through an indirect
> CSD
> job, whose workgroup counts are only known once the indirect buffer
> is
> read and may legitimately be zero, but such scenario should only
> result in
> a no-op.
>
> Overwrite the indirect CSD job workgroup counts with the indirect BO
> ones, even if they are zeroed, and don't submit the job to the
> hardware
> when any of the workgroup counts is zero, so the job completes
> immediately
> instead of running the shader.
>
> Cc: [email protected]
> Fixes: d223f98f0209 ("drm/v3d: Add support for compute shader
> dispatch.")
> Suggested-by: Jose Maria Casanova Crespo <[email protected]>
> Signed-off-by: Maíra Canal <[email protected]>
> ---
> drivers/gpu/drm/v3d/v3d_sched.c | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c
> b/drivers/gpu/drm/v3d/v3d_sched.c
> index 47f83936cd73..8a635a9ec046 100644
> --- a/drivers/gpu/drm/v3d/v3d_sched.c
> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
> @@ -352,6 +352,16 @@ v3d_csd_job_run(struct drm_sched_job *sched_job)
> return NULL;
> }
>
> + /* The HW interprets a workgroup size of 0 as 65536;
> however, the
> + * user-space driver exposes a maximum of 65535. Therefore,
> a 0 in
> + * any dimension means that we have no workgroups and the
> compute
> + * shader should not be dispatched.
> + */
> + if (!V3D_GET_FIELD(job->args.cfg[0],
> V3D_CSD_QUEUED_CFG0_NUM_WGS_X) ||
> + !V3D_GET_FIELD(job->args.cfg[1],
> V3D_CSD_QUEUED_CFG1_NUM_WGS_Y) ||
> + !V3D_GET_FIELD(job->args.cfg[2],
> V3D_CSD_QUEUED_CFG2_NUM_WGS_Z))
> + return NULL;
> +
> v3d->queue[V3D_CSD].active_job = &job->base;
>
> v3d_invalidate_caches(v3d);
> @@ -402,13 +412,13 @@
> v3d_rewrite_csd_job_wg_counts_from_indirect(struct v3d_cpu_job *job)
>
> wg_counts = (uint32_t *)(bo->vaddr + indirect_csd->offset);
>
> - if (wg_counts[0] == 0 || wg_counts[1] == 0 || wg_counts[2]
> == 0)
> - goto unmap_bo;
> -
> args->cfg[0] = wg_counts[0] <<
> V3D_CSD_CFG012_WG_COUNT_SHIFT;
> args->cfg[1] = wg_counts[1] <<
> V3D_CSD_CFG012_WG_COUNT_SHIFT;
> args->cfg[2] = wg_counts[2] <<
> V3D_CSD_CFG012_WG_COUNT_SHIFT;
>
> + if (wg_counts[0] == 0 || wg_counts[1] == 0 || wg_counts[2]
> == 0)
> + goto unmap_bo;
> +
> num_batches = DIV_ROUND_UP(indirect_csd->wg_size, 16) *
> (wg_counts[0] * wg_counts[1] * wg_counts[2]);
>
>