Currently there are two different behaviour modes when userspace tries to operate on not present HW IP blocks. On a machine without UVD, VCE and VPE blocks, this can be observed for example like this:
$ sudo ./amd_fuzzing --r cs-wait-fuzzing ... amd_cs_wait_fuzzing DRM_IOCTL_AMDGPU_CTX r 0 amd_cs_wait_fuzzing AMDGPU_WAIT_CS AMD_IP_GFX r 0 amd_cs_wait_fuzzing AMDGPU_WAIT_CS AMD_IP_COMPUTE r 0 amd_cs_wait_fuzzing AMDGPU_WAIT_CS AMD_IP_DMA r 0 amd_cs_wait_fuzzing AMDGPU_WAIT_CS AMD_IP_UVD r -1 amd_cs_wait_fuzzing AMDGPU_WAIT_CS AMD_IP_VCE r 0 amd_cs_wait_fuzzing AMDGPU_WAIT_CS AMD_IP_UVD_ENC r -1 amd_cs_wait_fuzzing AMDGPU_WAIT_CS AMD_IP_VCN_DEC r 0 amd_cs_wait_fuzzing AMDGPU_WAIT_CS AMD_IP_VCN_ENC r 0 amd_cs_wait_fuzzing AMDGPU_WAIT_CS AMD_IP_VCN_JPEG r 0 amd_cs_wait_fuzzing AMDGPU_WAIT_CS AMD_IP_VPE r 0 We can see that UVD returns an errno (-EINVAL) from the CS_WAIT ioctl, while VCE and VPE return unexpected successes. The difference stems from the fact the UVD is a load balancing engine which retains the context, so with a workaround implemented in amdgpu_ctx_init_entity(), but which does not account for the fact hardware block may not be present. This causes a single NULL scheduler to be passed to drm_sched_entity_init(), which immediately rejects this with -EINVAL. The not present VCE and VPE cases on the other hand pass zero schedulers to drm_sched_entity_init(), which is explicitly allowed and results in unusable entities. As the UVD case however shows, call paths can handle the errors, so we can consolidate this into a single path which will always return -EINVAL if the HW IP block is not present. We do this by rejecting it early and not calling drm_sched_entity_init() when there is no backing hardware. This also removes the need for the drm_sched_entity_init() to handle the zero schedulers and NULL scheduler cases, which means that we can follow up by removing the special casing from the DRM scheduler. Signed-off-by: Tvrtko Ursulin <[email protected]> References: f34e8bb7d6c6 ("drm/sched: fix null-ptr-deref in init entity") Cc: Christian König <[email protected]> --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c index afedea02188d..a5f85ea9fbb6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -239,6 +239,11 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip, goto error_free_entity; } + if (num_scheds == 0) { + r = -EINVAL; + goto error_free_entity; + } + /* disable load balance if the hw engine retains context among dependent jobs */ if (hw_ip == AMDGPU_HW_IP_VCN_ENC || hw_ip == AMDGPU_HW_IP_VCN_DEC || -- 2.52.0
