[AMD Official Use Only - AMD Internal Distribution Only]

I can confirm that during world switch the entire gfx block (including gfx, 
compute and sdma for gfx10+) been switched together .

Regards
Shaoyun.liu

-----Original Message-----
From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> On Behalf Of Alex Deucher
Sent: Friday, September 5, 2025 9:32 AM
To: Christian König <ckoenig.leichtzumer...@gmail.com>
Cc: Deucher, Alexander <alexander.deuc...@amd.com>; 
amd-gfx@lists.freedesktop.org; timur.kris...@gmail.com
Subject: Re: [PATCH 2/2] drm/amdgpu: reject gang submissions under SRIOV

On Fri, Sep 5, 2025 at 8:47 AM Christian König 
<ckoenig.leichtzumer...@gmail.com> wrote:
>
> Gang submission means that the kernel driver guarantees that multiple
> submissions are executed on the HW at the same time on different engines.
>
> Background is that those submissions then depend on each other and
> each can't finish stand alone.
>
> SRIOV now uses world switch to preempt submissions on the engines to
> allow sharing the HW resources between multiple VFs.
>
> The problem is now that the SRIOV world switch can't know about such
> inter dependencies and will cause a timeout if it waits for a
> partially running gang submission.
>
> To conclude SRIOV and gang submissions are fundamentally incompatible
> at the moment. For now just disable them.

Are you sure about this?  Thinking about this more, most gang submissions are 
between gfx and compute.  The entire GC block (gfx, compute, and sdma on 
gfx10+) gets preempted on world switch so all of the active queues would be 
preempted.  Everything gets resumed when the VF gets switched back.  VCN/JPEG 
gets switched independently so that could be a problem if you have a gang with 
VCN and GC, but I think all gangs within GC should in theory be ok.

Alex

>
> Signed-off-by: Christian König <christian.koe...@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 2ac9729e4c86..434a551365c7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -286,7 +286,7 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
>                 }
>         }
>
> -       if (!p->gang_size) {
> +       if (!p->gang_size || (amdgpu_sriov_vf(p->adev) && p->gang_size
> + > 1)) {
>                 ret = -EINVAL;
>                 goto free_all_kdata;
>         }
> --
> 2.43.0
>

Reply via email to