On 1/13/26 22:17, Alex Deucher wrote:
> On Tue, Jan 13, 2026 at 8:57 AM Christian König
> <[email protected]> wrote:
>>
>> Patches #1-#3: Reviewed-by: Christian König <[email protected]>
>>
>> Comment on patch #4 which also affects patches #5-#26.
> 
> What was your comment on patch 4?  I don't see that reply on the mailing list.

That we didn't used the job because we couldn't allocate memory while in the 
GPU reset.

We could use GFP_ATOMIC when allocating from the GPU reset IB pool to solve 
this.

Christian.

> 
> Alex
> 
>>
>> Comment on patch #27 and #28. When #28 comes before #27 then that would 
>> potentially solve the issue with #27.
>>
>> Patches #31: Reviewed-by: Christian König <[email protected]>
>>
>> Patches #32-#40 that looks extremely questionable to me. I've intentionally 
>> removed that state from the job because it isn't job dependent and sometimes 
>> has inter-job meaning.
>>
>> Patch #41: Absolutely clear NAK! We have exercised that nonsense to the max 
>> and I'm clearly against doing that over and over again. Saving the ring 
>> content clearly seems to be the saver approach.
>>
>> Regards,
>> Christian.
>>
>> On 1/8/26 15:48, Alex Deucher wrote:
>>> This set contains a number of bug fixes and cleanups for
>>> IB handling that I worked on over the holidays.
>>>
>>> Patches 1-2:
>>> Simple bug fixes.
>>>
>>> Patches 3-26:
>>> Removes the direct submit path for IBs and requires
>>> that all IB submissions use a job structure.  This
>>> greatly simplifies the IB submission code.
>>>
>>> Patches 27-42:
>>> Split IB state setup and ring emission.  This keeps all
>>> of the IB state in the job.  This greatly simplifies
>>> re-emission of non-timed-out jobs after a ring reset and
>>> allows for re-emission multiple times if multiple resets
>>> happen in a row.  It also properly handles the dma fence
>>> error handling for timedout jobs with adapter resets.
>>>
>>> Alex Deucher (42):
>>>   drm/amdgpu/jpeg4.0.3: remove redundant sr-iov check
>>>   drm/amdgpu: fix error handling in ib_schedule()
>>>   drm/amdgpu: add new job ids
>>>   drm/amdgpu/vpe: switch to using job for IBs
>>>   drm/amdgpu/gfx6: switch to using job for IBs
>>>   drm/amdgpu/gfx7: switch to using job for IBs
>>>   drm/amdgpu/gfx8: switch to using job for IBs
>>>   drm/amdgpu/gfx9: switch to using job for IBs
>>>   drm/amdgpu/gfx9.4.2: switch to using job for IBs
>>>   drm/amdgpu/gfx9.4.3: switch to using job for IBs
>>>   drm/amdgpu/gfx10: switch to using job for IBs
>>>   drm/amdgpu/gfx11: switch to using job for IBs
>>>   drm/amdgpu/gfx12: switch to using job for IBs
>>>   drm/amdgpu/gfx12.1: switch to using job for IBs
>>>   drm/amdgpu/si_dma: switch to using job for IBs
>>>   drm/amdgpu/cik_sdma: switch to using job for IBs
>>>   drm/amdgpu/sdma2.4: switch to using job for IBs
>>>   drm/amdgpu/sdma3: switch to using job for IBs
>>>   drm/amdgpu/sdma4: switch to using job for IBs
>>>   drm/amdgpu/sdma4.4.2: switch to using job for IBs
>>>   drm/amdgpu/sdma5: switch to using job for IBs
>>>   drm/amdgpu/sdma5.2: switch to using job for IBs
>>>   drm/amdgpu/sdma6: switch to using job for IBs
>>>   drm/amdgpu/sdma7: switch to using job for IBs
>>>   drm/amdgpu/sdma7.1: switch to using job for IBs
>>>   drm/amdgpu: require a job to schedule an IB
>>>   drm/amdgpu: mark fences with errors before ring reset
>>>   drm/amdgpu: rename amdgpu_fence_driver_guilty_force_completion()
>>>   drm/amdgpu: don't call drm_sched_stop/start() in asic reset
>>>   drm/amdgpu: drop drm_sched_increase_karma()
>>>   drm/amdgpu: plumb timedout fence through to force completion
>>>   drm/amdgpu: change function signature for emit_pipeline_sync()
>>>   drm/amdgpu: drop extra parameter for vm_flush
>>>   drm/amdgpu: move need_ctx_switch into amdgpu_job
>>>   drm/amdgpu: store vm flush state in amdgpu_job
>>>   drm/amdgpu: split fence init and emit logic
>>>   drm/amdgpu: split vm flush and vm flush emit logic
>>>   drm/amdgpu: split ib schedule and ib emit logic
>>>   drm/amdgpu: move drm sched stop/start into amdgpu_job_timedout()
>>>   drm/amdgpu: add an all_instance_rings_reset ring flag
>>>   drm/amdgpu: rework reset reemit handling
>>>   drm/amdgpu: simplify per queue reset code
>>>
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c  |   2 +-
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |   2 +-
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  13 +-
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c   | 136 +++------
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c      | 289 ++++++++++----------
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  40 ++-
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.h     |  13 +
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c    |  67 -----
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h    |  37 +--
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c    |   4 +-
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c     |   2 +-
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c     |  21 +-
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c      | 141 +++++-----
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h      |   3 +-
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c     |  45 +--
>>>  drivers/gpu/drm/amd/amdgpu/cik_sdma.c       |  36 ++-
>>>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c      |  41 ++-
>>>  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c      |  41 ++-
>>>  drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c      |  41 ++-
>>>  drivers/gpu/drm/amd/amdgpu/gfx_v12_1.c      |  33 ++-
>>>  drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c       |  28 +-
>>>  drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c       |  30 +-
>>>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c       | 143 +++++-----
>>>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c       | 149 +++++-----
>>>  drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c     |  26 +-
>>>  drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c     |  38 +--
>>>  drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c      |   3 +-
>>>  drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c      |   3 +-
>>>  drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c      |   3 +-
>>>  drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c      |   3 +-
>>>  drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c    |   6 +-
>>>  drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_5.c    |   3 +-
>>>  drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_0.c    |   3 +-
>>>  drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c    |   3 +-
>>>  drivers/gpu/drm/amd/amdgpu/jpeg_v5_3_0.c    |   3 +-
>>>  drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c      |  43 +--
>>>  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c      |  43 +--
>>>  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c      |  43 +--
>>>  drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c    |  45 +--
>>>  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c      |  46 ++--
>>>  drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c      |  45 +--
>>>  drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c      |  45 +--
>>>  drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c      |  45 +--
>>>  drivers/gpu/drm/amd/amdgpu/sdma_v7_1.c      |  45 +--
>>>  drivers/gpu/drm/amd/amdgpu/si_dma.c         |  34 ++-
>>>  drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c       |   8 +-
>>>  drivers/gpu/drm/amd/amdgpu/vce_v3_0.c       |   4 +-
>>>  drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c       |   2 +
>>>  drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c       |   2 +
>>>  drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c       |   3 +-
>>>  drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c     |   4 +-
>>>  drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c     |   3 +-
>>>  drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c     |   3 +-
>>>  drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c     |   4 +-
>>>  54 files changed, 952 insertions(+), 966 deletions(-)
>>>
>>

Reply via email to