[Public] Hi Alex and Lijo,
Revised the patch based on your comments. Please review [PATCH] drm/amdgpu: Put GPU CG/PG ungate in device_fini_hw and device_halt Best Regards, Yifan -----Original Message----- From: Lazar, Lijo <[email protected]> Sent: Saturday, January 31, 2026 2:05 PM To: Alex Deucher <[email protected]> Cc: Deucher, Alexander <[email protected]>; [email protected]; Zhang, Yifan <[email protected]> Subject: Re: [PATCH 01/12] drm/amdgpu/gfx9: handle gfxoff in interrupt set functions On 30-Jan-26 9:19 PM, Alex Deucher wrote: > On Fri, Jan 30, 2026 at 10:05 AM Lazar, Lijo <[email protected]> wrote: >> >> >> >> On 30-Jan-26 8:18 PM, Alex Deucher wrote: >>> On Tue, Jan 27, 2026 at 12:02 AM Lazar, Lijo <[email protected]> wrote: >>>> >>>> >>>> >>>> On 27-Jan-26 1:37 AM, Alex Deucher wrote: >>>>> Need to make sure gfxoff is disallowed when we touch GC registers >>>>> over MMIO. >>>>> >>>> >>>> I think interrupt enable/disable sequence is only supposed to be >>>> done under ip power/clock ungate sequence like in hw >>>> init/resume/suspend sequences. The fix probably should be in the >>>> higher level sequence which doesn't take care of that. >>> >>> In that case, Yifan's original patch is probably fine as is. >>> Someone should still double check all of the call paths though. >>> >> >> Original one is also not correct. For example, if this is happening >> after reset re initialization, only that sequence needs to be >> modified to keep the affected IPs ungated during reinit. > > It also gets called in amdgpu_device_fini_hw(). > Checked now. What I meant is - we should call ip cg/pg ungate insid amdgpu_device_fini_hw() to protect all other accesses within that sequence and not restricted to irq_disable/enable all. Now that happens a while later in amdgpu_device_ip_fini_early. Basically, the high level sequence needs IP pg ungate protection. Thanks, Lijo > Alex > >> >> Thanks, >> Lijo >> >>> Alex >>> >>>> >>>> Thanks, >>>> Lijo >>>> >>>>> Cc: Yifan Zhang <[email protected]> >>>>> Signed-off-by: Alex Deucher <[email protected]> >>>>> --- >>>>> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 ++++++++++ >>>>> 1 file changed, 10 insertions(+) >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c >>>>> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c >>>>> index 36f0300a21bfa..05178ee8e0e3a 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c >>>>> @@ -6056,6 +6056,7 @@ static int gfx_v9_0_set_priv_reg_fault_state(struct >>>>> amdgpu_device *adev, >>>>> u32 cp_int_cntl_reg, cp_int_cntl; >>>>> int i, j; >>>>> >>>>> + amdgpu_gfx_off_ctrl(adev, false); >>>>> switch (state) { >>>>> case AMDGPU_IRQ_STATE_DISABLE: >>>>> case AMDGPU_IRQ_STATE_ENABLE: >>>>> @@ -6080,6 +6081,7 @@ static int gfx_v9_0_set_priv_reg_fault_state(struct >>>>> amdgpu_device *adev, >>>>> default: >>>>> break; >>>>> } >>>>> + amdgpu_gfx_off_ctrl(adev, true); >>>>> >>>>> return 0; >>>>> } >>>>> @@ -6092,6 +6094,7 @@ static int gfx_v9_0_set_bad_op_fault_state(struct >>>>> amdgpu_device *adev, >>>>> u32 cp_int_cntl_reg, cp_int_cntl; >>>>> int i, j; >>>>> >>>>> + amdgpu_gfx_off_ctrl(adev, false); >>>>> switch (state) { >>>>> case AMDGPU_IRQ_STATE_DISABLE: >>>>> case AMDGPU_IRQ_STATE_ENABLE: >>>>> @@ -6116,6 +6119,7 @@ static int gfx_v9_0_set_bad_op_fault_state(struct >>>>> amdgpu_device *adev, >>>>> default: >>>>> break; >>>>> } >>>>> + amdgpu_gfx_off_ctrl(adev, true); >>>>> >>>>> return 0; >>>>> } >>>>> @@ -6125,6 +6129,7 @@ static int >>>>> gfx_v9_0_set_priv_inst_fault_state(struct amdgpu_device *adev, >>>>> unsigned type, >>>>> enum amdgpu_interrupt_state >>>>> state) >>>>> { >>>>> + amdgpu_gfx_off_ctrl(adev, false); >>>>> switch (state) { >>>>> case AMDGPU_IRQ_STATE_DISABLE: >>>>> case AMDGPU_IRQ_STATE_ENABLE: >>>>> @@ -6135,6 +6140,7 @@ static int >>>>> gfx_v9_0_set_priv_inst_fault_state(struct amdgpu_device *adev, >>>>> default: >>>>> break; >>>>> } >>>>> + amdgpu_gfx_off_ctrl(adev, true); >>>>> >>>>> return 0; >>>>> } >>>>> @@ -6152,6 +6158,7 @@ static int gfx_v9_0_set_cp_ecc_error_state(struct >>>>> amdgpu_device *adev, >>>>> unsigned type, >>>>> enum amdgpu_interrupt_state >>>>> state) >>>>> { >>>>> + amdgpu_gfx_off_ctrl(adev, false); >>>>> switch (state) { >>>>> case AMDGPU_IRQ_STATE_DISABLE: >>>>> WREG32_FIELD15(GC, 0, CP_INT_CNTL_RING0, @@ >>>>> -6173,6 +6180,7 @@ static int gfx_v9_0_set_cp_ecc_error_state(struct >>>>> amdgpu_device *adev, >>>>> default: >>>>> break; >>>>> } >>>>> + amdgpu_gfx_off_ctrl(adev, true); >>>>> >>>>> return 0; >>>>> } >>>>> @@ -6183,6 +6191,7 @@ static int gfx_v9_0_set_eop_interrupt_state(struct >>>>> amdgpu_device *adev, >>>>> unsigned type, >>>>> enum amdgpu_interrupt_state >>>>> state) >>>>> { >>>>> + amdgpu_gfx_off_ctrl(adev, false); >>>>> switch (type) { >>>>> case AMDGPU_CP_IRQ_GFX_ME0_PIPE0_EOP: >>>>> gfx_v9_0_set_gfx_eop_interrupt_state(adev, state); >>>>> @@ -6214,6 +6223,7 @@ static int gfx_v9_0_set_eop_interrupt_state(struct >>>>> amdgpu_device *adev, >>>>> default: >>>>> break; >>>>> } >>>>> + amdgpu_gfx_off_ctrl(adev, true); >>>>> return 0; >>>>> } >>>>> >>>> >>
