[PATCH] drm/amdkfd: Check debug trap enable before write dbg_ev_file

2024-05-06 Thread Lin . Cao
In interrupt context, write dbg_ev_file will be run by work queue. It will cause write dbg_ev_file execution after debug_trap_disable, which will cause NULL pointer access. v2: cancel work "debug_event_workarea" before set dbg_ev_file as NULL. Signed-off-by: Lin.Cao ---

[PATCH] drm/amdkfd: Check debug trap enable before write dbg_ev_file

2024-04-23 Thread Lin . Cao
In interrupt context, write dbg_ev_file will be run by work queue. It will cause write dbg_ev_file execution after debug_trap_disable, which will cause NULL pointer access. Signed-off-by: Lin.Cao --- drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)

[PATCH] drm/amd/pm set pp_dpm_*clk as read only for SRIOV one VF mode

2024-03-15 Thread Lin . Cao
pp_dpm_*clk should be set as read only for SRIOV one VF mode, remove S_IWUGO flag and _store function of these debugfs in one VF mode. Signed-off-by: Lin.Cao --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git

[PATCH v2] drm/amdgpu doorbell range should be set when gpu recovery

2023-10-30 Thread Lin . Cao
GFX doorbell range should be set after flr otherwise the gfx doorbell range will be overlap with MEC. v2: remove "amdgpu_sriov_vf" and "amdgpu_in_reset" check, and add grbm select for the case of 2 gfx rings. Signed-off-by: Lin.Cao --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +++ 1 file

[PATCH] drm/amdgpu set doorbell range when gpu recovery in sriov environment

2023-10-27 Thread Lin . Cao
GFX doorbell range should be set after flr otherwise the GFX doorbell range will overlap with MEC. Signed-off-by: Lin.Cao --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c

[PATCH] drm/amd check num of link levels when update pcie param

2023-10-19 Thread Lin . Cao
In SR-IOV environment, the value of pcie_table->num_of_link_levels will be 0, and num_of_levels - 1 will cause array index out of bounds Signed-off-by: Lin.Cao --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 3 +++ 1 file changed, 3 insertions(+) diff --git

[PATCH] drm/amdgpu remove restriction of sriov max_pfn on Vega10

2023-10-17 Thread Lin . Cao
Remove restriction of sriov max_pfn so that TBA and TMA can move to high 47 bits address. Regression test: change range alloc flag of libdrm as AMDGPU_VA_RANGE_HIGH and there is no flr occur when testing amdgpu_test of drm. Signed-off-by: Lin.Cao --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7

[PATCH] drm/amdgpu: save VCN instances init info before jpeg init

2023-10-10 Thread Lin . Cao
JPEG init header will overwirte vcn init header info which will loss some debug information Signed-off-by: Lin.Cao --- drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c

[PATCH] drm/amdgpu: Return -EINVAL when MMSCH init status incorrect

2023-10-08 Thread Lin . Cao
Return -EINVAL when MMSCH init fail which can be handle by function amdgpu_device_reset_sriov correctly. Signed-off-by: Lin.Cao --- drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c

[PATCH] drm/amdkfd: update struct pm4_mes_runlist Struct pm4_mes_runlist in amdgpu is conflict with spec Add last dword of the design of spec into struct pm4_mes_runlist

2023-09-06 Thread Lin . Cao
Signed-off-by: Lin.Cao --- drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h index 8b6b2bd5c148..ed937f70895c 100644 ---

[PATCH] SWDEV-420310 - struct pm4_mes_runlist in amdgpu is conflict with spec struct pm4_mes_runlist is different with mes pm4 packet nv10 spec Modification: add last dword of the design of spec into

2023-09-05 Thread Lin . Cao
Signed-off-by: Lin.Cao Change-Id: I1322c010d1428b2c1df5080b72da94e90cf17fec --- drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h | 12 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h

[PATCH] drm/amdgpu: Fix vram recover doesn't work after whole GPU reset

2023-05-04 Thread Lin . Cao
v1: Vmbo->shadow is used to back vram bo up when vram lost. So that we should set shadow as vmbo->shadow to recover vmbo->bo v2: Modify if(vmbo->shadow) shadow = vmbo->shadow as if(!vmbo->shadow) continue; Fix: 'commit e18aaea733da ("drm/amdgpu: move shadow_list to amdgpu_bo_vm")' Signed-off-by:

[PATCH] drm/amdgpu: Recover vram from vmbo->shadow rather than vmbo->bo

2023-04-26 Thread Lin . Cao
Vmbo->shadow is used to back vram bo up when vram lost. So that we should set shadow as vmbo->shadow to recover vmbo->bo. Fix: 'commit e18aaea733da ("drm/amdgpu: move shadow_list to amdgpu_bo_vm")' Signed-off-by: Lin.Cao --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +++- 1 file

[PATCH] drm/amdgpu: Call trace info was found in dmesg when loading amdgpu

2022-07-13 Thread lin cao
In the case of SRIOV, the register smnMp1_PMI_3_FIFO will get an invalid value which will cause the "shift out of bound". In Ubuntu22.04, this issue will be checked an related call trace will be reported in dmesg. Signed-off-by: lin cao --- drivers/gpu/drm/amd/pm/s