[PATCH] drm/amd/amdgpu: consolidate PSP TA init shared buf functions

2021-08-23 Thread Candice Li
Change-Id: I779f4fb52ecc661c25c42ced487719f08f3d875d Signed-off-by: Candice Li Reviewed-by: John Clements --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 142 +++- 1 file changed, 43 insertions(+), 99 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b

[PATCH] drm/amd/amdgpu: add name field back to ras_common_if

2021-08-23 Thread Candice Li
Adding name filed back to ras_common_if to work around error injection failure with amdgpuras tool. Change-Id: I9d181a4153b055e22ac6adeb3b51a521c8c2793b Signed-off-by: Candice Li Reviewed-by: John Clements --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 1 + 1 file changed, 1 insertion(+) diff

[PATCH] drm/amd: consolidate TA shared memory structures

2021-08-17 Thread Candice Li
Change-Id: I81be5a824fced3d2244cf209444c2391f6bc6c50 Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 218 +- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 68 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c | 4 +- .../gpu/drm/amd/amdgpu

[PATCH] drm/amd/amdgpu: consolidate PSP TA unload function

2021-08-27 Thread Candice Li
Create common PSP TA unload function and replace all common TA unloading sequences. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 160 ++-- 1 file changed, 40 insertions(+), 120 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b

[PATCH] drm/amd/amdgpu: add mpio to ras block

2021-08-29 Thread Candice Li
Add MPIO to RAS block Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 2 ++ drivers/gpu/drm/amd/amdgpu/ta_ras_if.h | 1 + 3 files changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu

[PATCH] drm/amdgpu: Create common PSP TA load function

2021-09-06 Thread Candice Li
Creat common PSP TA load function and update PSP ta_mem_context with size information. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 280 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 17 +- 2 files changed, 93 insertions(+), 204 deletions

[PATCH] drm/amdgpu: Unify PSP TA context

2021-09-10 Thread Candice Li
Remove all TA binary structures and add the specific binary structure in struct ta_context. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 23 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 122 +++--- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 23

[PATCH] drm/amdgpu: Conform ASD header/loading to generic TA systems

2021-09-13 Thread Candice Li
Update asd_context structure and add asd_initialize function to conform ASD header/loading to generic TA systems. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 60 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 10 ++--- 2 files changed, 26

[PATCH] drm/amdgpu: Update PSP TA unload function

2021-09-13 Thread Candice Li
Update PSP TA unload function to use PSP TA context as input argument. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd

[PATCH] drm/amdgpu: Remove all code paths under the EAGAIN path in RAS late init

2021-09-23 Thread Candice Li
All code paths under the EAGAIN path in RAS late init are unused. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 33 + drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 3 --- 2 files changed, 1 insertion(+), 35 deletions(-) diff --git a/drivers/gpu

[PATCH] drm/amdgpu: Update PSP TA Invoke to use common TA context as input

2021-09-24 Thread Candice Li
Updated invoke to use new common TA structure similarily to load/unload. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd

[PATCH] drm/amdgpu: Update TA version output in driver

2021-10-24 Thread Candice Li
TA version should only be displayed in firmware version column. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 12 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 14 +++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 4 ++-- drivers/gpu/drm/amd

[PATCH] drm/amdgpu: Add recovery_lock to save bad pages function

2021-11-16 Thread Candice Li
Fix race condition failure during UMC UE injection. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index

[PATCH 1/3] drm/amdgpu: Add RREG64_PCIE_EXT/WREG64_PCIE_EXT functions

2023-09-04 Thread Candice Li
1. Add 64bits register access support on register whose address is greater than 32bits. 2. Update RREG32_PCIE_EXT/WREG32_PCIE_EXT. Signed-off-by: Candice Li Reviewed-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 11 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 119

[PATCH 3/3] drm/amdgpu: Add umc v12_0 ras functions

2023-09-04 Thread Candice Li
Add umc v12_0 ras error querying. Signed-off-by: Candice Li Reviewed-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/Makefile| 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 16 +- drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 256 + drivers/gpu/drm/amd/amdgpu/umc_v12_0.h

[PATCH 2/3] drm/amd: Add umc v12_0_0 ip headers

2023-09-04 Thread Candice Li
Add umc v12_0_0 ip headers. Signed-off-by: Candice Li Reviewed-by: Tao Zhou --- .../include/asic_reg/umc/umc_12_0_0_offset.h | 33 +++ .../include/asic_reg/umc/umc_12_0_0_sh_mask.h | 95 +++ 2 files changed, 128 insertions(+) create mode 100644 drivers/gpu/drm/amd/include

[PATCH] drm/amdgpu: Drop deferred error in uncorrectable error check

2023-10-27 Thread Candice Li
Drop checking deferred error which can be handled by poison consumption. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c

[PATCH] drm/amdgpu: Retrieve CE count from ce_count_lo_chip in EccInfo table

2023-10-25 Thread Candice Li
Retrieve correctable error count from ce_count_lo_chip instead of mca_umc_status. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/umc_v8_10.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v8_10.c b/drivers/gpu/drm/amd

[PATCH] drm/amdgpu: Identify data parity error corrected in replay mode

2023-10-25 Thread Candice Li
Use ErrorCodeExt field to identify data parity error in replay mode. Signed-off-by: Candice Li Reviewed-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 32 ++ 1 file changed, 23 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c

[PATCH] drm/amdgpu: Log UE corrected by replay as correctable error

2023-10-18 Thread Candice Li
Support replay mode where UE could be converted to CE. Signed-off-by: Candice Li Reviewed-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c b/drivers/gpu/drm/amd/amdgpu

[PATCH] drm/amdgpu: Update RAS EEPROM support on smu v13_0_6.

2023-08-16 Thread Candice Li
RAS EEPROM device is only supported on dGPU platform for smu v13_0_6. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu

[PATCH v2 2/2] drm/amdgpu: Add debugfs TA load/unload/invoke support

2022-04-20 Thread Candice Li
v1: Add debugfs support to load/unload/invoke TA in runtime. v2: 1. Update some variables to static. 2. Use PAGE_ALIGN to calculate shared buf size directly. 3. Remove fp check. 4. Update debugfs from read to write. Signed-off-by: John Clements Signed-off-by: Candice Li --- drivers/gpu/drm

[PATCH v2 1/2] drm/amdgpu: Use indirect buffer and save response status for TA load/invoke

2022-04-20 Thread Candice Li
The upcoming TA debugfs interface needs to use indirect buffer when performing TA invoke and check psp response status for TA load and invoke. Signed-off-by: John Clements Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 54 + drivers/gpu/drm/amd

[PATCH] drm/amdgpu: Fix build warning for TA debugfs interface

2022-04-27 Thread Candice Li
Remove the redundant conditional group to fix build warning when CONFIG_DEBUG_FS is disabled. Reported-by: Randy Dunlap Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp_ta.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp_ta.c b

[PATCH] drm/amdgpu: Resolve pcie_bif RAS recovery bug

2022-05-20 Thread Candice Li
Check shared buf instead of init flag for xgmi ta shared buf init during xgmi ta initialization. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm

[PATCH] drm/amdgpu: Resolve RAS GFX error count issue after cold boot on Arcturus

2022-06-01 Thread Candice Li
Adjust the sequence for ras late init and separate ras reset error status from query status. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 7 --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 27 - 2 files changed, 26 insertions(+), 8 deletions

[PATCH] drm/amdgpu: Resolve RAS GFX error count issue v2

2022-06-01 Thread Candice Li
Fix misleading indentation Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index 99c1a2d3dae84d..424990e1bec10c

[PATCH v2] drm/amdgpu: Resolve RAS GFX error count issue v2

2022-06-01 Thread Candice Li
Fix misleading indentation and add ras unsupported checking for gfx ras late init. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd

[PATCH 1/2] drm/amdgpu: Use indirect buffer and save response status for TA load/invoke

2022-04-17 Thread Candice Li
The upcoming TA debugfs interface needs to use indirect buffer when performing TA invoke and check psp response status for TA load and invoke. Signed-off-by: John Clements Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 60 +++-- drivers/gpu/drm/amd

[PATCH 2/2] drm/amdgpu: Add debugfs TA load/unload/invoke support

2022-04-17 Thread Candice Li
Add debugfs support to load/unload/invoke TA in runtime. Signed-off-by: John Clements Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_psp_ta.c | 312

[PATCH v3] drm/amdgpu: Fix build warning for TA debugfs interface

2022-04-28 Thread Candice Li
Remove the redundant codes to fix build warning when CONFIG_DEBUG_FS is disabled. Reported-by: Randy Dunlap Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp_ta.c | 40 -- drivers/gpu/drm/amd/amdgpu/amdgpu_psp_ta.h | 1 - 2 files changed, 14 insertions

[PATCH v2] drm/amdgpu: Fix build warning for TA debugfs interface

2022-04-27 Thread Candice Li
Remove the redundant codes to fix build warning when CONFIG_DEBUG_FS is disabled. Reported-by: Randy Dunlap Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp_ta.c | 43 ++ 1 file changed, 12 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/amd

[PATCH] drm/amdgpu: Fix UBSAN shift-out-of-bounds for gfx v9_0

2022-08-24 Thread Candice Li
Check shift number to avoid doing a shift operation when the number of bits shifted equal to or greater than number of bits in the operand. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu

[PATCH] drm/amdgpu: Rely on MCUMC_STATUS for umc v8_10 correctable error counter only

2022-09-07 Thread Candice Li
Only check MCUMC_STATUS for CE counter for umc v8_10. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/umc_v8_10.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v8_10.c b/drivers/gpu/drm/amd/amdgpu/umc_v8_10.c index

[PATCH] drm/amdgpu: Enable full reset when RAS is supported on gc v11_0_0

2022-09-07 Thread Candice Li
Enable full reset for RAS supported configuration on gc v11_0_0. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/soc21.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c index a26c5723c46e27..81f32d77c98cd5

[PATCH] drm/amdgpu: Add EEPROM I2C address support for ip discovery

2022-10-17 Thread Candice Li
1. Update EEPROM_I2C_MADDR_SMU_13_0_0 to EEPROM_I2C_MADDR_54H 2. Add EEPROM I2C address support for smu v13_0_0 and v13_0_10. Signed-off-by: Candice Li Reviewed-by: Tao Zhou --- .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c| 20 +-- 1 file changed, 18 insertions(+), 2

[PATCH] drm/amdgpu: Update ras eeprom support for smu v13_0_0 and v13_0_10

2022-10-17 Thread Candice Li
Enable RAS EEPROM support for smu v13_0_0 and v13_0_10. Signed-off-by: Candice Li Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd

[PATCH 1/2] drm/amdgpu: Optimize RAS TA initialzation and TA unload funcs

2022-10-25 Thread Candice Li
1. Save TA unload psp response status 2. Add RAS TA loading status check for initialzaiton 3. Drop RAS context teardown to allow RAS TA to be reloaded Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff

[PATCH 2/2] drm/amdgpu: Optimize TA load/unload/invoke debugfs interfaces

2022-10-25 Thread Candice Li
1. Add a function pointer structure ta_funcs to psp context 2. Make the interfaces generic to all TAs 3. Leverage exisitng TA context and remove unused functions 4. Fix return code bugs Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c| 38 +--- drivers/gpu/drm/amd

[PATCH] drm/amdgpu: Enable GFX RAS feature for gfx v11_0_3

2022-10-27 Thread Candice Li
v1: Support gfx ras feature enablement for gfx v11_0_3. v2: Update function name and error message. Signed-off-by: Candice Li Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 26 ++ 1 file changed, 26 insertions(+) diff --git a/drivers/gpu/drm

[PATCH] drm/amdgpu: added support for ras driver loading

2022-09-09 Thread Candice Li
From: John Clements copy ras driver to psp if present Signed-off-by: John Clements --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 15 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 6 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h | 1 +

[PATCH] drm/amdgpu: Skip reset error status for psp v13_0_0

2022-09-09 Thread Candice Li
No need to reset error status since only umc ras supported on psp v13_0_0. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu

[PATCH] drm/amdgpu: Add EEPROM I2C address for smu v13_0_0

2022-09-09 Thread Candice Li
Set correct EEPROM I2C address for smu v13_0_0. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index

[PATCH v2] drm/amdgpu: Enable full reset when RAS is supported on gc v11_0_0

2022-09-08 Thread Candice Li
Enable full reset for RAS supported configuration on gc v11_0_0. v2: simplify the code. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/soc21.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c index

[PATCH 1/2] drm/amdgpu: Update umc v8_10_0 headers

2022-10-10 Thread Candice Li
Add GeccCtrl offset and mask to umc v8_10_0 headers. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/include/asic_reg/umc/umc_8_10_0_offset.h | 2 ++ drivers/gpu/drm/amd/include/asic_reg/umc/umc_8_10_0_sh_mask.h | 3 +++ 2 files changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd

[PATCH 2/2] drm/amdgpu: Add poison mode query for umc v8_10_0

2022-10-10 Thread Candice Li
Add poison mode query support on umc v8_10_0. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/umc_v8_10.c | 26 ++ 1 file changed, 26 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v8_10.c b/drivers/gpu/drm/amd/amdgpu/umc_v8_10.c index

[PATCH v2] drm/amdgpu: Fix UBSAN shift-out-of-bounds for gfx v9_0

2022-08-15 Thread Candice Li
Check shift number to avoid doing a shift operation when the number of bits shifted equal to or greater than number of bits in the operand. v2: Only calculate shift number for non-zero data and fix build warning. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 8

[PATCH] drm/amdgpu: Check num_gfx_rings for gfx v9_0 rb setup.

2022-08-17 Thread Candice Li
No need to set up rb when no gfx rings. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index 7f187558220e9a..1d6d3a852a0b3d

[PATCH] drm/amd/pm: Support RAS fatal error mode1 reset on smu v13_0_0 and v13_0_10

2023-01-12 Thread Candice Li
Support RAS fatal error mode1 reset on smu v13_0_0 and v13_0_10. Signed-off-by: Candice Li Reviewed-by: Evan Quan --- .../drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 42 +-- drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 6 +++ drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h

[PATCH 1/2] drm/amdgpu: Add df v4_3 headers

2022-12-14 Thread Candice Li
Add df v4_3 header files. Signed-off-by: Candice Li Reviewed-by: Hawking Zhang --- .../amd/include/asic_reg/df/df_4_3_offset.h | 30 .../amd/include/asic_reg/df/df_4_3_sh_mask.h | 157 ++ 2 files changed, 187 insertions(+) create mode 100644 drivers/gpu/drm/amd

[PATCH 2/2] drm/amdgpu: Add poison mode query for df v4_3

2022-12-14 Thread Candice Li
Add poison mode query support on df v4_3. Signed-off-by: Candice Li Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/Makefile | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 4 ++ drivers/gpu/drm/amd/amdgpu/df_v4_3.c | 61 +++ drivers/gpu

[PATCH] drm/amd/pm: Enable bad memory page/channel recording support for smu v13_0_0

2022-11-18 Thread Candice Li
Send message to SMU to update bad memory page and bad channel info. Signed-off-by: Candice Li Reviewed-by: Evan Quan --- .../pm/swsmu/inc/pmfw_if/smu_v13_0_0_ppsmc.h | 8 +++- drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h | 4 +- .../drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 39

[PATCH] drm/amdgpu: Add psp_13_0_10_ta firmware to modinfo

2022-11-12 Thread Candice Li
TA firmware loaded on psp v13_0_10, but it is missing in modinfo. Signed-off-by: Candice Li Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/psp_v13_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c

[PATCH] drm/amdgpu: Make umc_v8_10_convert_error_address static and remove unused variable

2023-02-23 Thread Candice Li
Fixes following warnings: warning: no previous prototype for 'umc_v8_10_convert_error_address' warning: variable 'channel_index' set but not used Reported-by: kernel test robot Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/umc_v8_10.c | 15 +-- 1 file changed, 5

[PATCH 1/2] drm/amdgpu: Add convert_error_address function for umc v8_10

2023-02-21 Thread Candice Li
Add convert_error_address for umc v8_10. Signed-off-by: Candice Li Reviewed-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/umc_v8_10.c | 73 +++--- 1 file changed, 42 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v8_10.c b/drivers/gpu/drm/amd

[PATCH 2/2] drm/amdgpu: Add ecc info query interface for umc v8_10

2023-02-21 Thread Candice Li
Support ecc info query for umc v8_10. v2: Simplied by convert_error_address. v3: Remove unused variable and invalid checking. Signed-off-by: Candice Li Reviewed-by: Tao Zhou Reviewed-by: Stanley.Yang --- drivers/gpu/drm/amd/amdgpu/umc_v8_10.c | 134 + 1 file changed

[PATCH] drm/amdgpu: Support umc node harvest config on umc v8_10

2023-02-28 Thread Candice Li
Don't need to query error count and error address on harvest umc nodes. v2: Fix code bug, use active_mask instead of harvsest_config and remove unnecessary argument in LOOP macro. v3: Leave adev->gmc.num_umc unchanged. Signed-off-by: Candice Li Reviewed-by: Tao Zhou --- drivers/gpu/drm/

[PATCH] drm/amd/pm: Enable ecc_info table support for smu v13_0_10

2023-02-28 Thread Candice Li
Support EccInfoTable which includes umc ras error count and error address. Signed-off-by: Candice Li Reviewed-by: Evan Quan --- .../drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 75 +++ 1 file changed, 75 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13

[PATCH] drm/amdgpu: Drop pcie_bif ras check from fatal error handler

2023-04-19 Thread Candice Li
Some ASICs support fatal error event but do not support pcie_bif ras. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c

[PATCH 2/2] drm/amdgpu: Add channel_dis_num to ras init flags

2023-06-13 Thread Candice Li
Add disabled channel number to ras init flags. Signed-off-by: Candice Li Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 1 + drivers/gpu/drm/amd/amdgpu/ta_ras_if.h | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b

[PATCH 1/2] drm/amdgpu: Update total channel number for umc v8_10

2023-06-13 Thread Candice Li
Update total channel number for umc v8_10. Signed-off-by: Candice Li Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 2 ++ drivers/gpu/drm/amd/amdgpu/umc_v8_10.h| 3 ++- 3 files changed, 5 insertions

[PATCH] drm/amdgpu: Allow the initramfs generator to include psp_13_0_6_ta

2023-07-13 Thread Candice Li
Allow the initramfs generator to automatically include psp_13_0_6_ta firmware to initramfs. Signed-off-by: Candice Li Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/psp_v13_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c b/drivers/gpu

[PATCH] Align eccinfo table structure with smu v13_0_0 interface

2023-06-08 Thread Candice Li
Update eccinfo table structure according to smu v13_0_0 interface. Signed-off-by: Candice Li Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/umc_v8_10.h | 3 +++ drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion

[PATCH] drm/amdgpu: Extend poison mode check to SDMA/VCN/JPEG

2023-08-08 Thread Candice Li
Treat SDMA/VCN/JPEG as RAS capable IP blocks in poison mode. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index

[PATCH] drm/amdgpu: Add I2C EEPROM support on smu v13_0_6

2023-08-10 Thread Candice Li
Support I2C EEPROM on smu v13_0_6. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index 4287743e121245

[PATCH v2] drm/amdgpu: Add I2C EEPROM support on smu v13_0_6

2023-08-10 Thread Candice Li
Support I2C EEPROM on smu v13_0_6. v2: Move IP_VERSION(13, 0, 6) ahead of IP_VERSION(13, 0, 10). Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm

[PATCH v2] drm/amd/pm: Align eccinfo table structure with smu v13_0_0 interface

2023-06-13 Thread Candice Li
Update eccinfo table structure according to smu v13_0_0 interface. v2: Calculate array size instead of using macro definition. Signed-off-by: Candice Li Reviewed-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions

[PATCH 1/2] drm/amdgpu: Log deferred error separately

2024-01-10 Thread Candice Li
Separate deferred error from UE and CE and log it individually. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c | 11 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 116 +++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 6 + drivers/gpu/drm

[PATCH 2/2] drm/amdgpu: Do bad page retirement for deferred errors

2024-01-10 Thread Candice Li
Needs to do bad page retirement for deferred errors. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c index

[PATCH] drm/amd/pm: Enable smu v13_0_6 eccinfo in firmware query mode

2024-01-09 Thread Candice Li
smu v13_0_6 eccinfo is supported in firmware query mode only. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c b/drivers/gpu/drm/amd

[PATCH] drm/amd/pm: Enable smu v13_0_6 eccinfo in firmware query mode

2024-01-09 Thread Candice Li
smu v13_0_6 eccinfo is supported in firmware query mode only. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c b/drivers/gpu/drm/amd

[PATCH] drm/amdgpu: Drop unnecessary sentences about CE and deferred error.

2024-01-03 Thread Candice Li
Remove "no user action is needed" for correctable and deferred error to avoid confusion. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 14 +- drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 3 +-- drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c | 3 +-- drive

[PATCH] drm/amdgpu: Support poison error injection via ras_ctrl debugfs

2024-01-03 Thread Candice Li
Support poison error injection. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index caf00df669bf7e..5851c7a80a5a8c

[PATCH] drm/amdgpu: Update EEPROM I2C address for smu v13_0_0

2023-11-23 Thread Candice Li
Check smu v13_0_0 SKU type to select EEPROM I2C address. Signed-off-by: Candice Li Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd

[PATCH] drm/amd/pm: Retrieve UMC ODECC error count from aca bank

2024-02-02 Thread Candice Li
Instead of software managed counters. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13

[PATCH] drm/amdgpu: Update setting EEPROM table version

2024-03-18 Thread Candice Li
Use helper function instead of umc callback to set EEPROM table version. Signed-off-by: Candice Li --- .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c| 22 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h | 2 -- drivers/gpu/drm/amd/amdgpu/umc_v8_10.c| 6 - 3 files

[PATCH] drm/amdgpu: Update EEPROM RAS table for mismatched table version

2024-03-27 Thread Candice Li
Update table version and restore bad page records to EEPROM RAS table for mismatched table version case. Otherwise force to reset the table. Signed-off-by: Candice Li --- .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c| 88 --- 1 file changed, 78 insertions(+), 10 deletions