RE: [PATCH] drm/amd/pm: fix Navi1x runtime resume failure V2
[AMD Public Use] Thanks Guchun & Jiansong. Yes, I had same concern as Jiansong. BR Evan -Original Message- From: Chen, Jiansong (Simon) Sent: Thursday, March 18, 2021 5:36 PM To: Chen, Guchun ; Quan, Evan ; amd-gfx@lists.freedesktop.org Cc: Lazar, Lijo ; Quan, Evan Subject: RE: [PATCH] drm/amd/pm: fix Navi1x runtime resume failure V2 We still need reserve "return 0", otherwise may trigger warning "not all control paths return a value". Regards, Jiansong -Original Message- From: amd-gfx On Behalf Of Chen, Guchun Sent: Thursday, March 18, 2021 5:28 PM To: Quan, Evan ; amd-gfx@lists.freedesktop.org Cc: Lazar, Lijo ; Quan, Evan Subject: RE: [PATCH] drm/amd/pm: fix Navi1x runtime resume failure V2 [AMD Public Use] One comment inline. Other than this, the patch is: Reviewed-by: Guchun Chen Regards, Guchun -Original Message- From: amd-gfx On Behalf Of Evan Quan Sent: Thursday, March 18, 2021 5:21 PM To: amd-gfx@lists.freedesktop.org Cc: Lazar, Lijo ; Quan, Evan Subject: [PATCH] drm/amd/pm: fix Navi1x runtime resume failure V2 The RLC was put into a wrong state on runtime suspend. Thus the RLC autoload will fail on the succeeding runtime resume. By adding an intermediate PPSMC_MSG_PrepareMp1ForUnload(some GC hard reset involved, designed for PnP), we can bring RLC back into the desired state. V2: integrate INTERRUPTS_ENABLED flag clearing into current mp1 state set routines Change-Id: I4eb3d5d76068412a6ab228af7fe7f794e53c1eaa Signed-off-by: Evan Quan Reviewed-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 9 -- drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h | 7 + drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 28 +++ .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 1 + .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 25 + .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 14 ++ .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c| 14 ++ drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 28 +++ drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h| 3 ++ 9 files changed, 102 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index 809f4cb738f4..ab6f4059630c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -2160,9 +2160,12 @@ static int psp_load_smu_fw(struct psp_context *psp) if (!ucode->fw || amdgpu_sriov_vf(psp->adev)) return 0; - - if (amdgpu_in_reset(adev) && ras && ras->supported && - adev->asic_type == CHIP_ARCTURUS) { + if ((amdgpu_in_reset(adev) && +ras && ras->supported && +adev->asic_type == CHIP_ARCTURUS) || +(adev->in_runpm && + adev->asic_type >= CHIP_NAVI10 && + adev->asic_type <= CHIP_NAVI12)) { ret = amdgpu_dpm_set_mp1_state(adev, PP_MP1_STATE_UNLOAD); if (ret) { DRM_WARN("Failed to set MP1 state prepare for reload\n"); diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h index 629a988b069d..21c3c149614c 100644 --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h @@ -836,6 +836,13 @@ struct pptable_funcs { */ int (*check_fw_status)(struct smu_context *smu); + /** +* @set_mp1_state: put SMU into a correct state for comming +* resume from runpm or gpu reset. +*/ + int (*set_mp1_state)(struct smu_context *smu, +enum pp_mp1_state mp1_state); + /** * @setup_pptable: Initialize the power play table and populate it with * default values. diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index bae9016fedea..470ca4e5d4d7 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -1927,36 +1927,16 @@ int smu_set_mp1_state(void *handle, enum pp_mp1_state mp1_state) { struct smu_context *smu = handle; - uint16_t msg; - int ret; + int ret = 0; if (!smu->pm_enabled) return -EOPNOTSUPP; mutex_lock(>mutex); - switch (mp1_state) { - case PP_MP1_STATE_SHUTDOWN: - msg = SMU_MSG_PrepareMp1ForShutdown; - break; - case PP_MP1_STATE_UNLOAD: - msg = SMU_MSG_PrepareMp1ForUnload; - break; - case PP_MP1_STATE_RESET: - msg = SMU_MSG_PrepareMp1ForReset; - break; - case PP_MP1_STATE_NONE: - default: - mutex_unlock(>mutex)
RE: [PATCH] drm/amd/pm: fix Navi1x runtime resume failure V2
We still need reserve "return 0", otherwise may trigger warning "not all control paths return a value". Regards, Jiansong -Original Message- From: amd-gfx On Behalf Of Chen, Guchun Sent: Thursday, March 18, 2021 5:28 PM To: Quan, Evan ; amd-gfx@lists.freedesktop.org Cc: Lazar, Lijo ; Quan, Evan Subject: RE: [PATCH] drm/amd/pm: fix Navi1x runtime resume failure V2 [AMD Public Use] One comment inline. Other than this, the patch is: Reviewed-by: Guchun Chen Regards, Guchun -Original Message- From: amd-gfx On Behalf Of Evan Quan Sent: Thursday, March 18, 2021 5:21 PM To: amd-gfx@lists.freedesktop.org Cc: Lazar, Lijo ; Quan, Evan Subject: [PATCH] drm/amd/pm: fix Navi1x runtime resume failure V2 The RLC was put into a wrong state on runtime suspend. Thus the RLC autoload will fail on the succeeding runtime resume. By adding an intermediate PPSMC_MSG_PrepareMp1ForUnload(some GC hard reset involved, designed for PnP), we can bring RLC back into the desired state. V2: integrate INTERRUPTS_ENABLED flag clearing into current mp1 state set routines Change-Id: I4eb3d5d76068412a6ab228af7fe7f794e53c1eaa Signed-off-by: Evan Quan Reviewed-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 9 -- drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h | 7 + drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 28 +++ .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 1 + .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 25 + .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 14 ++ .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c| 14 ++ drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 28 +++ drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h| 3 ++ 9 files changed, 102 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index 809f4cb738f4..ab6f4059630c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -2160,9 +2160,12 @@ static int psp_load_smu_fw(struct psp_context *psp) if (!ucode->fw || amdgpu_sriov_vf(psp->adev)) return 0; - - if (amdgpu_in_reset(adev) && ras && ras->supported && - adev->asic_type == CHIP_ARCTURUS) { + if ((amdgpu_in_reset(adev) && +ras && ras->supported && +adev->asic_type == CHIP_ARCTURUS) || +(adev->in_runpm && + adev->asic_type >= CHIP_NAVI10 && + adev->asic_type <= CHIP_NAVI12)) { ret = amdgpu_dpm_set_mp1_state(adev, PP_MP1_STATE_UNLOAD); if (ret) { DRM_WARN("Failed to set MP1 state prepare for reload\n"); diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h index 629a988b069d..21c3c149614c 100644 --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h @@ -836,6 +836,13 @@ struct pptable_funcs { */ int (*check_fw_status)(struct smu_context *smu); + /** +* @set_mp1_state: put SMU into a correct state for comming +* resume from runpm or gpu reset. +*/ + int (*set_mp1_state)(struct smu_context *smu, +enum pp_mp1_state mp1_state); + /** * @setup_pptable: Initialize the power play table and populate it with * default values. diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index bae9016fedea..470ca4e5d4d7 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -1927,36 +1927,16 @@ int smu_set_mp1_state(void *handle, enum pp_mp1_state mp1_state) { struct smu_context *smu = handle; - uint16_t msg; - int ret; + int ret = 0; if (!smu->pm_enabled) return -EOPNOTSUPP; mutex_lock(>mutex); - switch (mp1_state) { - case PP_MP1_STATE_SHUTDOWN: - msg = SMU_MSG_PrepareMp1ForShutdown; - break; - case PP_MP1_STATE_UNLOAD: - msg = SMU_MSG_PrepareMp1ForUnload; - break; - case PP_MP1_STATE_RESET: - msg = SMU_MSG_PrepareMp1ForReset; - break; - case PP_MP1_STATE_NONE: - default: - mutex_unlock(>mutex); - return 0; - } - - ret = smu_send_smc_msg(smu, msg, NULL); - /* some asics may not support those messages */ - if (ret == -EINVAL) - ret = 0; - if (ret) - dev_err(smu->adev->dev, "[PrepareMp1] Failed!\n"); + if (smu->ppt_funcs && + smu->ppt_
RE: [PATCH] drm/amd/pm: fix Navi1x runtime resume failure V2
[AMD Public Use] One comment inline. Other than this, the patch is: Reviewed-by: Guchun Chen Regards, Guchun -Original Message- From: amd-gfx On Behalf Of Evan Quan Sent: Thursday, March 18, 2021 5:21 PM To: amd-gfx@lists.freedesktop.org Cc: Lazar, Lijo ; Quan, Evan Subject: [PATCH] drm/amd/pm: fix Navi1x runtime resume failure V2 The RLC was put into a wrong state on runtime suspend. Thus the RLC autoload will fail on the succeeding runtime resume. By adding an intermediate PPSMC_MSG_PrepareMp1ForUnload(some GC hard reset involved, designed for PnP), we can bring RLC back into the desired state. V2: integrate INTERRUPTS_ENABLED flag clearing into current mp1 state set routines Change-Id: I4eb3d5d76068412a6ab228af7fe7f794e53c1eaa Signed-off-by: Evan Quan Reviewed-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 9 -- drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h | 7 + drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 28 +++ .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 1 + .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 25 + .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 14 ++ .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c| 14 ++ drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 28 +++ drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h| 3 ++ 9 files changed, 102 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index 809f4cb738f4..ab6f4059630c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -2160,9 +2160,12 @@ static int psp_load_smu_fw(struct psp_context *psp) if (!ucode->fw || amdgpu_sriov_vf(psp->adev)) return 0; - - if (amdgpu_in_reset(adev) && ras && ras->supported && - adev->asic_type == CHIP_ARCTURUS) { + if ((amdgpu_in_reset(adev) && +ras && ras->supported && +adev->asic_type == CHIP_ARCTURUS) || +(adev->in_runpm && + adev->asic_type >= CHIP_NAVI10 && + adev->asic_type <= CHIP_NAVI12)) { ret = amdgpu_dpm_set_mp1_state(adev, PP_MP1_STATE_UNLOAD); if (ret) { DRM_WARN("Failed to set MP1 state prepare for reload\n"); diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h index 629a988b069d..21c3c149614c 100644 --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h @@ -836,6 +836,13 @@ struct pptable_funcs { */ int (*check_fw_status)(struct smu_context *smu); + /** +* @set_mp1_state: put SMU into a correct state for comming +* resume from runpm or gpu reset. +*/ + int (*set_mp1_state)(struct smu_context *smu, +enum pp_mp1_state mp1_state); + /** * @setup_pptable: Initialize the power play table and populate it with * default values. diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index bae9016fedea..470ca4e5d4d7 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -1927,36 +1927,16 @@ int smu_set_mp1_state(void *handle, enum pp_mp1_state mp1_state) { struct smu_context *smu = handle; - uint16_t msg; - int ret; + int ret = 0; if (!smu->pm_enabled) return -EOPNOTSUPP; mutex_lock(>mutex); - switch (mp1_state) { - case PP_MP1_STATE_SHUTDOWN: - msg = SMU_MSG_PrepareMp1ForShutdown; - break; - case PP_MP1_STATE_UNLOAD: - msg = SMU_MSG_PrepareMp1ForUnload; - break; - case PP_MP1_STATE_RESET: - msg = SMU_MSG_PrepareMp1ForReset; - break; - case PP_MP1_STATE_NONE: - default: - mutex_unlock(>mutex); - return 0; - } - - ret = smu_send_smc_msg(smu, msg, NULL); - /* some asics may not support those messages */ - if (ret == -EINVAL) - ret = 0; - if (ret) - dev_err(smu->adev->dev, "[PrepareMp1] Failed!\n"); + if (smu->ppt_funcs && + smu->ppt_funcs->set_mp1_state) + ret = smu->ppt_funcs->set_mp1_state(smu, mp1_state); mutex_unlock(>mutex); diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c index 69aa60a2e8b7..e033aa6c7d9b 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu
[PATCH] drm/amd/pm: fix Navi1x runtime resume failure V2
The RLC was put into a wrong state on runtime suspend. Thus the RLC autoload will fail on the succeeding runtime resume. By adding an intermediate PPSMC_MSG_PrepareMp1ForUnload(some GC hard reset involved, designed for PnP), we can bring RLC back into the desired state. V2: integrate INTERRUPTS_ENABLED flag clearing into current mp1 state set routines Change-Id: I4eb3d5d76068412a6ab228af7fe7f794e53c1eaa Signed-off-by: Evan Quan Reviewed-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 9 -- drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h | 7 + drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 28 +++ .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 1 + .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 25 + .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 14 ++ .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c| 14 ++ drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 28 +++ drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h| 3 ++ 9 files changed, 102 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index 809f4cb738f4..ab6f4059630c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -2160,9 +2160,12 @@ static int psp_load_smu_fw(struct psp_context *psp) if (!ucode->fw || amdgpu_sriov_vf(psp->adev)) return 0; - - if (amdgpu_in_reset(adev) && ras && ras->supported && - adev->asic_type == CHIP_ARCTURUS) { + if ((amdgpu_in_reset(adev) && +ras && ras->supported && +adev->asic_type == CHIP_ARCTURUS) || +(adev->in_runpm && + adev->asic_type >= CHIP_NAVI10 && + adev->asic_type <= CHIP_NAVI12)) { ret = amdgpu_dpm_set_mp1_state(adev, PP_MP1_STATE_UNLOAD); if (ret) { DRM_WARN("Failed to set MP1 state prepare for reload\n"); diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h index 629a988b069d..21c3c149614c 100644 --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h @@ -836,6 +836,13 @@ struct pptable_funcs { */ int (*check_fw_status)(struct smu_context *smu); + /** +* @set_mp1_state: put SMU into a correct state for comming +* resume from runpm or gpu reset. +*/ + int (*set_mp1_state)(struct smu_context *smu, +enum pp_mp1_state mp1_state); + /** * @setup_pptable: Initialize the power play table and populate it with * default values. diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index bae9016fedea..470ca4e5d4d7 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -1927,36 +1927,16 @@ int smu_set_mp1_state(void *handle, enum pp_mp1_state mp1_state) { struct smu_context *smu = handle; - uint16_t msg; - int ret; + int ret = 0; if (!smu->pm_enabled) return -EOPNOTSUPP; mutex_lock(>mutex); - switch (mp1_state) { - case PP_MP1_STATE_SHUTDOWN: - msg = SMU_MSG_PrepareMp1ForShutdown; - break; - case PP_MP1_STATE_UNLOAD: - msg = SMU_MSG_PrepareMp1ForUnload; - break; - case PP_MP1_STATE_RESET: - msg = SMU_MSG_PrepareMp1ForReset; - break; - case PP_MP1_STATE_NONE: - default: - mutex_unlock(>mutex); - return 0; - } - - ret = smu_send_smc_msg(smu, msg, NULL); - /* some asics may not support those messages */ - if (ret == -EINVAL) - ret = 0; - if (ret) - dev_err(smu->adev->dev, "[PrepareMp1] Failed!\n"); + if (smu->ppt_funcs && + smu->ppt_funcs->set_mp1_state) + ret = smu->ppt_funcs->set_mp1_state(smu, mp1_state); mutex_unlock(>mutex); diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c index 69aa60a2e8b7..e033aa6c7d9b 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c @@ -2367,6 +2367,7 @@ static const struct pptable_funcs arcturus_ppt_funcs = { .get_fan_parameters = arcturus_get_fan_parameters, .interrupt_work = smu_v11_0_interrupt_work, .set_light_sbr = smu_v11_0_set_light_sbr, + .set_mp1_state = smu_cmn_set_mp1_state, }; void arcturus_set_ppt_funcs(struct smu_context *smu) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c index 88634b44f8ff..bd95d41fe7a9 100644 ---