On Fri, 2025-08-08 at 15:04 -0400, Alex Deucher wrote: > On Fri, Aug 8, 2025 at 4:22 AM Timur Kristóf > <timur.kris...@gmail.com> wrote: > > > > On Mon, 2025-08-04 at 13:31 -0400, Alex Deucher wrote: > > > On Mon, Aug 4, 2025 at 12:16 PM Timur Kristóf > > > <timur.kris...@gmail.com> wrote: > > > > > > > > On Mon, 2025-08-04 at 11:32 -0400, Alex Deucher wrote: > > > > > On Mon, Aug 4, 2025 at 9:42 AM Timur Kristóf > > > > > <timur.kris...@gmail.com> wrote: > > > > > > > > > > > > The si_upload_smc_data function uses > > > > > > si_write_smc_soft_register > > > > > > to set some register values in the SMC, and expects the > > > > > > result > > > > > > to be PPSMC_Result_OK which is 1. > > > > > > > > > > > > The PPSMC_Result_OK / PPSMC_Result_Failed values are used > > > > > > for > > > > > > checking the result of a command sent to the SMC. > > > > > > > > > > > > However, the si_write_smc_soft_register actually doesn't > > > > > > send > > > > > > any commands to the SMC and returns zero on success, > > > > > > so this check was incorrect. > > > > > > > > > > > > Fix that by correctly interpreting zero as success. > > > > > > This seems to fix an SMC hang that happens in > > > > > > si_set_sw_state. > > > > > > > > > > > > Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)") > > > > > > Signed-off-by: Timur Kristóf <timur.kris...@gmail.com> > > > > > > --- > > > > > > drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 31 > > > > > > +++++++++++++- > > > > > > ---- > > > > > > ---- > > > > > > 1 file changed, 19 insertions(+), 12 deletions(-) > > > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c > > > > > > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c > > > > > > index 33b9d4beec84..e9f034ade214 100644 > > > > > > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c > > > > > > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c > > > > > > @@ -5820,6 +5820,7 @@ static int si_upload_smc_data(struct > > > > > > amdgpu_device *adev) > > > > > > { > > > > > > struct amdgpu_crtc *amdgpu_crtc = NULL; > > > > > > int i; > > > > > > + int ret; > > > > > > > > > > > > if (adev->pm.dpm.new_active_crtc_count == 0) > > > > > > return 0; > > > > > > @@ -5837,20 +5838,26 @@ static int > > > > > > si_upload_smc_data(struct > > > > > > amdgpu_device *adev) > > > > > > if (amdgpu_crtc->line_time <= 0) > > > > > > return 0; > > > > > > > > > > > > - if (si_write_smc_soft_register(adev, > > > > > > - > > > > > > SI_SMC_SOFT_REGISTER_crtc_index, > > > > > > - amdgpu_crtc- > > > > > > >crtc_id) != > > > > > > PPSMC_Result_OK) > > > > > > - return 0; > > > > > > + ret = si_write_smc_soft_register( > > > > > > + adev, > > > > > > + SI_SMC_SOFT_REGISTER_crtc_index, > > > > > > + amdgpu_crtc->crtc_id); > > > > > > + if (ret) > > > > > > + return ret; > > > > > > > > > > > > - if (si_write_smc_soft_register(adev, > > > > > > - > > > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min, > > > > > > - amdgpu_crtc->wm_high > > > > > > / > > > > > > amdgpu_crtc->line_time) != PPSMC_Result_OK) > > > > > > - return 0; > > > > > > + ret = si_write_smc_soft_register( > > > > > > + adev, > > > > > > + > > > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min, > > > > > > + amdgpu_crtc->wm_high / amdgpu_crtc- > > > > > > >line_time); > > > > > > + if (ret) > > > > > > + return ret; > > > > > > > > > > > > - if (si_write_smc_soft_register(adev, > > > > > > - > > > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max, > > > > > > - amdgpu_crtc->wm_low > > > > > > / > > > > > > amdgpu_crtc->line_time) != PPSMC_Result_OK) > > > > > > - return 0; > > > > > > + ret = si_write_smc_soft_register( > > > > > > + adev, > > > > > > + > > > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max, > > > > > > + amdgpu_crtc->wm_low / amdgpu_crtc- > > > > > > >line_time); > > > > > > + if (ret) > > > > > > + return ret; > > > > > > > > > > This patch changes the behavior of this function (i.e., it > > > > > always > > > > > returns 0 before this patch). Not sure if that matters or > > > > > not. > > > > > I > > > > > think this could be simplified to something like the > > > > > following to > > > > > retain the current behavior. > > > > > > > > Actually now that I think of it more, I think it may be > > > > entirely > > > > unnecessary to check the return value. > > > > > > > > si_upload_smc_data calls: > > > > si_write_smc_soft_register > > > > amdgpu_si_write_smc_sram_dword > > > > si_set_smc_sram_address > > > > > > > > This last one, si_set_smc_sram_address returns -EINVAL when its > > > > smc_address parameter is not dword-aligned or out of bounds. > > > > Otherwise > > > > all of the above functions return 0 (success). Considering that > > > > all > > > > of > > > > the addresses passed by si_upload_smc_data are compile time > > > > constants, > > > > we know they are correct so there is no reason why any of those > > > > functions would return an error. > > > > > > > > Looking at other callers of si_write_smc_soft_register, I see > > > > that > > > > they > > > > don't check the return value at all. > > > > > > > > So, I'd actually simplify this even more and just not check the > > > > return > > > > values. What do you think about that? > > > > > > Sure. Works for me. > > > > > > Alex > > > > Alex, before I send a new version of this series, can you please > > clarify what these registers are and verify that the actual > > programming > > of these SMC registers is correct? > > This code was based on what the windows code did.
Makes sense. Let's assume that it's correct then. > > > > > The reason I ask is because due the the bug being fixed by these > > patch, > > these registers were never actually written, which makes me wonder > > if > > the value we program them to is actually correct. > > > > I mean the values that we program these registers to: > > > > SI_SMC_SOFT_REGISTER_crtc_index - we just program the index of the > > first active CRTC, seems straightforward enough, but it's unclear > > what > > the SMC uses this for. Why does the SMC care which crtc we use? > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min - programmed to the > > high > > display watermark divided by the line time. But I can't find any > > information about what this information represents or what the SMC > > uses > > it for. Judging by the name it has to do with mclk switching? > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max - same concern as > > _min. > > For mclk switching, the mclk has to be changed during the display > blanking period to avoid display artifacts. This is presumably part > of that, but I don't remember exactly what all of these do anymore. > > Alex Thank you. I will make the changes that we discussed above and send a second version of this series next week. > > > > > Thanks, > > Timur > > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c > > > > > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c > > > > > index 52e732be59e36..3dd0115aa15f8 100644 > > > > > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c > > > > > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c > > > > > @@ -5836,17 +5836,17 @@ static int si_upload_smc_data(struct > > > > > amdgpu_device *adev) > > > > > > > > > > if (si_write_smc_soft_register(adev, > > > > > > > > > > SI_SMC_SOFT_REGISTER_crtc_index, > > > > > - amdgpu_crtc->crtc_id) > > > > > != > > > > > PPSMC_Result_OK) > > > > > + amdgpu_crtc->crtc_id)) > > > > > return 0; > > > > > > > > > > if (si_write_smc_soft_register(adev, > > > > > > > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_min, > > > > > - amdgpu_crtc->wm_high / > > > > > amdgpu_crtc->line_time) != PPSMC_Result_OK) > > > > > + amdgpu_crtc->wm_high / > > > > > amdgpu_crtc->line_time)) > > > > > return 0; > > > > > > > > > > if (si_write_smc_soft_register(adev, > > > > > > > > > > SI_SMC_SOFT_REGISTER_mclk_change_block_cp_max, > > > > > - amdgpu_crtc->wm_low / > > > > > amdgpu_crtc->line_time) != PPSMC_Result_OK) > > > > > + amdgpu_crtc->wm_low / > > > > > amdgpu_crtc->line_time)) > > > > > return 0; > > > > > > > > > > return 0; > > > > > > > > > > > > > > > > > > > > > > return 0; > > > > > > } > > > > > > -- > > > > > > 2.50.1 > > > > > >