Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider: - [Medium] Unconditional 100ms delay added to all normal S3 resumes due to a missing suspend-abort check. - [Medium] Flawed condition or contradictory documentation for detecting suspend abort. --
commit f65d16d9bd092a9093fda22e85afb7762ebbe6a5 Author: Jakob Linke <[email protected]> drm/amdgpu/soc24: reset dGPU if suspend got aborted For SOC24 ASICs (RDNA4 / Navi 4x dGPUs) re-enabling PM features fails if an S3 suspend got aborted. Apply the same workaround as soc15 and soc21 to detect the aborted-suspend state and reset the device before re-init. > diff --git a/drivers/gpu/drm/amd/amdgpu/soc24.c > b/drivers/gpu/drm/amd/amdgpu/soc24.c > index 265db9331d0bb..9dce30d2bb8d3 100644 > --- a/drivers/gpu/drm/amd/amdgpu/soc24.c > +++ b/drivers/gpu/drm/amd/amdgpu/soc24.c [ ... ] > @@ -496,8 +496,36 @@ static int soc24_common_suspend(struct amdgpu_ip_block > *ip_block) > return soc24_common_hw_fini(ip_block); > } > > +static bool soc24_need_reset_on_resume(struct amdgpu_device *adev) > +{ > + u32 sol_reg1, sol_reg2; > + > + /* Will reset for the following suspend abort cases. > + * 1) Only reset dGPU side. > + * 2) S3 suspend got aborted and TOS is active. > + * As for dGPU suspend abort cases the SOL value > + * will be kept as zero at this resume point. > + */ > + if (!(adev->flags & AMD_IS_APU) && adev->in_s3) { > + sol_reg1 = RREG32_SOC15(MP0, 0, regMPASP_SMN_C2PMSG_81); > + msleep(100); [Severity: Medium] Does this add an unconditional 100ms delay to all S3 resumes? Since adev->in_s3 is true during all S3 resumes, not just aborted ones, this path seems to execute and sleep every time the system normally wakes up. Could this use a fast-path check, like skipping the sleep if sol_reg1 is non-zero, or perhaps checking !pm_resume_via_firmware() similar to what the soc15 workaround does? > + sol_reg2 = RREG32_SOC15(MP0, 0, regMPASP_SMN_C2PMSG_81); > + > + return (sol_reg1 != sol_reg2); [Severity: Medium] Is the block comment above accurate about the SOL value behavior? The comment states that for dGPU suspend abort cases, the SOL value will be kept as zero at this resume point. If both sol_reg1 and sol_reg2 read as zero, the condition (0 != 0) will evaluate to false, and the reset will not be triggered. If the SOL registers stay zero during an abort, does this logic actually fail to detect the abort case entirely? > + } > + > + return false; > +} -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=1
