[AMD Official Use Only - AMD Internal Distribution Only] This is not a workaround; you have misunderstood the intent of this patch. All ASIC load sensors must be constrained to the 0–100 range. In other words, the KMD driver must not blindly trust the value returned by the firmware without validation. For example, invalid values may arise from issues such as memory corruption.
Best Regards, Kevin -----Original Message----- From: Lazar, Lijo <[email protected]> Sent: Friday, February 27, 2026 13:40 To: Wang, Yang(Kevin) <[email protected]>; Alex Deucher <[email protected]> Cc: [email protected]; Deucher, Alexander <[email protected]>; Zhang, Hawking <[email protected]>; Feng, Kenneth <[email protected]> Subject: Re: [PATCH] drm/amd/pm: restrict sensor load values to 0-100 On 27-Feb-26 10:14 AM, Wang, Yang(Kevin) wrote: > [AMD Official Use Only - AMD Internal Distribution Only] > > Ping... > Please restrict this workaround to the affected SOC. Otherwise, if there are bogus values, we will fix it at the right place. Thanks, Lijo > Best Regards, > Kevin > > -----Original Message----- > From: Alex Deucher <[email protected]> > Sent: Wednesday, February 25, 2026 10:24 PM > To: Lazar, Lijo <[email protected]> > Cc: Wang, Yang(Kevin) <[email protected]>; > [email protected]; Deucher, Alexander > <[email protected]>; Zhang, Hawking <[email protected]>; > Feng, Kenneth <[email protected]> > Subject: Re: [PATCH] drm/amd/pm: restrict sensor load values to 0-100 > > On Wed, Feb 25, 2026 at 7:14 AM Lazar, Lijo <[email protected]> wrote: >> >> >> >> On 25-Feb-26 3:04 PM, Yang Wang wrote: >>> Limit GPU/MEM/VCN load sensor values to 0-100 range via clamp_t to >>> ensure validity. >>> >> >> Is this a workaround? If it's not within range, it indicates some >> underlying issue. > > Likely for: > https://gitlab.freedesktop.org/drm/amd/-/issues/4905 > > Alex > >> >> Thanks, >> Lijo >> >>> Signed-off-by: Yang Wang <[email protected]> >>> --- >>> drivers/gpu/drm/amd/pm/amdgpu_pm.c | 27 +++++++++++++++++++++++---- >>> 1 file changed, 23 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> index 938361ecae05..86ef1ffbf1dd 100644 >>> --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> @@ -1414,20 +1414,39 @@ static ssize_t >>> amdgpu_set_pp_power_profile_mode(struct device *dev, >>> >>> static int amdgpu_pm_get_sensor_generic(struct amdgpu_device *adev, >>> enum amd_pp_sensors sensor, >>> - void *query) >>> + uint32_t *val) >>> { >>> - int r, size = sizeof(uint32_t); >>> + uint32_t tmp = UINT_MAX, size = sizeof(tmp); >>> + int r; >>> + >>> + if (!val) >>> + return -EINVAL; >>> >>> r = amdgpu_pm_get_access_if_active(adev); >>> if (r) >>> return r; >>> >>> /* get the sensor value */ >>> - r = amdgpu_dpm_read_sensor(adev, sensor, query, &size); >>> + r = amdgpu_dpm_read_sensor(adev, sensor, (void *)&tmp, &size); >>> >>> amdgpu_pm_put_access(adev); >>> >>> - return r; >>> + if (r) >>> + return r; >>> + >>> + switch (sensor) { >>> + case AMDGPU_PP_SENSOR_GPU_LOAD: >>> + case AMDGPU_PP_SENSOR_MEM_LOAD: >>> + case AMDGPU_PP_SENSOR_VCN_LOAD: >>> + tmp = clamp_t(uint32_t, tmp, 0, 100); >>> + break; >>> + default: >>> + break; >>> + } >>> + >>> + *val = tmp; >>> + >>> + return 0; >>> } >>> >>> /** >>
