Hi Calvin,

On Mon, Mar 09, 2026 at 09:24:57PM -0700, Calvin Owens wrote:
> Commit e1b385726f7f ("drm/amd/display: Add additional checks for PSP
> footer size") introduced a use of an uninitialized stack variable
> in dm_dmub_sw_init() (region_params.bss_data_size).
> 
> Interestingly, this seems to cause no issue on normal kernels. But when
> full LTO is enabled, it causes the compiler to "optimize" out huge
> swaths of amdgpu initialization code, and the driver is unusable:

Yeah, this appears to be a very unfortunate case of "clang encountered known
undefined behavior and stopped code generation", which we would like to
avoid but figuring out a proper upstreamable solution is hard. The most
recent attempt:

  https://github.com/llvm/llvm-project/pull/146791

My guess is that LTO allows inlining of
dmub_srv_get_fw_meta_info_from_raw_fw() into dm_dmub_sw_init(), at which
point it can see that the result of accessing an uninitialized
region_params.bss_data_size will be used through
fw_meta_info_params.fw_bss_data and gives up generating the rest of the
function.

>     amdgpu 0000:03:00.0: [drm] Loading DMUB firmware via PSP: 
> version=0x07002F00
>     amdgpu 0000:03:00.0: sw_init of IP block <dm> failed 5
>     amdgpu 0000:03:00.0: amdgpu_device_ip_init failed
>     amdgpu 0000:03:00.0: Fatal error during GPU init
> 
> It surprises me that neither gcc nor clang emit a warning about this: I
> only found it by bisecting the LTO breakage.

gcc's -Wmaybe-uninitialized is disabled by default for the kernel but
even enabling it with KCFLAGS does not show an instance here, which I
find quite surprising... for clang, it is harder because the warning
happens early in the frontend where it might not be able to track a
value that well.

> Fix by using the old value for region_params.bss_data_size in place of
> the uninitialized reference, which makes amdgpu work with LTO again.
> 
> Fixes: e1b385726f7f ("drm/amd/display: Add additional checks for PSP footer 
> size")
> Signed-off-by: Calvin Owens <[email protected]>
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index b3d6f2cd8ab6..e69e61163ae9 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -2554,7 +2554,7 @@ static int dm_dmub_sw_init(struct amdgpu_device *adev)
>       fw_meta_info_params.fw_inst_const = adev->dm.dmub_fw->data +
>                                           
> le32_to_cpu(hdr->header.ucode_array_offset_bytes) +
>                                           PSP_HEADER_BYTES_256;
> -     fw_meta_info_params.fw_bss_data = region_params.bss_data_size ? 
> adev->dm.dmub_fw->data +
> +     fw_meta_info_params.fw_bss_data = le32_to_cpu(hdr->bss_data_bytes) ? 
> adev->dm.dmub_fw->data +

Maybe it would be better to use fw_meta_info_params.bss_data_size
instead of le32_to_cpu(hdr->bss_data_bytes)? Obviously it is the same
value but it would result in a smaller change. It seems likely that this
was just a copy and paste failure.

>                                         
> le32_to_cpu(hdr->header.ucode_array_offset_bytes) +
>                                         le32_to_cpu(hdr->inst_const_bytes) : 
> NULL;
>       fw_meta_info_params.custom_psp_footer_size = 0;
> -- 
> 2.47.3
> 

Reply via email to