On 13.10.25 09:09, Lazar, Lijo wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
> 
>> -----Original Message-----
>> From: Zhang, Jesse(Jie) <[email protected]>
>> Sent: Monday, October 13, 2025 11:25 AM
>> To: Lazar, Lijo <[email protected]>; [email protected]; dri-
>> [email protected]
>> Cc: Deucher, Alexander <[email protected]>; Koenig, Christian
>> <[email protected]>; Yang, Philip <[email protected]>
>> Subject: RE: [PATCH] drm/ttm: Add NULL check in
>> ttm_resource_manager_usage
>>
>> [AMD Official Use Only - AMD Internal Distribution Only]
>>
>>> -----Original Message-----
>>> From: Lazar, Lijo <[email protected]>
>>> Sent: Monday, October 13, 2025 12:37 PM
>>> To: Zhang, Jesse(Jie) <[email protected]>;
>>> [email protected]; [email protected]
>>> Cc: Deucher, Alexander <[email protected]>; Koenig, Christian
>>> <[email protected]>; Zhang, Jesse(Jie) <[email protected]>;
>>> Yang, Philip <[email protected]>; Zhang, Jesse(Jie)
>>> <[email protected]>
>>> Subject: RE: [PATCH] drm/ttm: Add NULL check in
>>> ttm_resource_manager_usage
>>>
>>> [AMD Official Use Only - AMD Internal Distribution Only]
>>>
>>> The specific issue of trace with amdgpu_mem_info_vram_used_show should
>>> be fixed with this one - "drm/amdgpu: hide VRAM sysfs attributes on
>>> GPUs without VRAM"
>> Thanks @Lazar, Lijo,  maybe we still can use this patch to fix  this crash 
>> when
>> calling  AMDGPU_CS and  query AMDGPU_INFO_VRAM_USAGE.
>> or add check like the previous patch.
>>
> [lijo]
> 
> Agree, there are indeed multiple places of ttm_resource_manager_usage call. 
> You may follow the same check as in the hide VRAM patch - 
> ttm_resource_manager_used - in case ttm doesn't take this change.

Yeah, agree.

When the VRAM manager isn't initialized we shouldn't be calling any of its 
functions in the first place.

Maybe it is a good idea to add something like "if (WARN_ON_ONCE(!man)) return 
0;" to prevent the crashes and only get a nice warning into the system log.

Regards,
Christian.

> 
> Thanks,
> Lijo
> 
>> Regards
>> Jesse
>>
>> [  911.954646] BUG: kernel NULL pointer dereference, address:
>> 00000000000008f8 [  911.962437]
>> #PF: supervisor write access in kernel mode [  912.007045] RIP:
>> 0010:_raw_spin_lock+0x1e/0x40  [  912.105151]
>> amdttm_resource_manager_usage+0x1f/0x40
>> [amdttm] [  912.111579]  amdgpu_cs_parser_bos.isra.0+0x543/0x800
>> [amdgpu]
>>
>>>
>>> Thanks,
>>> Lijo
>>>> -----Original Message-----
>>>> From: amd-gfx <[email protected]> On Behalf Of
>>>> Jesse.Zhang
>>>> Sent: Monday, October 13, 2025 7:25 AM
>>>> To: [email protected]; [email protected]
>>>> Cc: Deucher, Alexander <[email protected]>; Koenig, Christian
>>>> <[email protected]>; Zhang, Jesse(Jie) <[email protected]>;
>>>> Yang, Philip <[email protected]>; Zhang, Jesse(Jie)
>>>> <[email protected]>
>>>> Subject: [PATCH] drm/ttm: Add NULL check in
>>>> ttm_resource_manager_usage
>>>>
>>>> Add a NULL pointer check in ttm_resource_manager_usage() to prevent
>>>> kernel NULL pointer dereferences when the function is called with an
>>>> uninitialized resource manager.
>>>>
>>>> This fixes a kernel OOPS observed on APU devices where the VRAM
>>>> resource manager is not fully initialized, but various sysfs and
>>>> debug interfaces still attempt to query VRAM usage statistics.
>>>>
>>>> The crash backtrace showed:
>>>>    BUG: kernel NULL pointer dereference, address: 00000000000008f8
>>>>    Call Trace:
>>>>     amdttm_resource_manager_usage+0x1f/0x40 [amdttm]
>>>>     amdgpu_mem_info_vram_used_show+0x1e/0x40 [amdgpu]
>>>>     dev_attr_show+0x1d/0x40
>>>>     kernfs_seq_show+0x27/0x30
>>>>
>>>> By returning 0 for NULL managers, we allow callers to safely query
>>>> usage information even when the underlying resource manager is not
>>>> available, which is the expected behavior for devices without
>>>> dedicated VRAM like
>>> APUs.
>>>>
>>>> Suggested-by: Philip Yang <[email protected]>
>>>> Signed-off-by: Jesse Zhang <[email protected]>
>>>> ---
>>>> drivers/gpu/drm/ttm/ttm_resource.c | 3 +++
>>>> 1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c
>>>> b/drivers/gpu/drm/ttm/ttm_resource.c
>>>> index e2c82ad07eb4..e4d45f75e40a 100644
>>>> --- a/drivers/gpu/drm/ttm/ttm_resource.c
>>>> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
>>>> @@ -587,6 +587,9 @@ uint64_t ttm_resource_manager_usage(struct
>>>> ttm_resource_manager *man)  {
>>>>       uint64_t usage;
>>>>
>>>> +      if (!man)
>>>> +              return 0;
>>>> +
>>>>       spin_lock(&man->bdev->lru_lock);
>>>>       usage = man->usage;
>>>>       spin_unlock(&man->bdev->lru_lock);
>>>> --
>>>> 2.49.0
>>>
>>
> 

Reply via email to